toChanglelogStream prints different kinds of changes

toChanglelogStream prints different kinds of changes - java

I am reading at https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/data_stream_api/#examples-for-fromchangelogstream,
The EXAMPLE 1:
// === EXAMPLE 1 ===
// interpret the stream as a retract stream
// create a changelog DataStream
val dataStream = env.fromElements(
Row.ofKind(RowKind.INSERT, "Alice", Int.box(12)),
Row.ofKind(RowKind.INSERT, "Bob", Int.box(5)),
Row.ofKind(RowKind.UPDATE_BEFORE, "Alice", Int.box(12)),
Row.ofKind(RowKind.UPDATE_AFTER, "Alice", Int.box(100))
)(Types.ROW(Types.STRING, Types.INT))
// interpret the DataStream as a Table
val table = tableEnv.fromChangelogStream(dataStream)
// register the table under a name and perform an aggregation
tableEnv.createTemporaryView("InputTable", table)
tableEnv
.executeSql("SELECT f0 AS name, SUM(f1) AS score FROM InputTable GROUP BY f0")
.print()
// prints:
// +----+--------------------------------+-------------+
// | op | name | score |
// +----+--------------------------------+-------------+
// | +I | Bob | 5 |
// | +I | Alice | 12 |
// | -D | Alice | 12 |
// | +I | Alice | 100 |
// +----+--------------------------------+-------------+
The EXAMPLE 2:
// === EXAMPLE 2 ===
// convert to DataStream in the simplest and most general way possible (no event-time)
val simpleTable = tableEnv
.fromValues(row("Alice", 12), row("Alice", 2), row("Bob", 12))
.as("name", "score")
.groupBy($"name")
.select($"name", $"score".sum())
tableEnv
.toChangelogStream(simpleTable)
.executeAndCollect()
.foreach(println)
// prints:
// +I[Bob, 12]
// +I[Alice, 12]
// -U[Alice, 12]
// +U[Alice, 14]
For the two examples, I would ask why the first one prints -D and +I in the last two records, while the second one prints -U and +U. What's the rule here to determine the kind of change? Thanks.

The reason for the difference has two parts, both of them defined in GroupAggFunction, which is the process function used to process this query.
The first is this part of the code:
// update aggregate result and set to the newRow
if (isAccumulateMsg(input)) {
// accumulate input
function.accumulate(input);
} else {
// retract input
function.retract(input);
}
When a new value is received for a given key, the method first checks if it is an accumulation message (RowKind.INSERT or RowKind.UPDATE_AFTER) or a retract message (RowKind.UPDATE_BEFORE).
In your first example, you explicitly state the RowKind yourself. When the execution reaches Row.ofKind(RowKind.UPDATE_BEFORE, "Alice", Int.box(12)), which is a retraction message, it will first retract the input from the existing accumulator. This means that after the retraction, we end up with a key which has an empty accumulator. When that happens, the below line is reached:
} else {
// we retracted the last record for this key
// sent out a delete message
if (!firstRow) {
// prepare delete message for previous row
resultRow.replace(currentKey, prevAggValue).setRowKind(RowKind.DELETE);
out.collect(resultRow);
}
// and clear all state
accState.clear();
// cleanup dataview under current key
function.cleanup();
}
Since this is not the first row received for the key "Alice", we emit a delete message for the previous row, and then the next one will emit an INSERT.
For your second example where you don't explicitly specify the RowKind, all messages are received with RowKind.INSERT by default. This means that now we don't retract the existing accumulator, and the following code path is taken:
if (!recordCounter.recordCountIsZero(accumulators)) {
// we aggregated at least one record for this key
// update the state
accState.update(accumulators);
// if this was not the first row and we have to emit retractions
if (!firstRow) {
if (stateRetentionTime <= 0 && equaliser.equals(prevAggValue, newAggValue)) {
// newRow is the same as before and state cleaning is not enabled.
// We do not emit retraction and acc message.
// If state cleaning is enabled, we have to emit messages to prevent too early
// state eviction of downstream operators.
return;
} else {
// retract previous result
if (generateUpdateBefore) {
// prepare UPDATE_BEFORE message for previous row
resultRow
.replace(currentKey, prevAggValue)
.setRowKind(RowKind.UPDATE_BEFORE);
out.collect(resultRow);
}
// prepare UPDATE_AFTER message for new row
resultRow.replace(currentKey, newAggValue).setRowKind(RowKind.UPDATE_AFTER);
}
Since the row count is greater than 0 (we didn't retract), and this is not the first row received for the key, and because the AggFunction has set generateUpdateBefore to true, we first receive an UPDATE_BEFORE message (-U) followed immediately by an UPDATE_AFTER (+U).
All the relevant code can be found here.

Related

How to run a Cucumber Step multiple times with different data?

I am trying to automate one scenario using Cucumber.
Step Then Create item actually takes values from first row only.
What I want to do is execute step Then Create item 2 times, before moving to step Then assigns to CRSA.
But my code is taking values from first row only (0P00A). How to take values from both rows?
Background: Application login
Given User launch the application on browser
When User logs in to application
Scenario: Test
Then Create item
| Item ID | Attribute Code | New Value | Old Value |
| 0P00A | SR | XYZ21 | ABC21 |
| 0P00B | CA | XYZ22 | ABC22 |
Then assigns to CRSA
#Then("Create item")
public void createItem(DataTable dataTable) {
List<Map<String, String>> inputData = dataTable.asMaps();
}

You can use foreach like below:
List<Map<String, String>> inputData = dataTable.asMaps();
for (Map<String, String> columns : inputData ) {
columns.get("Item ID");
columns.get("Attribute Code");
}

How to add Integers in a Python Entry Field

So I'm a newbie when it comes to creating GUI's in Python, and I was wondering how I can add values/integers in an Entry Field that has its state disabled in Python. I've actually done the same thing in Java, but I can't seem to figure out how to translate from Java to Python.
Java Version
C_ID.setText(String.valueOf(Integer.parseInt(C_ID.getText())+1));
Python Code
# Title for the Registration Form
label = Label(window, text="Customer Registration System", width=30, height=1, bg="yellow", anchor="center")
label.config(font=("Courier", 10))
label.grid(column=2, row=1)
# Customer ID Label (Left)
ID_Label = Label(window, text="Customer ID:", width=14, height=1, bg="yellow", anchor="w")
ID_Label.config(font=("Courier", 10))
ID_Label.grid(column=1, row=2)
# Customer ID Input Field
C_ID = StringVar()
C_ID = Entry(window, textvariable=C_ID, text="1")
C_ID.insert(0, "1")
C_ID.config(state=DISABLED)
C_ID.grid(column=2, row=2)
Extra Info:
My code needs to increment by 1 every time I press the save button.
def save():
if len(C_Name.get()) == 0 or len(C_Email.get()) == 0 or len(C_Birthday.get()) == 0 or len(
C_Address.get()) == 0 or len(C_Contact.get()) == 0:
msg_box("All Input Fields Must Be Complete", "Record")
elif not check_email:
msg_box("Please Input a Valid Email", "Record")
elif not check_dateValid:
msg_box("Please Input a Valid date", "Record")
elif not check_minor:
msg_box("Minors are Not Allowed to Register", "Record")
else:
msg_box("Save Record", "Record")
I tried using this code right here,
C_ID.config(text=str(int(C_ID.get())+1))1
But it doesn't seem to add no matter what I do.

You are doing it correctly in your code.
You just need to set the config to normal, set the text, and then set the config back to disabled.
C_ID.config(state=NORMAL) # sets config to normal
C_ID.delete(0, END) #deletes the current value in entry
C_ID.insert(0, "2") # enters a new default value
C_ID.config(state=DISABLED) # sets config to disabled again

why i am getting rows with null values when i am trying to create view from json file in spark with java

i am reading Json file and creating view in spark with java when i am trying to display it was displaying two extra row starting and ending with null values
i have tried with different options line multi line true but it's not working
class Something
{
public void DoSomething() {
SparkSession session = SparkSession.builder().appName("jsonreader")
.master("local[4]").getOrCreate();
Dataset<Row> jsondataset = session.read()
.json("G:\\data\\employee.json");
jsondataset.select("id","name","age").show();
}
}
+----+-------+----+
| id| name| age|
+----+-------+----+
|null| null|null|
|1201| satish| 25|
|1202|krishna| 28|
|null| null|null|
+----+-------+----+
{
{"id" : "1201", "name" : "satish", "age" : "25"}
{"id" : "1202", "name" : "krishna", "age" : "28"}
}
is my json file and i am getting out put rows with null values like above
can any one help me why i am getting like this

The extra curly brackets are causing this. You will have to handle it either before reading JSON or after reading it ie through spark. Also the NULLs are read as strings not exactly NULL. Below is my workaround, the filter condition will uniquely identify these faulty rows due the "null" being string. :
jsondataset = jsondataset.select("age","id","name").filter("age <> 'null'")
jsondataset.show()
// Result
// +---+----+-------+
// |age|id |name |
// +---+----+-------+
// |25 |1201|satish |
// |28 |1202|krishna|
// +---+----+-------+

Why dynamic (real time) update doesn’t work? Why changes can be seen only after application restarts?

For example:
When attribute value in domain is changed (value of string or boolean gets changed from ‘true’ to ‘false’). Everything happens in one session, and we’re waiting for another rebounds to ‘rest’, which should be getting data but it seems not to be updating.

This is often caused by incorrect domain usage in hyperon runtime.
Compare these 2 usages (incorrect and correct):
1) Incorrect - domain element is reused (it should not be)
// incorrect:
HyperonDomainObject lob = engine.getDomain("GROUP", "LOB[GROUP]");
HyperonDomainObject trm = lob.getChild("PRODUCT", "PRD3");
HyperonDomainObject adb = trm.getChild("RIDER", "ADB");
// adb is domain element snapshot
while (true) {
log.info("code = {}", adb.getAttrString("CODE", ctx));
// sleep 3 sec
...
}
// == console == (in meantime user changes attribute CODE from "A" to "BBB")
// but adb is frozen - it is snapshot
code = A
code = A
code = A
code = A
...
2) Correct usage - always get fresh domain objects:
while (true) {
HyperonDomainObject lob = engine.getDomain("GROUP", "LOB[GROUP]");
HyperonDomainObject trm = lob.getChild("PRODUCT", "PRD3");
HyperonDomainObject adb = trm.getChild("RIDER", "ADB");
log.info("code = {}", adb.getAttrString("CODE", ctx));
// sleep 3 sec
...
}
// == console == (in meantime user changes attribute CODE from "A" to "BBB")
// adb is always fresh
code = A
code = A
code = BBB
code = BBB
...
Remember, engine.getDomain() returns domain object snopshot and this object is frozen.
Treat engine.getDomain() as cache as it finds objects in memory structure. This structure is refreshed if user modify domain in Hyperon Studio.

Object in the HashMap are overwritten - java

I would like to create an HashMap where the key is a string and the value is a List. All the values are taken from a Mysql table. The problem is that I have an HashMap where the key is the right one while the value is not the right one, because it is overwritten. In fact I have for all different keys the same list with the same content.
This is the code:
public static HashMap<String,List<Table_token>> getHashMapFromTokenTable() throws SQLException, Exception{
DbAccess.initConnection();
List<Table_token> listFrom_token = new ArrayList();
HashMap<String,List<Table_token>> hMapIdPath = new HashMap<String,List<Table_token>>();
String query = "select * from token";
resultSet = getResultSetByQuery(query);
while(resultSet.next()){
String token=resultSet.getString(3);
String path=resultSet.getString(4);
String word=resultSet.getString(5);
String lemma=resultSet.getString(6);
String postag=resultSet.getString(7);
String isTerminal=resultSet.getString(8);
Table_token t_token = new Table_token();
t_token.setIdToken(token);
t_token.setIdPath(path);
t_token.setWord(word);
t_token.setLemma(lemma);
t_token.setPosTag(postag);
t_token.setIsTerminal(isTerminal);
listFrom_token.add(t_token);
System.out.println("path "+path+" path2: "+token);
int row = resultSet.getRow();
if(resultSet.next()){
if((resultSet.getString(4).compareTo(path)!=0)){
hMapIdPath.put(path, listFrom_token);
listFrom_token.clear();
}
resultSet.absolute(row);
}
if(resultSet.isLast()){
hMapIdPath.put(path, listFrom_token);
listFrom_token.clear();
}
}
DbAccess.closeConnection();
return hMapIdPath;
}
You can find an example of the content of the HashMap below:
key: p000000383
content: [t0000000000000019231, t0000000000000019232, t0000000000000019233]
key: p000000384
content: [t0000000000000019231, t0000000000000019232, t0000000000000019233]
The values that are in "content" are in the last rows in Mysql table for the same key.
mysql> select * from token where idpath='p000003361';
+---------+------------+----------------------+------------+
| idDoc | idSentence | idToken | idPath |
+---------+------------+----------------------+------------+
| d000095 | s000000048 | t0000000000000019231 | p000003361 |
| d000095 | s000000048 | t0000000000000019232 | p000003361 |
| d000095 | s000000048 | t0000000000000019233 | p000003361 |
+---------+------------+----------------------+------------+
3 rows in set (0.04 sec)

You need to allocate a new listFrom_token each time instead of clear()ing it. Replace this:
listFrom_token.clear();
with:
listFrom_token = new ArrayList<Table_token>();
Putting the list in the HashMap does not make a copy of the list. You are clearing and refilling the same list over and over.

Your data shows that idPath is not a primary key. That's what you need to be the key in the Map. Maybe you should make idToken the key in the Map - it's the only thing in your example that's unique.
Your other choice is to make the column name the key and give the values to the List. Then you'll have four keys, each with a List containing four values.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

toChanglelogStream prints different kinds of changes - java

Related

How to run a Cucumber Step multiple times with different data?

How to add Integers in a Python Entry Field

why i am getting rows with null values when i am trying to create view from json file in spark with java

Why dynamic (real time) update doesn’t work? Why changes can be seen only after application restarts?

Object in the HashMap are overwritten - java

Categories

Resources