Object in the HashMap are overwritten - java - java

I would like to create an HashMap where the key is a string and the value is a List. All the values are taken from a Mysql table. The problem is that I have an HashMap where the key is the right one while the value is not the right one, because it is overwritten. In fact I have for all different keys the same list with the same content.
This is the code:
public static HashMap<String,List<Table_token>> getHashMapFromTokenTable() throws SQLException, Exception{
DbAccess.initConnection();
List<Table_token> listFrom_token = new ArrayList();
HashMap<String,List<Table_token>> hMapIdPath = new HashMap<String,List<Table_token>>();
String query = "select * from token";
resultSet = getResultSetByQuery(query);
while(resultSet.next()){
String token=resultSet.getString(3);
String path=resultSet.getString(4);
String word=resultSet.getString(5);
String lemma=resultSet.getString(6);
String postag=resultSet.getString(7);
String isTerminal=resultSet.getString(8);
Table_token t_token = new Table_token();
t_token.setIdToken(token);
t_token.setIdPath(path);
t_token.setWord(word);
t_token.setLemma(lemma);
t_token.setPosTag(postag);
t_token.setIsTerminal(isTerminal);
listFrom_token.add(t_token);
System.out.println("path "+path+" path2: "+token);
int row = resultSet.getRow();
if(resultSet.next()){
if((resultSet.getString(4).compareTo(path)!=0)){
hMapIdPath.put(path, listFrom_token);
listFrom_token.clear();
}
resultSet.absolute(row);
}
if(resultSet.isLast()){
hMapIdPath.put(path, listFrom_token);
listFrom_token.clear();
}
}
DbAccess.closeConnection();
return hMapIdPath;
}
You can find an example of the content of the HashMap below:
key: p000000383
content: [t0000000000000019231, t0000000000000019232, t0000000000000019233]
key: p000000384
content: [t0000000000000019231, t0000000000000019232, t0000000000000019233]
The values that are in "content" are in the last rows in Mysql table for the same key.
mysql> select * from token where idpath='p000003361';
+---------+------------+----------------------+------------+
| idDoc | idSentence | idToken | idPath |
+---------+------------+----------------------+------------+
| d000095 | s000000048 | t0000000000000019231 | p000003361 |
| d000095 | s000000048 | t0000000000000019232 | p000003361 |
| d000095 | s000000048 | t0000000000000019233 | p000003361 |
+---------+------------+----------------------+------------+
3 rows in set (0.04 sec)

You need to allocate a new listFrom_token each time instead of clear()ing it. Replace this:
listFrom_token.clear();
with:
listFrom_token = new ArrayList<Table_token>();
Putting the list in the HashMap does not make a copy of the list. You are clearing and refilling the same list over and over.

Your data shows that idPath is not a primary key. That's what you need to be the key in the Map. Maybe you should make idToken the key in the Map - it's the only thing in your example that's unique.
Your other choice is to make the column name the key and give the values to the List. Then you'll have four keys, each with a List containing four values.

Related

How to run a Cucumber Step multiple times with different data?

I am trying to automate one scenario using Cucumber.
Step Then Create item actually takes values from first row only.
What I want to do is execute step Then Create item 2 times, before moving to step Then assigns to CRSA.
But my code is taking values from first row only (0P00A). How to take values from both rows?
Background: Application login
Given User launch the application on browser
When User logs in to application
Scenario: Test
Then Create item
| Item ID | Attribute Code | New Value | Old Value |
| 0P00A | SR | XYZ21 | ABC21 |
| 0P00B | CA | XYZ22 | ABC22 |
Then assigns to CRSA
#Then("Create item")
public void createItem(DataTable dataTable) {
List<Map<String, String>> inputData = dataTable.asMaps();
}
You can use foreach like below:
List<Map<String, String>> inputData = dataTable.asMaps();
for (Map<String, String> columns : inputData ) {
columns.get("Item ID");
columns.get("Attribute Code");
}

Java: increasing speed of parsing large file

I have csv file:
Lets call it product.csv
Product_group | Product producer | Product_name | CODE | RANDOM_F_1 | ... | RANDOM_F_25
----------------------------------------------------------------------------------------
Electronic | Samsung | MacBook_1 | 60 | 0.8 | ... | 1.2
Electronic | Samsung | MacBook_2 | | 0.8 | ... | 1.2
... | ... | ... | | ... | ... | ...
Electronic | Samsung | MacBook_9999 | 63 | 1.2 | ... | 3.1
Electronic | Samsung | MacBook_9999 | 64 | 1.2 | ... | 3.1
I will try to explain this csv file:
The couple Product_name + CODE are unique (if code present), RANDOM_F_1 are fields with random values.
So, my goal:
I have java class, which generate this csv file. And when it generates new file - it will clean product.csv, and generate new one with other random attributes.
Now, i have a goal - not overwrite this random fields in new csv generation.
So, i have one idea - create copy of this csv file before cleaning, and if MacBook_9999 present in copy file - just use this raw in new generation of file.
My code now looks like:
public void createProducts(List<Products> products) {
//copying file
for(Product newProduct : products) {
Product previousProduct = findPreviousProduct(newProduct);
if(previousProduct != null) {
newProduct = previousProduct;
}
addToCsv(newProduct );
}
}
private void copyFile() {
//here i am copying file by FileInputStream and FileOutputStream
}
private Product findPreviousProduct(Product product) {
File copyFile = new File(...);
//creation BufferReader br here, in try with resources
previousProduct = br.lines().parallel()
.map(Product::new)
.filter(e -> e.getName.equals(product.getName) && //here is comparison by code)
.findFirst().orElse(...);
//return statement here
}
Everythink works fine, but i faced one performance problem after this check adding, see below test example (file with 12k raws):
BEFORE: 3seconds
AFTER: 2minutes 20seconds
So, the question is: how can i boost it ? Should i use other way to save my RANDOM fields?
Because it is really low perfomance. If i will have 100k raws it will take 22minutes :(
Idea with saving data in hash map (Blaž Mrak comment), and getting row by key is brittiant, but if i will have 500-700k of objects - my Heap memory will ends.
Than you, developers
I don't think you have O(n) complexity, but a O(n^2), which means that for 100k lines your code will run for 220 minutes, not 22. What makes it worse is that you are reading the file each time you call findPreviousProduct. I would suggest first loading csv into memory and then searching it:
//somewhere else... MyCsvReader or sth
public List<Product> readPerviousProducts() {
File copyFile = new File(...);
...
return br.lines().parallel()
.map(Product::new).toList();
}
//then in your class
public void createProducts(List<Product> products, List<Product> previousProducts) {
for(Product newProduct : products) {
Product previousProduct = findPreviousProduct(previousProducts, newProduct);
if(previousProduct != null) {
newProduct = previousProduct;
}
addToCsv(newProduct );
}
}
private Product findPreviousProduct(List<Product> previousProducts, Product product) {
return previousProducts.filter(e -> e.getName.equals(product.getName) && //here is comparison by code)
.findFirst().orElse(...);
}
Give this a try first to see if there are some performance improvements. The second optimization is to create a HashMap instead of a List. You create a key() method on the product, which will return a unique string generated from name and code. (basically just name + _ + code)
//somewhere else
public List<Product> readPerviousProducts() {
File copyFile = new File(...);
...
return br.lines().parallel()
.map(Product::new)
.toMap((product) -> product.key(), (product) -> product);
}
public void createProducts(List<Product> products, HashMap<String, Product> previousProducts) {
for(Product newProduct : products) {
Product previousProduct = findPreviousProduct(previousProducts, newProduct);
if(previousProduct != null) {
newProduct = previousProduct;
}
addToCsv(newProduct );
}
}
private Product findPreviousProduct(List<Product> previousProducts, Product product) {
return previousProducts.get(product.key());
}
You can then compare how much faster each solution is :)

How to verify all the elements in the same page using Data Table

I would like to know how to find all the elements in the same page and assert them they are displayed in the page using Data table in Selenium / Java /Cucumber.
For example, I have a scenario like this
Sceanario: Verify all the elements in the xyz page
Given I am in the abc page
When I navigate to xyz page
Then I can see the following fields in the xyz page
|field 1|
|field 2|
|field 3|
|field 4|
First Step : Constructing Data Table. (Clue, Using Header we can implement Data Table in much clean & precise way and considering Data Table looks like below one)
Then I can see the following fields in the xyz page
| Field Name | Locator |
| field 1 | Xpath1 |
| field 2 | Xpath2 |
| field 3 | Xpath3 |
| field 4 | Xpath4 |
Second Step : Implementing Step Definition
#Then
public void I_can_see_the_following_fields_in_the_xyz_page(DataTable table) throws Throwable {
WebElement element;
List<Map<String, String>> list = table.asMaps(String.class,String.class);
for(Map<String, String> list : data) {
element = driver.findElement(By.xpath(list.get("Locator")));
Assert.assertTrue("Element : "+list.get("Field Name")+ "not found",isElementPresent(element));
}
}
Utility Method : To check if element present
protected synchronized boolean isElementPresent(WebElement element) {
boolean elementPresenceCheck = false;
Wait<WebDriver> wait=null;
try {
wait = new FluentWait<WebDriver>((WebDriver) driver).withTimeout(10, TimeUnit.SECONDS).pollingEvery(1,
TimeUnit.SECONDS);
elementPresenceCheck = wait.until(ExpectedConditions.visibilityOf(element)).isDisplayed();
return elementPresenceCheck;
}catch(Exception e) {
return elementPresenceCheck;
}
}
What if you will place all values to array
{ field 1, field 2, field 3, field 4 }
and as next step -> will check on visibility each value on page?
I consider that it should resolve you issue.

Trying to Extract Info from a Website but some values appear as null?

I've been working and testing with JSoup for a while and this problem has bugged me for a while.
http://fx.sauder.ubc.ca/today.html
Currently this class is supposed extract info from a table from this website the only things that this program can pull from the table is
Code | Currency | fcu/CAD | fcu/USD | Code | Currency | fcu/CAD | fcu/USD
and all of the 3 letter codes shown on the website, but for all of the other information like values and dollar names the program shows these as null. If anyone wants to know px goes up to 34 and ay goes up to 16 as that is the size of the table I would be extracting from.
public String CountryHandler2(int px, int ay) throws IOException{
String url = "http://fx.sauder.ubc.ca/today.html";
Document doc = Jsoup.connect(url).get();
Elements paragraphs = doc.select("body > table:nth-child(4) > tbody:nth-child(1) > tr:nth-child("+px+") > td:nth-child("+ay+") > font:nth-child(1) > b:nth-child(1)");
System.out.println("Paragraphs " + paragraphs.text());
if(paragraphs.hasText()){
return paragraphs.text();
}
return null;
}
I tried this and was able to extract table data. May be you try it this way.
public static void main (String [] args) throws IOException{
Document doc = Jsoup.connect("http://fx.sauder.ubc.ca/today.html").get();
Element table = doc.select("table").get(2); //get the 3rd table
Elements trs = table.select("tr"); //get each row of that table
for (Element e: trs){
System.out.println(e.text());
}
}

JPA double relation with the same Entity

I have these Entities:
#Entity
public class Content extends AbstractEntity
{
#NotNull
#OneToOne(optional = false)
#JoinColumn(name = "CURRENT_CONTENT_REVISION_ID")
private ContentRevision current;
#OneToMany(mappedBy = "content", cascade = CascadeType.ALL, orphanRemoval = true)
private List<ContentRevision> revisionList = new ArrayList<>();
}
#Entity
public class ContentRevision extends AbstractEntity
{
#NotNull
#ManyToOne(optional = false)
#JoinColumn(name = "CONTENT_ID")
private Content content;
#Column(name = "TEXT_DATA")
private String textData;
#Temporal(TIMESTAMP)
#Column(name = "REG_DATE")
private Date registrationDate;
}
and this is the db mapping:
CONTENT
+-----------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------------+--------------+------+-----+---------+----------------+
| ID | bigint(20) | NO | PRI | NULL | auto_increment |
| CURRENT_CONTENT_REVISION_ID | bigint(20) | NO | MUL | NULL | |
+-----------------------------+--------------+------+-----+---------+----------------+
CONTENT_REVISION
+-----------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------------+--------------+------+-----+---------+----------------+
| ID | bigint(20) | NO | PRI | NULL | auto_increment |
| REG_DATE | datetime | YES | | NULL | |
| TEXT_DATA | longtext | YES | | NULL | |
| CONTENT_ID | bigint(20) | NO | MUL | NULL | |
+-----------------------------+--------------+------+-----+---------+----------------+
I have also these requirements:
Content.current is always a member of Content.revisionList (think about Content.current as a "pointer").
Users can add a new ContentRevision to an existing Content
Users can add a new Content with an initial ContentRevision (cascade persist)
Users can change Content.current (move the "pointer")
Users can modify Content.current.textData, but saves Content (cascade merge)
Users can delete ContentRevision
Users can delete Content (cascade remove to ContentRevision)
Now, my questions are:
Is this the best approach? Any best practice?
Is it safe to cascade merge when the same entity is referenced twice? (Content.current is also Content.revisionList[i])
Are Content.current and Content.revisionList[i] the same instance? (Content.current == Content.revisionList[i] ?)
Thanks
#jabu.10245 I'm very grateful for your effort. Thank you, really.
However, there's a problematic (missing) case from your tests: when you run it inside a container using CMT:
#RunWith(Arquillian.class)
public class ArquillianTest
{
#PersistenceContext
private EntityManager em;
#Resource
private UserTransaction utx;
#Deployment
public static WebArchive createDeployment()
{
// Create deploy file
WebArchive war = ShrinkWrap.create(WebArchive.class, "test.war");
war.addPackages(...);
war.addAsResource("persistence-arquillian.xml", "META-INF/persistence.xml");
war.addAsManifestResource(EmptyAsset.INSTANCE, "beans.xml");
// Show the deploy structure
System.out.println(war.toString(true));
return war;
}
#Test
public void testDetached()
{
// find a document
Document doc = em.find(Document.class, 1L);
System.out.println("doc: " + doc); // Document#1342067286
// get first content
Content content = doc.getContentList().stream().findFirst().get();
System.out.println("content: " + content); // Content#511063871
// get current revision
ContentRevision currentRevision = content.getCurrentRevision();
System.out.println("currentRevision: " + currentRevision); // ContentRevision#1777954561
// get last revision
ContentRevision lastRevision = content.getRevisionList().stream().reduce((prev, curr) -> curr).get();
System.out.println("lastRevision: " + lastRevision); // ContentRevision#430639650
// test equality
boolean equals = Objects.equals(currentRevision, lastRevision);
System.out.println("1. equals? " + equals); // true
// test identity
boolean same = currentRevision == lastRevision;
System.out.println("1. same? " + same); // false!!!!!!!!!!
// since they are not the same, the rest makes little sense...
// make it dirty
currentRevision.setTextData("CHANGED " + System.currentTimeMillis());
// perform merge in CMT transaction
utx.begin();
doc = em.merge(doc);
utx.commit(); // --> ERROR!!!
// get first content
content = doc.getContentList().stream().findFirst().get();
// get current revision
currentRevision = content.getCurrentRevision();
System.out.println("currentRevision: " + currentRevision);
// get last revision
lastRevision = content.getRevisionList().stream().reduce((prev, curr) -> curr).get();
System.out.println("lastRevision: " + lastRevision);
// test equality
equals = Objects.equals(currentRevision, lastRevision);
System.out.println("2. equals? " + equals);
// test identity
same = currentRevision == lastRevision;
System.out.println("2. same? " + same);
}
}
since they are not the same:
if I enable cascading on both properties, an Exception is thrown
java.lang.IllegalStateException:
Multiple representations of the same entity [it.shape.edea2.jpa.ContentRevision#1] are being merged.
Detached: [ContentRevision#430639650];
Detached: [ContentRevision#1777954561]
if I disable cascade on current, the change get lost.
the strange thing is that running this test outside the container results in successful execution.
Maybe it's lazy loading (hibernate.enable_lazy_load_no_trans=true), maybe something else, but it's definitely NOT SAFE.
I wonder if there's a way to get the same instance.
Is it safe to cascade merge when the same entity is referenced twice?
Yes. If you manage an instance of Content, then it's Content.revisionList and Content.current are managed as well. Changes in any of those will be persisted when flushing the entity manager. You don't have to call EntityManager.merge(...) manually, unless you're dealing with transient objects that need to be merged.
If you create a new ContentRevision, then call persist(...) instead of merge(...) with that new instance and make sure it has a managed reference to the parent Content, also add it to the content's list.
Are Content.current and Content.revisionList[i] the same instance?
Yes, should be. Test it to be sure.
Content.current is always a member of Content.revisionList (think about Content.current as a "pointer").
You could do that check in in SQL with a check constraint; or in Java, although you'd have to be sure the revisionList is fetched. By default it's lazy fetched, meaning Hibernate will run another query for this list if you access the getRevisionList() method. And for that you need a running transaction, otherwise you'll be getting a LazyInitializationException.
You could instead load the list eagerly, if that's what you want. Or you could define a entity graph to be able to support both strategies in different queries.
Users can modify Content.current.textData, but saves Content (cascade merge)
See my first paragraph above, Hibernate should save changes on any managed entity automatically.
Users can delete ContentRevision
if (content.getRevisionList().remove(revision))
entityManager.remove(revision);
if (revision.equals(content.getCurrentRevision())
content.setCurrentRevision(/* to something else */);
Users can delete Content (cascade remove to ContentRevision)
Here I'd prefer to ensure that in the database schema, for instance
FOREIGN KEY (content_id) REFERENCES content (id) ON DELETE CASCADE;
UPDATE
As requested, I wrote a test. See this gist for the implementations of Content and ContentRevision I used.
I had to make one important change though: Content.current cannot really be #NotNull, especially not the DB field, because if it were, then we couldn't persist a content and revision at the same time, since both have no ID yet. Hence the field must be allowed to be NULL initially.
As a workaround I added the following method to Content:
#Transient // ignored in JPA
#AssertTrue // javax.validation
public boolean isCurrentRevisionInList() {
return current != null && getRevisionList().contains(current);
}
Here the validator ensures that the there is always a non-null current revision and that it is contained in the revision list.
Now here are my tests.
This one proves that the references are the same (Question 3) and that it is enough to persist content where current and revisionList[0] is referencing the same instance (question 2):
#Test #InSequence(0)
public void shouldCreateContentAndRevision() throws Exception {
// create java objects, unmanaged:
Content content = Content.create("My first test");
assertNotNull("content should have current revision", content.getCurrent());
assertSame("content should be same as revision's parent", content, content.getCurrent().getContent());
assertEquals("content should have 1 revision", 1, content.getRevisionList().size());
assertSame("the list should contain same reference", content.getCurrent(), content.getRevisionList().get(0));
// persist the content, along with the revision:
transaction.begin();
entityManager.joinTransaction();
entityManager.persist(content);
transaction.commit();
// verify:
assertEquals("content should have ID 1", Long.valueOf(1), content.getId());
assertEquals("content should have one revision", 1, content.getRevisionList().size());
assertNotNull("content should have current revision", content.getCurrent());
assertEquals("revision should have ID 1", Long.valueOf(1), content.getCurrent().getId());
assertSame("current revision should be same reference", content.getCurrent(), content.getRevisionList().get(0));
}
The next ensures that it's still true after loading the entity:
#Test #InSequence(1)
public void shouldLoadContentAndRevision() throws Exception {
Content content = entityManager.find(Content.class, Long.valueOf(1));
assertNotNull("should have found content #1", content);
// same checks as before:
assertNotNull("content should have current revision", content.getCurrent());
assertSame("content should be same as revision's parent", content, content.getCurrent().getContent());
assertEquals("content should have 1 revision", 1, content.getRevisionList().size());
assertSame("the list should contain same reference", content.getCurrent(), content.getRevisionList().get(0));
}
And even when updating it:
#Test #InSequence(2)
public void shouldAddAnotherRevision() throws Exception {
transaction.begin();
entityManager.joinTransaction();
Content content = entityManager.find(Content.class, Long.valueOf(1));
ContentRevision revision = content.addRevision("My second revision");
entityManager.persist(revision);
content.setCurrent(revision);
transaction.commit();
// re-load and validate:
content = entityManager.find(Content.class, Long.valueOf(1));
// same checks as before:
assertNotNull("content should have current revision", content.getCurrent());
assertSame("content should be same as revision's parent", content, content.getCurrent().getContent());
assertEquals("content should have 2 revisions", 2, content.getRevisionList().size());
assertSame("the list should contain same reference", content.getCurrent(), content.getRevisionList().get(1));
}
SELECT * FROM content;
id | version | current_content_revision_id
----+---------+-----------------------------
1 | 2 | 2
UPDATE 2
It was hard to reproduce that situation on my machine, but I got it to work. Here is what I've done so far:
I changed all #OneToMany relations to use lazy fetching (the default) and rerun the following test case:
#Test #InSequence(3)
public void shouldChangeCurrentRevision() throws Exception {
transaction.begin();
entityManager.joinTransaction();
Document document = entityManager.find(Document.class, Long.valueOf(1));
assertNotNull(document);
assertEquals(1, document.getContentList().size());
Content content = document.getContentList().get(0);
assertNotNull(content);
ContentRevision revision = content.getCurrent();
assertNotNull(revision);
assertEquals(2, content.getRevisionList().size());
assertSame(revision, content.getRevisionList().get(1));
revision.setTextData("CHANGED");
document = entityManager.merge(document);
content = document.getContentList().get(0);
revision = content.getCurrent();
assertSame(revision, content.getRevisionList().get(1));
assertEquals("CHANGED", revision.getTextData());
transaction.commit();
}
The test passed with lazy fetching. Note that lazy fetching requires it to be executed within a transaction.
For some reason the content revision instance you're editing is not the same as the one in the one-to-many list. To reproduce that I've modified my test as follows:
#Test #InSequence(4)
public void shouldChangeCurrentRevision2() throws Exception {
transaction.begin();
Document document = entityManager.find(Document.class, Long.valueOf(1));
assertNotNull(document);
assertEquals(1, document.getContentList().size());
Content content = document.getContentList().get(0);
assertNotNull(content);
ContentRevision revision = content.getCurrent();
assertNotNull(revision);
assertEquals(2, content.getRevisionList().size());
assertSame(revision, content.getRevisionList().get(1));
transaction.commit();
// load another instance, different from the one in the list:
revision = entityManager.find(ContentRevision.class, revision.getId());
revision.setTextData("CHANGED2");
// start another TX, replace the "current revision" but not the one
// in the list:
transaction.begin();
document.getContentList().get(0).setCurrent(revision);
document = entityManager.merge(document); // here's your error!!!
transaction.commit();
content = document.getContentList().get(0);
revision = content.getCurrent();
assertSame(revision, content.getRevisionList().get(1));
assertEquals("CHANGED2", revision.getTextData());
}
And there, I got exactly your error. Then I modified the cascading setting on the #OneToMany mapping:
#OneToMany(mappedBy = "content", cascade = { PERSIST, REFRESH, REMOVE }, orphanRemoval = true)
private List<ContentRevision> revisionList;
And the error disappeared :-) ... because I removed CascadeType.MERGE.

Categories

Resources