How to join two tables using Guava

How to join two tables using Guava - java

Fact table :
Id Year Month countryId Sales
1 1999 1 1 3000
2 1999 2 1 2300
3 2000 3 2 3999
4 2000 4 3 2939
Dimension table:
Id country province
1 US LA
2 US CA
3 US GA
4 EN LN
and I use Guava table like this :
Table<Integer, String, Object> table = Tables.newCustomTable(
Maps.<Integer, Map<String, Object>> newLinkedHashMap(),
new Supplier<Map<String, Object>>() {
public Map<String, Object> get() {
return Maps.newLinkedHashMap();
}
});
table.put(1, "Year", 1999);
table.put(1, "Month", 1);
table.put(1, "countyId", 1);
table.put(1, "Sales", 3000);
// ...... etc
table1.put(1, "county", "US");
table1.put(1, "provice", 1999);
// ......
I want to implement a LEFT JOIN like:
1 1999 1 1 3000 US LA
2 1999 2 1 2300 US LA
3 2000 3 2 3999 US CA
4 2000 4 3 2939 EN LN
What should I do?

Guava's Table isn't supposed to be used like any SQL's table, as it is a collection. SQL's tables are designed to be indexable, sortable, filterable, etc. Guava's Table has only a fraction of those and only indirectly, and joints aren't part of them (unless you play with transformations).
What you need to do is to have your two tables and loop through the elements of table and find the corresponding mapping in table1.
In your case, I believe you're better off with a List replacing table and a Guava Table for table1. Loop through the list and make your final objects as you get your elements.

Related

Spark dataset Combine multiple rows

This is my dataset,
Name Group Age
A [123] 10
B. [123,456] 20
C. [456,789] 30
D. [900] 40
E. [800,900] 50
F. [1000] 60
Now I want to merge Group such that, the result looks like
Name Group Age
A,B,C [123,456,789] 10,20,30
D,E [900,800] 40,50
F. [1000] 60
I tried arrays contains but that is not giving me what I want. I tried self join too. Anyone can help with a java solution.
Edit:
I found ReduceFunction which can do something similar.
dataset.reduce(new ReduceFunction<Grouped>(){
private static final long serialVersionUID = 8289076985320745158L;
#Override
public Grouped call(final Grouped v1, final Grouped v2) {
if (!Collections.disjoint(v1.getGroup(), (v2.getGroup())))
{
v1.getAge().addAll(v2.getAge());
v1.getGroup().addAll(v2.getGroup());
v1.getName().addAll(v2.getName());
}
}
}
But how to do this for all rows???
This is able to give me first 2 rows reduced to :
Name Group Age
A,B [123,456] 10,20

How to solve this logically

I have set and a sub-set for each id. i need to accumulate the total
ex: employeeIdSet is the outer set which has all the employeeIds
Now each employee - may be combined or not combined and they will be added credits
empa - credit 10
empb linked with empc, empd - credit would be 15, overall for the 3 employees.
similalrly
empe linked with empz, emps - credit would be 7, over all for the 3 employees and linked with empq where the credit is 9
similarly
empr linked with empo - credit would be 6, overall for the 2 employees
Now i want to have a list of employee id with respective credits
ex:
empa-10
emp-15,
empc-15,
empd-15,
empe - 7+9,
empz - 7+9,
emps- 7+9,
empr - 6,
empo - 6
the problem we get employee id in the outer loop and inner loop we can get the subsequent employees. however all addition leads to problem
code
final Set<Long> combinedEmployeeIdSet = new HashSet<>();
final Set<CombinedEmployee> combinedEmployees = employee.getCombinedEmployees();
for(final CombinedEmployee combinedEmployee: combinedEmployees) {
combinedEmployeeIdSet.add(combinedEmployee.getId());
}
for(final OtherEmployee otherEmployee: otherEmployees) {
if(!combinedEmployeeIdSet.contains(otherEmployee.getId())) {
employeeCredit += otherEmployee.getCredit();
}
}
expectation is get the total credits of the given employeeId where when there under same group, it should be added as one single unit, else the credit should be added
empe - 7+9, displays 15
empz - 7+9, displays 15
emps- 7+9, displays 15
thanks

Very confused by your description.
Do you mean you have some "emp"s, say: emp-a,emp-b ... emp-x, and each emp have a credit, say: emp-a:10, emp-b:5, emp-c:7... emp-x:6. Some emps have links with other emps, say: emp-a (emp-b, empc). Now you want to get the credit for each emp, if the emp has links, its credit should be a sumarize of itself and all its links.
So you may get
emp-a 10+5+7
emp-b 5
emp-c 7
...
emp-x 6

Managing java list object and iterating them

I have a list which is a java object like below.
public class TaxIdentifier {
public String id;
public String gender;
public String childId;
public String grade,
public String isProcessed;
////...///
getters and setters
///....///
}
Records in DB looks like below,
id gender childId grader isProcessed
11 M 111 3 Y
12 M 121 4 Y
11 M 131 2 Y
13 M 141 5 Y
14 M 151 1 Y
15 M 161 6 Y
List<TaxIdentifier> taxIdentifierList = new ArrayList<TaxIdentifier>();
for (TaxIdentifier taxIdentifier : taxIdentifierList) {
}
while I process for loop and get the id = 11, i have to check if there are other records with id = 11 and process them together and do a DB operation and then take the next record say in this case 12 and see if there are other records with id = 12 and so on.
One option is i get the id and query the DB to return all id = 11 and so on.
But this is too much back and forth with the DB.
What is the best way to do the same in java? Please advice.

If you anyway need to process all the records in the corresponding database table - you should retrieve all of them in 1 database roundtrip.
After that, you can collect all your TaxIdentifier records in dictionary data structure and process in whatever way you want.
The brief example may look like this:
Map<String, List<TaxIdentifier>> result = repositoty.findAll().stream().collect(Collectors.groupingBy(TaxIdentifier::getId));
Here all the TaxIdentifier records are grouped by TaxIdentifier's id (all the records with id equals "11") can be retrieved and processed this way:
List<TaxIdentifier> taxIdentifiersWithId11 = result.get("11");

I would leverage the power of your database. When you query, you should order your query by id in ascending order. Not sure which database you are using but I would do:
SELECT * FROM MY_DATABASE WHERE IS_PROCESSED = 'N' ORDER BY ID ASC;
That takes the load of sorting it off of your application and onto your database. Then your query returns unprocessed records with the lowest id's on top. Then just sequentially work through them in order.

Selenium : Converting List<WebElement> to Set<String>

I want to get a unique elements from the page. I'am calculating total number of records. Each record belong to a user, hence there are multiple records with the belongs to the same user. I want to get the total with the unique number of users.
List<WebElement> efirstpagecount = driver.findElements(By.xpath("//*[#id='usersList']/tbody/tr/td[3]"));
Set<WebElement> uniquecount = new HashSet<WebElement>(efirstpagecount);
System.out.println("Unique count: " + uniquecount.size());
for (WebElement u : uniquecount ) {
System.out.println(u.getText());
}
Output:
Unique count: 20
robin
Rocky
prom
jack
stone
Veronica
Veronica
Shawn
Rocky
carl
Rocky
James
Rocky
sam
bon
sam
bone
don
Shawn
don
Above code is giving me the the count including the duplicate values. Please advise how to get the unique values. Thanks in advance!

Assuming the td just has the username, you can try something like this in java 8.
Set<String> uniqueUsers = efirstpagecount.stream()
.map(WebElement::getText).map(String::trim)
.distinct().collect(Collectors.toSet());

Sqlite ExecuteQuery very slow with java (netbeans)

I use sqlite to store data. I am trying to get data from sqlite table view and fill array of objects in java, but Query Execution takes very long time.
I only have 32 objects with 22 fields, and sqlite with 380 rows.
But to Execute similar statement took me 17 seconds for 32 objects.
sql = "SELECT "
+ " field1,"
+ " field2,"
....
+ " field22"
+ " from Rankedview WHERE Ranking = " + Integer.toString(RankingIndex);
try (ResultSet rs = stmt.executeQuery(sql)) {
while (rs.next()) {
a[j].field1= rs.getString("field1");
..........
a[j].field22 = rs.getInt("field22");
}
}
After I updated sqlite-jdbc driver from 3.7.2 to 3.8.5 time from 17 seconds lowered to 9 seconds.
How can I improve its performance?
Edit:
view definition (ATP is a table)
CREATE VIEW Ranked AS
SELECT p1.ID,
p1.field2,
...
p1.field21,
(
SELECT count() + (
SELECT count() + 1
FROM Table AS p2
WHERE p2.field21 = p1.field21 AND
p2.id > p1.id
)
FROM ATP AS p2
WHERE p2.field21 > p1.field21
)
AS Ranking
FROM ATP AS p1
ORDER BY Ranking ASC;
EXPLAIN QUERY PLAN output:
selectid order from detail
0 0 0 SCAN TABLE ATP AS p1
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1
1 0 0 SCAN TABLE ATP AS p2
1 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 2
2 0 0 SEARCH TABLE ATP AS p2 USING INTEGER PRIMARY KEY (rowid>?)
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 3
3 0 0 SCAN TABLE ATP AS p2
3 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 4
4 0 0 SEARCH TABLE ATP AS p2 USING INTEGER PRIMARY KEY (rowid>?)
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 5
5 0 0 SCAN TABLE ATP AS p2
5 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 6
6 0 0 SEARCH TABLE ATP AS p2 USING INTEGER PRIMARY KEY (rowid>?)
0 0 0 USE TEMP B-TREE FOR ORDER BY

To get a row with a specific rank, you should not compute the rank by hand, but use the LIMIT/OFFSET clauses:
SELECT ...
FROM ATP
ORDER BY field21, id
LIMIT 1 OFFSET x
This still requires sorting all table rows to determine which is the x-th, but is much more efficient than multiple nested table scans.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to join two tables using Guava - java

Related

Spark dataset Combine multiple rows

How to solve this logically

Managing java list object and iterating them

Selenium : Converting List<WebElement> to Set<String>

Sqlite ExecuteQuery very slow with java (netbeans)

Categories

Resources