distinct results and pagination with many-to-many relation - java

I have a M-to-M relation going from Nomination to User mapped on a "Nominee" table. I have the following method to encapsulate results in a paging class called "ResultPage":
protected ResultPage<T> findPageByCriteria(Criteria criteria, int page,
int pageSize) {
DataVerify.notNull(criteria);
DataVerify.greaterThan(page, 0, "Invalid page number");
DataVerify.isTrue(pageSize >= 0, "Invalid page size");
if (logger.isDebugEnabled()) {
logger.debug("Arguments: ");
logger.debug("Page: " + page);
logger.debug("Page size: " + pageSize);
}
int totalItems = 0;
List<T> results = null;
if (pageSize != 0) {
totalItems = ((Number) criteria.setProjection(Projections.rowCount()).
uniqueResult()).intValue();
criteria.setProjection(null);
criteria.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY);
criteria.addOrder(Order.desc("id"));
results = criteria.setFirstResult((page-1) * pageSize).
setMaxResults(pageSize).list();
} else {
results = criteria.setFirstResult((page-1) * pageSize).
list();
totalItems = results.size();
}
ResultPage<T> resultsPage = new ResultPage<T>(results, page,
totalItems,
(pageSize != 0) ? pageSize :
totalItems);
if (logger.isDebugEnabled()){
logger.debug("Total Results: " + resultsPage.getTotalItems());
}
return resultsPage;
}
Now fetching is done right. However my results count is not being consistent. This of course only happens when a "Nomination" has more than 1 user assigned to it. It then counts the users instead of the root entity and thus I get totals of "1 to 22" per page instead of "1 to 25" like I have specified - as if there are 22 nominations but 25 users total.
Can I get some help for this? Let me know if I have to clarify.
if anything this is the question that comes as closest as my problem: how to retrieve distinct root entity row count in hibernate?

The solution I use for this problem is to have a first query to only load the IDs of the root entities that satisfy the criteria (i.e. the IDs of your 25 nominations), and then issue a second query which loads the data of these 25 IDs, by doing a query like the following
select n from Nomination n
[... joins and fetches]
where n.id in (:ids)

Related

Improve performance of loading 100,000 records from database

We created a program to make the use of the database easier in other programs. So the code im showing gets used in multiple other programs.
One of those other programs gets about 10,000 records from one of our clients and has to check if these are in our database already. If not we insert them into the database (they can also change and have to be updated then).
To make this easy we load all the entries from our whole table (at the moment 120,000), create a class for every entry we get and put all of them into a Hashmap.
The loading of the whole table this way takes around 5 minutes. Also we sometimes have to restart the program because we run into a GC overhead error because we work on limited hardware. Do you have an idea of how we can improve the performance?
Here is the code to load all entries (we have a global limit of 10.000 entries per query so we use a loop):
public Map<String, IMasterDataSet> getAllInformationObjects(ISession session) throws MasterDataException {
IQueryExpression qe;
IQueryParameter qp;
// our main SDP class
Constructor<?> constructorForSDPbaseClass = getStandardConstructor();
SimpleDateFormat itaTimestampFormat = new SimpleDateFormat("yyyyMMddHHmmssSSS");
// search in standard time range (modification date!)
Calendar cal = Calendar.getInstance();
cal.set(2010, Calendar.JANUARY, 1);
Date startDate = cal.getTime();
Date endDate = new Date();
Long startDateL = Long.parseLong(itaTimestampFormat.format(startDate));
Long endDateL = Long.parseLong(itaTimestampFormat.format(endDate));
IDescriptor modDesc = IBVRIDescriptor.ModificationDate.getDescriptor(session);
// count once before to determine initial capacities for hash map/set
IBVRIArchiveClass SDP_ARCHIVECLASS = getMasterDataPropertyBag().getSDP_ARCHIVECLASS();
qe = SDP_ARCHIVECLASS.getQueryExpression(session);
qp = session.getDocumentServer().getClassFactory()
.getQueryParameterInstance(session, new String[] {SDP_ARCHIVECLASS.getDatabaseName(session)}, null, null);
qp.setExpression(qe);
qp.setHitLimitThreshold(0);
qp.setHitLimit(0);
int nrOfHitsTotal = session.getDocumentServer().queryCount(session, qp, "*");
int initialCapacity = (int) (nrOfHitsTotal / 0.75 + 1);
// MD sets; and objects already done (here: document ID)
HashSet<String> objDone = new HashSet<>(initialCapacity);
HashMap<String, IMasterDataSet> objRes = new HashMap<>(initialCapacity);
qp.close();
// do queries until hit count is smaller than 10.000
// use modification date
boolean keepGoing = true;
while(keepGoing) {
// construct query expression
// - basic part: Modification date & class type
// a. doc. class type
qe = SDP_ARCHIVECLASS.getQueryExpression(session);
// b. ID
qe = SearchUtil.appendQueryExpressionWithANDoperator(session, qe,
new PlainExpression(modDesc.getQueryLiteral() + " BETWEEN " + startDateL + " AND " + endDateL));
// 2. Query Parameter: set database; set expression
qp = session.getDocumentServer().getClassFactory()
.getQueryParameterInstance(session, new String[] {SDP_ARCHIVECLASS.getDatabaseName(session)}, null, null);
qp.setExpression(qe);
// order by modification date; hitlimit = 0 -> no hitlimit, but the usual 10.000 max
qp.setOrderByExpression(session.getDocumentServer().getClassFactory().getOrderByExpressionInstance(modDesc, true));
qp.setHitLimitThreshold(0);
qp.setHitLimit(0);
// Do not sort by modification date;
qp.setHints("+NoDefaultOrderBy");
keepGoing = false;
IInformationObject[] hits = null;
IDocumentHitList hitList = null;
hitList = session.getDocumentServer().query(qp, session);
IDocument doc;
if (hitList.getTotalHitCount() > 0) {
hits = hitList.getInformationObjects();
for (IInformationObject hit : hits) {
String objID = hit.getID();
if(!objDone.contains(objID)) {
// do something with this object and the class
// here: construct a new SDP sub class object and give it back via interface
doc = (IDocument) hit;
IMasterDataSet mdSet;
try {
mdSet = (IMasterDataSet) constructorForSDPbaseClass.newInstance(session, doc);
} catch (Exception e) {
// cause for this
String cause = (e.getCause() != null) ? e.getCause().toString() : MasterDataException.ERRMSG_PART_UNKNOWN;
throw new MasterDataException(MasterDataException.ERRMSG_NOINSTANCE_POSSIBLE, this.getClass().getSimpleName(), e.toString(), cause);
}
objRes.put(mdSet.getID(), mdSet);
objDone.add(objID);
}
}
doc = (IDocument) hits[hits.length - 1];
Date lastModDate = ((IDateValue) doc.getDescriptor(modDesc).getValues()[0]).getValue();
startDateL = Long.parseLong(itaTimestampFormat.format(lastModDate));
keepGoing = (hits.length >= 10000 || hitList.isResultSetTruncated());
}
qp.close();
}
return objRes;
}
Loading 120,000 rows (and more) each time will not scale very well, and your solution may not work in the future as the record size grows. Instead let the database server handle the problem.
Your table needs to have a primary key or unique key based on the columns of the records. Iterate through the 10,000 records performing JDBC SQL update to modify all field values with where clause to exactly match primary/unique key.
update BLAH set COL1 = ?, COL2 = ? where PKCOL = ?; // ... AND PKCOL2 =? ...
This modifies an existing row or does nothing at all - and JDBC executeUpate() will return 0 or 1 indicating number of rows changed. If number of rows changed was zero you have detected a new record which does not exist, so perform insert for that new record only.
insert into BLAH (COL1, COL2, ... PKCOL) values (?,?, ..., ?);
You can decide whether to run 10,000 updates followed by however many inserts are needed, or do update+optional insert, and remember JDBC batch statements / auto-commit off may help speed things up.

get path after querying neo4j java

I'm trying to do a query to fin all possible paths that correspond to the pattern "(Order) - [ORDERS] -> (Product) - [PART_OF] -> (Category)" and would like to get the whole path (i.e. all 3 nodes and 2 relationships as their appropriate classes).
The method i used below only let me have 1 column of data (number of orders: 2155). If I tried it once more (the 2nd for loop), the number of row i'd get is 0(number of products: 0). Is there a way to save all the results as nodes and relationships or do I have to query the command 5 times over?
Please help!
String query = "MATCH (o:Order)-[:ORDERS]->(p:Product)-[:PART_OF]->(cate:Category) return o,p,cate";
try( Transaction tx = db.beginTx();
Result result = db.execute(query) ){
Iterator<Node> o_column = result.columnAs( "o" );
int i = 0;
for ( Node node : Iterators.asIterable( o_column ) )
{
i++;
}
System.out.println("number of orders: " + i);
i = 0;
Iterator<Node> p_column = result.columnAs( "p" );
for ( Node node : Iterators.asIterable( p_column ) )
{
i++;
}
System.out.println("number of products: " + i);
tx.success();
}
I've found a way to work around this in the code below, where i'd changes the return value to the node ID using id() then uses GraphDatabaseService.getNodeByID(long):
String query = "MATCH (o:Order)-[:ORDERS]->(p:Product)-[:PART_OF]->(cate:Category) return id(o), id(p), id(cate)";
int nodeID = Integer.parseInt(column.getValue().toString());
Node node = db.getNodeById(nodeID);
If you do this :
MATCH path=(o:Order)-[:ORDERS]->(p:Product)-[:PART_OF]->(cate:Category) return path
You can process path in your loop and unpack that. Takes a bit of exploring but all the information is in there.
Hope that helps.
Regards,
Tom

How to get all data from row wise from a resultset in jdbc

I have a table which is showing one row at a time, i.e in the first iteration it will give one row information and 2nd iteration and so on. Now I want to get all the row data in result set. How I can do that??
This is the structure of my table:
name s e p f
Allan 2 3 8 9
I am doing:
rsServeResource6 = st.executeQuery(sqlForIndividualMileStone);
while(rsServeResource6.next()){
if(rsServeResource6.getString(2)!=null){
engageActual = Integer.parseInt(rsServeResource6.getString(2));
System.out.println("Results :"+engageActual);
}else if(rsServeResource6.getString(3)!=null){
qualificationActual = Integer.parseInt(rsServeResource6.getString(3));
System.out.println("Results :"+qualificationActual);
}else if(rsServeResource6.getString(4)!=null){
isSubmissionActual = Integer.parseInt(rsServeResource6.getString(4));
System.out.println("Results :"+isSubmissionActual);
}else if(rsServeResource6.getString(5)!=null){
presentActual = Integer.parseInt(rsServeResource6.getString(5));
System.out.println("Results :"+presentActual);
}else if(rsServeResource6.getString(6)!=null){
interviewActual = Integer.parseInt(rsServeResource6.getString(6));
System.out.println("Results :"+interviewActual);
}
}
like that.
How can I achieve that ??
Use if instead of using else if when fetching results

native sql into an unmapped object

I'm working on modifying an existing application and I've decided to work with these 2 things.
My unmapped object is a simple object that consists of 2 integer properties:
public class EmployeeScore {
private int id;
private int score;
}
and I have a DAO which does the following:
public List<EmployeeScore> findEmployeeTotals(int regionId, int periodId) {
DataVerify.greaterThan(regionId, 0, "Invalid Region id: Region Id cannot be zero");
DataVerify.lessThan(regionId, 4, "Invalid Region id: Region id cannot be greater than 3");
List<EmployeeScore> results = (List<EmployeeScore>) currentSession().createSQLQuery(
"select n.EMP_ID, SUM(DISTINCT(nom.TOTAL_POINT)) from" +
" NOMINEE n join NOMINATION nom on nom.NOM_ID = n.NOM_ID" +
" join EMPLOYEE e on n.EMP_ID = e.EMP_ID" +
" join COMPANY c on c.COMPANY_CODE = e.COMPANY_CODE" +
" join REGION r on r.REGION_ID = c.REGION_ID" +
" where nom.PERIOD_ID = :periodId" +
" AND nom.STATUS_ID = 2" +
" AND e.ISACTIVE = 1" +
" AND nom.CATEGORY_CODE != 'H'" +
" AND r.REGION_ID = :regionId" +
" group by n.EMP_ID")
.setParameter("regionId", regionId)
.setParameter("periodId", periodId)
.list();
return results;
}
It's a huge query i know. I'm having problems on my tests and I assume because I'm not understanding how to apply these 2 correctly.
My test goes as follows:
#Test
#Transactional(isolation = Isolation.SERIALIZABLE)
public void testEmpScore() {
NomPeriod period = nomPeriodHibernateDAO.findById(6);
Region region = regionHibernateDAO.findById(1);
List<EmployeeScore> results = winnerHibernateDAO.findEmployeeTotals(region.getId(), period.getId());
results.toString();
Assert.assertEquals(13, results.size());
}
It should return 13 objects type EmployeeScore but instead it returns 0 so the test fails.
Can you point me in the right direction of what I'm doing wrong? I know it has to be something with my object seeing as it is not mapped but I have no way of mapping the score value or the id value since they reference different tables or aggregates.
Thanks.
The problem is that you are querying for two integers and trying to interpret them as EmployeeScores. Hibernate can do it but it will take a bit more work than that.
Assuming EmployeeScore has a constructor that takes two integers, you can try
select new my.package.EmployeeScore(n.EMP_ID, SUM(DISTINCT(nom.TOTAL_POINT))) ...
You need to give it the full package path to your object.
Alternatively, by default, I think the query will return a List<Object[]>. So you could iterate through these and form your employee scores manually.
List<Object[]> results = query.list();
List<EmployeeScore> scores = new LinkedList<EmployeeScore>();
for (Object[] arr : results)
{
int id = (int) arr[0];
int total = (int) arr[1];
scores.add(new EmployeeScore(id, total));
}

how to get 150k followersIDs from User?

I am trying to get all the followersIDs from an a twitter account with about 150.000 followers. I later want to map their location, but first I need all those IDs.
at the moment I am using this code:
long lCursorIDs = -1;
long[] fArray = new long[100];
do
{
fArray = twitter.getFollowersIDs(name, lCursorIDs).getIDs();
} while (twitter.getFollowersIDs(name, lCursorIDs).hasNext ());
try
{
PrintWriter pr = new PrintWriter(filenameOutput);
for (int i=0; i<fArray.length ; i++)
{
pr.println(fArray[i]);
}
pr.close();
System.out.println("Follower IDs collected and saved to file: " + filenameOutput );
}
catch (Exception e)
{
e.printStackTrace();
System.out.println("No such file exists.");
}
This works for User with less followers. but with that many it always returns an error message - rate limit exceeded.
I was thinking about getting only a certain number of followersIDs per hour, but I am not sure how to do that and not start every hour from the beginning with the first follower. also, I am not sure how many followers I can get with one request. maybe it is 100, as with the "lookupUser" method but I am not sure.. any ideas/suggestions?
EDIT: ok, I just tried to get the followerIDs of an account with 2700 followers and it stored them correctly in the text file. It also only "cost" one request. than I changed the account name to an account with 15500 followers and it crashes again with an rate limit exceeded message. I don´t get why since it´s only roughly 6 times as many followers but all the remaining requests get spend.. any ideas on what I´m doing wrong?
the answer:
int numberOfFollowers;
numberOfFollowers = user.getFollowersCount();
//CREATE ARRAYS FOR FOLLOWER IDS
long cursor = -1;
long[] fArray = new long[numberOfFollowers];
long[] local = new long[5000];
IDs ids = twitter.getFollowersIDs(name, cursor);
int j = 0;
int x = 5000;
int durchgang = 1;
int d_anzahl = 1 + numberOfFollowers / 5000;
//STROE FOLLOWER IDS IN ARRAYS
do
{
ids = twitter.getFollowersIDs(name, cursor);
local = twitter.getFollowersIDs(name, cursor).getIDs();
System.out.println("Durchgang: " + durchgang + " / " + d_anzahl );
System.arraycopy(local, 0, fArray, j * x , local.length);
j++;
durchgang++;
cursor = ids.getNextCursor();
} while (ids.hasNext());
this gets an array with all follower IDs of any twitter User. It calculates the number of loops needed to get all follower IDs and copys each array of 5000 IDs into new array which has all IDs at the end.

Categories

Resources