Sqlite relative complement on combined key - java

First some background about my Problem:
I am building a crawler and I want to monitor some highscore lists.
The highscore lists are defined by two parameters: a category and a collection (together unique).
After a successful download I create a new stats entry (category, collection, createdAt, ...)
Problem: I want to query the highscore list only once per day. So I need a query that will return category and collection that haven't been downloaded in 24h.
The stats Table should be used for this.
I have a List of all possible categories and of all possible collections. They work like a cross join.
So basically i need the relative complement of the cross join with the entries from the last 24h
My Idea: Cross join categories and collections and 'substract' all Pair(category, collection) of stats entries that has been created during last 24 h
Question 1: Is it possible to define categories and collections inside the query and cross join them or do I have to create a table for them?
Question 2: Is my Idea the correct approach? How would you do this in Sqlite?
Ok i realise that this might sound confusing so I drew an image of what I actually want.
I am interested in C.
Here is my current code in java, maybe it helps to understand the problem:
public List<Pair<String, String>> getCollectionsToDownload() throws SQLException {
long threshold = System.currentTimeMillis() - DAY;
QueryBuilder<TopAppStatistics, Long> query = queryBuilder();
List<TopAppStatistics> collectionsNotToQuery = query.where().ge(TopAppStatistics.CREATED_AT, threshold).query();
List<Pair<String, String>> toDownload = crossJoin();
for (TopAppStatistics stat : collectionsNotToQuery) {
toDownload.remove(new Pair<>(stat.getCategory(), stat.getCollection()));
}
return toDownload;
}
private List<Pair<String, String>> crossJoin() {
String[] categories = PlayUrls.CATEGORIES;
String[] collections = PlayUrls.COLLECTIONS;
List<Pair<String, String>> toDownload = new ArrayList<>();
for (String ca : categories) {
for (String co : collections) {
toDownload.add(new Pair<>(ca, co));
}
}
return toDownload;
}

The easiest solution to your problem is an EXCEPT. Say you have a subquery
that computes A and another one that computes B. These queries
can be very complex. The key is that both should return the same number of columns and comparable data types.
In SQLite you can then do:
<your subquery 1> EXCEPT <your subquery 2>
As simple as that.
For example:
SELECT a, b FROM T where a > 10
EXCEPT
SELECT a,b FROM T where b < 5;
Remember, both subqueries must return the same number of columns.

Related

Hibernate ResultTransformer unable to cast a single column

I have some very complicated SQL (does some aggregation, some counts based on max value etc) so I want to use SQLQuery rather than Query. I created a very simple Pojo:
public class SqlCount {
private String name;
private Double count;
// getters, setters, constructors etc
Then when I run my SQLQuery, I want hibernate to populate a List for me, so I do this:
Query hQuery = sqlQuery.setResultTransformer(Transformers.aliasToBean(SqlCount.class));
Now I had a problem where depending on what the values are for 'count', Hibernate will variably retrieve it as a Long, Double, BigDecimal or BigInteger. So I use the addScalar function:
sqlQuery.addScalar("count", StandardBasicTypes.DOUBLE);
Now my problem. It seems that if you don't use the addScalar function, Hibernate will populate all of your fields with all of your columns in your SQL result (ie it will try to populate both 'name' and 'count'). However if you use the addScalar function, it only maps the columns that you listed, and all other columns seem to be discarded and the fields are left as null. In this instance, it wouldn't be too bad to just list both "name" and "count", but I have some other scenarios where I need a dozen or so fields - do I really have to list them all?? Is there some way in hibernate to say "map all fields automatically, like you used to, but by the way map this field as a Double"?
Is there some way in hibernate to say "map all fields automatically.
No, check the document here, find 16.1.1. Scalar queries section
The most basic SQL query is to get a list of scalars (values).
sess.createSQLQuery("SELECT * FROM CATS").list();
sess.createSQLQuery("SELECT ID, NAME, BIRTHDATE FROM CATS").list();
These will return a List of Object arrays (Object[]) with scalar values for each column in the CATS table. Hibernate will use ResultSetMetadata to deduce the actual order and types of the returned scalar values.
To avoid the overhead of using ResultSetMetadata, or simply to be more explicit in what is returned, one can use addScalar():
sess.createSQLQuery("SELECT * FROM CATS")
.addScalar("ID", Hibernate.LONG)
.addScalar("NAME", Hibernate.STRING)
.addScalar("BIRTHDATE", Hibernate.DATE)
i use this solution, I hope it will work with you.
with this solution you can populate what you select from the SQL, and return it as Map, and cast the values directly.
since hibernate 5.2 the method setResultTransformer() is deprecated but its work fine to me and works perfect.
if you hate to write extra code addScalar() for each column from the SQL, you can implement ResultTransformer interface and do the casting as you wish.
ex:
lets say we have this Query:
/*ORACLE SQL*/
SELECT
SEQ AS "code",
CARD_SERIAL AS "cardSerial",
INV_DATE AS "date",
PK.GET_SUM_INV(SEQ) AS "sumSfterDisc"
FROM INVOICE
ORDER BY "code";
note: i use double cote for case-sensitive column alias, check This
after create hibernate session you can create the Query like this:
/*Java*/
List<Map<String, Object>> list = session.createNativeQuery("SELECT\n" +
" SEQ AS \"code\",\n" +
" CARD_SERIAL AS \"cardSerial\",\n" +
" INV_DATE AS \"date\",\n" +
" PK.GET_SUM_INV(SEQ) AS \"sumSfterDisc\"\n" +
"FROM INVOICE\n" +
"ORDER BY \"code\"")
.setResultTransformer(new Trans())
.list();
now the point with Trans Class:
/*Java*/
public class Trans implements ResultTransformer {
private SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss", Locale.US);
#Override
public Object transformTuple(Object[] objects, String[] strings) {
Map<String, Object> map = new LinkedHashMap<>();
for (int i = 0; i < strings.length; i++) {
if (objects[i] == null) {
continue;
}
if (objects[i] instanceof BigDecimal) {
map.put(strings[i], ((BigDecimal) objects[i]).longValue());
} else if (objects[i] instanceof Timestamp) {
map.put(strings[i], dateFormat.format(((Timestamp) objects[i])));
} else {
map.put(strings[i], objects[i]);
}
}
return map;
}
#Override
public List transformList(List list) {
return list;
}
}
here you should override the two method transformTuple and transformList, in transformTuple you have two parameters the Object[] objects its the columns values of the row and String[] strings the names of the columns the hibernate Guaranteed the same order of of the columns as you order it in the query.
now the fun begin, for each row returned from the query the method transformTuple will be invoke, so you can build the row as Map or create new object with fields.

What is the better approach for solving Restrictions.in with large lists?

It has been established that when you use Hibernate's Restrictions.in(String property, List list), you have to limit the size of list.
This is because the database server might not be able to handle long queries. Aside from adjusting the configuration of the database server.
Here are the solutions I found:
SOLUTION 1: Split the list into smaller ones and then add the smaller lists separately into several Restrictions.in
public List<Something> findSomething(List<String> subCdList) {
Criteria criteria = getSession().createCriteria(getEntityClass());
//if size of list is greater than 1000, split it into smaller lists. See List<List<String>> cdList
if(subCdList.size() > 1000) {
List<List<String>> cdList = new ArrayList<List<String>>();
List<String> tempList = new ArrayList<String>();
Integer counter = 0;
for(Integer i = 0; i < subCdList.size(); i++) {
tempList.add(subCdList.get(i));
counter++;
if(counter == 1000) {
counter = 0;
cdList.add(tempList);
tempList = new ArrayList<String>();
}
}
if(tempList.size() > 0) {
cdList.add(tempList);
}
Criterion criterion = null;
//Iterate the list of lists, add the restriction for smaller list
for(List<String> cds : cdList) {
if (criterion == null) {
criterion = Restrictions.in("subCd", cds);
} else {
criterion = Restrictions.or(criterion, Restrictions.in("subCd", cds));
}
}
criteria.add(criterion);
} else {
criteria.add(Restrictions.in("subCd", subCdList));
}
return criteria.list();
}
This is an okay solution since you will only have one select statement. However, I think it's a bad idea to have for loops on the DAO layer because we do not want the connection to be open for a long time.
SOLUTION 2: Use DetachedCriteria. Instead of passing the list, query it on the WHERE clause.
public List<Something> findSomething() {
Criteria criteria = getSession().createCriteria(getEntityClass());
DetachedCriteria detached = DetachedCriteria.forClass(DifferentClass.class);
detached.setProjection(Projections.property("cd"));
criteria.add(Property.forName("subCd").in(detached));
return criteria.list();
}
The problem in this solution is on the technical usage of DetachedCriteria. You usually use it when you want to create a query to a another class that is totally not connected (or does not have relationship) on your current class. On the example, Something.class has a property subCd that is a foreign key from DifferentClass. Another, this produces a subquery on the where clause.
When you look at the code:
1. SOLUTION 2 is simpler and concise.
2. But SOLUTION 1 offers a query with only one select.
Please help me decide which one is more efficient.
Thanks.
For Solution 1 : Instead of using for loops, you can try as below
To avoid this use an utility method to build the Criterion Query IN clause if the number of parameter values passed has a size more than 1000.
class HibernateBuildCriteria {
private static final int PARAMETER_LIMIT = 800;
public static Criterion buildInCriterion(String propertyName, List<?> values) {
Criterion criterion = null;
int listSize = values.size();
for (int i = 0; i < listSize; i += PARAMETER_LIMIT) {
List<?> subList;
if (listSize > i + PARAMETER_LIMIT) {
subList = values.subList(i, (i + PARAMETER_LIMIT));
} else {
subList = values.subList(i, listSize);
}
if (criterion != null) {
criterion = Restrictions.or(criterion, Restrictions.in(propertyName, subList));
} else {
criterion = Restrictions.in(propertyName, subList);
}
}
return criterion;
}
}
Using the Method :
criteria.add(HibernateBuildCriteria.buildInCriterion(propertyName, list));
hope this helps.
Solution 1 has one major drawback: you may end up with a lot of different prepared statements which would need to be parsed and for which execution plan would need to be calculated and cached. This process may be much more expensive than the actual execution of the query for which the statement has already been cached by the database. Please see this question for more details.
The way how I solve this is to utilize the algorithm used by Hibernate for batch fetching of lazy loaded associated entities. Basically, I use ArrayHelper.getBatchSizes to get the sublists of ids and then I execute a separate query for each sublist.
Solution 2 is appropriate only if you can project ids in a subquery. But if you can't, then you can't use it. For example, the user of your app edited 20 entities on a screen and now they are saving the changes. You have to read the entities by ids to merge the changes and you cannot express it in a subquery.
However, an alternative approach to solution 2 could be to use temporary tables. For example Hibernate does it sometimes for bulk operations. You can store your ids in the temporary table and then use them in the subquery. I personally consider this to be an unnecessary complication compared to the solution 1 (for this use case of course; Hibernate's reasoning is good for their use case), but it is a valid alternative.

Theres a way to query on hibernate without generate SQLs?

I need to query on my database (postgres) like this:
Entity:
class Cat{
int id;
String name;
}
main class:
int [] idCats = {1,2,7,5,8,4,9,10,12,14};
for(int id : idCats){
Cat cat = session.load(Cat,id);
(do something with cat, according your name)
}
But, this approach generates to many sqls. Considering i'll search almost all ids, there's a
way to bring all objects e search on it using criteria. Without implement by myself.
You can use the second level cache feature of hibernate. But if your application is simple and you just want to fetch a cat with its id, then store the result of the query in a hashmap like
Map<Integer, Cat> mapCats = new HashMap<Integer, Cat>();
You can use the for loop to iterate over the list from DB.
Map<Integer, Cat> mapCats = new HashMap<Integer, Cat>();
for(Cat oneCat: listCats) {
mapCats.put(oneCat.id, oneCat);
}
Then retrieve using
mapCats.get(catid);

How to compare list of records against database in Java?

How to compare list of records against database? I have more than 1000 records in list and need to validate against database. How to validate each record from list to database? Select all the data from database and stored in list, then have to compare the values? Please advise...
The below code lists values to validate against database.
private void validatepart(HttpServletRequest req, Vector<String> errors) {
Parts Bean = (Parts)req.getAttribute("partslist");
Vector<PartInfo> List = Bean.getPartList();
int sz = partList.size();
for (int i = 0; i < sz; i++) {
PartInfo part = (PartInfo)partList.elementAt(i);
System.out.println(part.getNumber());
System.out.println(part.getName());
}
}
This depends on what you mean by compare. If it's just one field then executing a query such as select * from parts_table where part_number = ?. It's not that much of a stretch to add more fields to that query. If nothing is returned you know it doesn't exist.
If you need to compare and know exactly which values are different then you can try something like this
List<String> compareObjects(PartInfo filePart, PartInfo dbPart) {
List<String> different = new LinkedList<String>();
if (!filePart.getNumber().equals(dbPart.getNumber())) {
different.add("number");
}
//repeat for all your fields
return different;
}
If your list of objects that you need to validate against the database includes a primary key, then you could just build a list of those primary key values and run a query like:
SELECT <PRIMARY KEY FIELD> FROM <TABLE> WHERE <PRIMARY_KEY_FIELD> IN <LIST OF PRIMARY KEYS> SORT BY <PRIMARY KEY FIELD> ASC;
Once you get that list back, you can compare the results. My instinct would be to put your data (and the query results too) into a Set object and then call removesAll() to get the items not in the database (reverse this for items in the database but not in your set):
yourDataSet.removeAll(queryResults);
This assumes that you have an equals() method implemented in your PartInfo object. You can see the Java API documentation for more details.

Shortest path problem with DB and java

I have a movie database (postgreSQL). One of tables contains actors and movie titles. The assignment which I have to solve in java is as follows: two actors (A and B) are connected together when they in the same movie. Further two actors, A and B, are also connected when there is a third actor C, who plays with both of them in different movies (A and B don't play together!) and so on... I hope you get the idea :) Now I have to find the shortest connection (= path) between two actors.
Now to the implementation: fetching the data from the DB (prepared statements) and saving the names (as strings) in a linked list is working. As well as simple connection between actors like A -> B (= both play in the same movie). I'm hitting the wall trying to include more complicated connections (like A -> B -> C).
I am storing the actor names in a HashMap like this:
Map<String, List<String>> actorHashMap = new HashMap<String, List<String>>();
So when I load the first actor (Johnny Depp) I have his name as key, and other actors playing with him in a list referenced by the key. Checking, whether another actor played with him is easy:
List<String> connectedActors = actorHashMap.get(sourceActor);
if(connectedActors.contains(actor)) {
found = true; }
But... what do I do if the actor I'm looking for is not in the HashMap (ie. when I have to go one level deeper to find him)? I assume I would have to pick the first actors' name form the connectedActors list, insert it as new key into the HashMap, and fetch all actors he played with him to insert them to. Then search in this list.But that's exactly the part which i can't figure out. I already tried to store the names in graph nods and using bfs to search for them, but same problem here, just don't know how to go "one level down" without creating an infinite loop... Does anyone have an idea how i can solve this? I am just starting with java as well as programing in general so it's probably simple, but I just can't see it :/
First I would use a Set to store the actors someone played with:
Map<String, Set<String>> actorHashMap = ...
instead of a List to avoid duplicate names.
In order to find how many degrees of separation between 2 actors
I would start with one actor and generate all actors that are separated by
1, 2, 3... degrees. I would fix a maximum search depth, however that might not be necessary if the number of actors is not too large.
String actor = "Johnny Depp";
String targetActor = "John Travolta";
Map<String, Integer> connectedActorsAndDepth = new HashMap<String, Integer>();
Integer depth = 1;
Set<String> actorsAddedAtCurrentDepth = actorHashMap.get(actor);
for (String otherActor : actorsAddedAtPrecedingDepth) {
if (otherActor.equals(targetActor)) return depth;
connectedActorsAndDepth.put(otherActor, depth);
}
Set<String> actorsAddedAtPrecedingDepth = actorAddedAtCurrentDepth;
Integer maxDepth = 10;
while (++depth < maxDepth) {
actorsAddedAtCurrentDepth = new HashSet<String>():
for (String otherActor : actorsAddedAtPrecedingDepth) {
if (otherActor.equals(targetActor)) return depth;
if (!connectedActorsAndDepth.contains(otherActor)) {
actorsAddedAtCurrentDepth.add(otherActor);
connectedActorsAndDepth.put(otherActor, depth);
}
}
actorsAddedAtPrecedingDepth = actorsAddedAtCurrentDepth;
}
I do not claim it is the most efficient algorithm. There might also be bugs in the code above.

Categories

Resources