I have a Quarkus application where we use Hibernate ORM with Panache to build and query the database. In some situations, we want to use an List or rather a Set to filter from a table of "requests". The Request entity, which has a different name in practice, has a status property, which is an enum that can have three values: PENDING, APPROVED or DENIED. In the web front-end, we want to use a checkbox-style filter which then during the HTTP request sends this as an array to the Quarkus application, where we then want to give it to Hibernate somehow, preferably as a Set to easily filter duplicates.
I've done something extremely similar within the NodeJS/MongoDB ecosystem in the past, which looks like this, as a step of a bigger aggregate pipeline:
aggregatePipeline.push({
$match: {
status: {
$ne: status //Array of strings
}
}
});
How would something like this be done within Hibernate? I've tried some googling, but the results are largely cluttered by people asking how you get an Arraylist out of the cursor from a standard find-query.
Thanks in advance.
Edit: Trying this line
List<Publisher> publishers = Publisher.find("name", Arrays.asList("Books", "Publishing")).list();
Gives this error:
org.postgresql.util.PSQLException: ERROR: operator does not exist: character varying = record
Hint: No operator matches the given name and argument types. You might need to add explicit type casts.
Nevermind. I tried thinking about another search term which wouldn't result in a flood of irrelevant results I mentioned. I tried "Hibernate find in set", which led me to find this other post. This led me to try this line (with the Set values hardcoded for FAAFO purposes):
List<Publisher> publishers = Publisher.find("name IN ?1", new HashSet<>(Arrays.asList("Indiana University Press", "Harvard University Press"))).list();
No errors, and returns two entries that were imported into the dev db through the import.sql file. It's the "IN ?1" part that does it, not that I used a Set for my query instead of a List this time. It works just as well with a List, as long as the "IN ?1" stays there.
Related
I'm making a social media app like instagram, but trying to filter out the user's own posts so they don't see it on their feed. I'm trying to do it like below, but it causes the app to crash. When I change it to .whereEqualTo, it works just fine and only shows posts by the users. As far as I know, the two should work exactly the same, with the obvious exception of it being equal to vs not, so why does one work but the other doesn't?
Does not work
Query query = firestoreDb.collection("posts")
.whereNotEqualTo("user.username", username)
Shows only posts by users
Query query = firestoreDb.collection("posts")
.whereEqualTo("user.username", username)
The error I get is
You have an inequality where filter (whereLessThan(), whereGreaterThan(), etc.) on field 'user.username' and so you must also have 'user.username' as your first orderBy() field, but your first orderBy() is currently on field 'creationTime' instead
but I don't want to order by username.
If you want to use an inequality filter (such as whereNotEqualTo) in your query, you are obliged to order the results by the field on that filter. That is a hard requirement of Firestore, due to the way it's organized. The only way to work around that is to reorder the query results in your app code - you will not be able to coerce the query to do what you want.
See also:
Firestore "Invalid query" - Am I using Indexing wrong?
Firestore query order on field with filter on a different field
I have an Saleforce app that allows me to execute REST API calls, and I need to retrieve orders (/services/data/v47.0/sobjects/Order) by status.
I've found some manual that describes similar filtering on another entitiy (https://developer.salesforce.com/docs/atlas.en-us.api_placeorder.meta/api_placeorder/sforce_placeorder_rest_api_standalone.htm).
However when trying to execute followin request, it seems that all statuses returned:
GET /services/data/v47.0/sobjects/Order?order.status='ddd'
I also tried some variations of query params. Is this functionality supported?
/sobjects service will let you learn dynamically what fields (standard and custom) exist in Order table (or any other really), what types they are, picklist values...
To retrieve actual data you can use query resource. (Salesforce uses a dialect of SQL, called SOQL. If you've never used it before it'll look bit weird the moment you want to do any JOINs, would be nice if a SF developer would fill you in)
This might be a good start
/services/data/v47.0/query/?q=SELECT Id, Name, OrderNumber FROM Order WHERE Status = 'Draft' LIMIT 10
Never seen the API you've linked to, interesting stuff. But I don't see anything obvious that would let you filter by status there so the more generic "query anything you wish" might work better for you. Play a bit and perhaps https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/dome_query.htm will suit your needs more?
My situation is that, given 3 following methods (I used couchbase-java-client 2.2 in Scala. And Version of Couchbase server is 4.1):
def findAll() = {
bucket.query(N1qlQuery.simple(select("*").from(i(DatabaseBucket.USER))))
.allRows().toList
}
def findById(id: UUID) = {
Option(bucket.get(id.toString, classOf[RawJsonDocument])).map(i => read[User](i.content()))
}
def upsert(i: User) = {
bucket.async().upsert(RawJsonDocument.create(i.id.toString, write(i)))
}
Basically, they are insert, find one by id and findAll. I did an experiment where :
I insert a User, then find one by findById right after that, I got a user that I have inserted correctly.
I insert and then I use findAll right after that, it returns empty.
I insert, put 3 seconds delay and then I use findAll, I can find the one that I have inserted.
By that, I suspected that N1qlQuery only search over cached layer rather than "persist" layer. So, how can I force to let it search on "persist" layer?
In Couchbase 4.0 with N1QL, there are different consistency levels you can specify when querying which correspond to different cost for updates/changes to propagate through index recalculation. These aren't tied to whether or not data is persisted, but rather it's an option when you issue the query. The default is "not bounded" and to make sure that your upsert request is taken into consideration, you'll want to issue this query as "request plus".
To get the effect you're looking for, you'll want to add N1qlPararms on your creation of the N1qlQuery by using another form of the simple() method. Add a N1qlParams with ScanConsistency.REQUEST_PLUS. You can read more about this in Couchbase's Developer Guide. There's a Java API example of this. With that change, you won't need to have a sleep() in there, the system will automatically service the query request once the index recalculation has gotten to your specified level.
Depending on how you're using this elsewhere in your application, there are times you may want either consistency level.
You need stronger scan consistency. Add a N1qlParam to the query, using consistency(ScanConsistency.REQUEST_PLUS)
I have found the Jquery datatables plug in extremely useful for simple, read only applications where I'd like to give the user pagination, sorting and searching of very large sets of data (millions of rows using server side processing).
I have a system for reusing this code but I end up doing the same thing over and over alot. I'd like to write a very generalized api that I essentially just need to configure the sql needed to retrieve the data used in the table. I am looking for a good design pattern/approach to do this. I've seen articles like this http://www.codeproject.com/Articles/359750/jQuery-DataTables-in-Java-Web-Applications and have a complete understanding of how server side processing works (have done it in java and asp.net many times). For someone to answer you will probably need to have a deep understanding of how server side processing works in java but here are some issues that come up with attempting to do this:
I generally run three separate queries. A count without the search clause, a count with the clause included, the query for the actual data. I haven't found an efficient way to do all 3 at once and doing so requires a lot of extra data to come back from db (ie counts over and over). The api needs to support behavior based on these three different queries and complex queries at that. I generally row number () over an index for the pagination to be relatively speedy with large data.
*where clause changes dynamically (user can search over a variable number of rows).
*order by clause changes for the same reason.
overall, each case is often pretty specific to the data we need. Is there a good way to abstract this so that I can do minimal work when I want to use the plug in server side.
So, the steps are as follows in most projects:
*extract the params the plug on sends to the server (alot of times my own are added, mostly date ranges)
*build the unfiltered count query (this is rarely dynamic).
*build the filtered count query (is dynamic)
*build the data query
*construct a model object of the table and return it as json.
A lot of the issues occur setting the prepared statements with a variable number of parameters. Dynamically generating the sql in a general way (say based on just column names) seems unlikely. I am wondering if someone else has created something they are using for this or if it sounds like a specific pattern is applicable. It has just occurred to me that creating a reusable filter may be helpful in java. Any advice would be greatly appreciated. Feel free to be language agnostic as the architecture is what I'm trying to figure out.
We have base search criteria where all request parameters relevant to DataTables are mapped onto class properties (fields) and custom search criteria class that extends base and contains specific to business logic fields for sutom search. Also on server side we have repository class that takes custom search criteria as an argument and makes queries to database.
If you are familiar with C#, you could check out custom binding code and example of usage.
You could do such custom binding in your Java code as well.
Let's presume that you are writing an application for a retail store chain. So, you would design your object model such that you would define 'Store' as the core business object and lots of supporting objects. Let's say 'Store' looks like follows:
class Store implements Validatable{
int storeNo;
int storeName;
... etc....
}
So, your client tells you that you have to import store schedule from a excel sheet into the application and you would have to run a series of validations on 'em. For instance, 'StoreIsInSameCountry';'StoreIsValid'... etc. So, you would design a Rule interface for checking all business conditions. Something like this:
interface Rule T extends Validatable> {
public Error check(T value) throws Exception;
}
Now, here comes the question. I am uploading 2000 stores from this excel sheet. So, I would end up running each rule defined for a store that many times. If I were to have 4 rules = 8000 queries to the database, i.e, 16000 hits to the connection pool. For a simple check where I would just have to check whether the store exists or not, the query would be:
SELECT STORE_ATTRIB1, STORE_ATTRIB2... from STORE where STORE_ID = ?
That way I would obtain get my 'Store' object. When I don't get anything from the database, then that store doesn't exist. So, for such a simple check, I would have to hit the database 2000 times for 2000 stores.
Alternatively, I could just do:
SELECT STORE_ATTRIB1, STORE_ATTRIB2... from STORE where STORE_ID in (1,2,3..... )
This query would actually return much faster than doing the one above it 2000 times.
However, it doesn't go well with the design that a Rule can be run for a single store only.
I know using IN is not a suggested methodology. So, what do you think I should be doing? Should I go ahead and use IN here, coz it gives better performance in this scenario? Or should I change my design?
What would you do if you were in my shoes, and what is the best practice?
That way I would obtain get my 'Store' object from the database. When I don't get anything from the database, then that store doesn't exist. So, for such a simple check, I would have to hit the database 2000 times for 2000 stores.
This is what you should not do.
Create a temporary table, fill the table with your values and JOIN this table, like this:
SELECT STORE_ATTRIB1, STORE_ATTRIB2...
FROM temptable tt
JOIN STORE s
ON s.STORE_ID = t.id
or this:
SELECT STORE_ATTRIB1, STORE_ATTRIB2...
FROM STORE s
WHERE s.STORE_ID IN
(
SELECT id
FROM temptable tt
)
I know using IN is not a suggested methodology. So, what do you think I should be doing? Should I go ahead and use IN here, coz it gives better performance in this scenario? Or should I change my design?
IN filters duplicates out.
If you want each eligible row to be selected for each duplicate value in the list, use JOIN.
IN is in no way a "not suggested methology".
In fact, there was a time when some databases did not support IN queries effciently, that's why folk wisdom still advices against using it.
But if your store_id is indexed properly (and it most probably is, if it's a PRIMARY KEY which it looks like), then all modern versions of major databases (that is Oracle, SQL Server, MySQL and PostgreSQL) will use an efficient plan to perform this query.
See this article in my blog for performance details in SQL Server:
IN vs. JOIN vs. EXISTS
Note, that in a properly designed database, validation rules are also set-based.
I. e. you implement your validation rules as queries against the temptable.
However, to support legacy rules, you can select values from temptable row-by-agonizing-row, apply the rules, and delete values which did not pass validation.
SELECT store_id FROM store WHERE store_active = 1
or even
SELECT store_id FROM store
will tell you all the active stores in a single query. You can now conduct the other tests on stores you know to exist, and you've saved yourself 1,999 hits to the database.
If you've got relatively uncontested database access, and no time constraint on how long the whole thing is going to take then you've no real need to worry about hitting the connection pool over and over again. That's what it's designed for, after all!
I think it's more of a business question with parameter of how often does the client run the import, how long would it take for you to implement either of the solution, and how expensive is your time per hour.
If it's something that runs once in a while, a bit of bad performance is acceptable in my opinion, especially if you can get the job done quick using clean code.
...a Rule can be run for a single store only.
Managing business rules along with performance is a tricky task, so there is a library ("Persistence Layer") that does exactly that. You define rules, then execute a bulk of commands, then the library fetch from DB whatever the rules require in a single query (by using temp tables rather than 'IN') and then passes it to the rules.
There is an example of a validator in here.