Is there any way to query GAE datastore with filter similar to SQL LIKE statement? For example, if a class has a string field, and I want to find all classes that have some specific keyword in that string, how can I do that?
It looks like JDOQL's matches() don't work... Am I missing something?
Any comments, links or code fragments are welcome
As the GAE/J docs say, BigTable doesn't have such native support. You can use JDOQL String.matches for "something%" (i.e startsWith). That's all there is. Evaluate it in-memory otherwise.
If you have a lot of items to examine you want to avoid loading them at all. The best way would probably be to break down the inputs a write time. If you are only searching by whole words then that is easy
For example, "Hello world" becomes "Hello", "world" - just add both to a multi valued property. If you have a lot of text you want to avoid loading the multi valued property because you only need it for the index lookup. You can do this by creating a "Relation Index Entity" - see bret slatkins Google IO talk for details.
You may also want to break down the input into 3 character, 4 character etc strings or stem the words - perhaps with a lucene stemmer.
Related
I have a Java (lucene 4) based application and a set of keywords fed into the application as a search query (the terms may include more than one words, eg it can be: “memory”, “old house”, “European Union law”, etc).
I need a way to get the list of matched keywords out of an indexed document and possibly also get keyword positions in the document (also for the multi-word keywords).
I tried with the lucene highlight package but I need to get only the keywords without any surrounding portion of text. It also returns multi-word keywords in separate fragments.
I would greatly appreciate any help.
There's a similar (possibly same) question here:
Get matched terms from Lucene query
Did you see this?
The solution suggested there is to disassemble a complicated query into a more simple query, until you get a TermQuery, and then check via searcher.explain(query, docId) (because if it matches, you know that's the term).
I think It's not very efficient, but
it worked for me until I ran into SpanQueries. it might be enough for you.
I'm in trouble with a simple query to get strings from Realm engine in Java for an Android app.
As said in the title of my topic, I want to get diacritic insensitive results from my query.
Example:
If user type the word "securite", I want my query to return "securite" and "sécurité".
How can I do that ?
Thanks a lot in advance for your help !
While Realm doesn't support that currently. Depending on how much of the data you control, you can also add a "normalized" field you can use in your search. There is an approach described here: Remove diacritics from string in Java
This is not possible in Realm at the moment. Your only option is to manage tables containing all the possibilities for each letter of the alphabet you are interested in. Something like [a, á, à, å, etc] and then for each string compute all the possible permutations and build a huge query with equalTo() and or(). It would probably take longer to build such query than to execute it, but that's a very interesting use case! If you end up implementing it I would love to know the results!
I'm using Sqlite with Android (Java).
I have a database that contains texts with hebrew punctuation.
My problem is that when I'm doing a SELECT for certain value (without punctuation) I don't get all the results as I guess the DB is not ignoring the records that are punctuated and treating the punctuation as a normal characters.
After doing a search, I found some answers which says I should register a collation for it (sqlite3_create_collation).
As I've never used collations, I would like if some one will give me a hint on how to register it and use it to get the correct full result as I want.
For example:
SELECT * FROM sometable WHERE punctuated_field LIKE '%re%'
I would like to get both the following:
dream
drém
Currently I'm getting just:
dream
I read this relevant answer but didn't managed to understand how to implement it within my query or the Java code.
I would be happy to have someone writing the full query required for me to write within my code.
Thanks in advance!
The Android API does not allow registering custom collations.
You have to make do with the built-in collations, or with Android's LOCALIZED and UNICODE collations.
Since the Android sqlite API doesn't expose anything to set up custom collations, you'll have to figure some other way to solve the problem.
One is to add another column where you have the strings normalized i.e. accent marks ("punctuation" as you like) removed. Then do your LIKE matching on this normalized column and use the original column for display purposes. The cost of this is larger data size and some extra code when inserting into the database.
I've described one such normalization approach in here:
How to ignore accent in SQLite query (Android) - I have no idea how well that works with Hebrew chars though.
I just realize that in my forms I couldn't save name like O'Brian (It would saved as O only and 'Brian will be truncated).
I'm using grails 1.2.2 with mysql.
is there simple ways to allow ' to be inserted into db ? rather than modify each form and put html replacement for that char ?
If inserting into the database is the problem, then you can use parameterized queries. This is strongly recommended anyway, since it avoids possible security risks.
Imagine if instead of entering just a quote character, the user enters "Brian'; DROP TABLE data" into your form!
use the escape character, \
e.g. O\'Brian
See http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html
That said, most DB abstraction layers will allow you to use parameterized queries that do this for you
Grails and its database abstraction GORM should handle that for you, unless you are saving it yourself using some lower level API:s. See the documentation here.
You should not need to replace such characters yourself, so I suggest you have another look at your code and see if you can spot what might cause the problem. I hope you can find an easy solution, it shouldn't be hard with Grails :-)
I have a Java based web-application and a new requirement to allow Users to place variables into text fields that are replaced when a document or other output is produced. How have others gone about this?
I was thinking of having a pre-defined set of variables such as :
#BOOKING_NUMBER#
#INVOICE_NUMBER#
Then when a user enters some text they can specify a variable inline (select it from a modal or similar). For example:
"This is some text for Booking #BOOKING_NUMBER# that is needed by me"
When producing some output (eg. PDF) that uses this text, I would do a regex and find all variables and replace them with the correct value:
"This is some text for Booking 10001 that is needed by me"
My initial thought was something like Freemarker but I think that is too complex for my Users and would require them to know my DataModel (eww).
Thanks for reading!
D.
Have a look at java.text.MessageFormat - particularly the format method - as this is designed for exactly what you are looking for.
i.e.
MessageFormat.format("This is some text for booking {0} that is needed by me, for use with invoice {1}", bookingNumber, invoiceNumber);
You may even want to get the template text from a resource bundle, to allow for support of multiple languages, with the added ability to cope with the fact that {0} and {1} may appear in a different order in some languages.
UPDATE:
I just read your original post properly, and noticed the comment about the PDF.
This suggest that the template text is going to be significantly larger than a line or two.
In such cases, you may want to explore something like StringTemplate which seems better suited for this purpose - this comment is based solely on initial investigations, as I've not used it in anger.
I have used a similiar replacement token system before. I personally like something like.
[MYVALUE]
As it is easy for the user to type, and then I just use replacements to swap out the tokens for the real data.