Mixing Solr range function with additional parameters

Mixing Solr range function with additional parameters - java

I have a range function in a Solr fq that works as expected:
{!frange l=1 u=2}sum(termfreq(tags,'twitter'),termfreq(tags,'facebook'),termfreq(tags,'pinterest'))
However, if I try to further refine it by adding an additional parameter to the end:
{!frange l=1 u=2}sum(termfreq(tags,'twitter'),termfreq(tags,'facebook'),termfreq(tags,'pinterest')) AND (region:"US")
I get the error: org.apache.solr.search.SyntaxError: Unexpected text after function: AND (region:"US")
If I try to prepend an additional parameter:
(region:"US") AND {!frange l=1 u=2}sum(termfreq(tags,'twitter'),termfreq(tags,'facebook'),termfreq(tags,'pinterest'))
I get the error: org.apache.solr.search.SyntaxError: Expected ')' at position 27 in 'sum(termfreq(tags,'twitter''
I've tried wrapping the range portion in additional parenthesis but still having no luck. How can I combine a range function with additional query parameters?

Ok I got solved what I need to. Basically, I was running commands from the Solr admin dashboard. Although I wasn't able to mix the frange command above with other queries in fq, I was able to put my frange command in q and keep the other queries in fq.
q: {!frange l=1 u=2}sum(termfreq(tags,'twitter'),termfreq(tags,'facebook'),termfreq(tags,'pinterest'))
fq: (region:"US")
I answered a similar question here.
My issue is solved, but I'll leave this unanswered for a few days in case someone knows of a better way.

Please check my answer here: Nested functional range query with OR
which solves this problem using query solr feature.

You can also keep adding fq= parameters this is seen as an AND, for example:
&fq={!frange l=1 u=2}sum(termfreq(tags,'twitter'),termfreq(tags,'facebook'),termfreq(tags,'pinterest'))&fq=region:"US"

Related

Subtle difference when searching multi value fields in Solr

I have a very simple question but I don't understand exactly why it happens and what the difference is.
Take a simple Solr search on a multi value field:
field_name:ABC AND DEF
field_name:(ABC AND DEF)
They return quite different results. I understand the brackets are for grouping but I don't understand the difference. It seems quite subtle.
Many thanks.

The first query isn't doing what you think it's doing.
field_name:ABC AND DEF
This is parsed as:
field_name:ABC AND <default search field>:DEF
This is different from your second example, which is parsed as:
field_name:ABC AND field_name:DEF
In the first example the second part of your query is made against whatever field is defined as the default search field in your index (or in the query itself, if you've set df).

Integrality Gap in MAXIMIZATION in CPLEX+Java | Bug?

I try to solve a large MIP in which the . If it does not solve optimally, it shall return the integrality gap (that is, difference between best integer solution and best solution of the linear relaxation).
Using getMIPRelativeGap of the Java+CPLEX interface, I sometimes got values in the range of 1.0E11-1.0E13 which does not make sense, as an integrality gap should be a percentage between 0 and 1. I tracked those cases down and found out that I get those results, if the best integer solution has a value of 0 (my inner problem is a profitable tour problem, thus, if the best route is not visiting any vertice). The integrality gap should be (bestobjective-bestinteger)/bestobjective (https://www.ibm.com/support/knowledgecenter/SSSA5P_12.6.0/ilog.odms.cplex.help/refdotnetcplex/html/M_ILOG_CPLEX_Cplex_MIPInfoCallback_GetMIPRelativeGap.htm), yet, it seems to be (bestobjective-bestinteger)/bestinteger.
I also tested a couple of other values (if the integer objective is positive), and were able to confirm this in examples.
Can someone else reproduce this behavior? Does this behavior make sense to you?
Thanks :)

Indeed, the documentation for CPXgetmiprelgap in the Callable Library (C API) says the following:
For a minimization problem, this value is computed by
(bestinteger - bestobjective) / (1e-10 + |bestinteger|)
where bestinteger is the value returned by CPXXgetobjval/CPXgetobjval
and bestobjective is the value returned by
CPXXgetbestobjval/CPXgetbestobjval. For a maximization problem,
the value is computed by:
(bestobjective - bestinteger) / (1e-10 + |bestinteger|)
So, it looks like the documentation for the Java API is buggy. The Java API just calls CPXgetmiprelgap under the hood, so it should be the same. Thanks for reporting this. I'll make sure that this gets passed on to the folks who can fix it.

"Escaped hexadecimal" to boolean

I'm working with HBase on a project and running into a seemingly simple situation that is throwing me for a loop. Hbase can store table values as escaped hexadecimal. In my case, I have true/false being stored as \x00 and \xFF, respectively.
The problem is (besides being unfamiliar with Java) I need to find a way to convert these to bool, or at least to compare them in a like-bool situation. They will never be anything other than \x00 and \xFF.
Is there not an elegant way to do this?
Please help, I'm really stuck.
Edit: This is probably relevant Hbase shell - how to write byte value

I suspect you could do something like... Hex ->binary->boolean.
But there might even be a toBoolean method already.
Or you could override the compare method they're using. But this could yield undesirable effects.
Can you post the API for the class you're using?

Ok, apparently there is a Bytes.toBoolean() function.

Optional parameters in cucumber throw error

I think I am copying the answer from here, but I still cannot get optional parameters to work. The two steps run independently, I just wanted to try and combine them.
Scenario:
Then(~/^set timeout(?: at (\d+) (min|hr))?$/) { int duration , String units ->
Works for
And set timeout at 30 min
But not for:
And set timeout
Which throws this error
groovy.lang.MissingMethodException: No signature of method: CucumberTestSteps$_run_closure56.doCall() is applicable for argument types: (null, null) values: [null, null]
Possible solutions: doCall(int, java.lang.String), findAll(), findAll()
I've tried several other random locations for '?:' and '?' with no luck. Also several web searches which all come back to that syntax should work.
Cucumber recognizes it as a valid test because when I add
Then(~/^set timeout$/)
It recognises it as a duplicate step
cucumber.runtime.AmbiguousStepDefinitionsException: ✽.Then set timeout(test.feature:57) matches more than one step definition:
^set timeout$ in CucumberTestSteps.groovy:1128
^set timeout(?: at (\d+) (min|hr))?$ in CucumberTestSteps.groovy:1148

I know I'm too late for this answer, but I had the same issue today and was able to resolve it. Hopefully, this answer will help those looking for the solution to this problem. Apparently, in case of optional parameters, it passes null values to the parameters.
The problem happens because your method has an int instead of Integer. In my case, I changed the int to Integer and did a null check before proceeding. That solved the issue.

Lucene Solr using complex filters

I am currently having a problem with specifying filters for Lucene/Solr. Every solution I come up with breaks other solutions. Let me start with an example. Assume that we have the following 5 documents:
doc1 = [type:Car, sold:false, owner:John]
doc2 = [type:Bike, productID:1, owner:Brian]
doc3 = [type:Car, sold:true, owner:Mike]
doc4 = [type:Bike, productID:2, owner:Josh]
doc5 = [type:Car, sold:false, owner:John]
So I need to construct the following filter queries:
Give me all documents of type:Car which has sold:false only and if it is a type that is different that Car, include in the result. So basically I want docs 1, 2, 4, 5 the only document I don't want is doc3 because it is has sold:true. To put it more precisely:
for each document d in solr/lucene
if d.type == Car {
if d.sold == false, then add to result
else ignore
}
else {
add to result
}
return result
Filter in all documents that are of (type:Car and sold:false) or (type:Bike and productID:1). So for this I will get 1,2,5.
Get all documents that if the type:Car then get only with sold:false, otherwise get me documents from owners John, Brian, Josh. So for this query I should get 1, 2, 4, 5.
Note: You don't know all the types in the documents. Here it is obvious because of the small number of documents.
So my solutions were:
(-type:Car) OR ((type:Car) AND (sold:false). This works fine and as expected.
((-type:Car) OR ((type:Car) AND (sold:false)) AND ((-type:Bike) OR ((type:Bike) AND (productID:1))). This solution does not work.
((owner:John) OR (owner:Brian) OR (owner:Josh)) AND ((-type:Car) OR ((type:Car) AND (sold:false)). This does not work, I can make it work if I do I do this: ((owner:John) OR (owner:Brian) OR (owner:Josh)) AND ((version:* OR (-type:Car)) OR ((type:Car) AND (sold:false)). I don't understand how this works, because logically it should work, but Solr/Lucene somehow does something.

Okay, to get anything but a sold car, you could use -(type:Car sold:true).
This can be incorporated into the other queries, but you'll need to be careful with lonely negative queries like this. Lucene doesn't handle them well, generally speaking, and Solr has some odd gotchas as well. Particularly, A -B reads more like "get all A but forbid B" rather than "get all A and anything but B". Similar problem with A or -B, see this question for more.
To get around that, you'll need to surround the negative with an extra set of parentheses, to ensure it is understood by Solr to be a standalone negative query, like: (-(type:Car AND sold:true))
So:
-(type:Car AND sold:true) (This doesn't get the result you stated, but as per my comment, I don't really understand your stated results)
(type:Bike AND productID:1) (-(type:Car AND sold:true)) (You actually wrote this in the description of the problem!)
(-(type:Car AND sold:false)) owner:(John Brian Josh)

My advice is to use programmatic Lucene (that is, directly in Java using the Java Lucene API) rather than issuing text queries which will be interpreted. This will give you much more fine-grained control.
What you're going to want to do is construct a Lucene Filter Object using the QueryWrapperFilter API. A QueryWrapperFilter is a filter which takes a Lucene Query, and filters out any documents which do not match that query.
In order to use QueryWrapperFilter, you'll need to construct a Query which matches the terms you're interested in. The best way to do this is to use TermQuery:
TermQuery tq = new TermQuery(new Term("fieldname", "value"));
As you might have guessed, you'll want to replace "fieldname" with the name of a field, and "value" with a desired value. For example, from your example in the OP, you might want to do something like new Term("type", "Car").
This only matches a single term. You're going to need multiple TermQueries, and a way to combine them to create a single, larger query. The best way to do this is with BooleanQuery:
BooleanQuery bq = new BooleanQuery();
bq.add(tq, BooleanQuery.Occur.MUST);
You can call bq.add as many times as you want - once for each TermQuery that you have. The second argument specifies how strict the query is. It can specify that a sub-query MUST appear, SHOULD appear, or should NOT appear (these are the three values of the BooleanQuery.Occur enum).
After you've added each of the sub-queries, this BooleanQuery represents the full query which will match only the documents you ask for. However, it's still not a filter. We now need to feed it to QueryWrapperFilter, which will give us back a filter object:
QueryWrapperFilter qwf = new QueryWrapperFilter(bq);
That should do it. Then if you want to run queries over only the documents allowed through by that filter, you just take your new query (call it q) and your filter, and create a FilteredQuery:
FilteredQuery fq = new FilteredQuery(q, qwf);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.