Solr string field search with special characters - java

I have just started to work on Solr. There is a phone field and it has been defined in schema like below
<field docValues="true" indexed="true" multiValued="true" name="phones" stored="true" type="StrField"/>
From my understanding the string field will try to do the exact match but the user can use any format to search the phone number with special characters like (111) 111-1111. So I used ClientUtils.escapeQueryChars to add a slash for the special characters but the search does not result any result. I have been trying to understand why and is there any criteria that special characters cannot be escaped for string field? I don't think tokenizer matters as it is string field and I use edismax parser. Any ideas?

Using Solr 7.3.1 I reproduced what you've asked and can confirm that as long as you escape (, ) and properly, you'll get the hits you're looking for.
Schema
id: string
phones: string (multivalued, docvalues, indexed, stored)
Documents
{
"id":"doc1",
"phones":["(111) 111-1111"],
"_version_":1602190176246824960
},
{
"id":"doc2",
"phones":["111 111-1111"],
"_version_":1602190397829808128
},
{
"id":"doc3",
"phones":["111 (111)-1111"],
"_version_":1602190400002457600
}
Query
/select?q=phones:\(111\)\ 111-1111
{
"id":"doc1",
"phones":["(111) 111-1111"],
"_version_":1602190176246824960}]
}
/select?debugQuery=on&q=phones:111\ 111-1111
{
"id":"doc2",
"phones":["111 111-1111"],
"_version_":1602190397829808128}]
}
/select?debugQuery=on&q=phones:1111111111
"response":{"numFound":0,"start":0,"docs":[]}
The behavior is exactly as described - exact matches only.
Getting the behavior you want with PatternReplaceCharFilterFactory
Let's create a custom field type that removes anything that's not a number or letter:
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field-type" : {
"name":"phoneStripped",
"class":"solr.TextField",
"positionIncrementGap":"100",
"analyzer" : {
"charFilters":[{
"class":"solr.PatternReplaceCharFilterFactory",
"replacement":"",
"pattern":"[^a-zA-Z0-9]"
}],
"tokenizer":{
"class":"solr.KeywordTokenizerFactory"
},
}
}
}' http://localhost:8983/solr/foo/schema
Then we create a new field named phone_stripped using this new field type (you can do this in the UI), and reindex our documents - now using the new field name:
{
"id":"doc1",
"phone_stripped":"(111) 111-1111"
},
{
"id":"doc3",
"phone_stripped":"111 (111)-1111"
},
{
"id":"doc2",
"phone_stripped":"111 111-1111"
}
And then we search for just 1111111111:
"response":{"numFound":3,"start":0,"docs":[ .. all our docs ..]
Using the previous search, phone_stripped:\(111\)\ 111-1111:
"response":{"numFound":3,"start":0,"docs":[ .. all our docs ..]
And just to make sure we haven't broken things in unspeakable ways, let's search for phone_stripped:\(111\)\ 111-1112:
"response":{"numFound":0,"start":0,"docs":[]

Related

How to use replaceAll in java without removing the whitespaces?

I have json string like :
{
"type": "abc_onClick",
"selectedComponent": "xyz_Button",
"displayEventName": "On Click",
"eventCategory": "Component Events"
}
I am trying to replace "type": "abc_ with some other string which is working fine but I need to remove whitespaces first. This is how I am doing it :
json = json.replaceAll("\\s", "");
json = json.replaceAll("\"type\":\"abc_\", "\"type\":\"newName_\");
But while doing so my json structure is getting changed as it is removing all the white spaces and I need the whitespaces as earlier. It is getting changed to :
{
"type": "abc_onClick",
"selectedComponent": "xyz_Button",
"displayEventName": "OnClick",
"eventCategory": "ComponentEvents"
}
Is there anyway I can achieve it with removing the whitespaces .
Note : I cannot loop or use JsonObject as these are the constraints.
Thanks

How can I fix this ElasticSearch Fielddata exception in Java code?

I'm working on Java code to create an index and query on ElasticSearch.
I keep getting this exception when trying to use count, sort API:
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set fielddata=true ......
How can I set Fielddata to true?
I used BulkRequest to create index, how can I add mapping to BulkRequest?
Here is the code to create index:
BulkRequest request=new BulkRequest();
try {
BufferedReader br=new BufferedReader(new FileReader(fileName));
String line;
while((line=br.readLine())!=null) {
request.add(new IndexRequest(indexName, type).source(line, XContentType.JSON)); ;
BulkResponse bulkresp=client.bulk(request);
afterBulk(request,bulkresp);
}
catch (IOException e) {
e.printStackTrace();
}
First of all, let's go to the source of the problem, you want to do a sorting operation on the text field, which requires you to have fielddata enabled.
Before you enable fielddata, consider why you are using a text field
for aggregations, sorting, or in a script. It usually doesn’t make
sense to do so.
A text field is analyzed before indexing so that a value like New York
can be found by searching for new or for york. A terms aggregation on
this field will return a new bucket and a york bucket, when you
probably want a single bucket called New York.
Same would be the case for sorting. How you're suppose to sort on the field, where you have tons of terms.
Instead, you should have a text field for full text searches, and an
unanalyzed keyword field with doc_values enabled for aggregations,
as follows
{
"mappings": {
"_doc": {
"properties": {
"my_field": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
To the other part of the question - you need to take a look at CreateIndexRequest, it allows to specify mappings explicitly. Most likely, right now you're using dynamics ones, that's why fielddata causes you the problems. More information on how to use CreateIndexRequest - https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-create-index.html#java-rest-high-create-index

How to save a searchable and queryable json document in Postgres?

Receiving a Person's profile as JSON. How can I model it in a way it every value of this JSON document is searchable?
Not only json document needs to be searchable. It should also be queryable like "find all the persons who like Tarantino movies".
I can define this document in a relational model with one to many relationships. But this approach wouldn't allow free text search from the client side.Is there a better way to handle such scenarios? Document look like this:
{
"name":"FirstN LastN",
"photo":nicephoto.jpg,
"location":"Boston, MA",
"contacts":[
{
"type":"phone",
"value":"701290012734"
},
{
"type":"email",
"value":"test#test.com"
}
],
"movies":[
{
"name":"The Godfather",
"director":"Francis Ford Coppola",
"releaseYear":"1972",
"favQuote":"I'm gonna make him an offer he can't refuse. Okay?"
},
{
"name":"Pulp Fiction",
"director":"Quentin Tarantino",
"releaseYear":"1994",
"favQuote":"Just because you are a character doesn't mean that you have character."
}
],
"school":null,
}
"find all the persons who like Tarantino movies" needs to be written or converted in SQL like:
select persons->>'name' from jdoc,json_array_elements(jdoc.persons->'movies') movies where movies->>'director' ~ 'Tarantino';
other selection criteria can be modeled in similar way.
Requires Postgres 9.3 or later
http://sqlfiddle.com/#!15/652eb/10
to the question "how to save":
create table jdoc (persons json);
insert into jdoc values ('{
"name":"FirstN LastN",
"photo":"nicephoto.jpg",
"location":"Boston, MA",
"contacts":[
{
"type":"phone",
"value":"701290012734"
},
{
"type":"email",
"value":"test#test.com"
}
],
"movies":[
{
"name":"The Godfather",
"director":"Francis Ford Coppola",
"releaseYear":"1972",
"favQuote":"Im gonna make him an offer he cant refuse. Okay?"
},
{
"name":"Pulp Fiction",
"director":"Quentin Tarantino",
"releaseYear":"1994",
"favQuote":"Just because you are a character doesnt mean that you have character."
}
],
"school":null
}')
;
You might be looking for FLWOR:
for $d in doc("depts.xml")//deptno
let $e := doc("emps.xml")//employee[deptno = $d]
where count($e) >= 10
order by avg($e/salary) descending
return
<big-dept>
{ $d,
<headcount>{count($e)}</headcount>,
<avgsal>{avg($e/salary)}</avgsal>
}
</big-dept>
although it doesn't look like Postgres has plans to support Xquery.

How to custom search for text query in mongodb?

I'm new in mongodb. I have following data as a JSON format in mongodb. I need to search the bookLabel or the shortLabel for the book and it should show me all the information about the book. For example: if I query for 'Cosmos' it'll show all the description about the book, like: bookLabel, writer, yearPublish, url. How can I do that in java? Need query, please help.
"Class":"Science",
"Description":[
{
"bookLabel":"Cosmos (Mass Market Paperback)",
"shortLabel":"Cosmos",
"writer":"Carl Sagan",
"yearPublish":[
"2002"
],
"url":"https://www.goodreads.com/book/show/55030.Cosmos"
},
{
"bookLabel":"The Immortal Life of Henrietta Lacks",
"shortLabel":"Immortal Life",
"writer":"Rebecca Skloot",
"yearPublish":[
"2010, 2011"
],
"url":"https://www.goodreads.com/book/show/6493208-the-immortal-life-of-henrietta-lacks"
}
],
"Class":"History",
"Description":[
{
"bookLabel":"The Rise and Fall of the Third Reich",
"shortLabel":"Rise and Fall",
"writer":"William L. Shirer",
"yearPublish":[
"1960"
],
"url":"https://www"
}
]
}
With MongoDB Java Driver v3.2.2 you can do something like this:
FindIterable<Document> iterable = collection.find(Document.parse("{\"Description.shortLabel\": {$regex: \"Cosmos\"}"));
This returns all documents containing Cosmos in the Description.shortLabel nested field. For an exact match, try this {"Description.shortLabel": "Cosmos"}. Replace shortLabel with bookLabelto search the bookLabel field. Then you can do iterable.forEach(new Block<Document>()) on the returned documents. To search both bookLabel and shortLabel, you can do a $or{}. My syntax could be wrong so check the MongoDB manual. But this is the general idea.
For this, you can use MongoDB's Text Search Capabilities. You'll have to create a text index on your collection for that.
First of all create a text index on your collection on fields bookLabel and shortLabel.
db.books.createIndex({ "Description.bookLabel" : "text", "Description.shortLabel" : "text" })
Note that this is done in the Mongo shell
Then
DBObject command = BasicDBObjectBuilder
.start("text", "books")
.append("search", "Cosmos")
.get();
CommandResult result = db.command(command);
BasicDBList results = (BasicDBList) result.get("results");
for(Object o : results) {
DBObject dbo = (DBObject) ((DBObject) o).get("obj");
String id = (String) dbo.get("_ID");
System.out.println(id);
}
Haven't really tested this. But just give it a try. Should work.

Problem with filling ext grid with JSON values

I'm new in Ext and I have a problem: I'm trying to fill extjs-grid with data:
Ext.onReady(function() {
var store = new Ext.data.JsonStore({
root: 'topics',
totalProperty: 'totalCount',
idProperty: 'threadid',
remoteSort: true,
autoLoad: true, ///
fields: [
'title', 'forumtitle', 'forumid', 'author',
{name: 'replycount', type: 'int'},
{name: 'lastpost', mapping: 'lastpost', type: 'date', dateFormat: 'timestamp'},
'lastposter', 'excerpt'
],
proxy: new Ext.data.ScriptTagProxy({
url:'http://10.10.10.101:8080/myproject/statusList/getJobs/2-10/search-jobname-/sort-asdf/filterjobname-123/filterusername-davs/filterstatus-completed/filtersubmdate-today',
method : 'GET'
})
});
//
var cm = new Ext.grid.ColumnModel([
{sortable:true, id : 'id', dataIndex:'id'},
{sortable:true, id : 'title', dataIndex:'title'},
{sortable:true, id : 'forumtitle', dataIndex:'forumtitle'},
{sortable:true, id : 'forumid', dataIndex:'forumid'},
{sortable:true, id : 'author', dataIndex:'author'}
]);
var grid = new Ext.grid.GridPanel({
id: 'mainGrid',
el:'mainPageGrid',
pageSize:10,
store:store,
// stripeRows: true,
cm:cm,
stateful: false, // skipSavingSortState
viewConfig:{
forceFit:true
},
// width:1000,
// height:700,
loadMask:true,
frame:false,
bbar: new Ext.PagingToolbar({
id : 'mainGridPaginator',
store:store,
hideRefresh : true,
plugins: new Ext.ux.Andrie.pPageSize({
beforeText: 'View: ',
afterText: '',
addAfter: '-',
variations: [10, 25, 50, 100, 1000]
//comboCfg: {
//id: '${ dispview_widgetId }_bbar_pageSize'
//}
}),
displayMsg: 'Displaying items {0} - {1} of {2}',
emptyMsg:'No data found',
displayInfo:true
})
});
grid.render();
});
and the Java part:
#GET
#Path("/getJobs/{startFrom}-{startTo}/search-{searchType}-{searchName:.*}/" +
"sort-{sortType}/filterjobname-{filterJobName:.*}/filterusername-{filterUsername:.*}/" +
"filterstatus-{filterStatus:.*}/filtersubmdate-{filterSubmittedDate:.*}")
#Produces({"application/json"})
#Encoded
public String getJobs(
#PathParam("startFrom") String startFrom,
#PathParam("startTo") String startTo,
#PathParam("searchType") String searchType,
#PathParam("searchName") String searchName,
#PathParam("sortType") String sortType,
#PathParam("filterJobName") String filterJobName,
#PathParam("filterUsername") String filterUsername,
#PathParam("filterStatus") String filterStatus,
#PathParam("filterSubmittedDate") String filterSubmittedDate) {
return "{totalCount:'3',topics:[{title:'XTemplate with in EditorGridPanel',threadid:'133690',username:'kpremco',userid:'272497',dateline:'1305604761',postid:'602876',forumtitle:'Ext 3x Help',forumid:'40',replycount:'2',lastpost:'1305857807',lastposter:'kpremco',excerpt:'Hi I have an EditiorGridPanel whose one column i am using XTemplate to render and another Column is Combo Box FieldWhen i render the EditorGri'}," +
"{title:'IFrame error _flyweights is undefined',threadid:'133571',username:'Daz',userid:'52119',dateline:'1305533577',postid:'602456',forumtitle:'Ext 3x Help',forumid:'40',replycount:'1',lastpost:'1305857313',lastposter:'Daz',excerpt:'For Ext 330 using Firefox 4 Firebug, the following error is often happening when our app loads e._flyweights is undefined Yetthis '}," +
"{title:'hellllllllllllllpwhy it doesnt fire cellclick event after I change the cell value',threadid:'133827',username:'aimer311',userid:'162000',dateline:'1305700219',postid:'603309',forumtitle:'Ext 3x Help',forumid:'40',replycount:'3',lastpost:'1305856996',lastposter:'aimer311',excerpt:'okI will discribe this problem as more detail as I canI look into this problem for a whole dayI set clicksToEdit1 to a EditorGridPanelso when I'}]}";
As a result I'm getting a JavaScript error:
Syntax error at line 1 while loading:
totalCount:'3',topics:[{title:'XTemplate
---------------------^
expected ';', got ':'
Although, when I'm using Proxy's URL:
URL: 'http://extjs.com/forum/topics-browse-remote.php',
which represents same information, I don't have any problems.
Where is my failure????
P.S. Comments for the first answer:
return "{\"totalCount\":\"3\",\"topics\":[{\"title\":\"XTemplate with in EditorGridPanel\",\"threadid\":\"133690\",\"username\":\"kpremco\",\"userid\":\"272497\",\"dateline\":\"1305604761\",\"postid\":\"602876\",\"forumtitle\":\"Ext 3x Help\",\"forumid\":\"40\",\"replycount\":\"2\",\"lastpost\":\"1305857807\",\"lastposter\":\"kpremco\",\"excerpt\":\"Hi I have an EditiorGridPanel whose one column i am using XTemplate to render and another Column is Combo Box FieldWhen i render the EditorGri\"}," +
"{\"title\":\"IFrame error _flyweights is undefined\",\"threadid\":\"133571\",\"username\":\"Daz\",\"userid\":\"52119\",\"dateline\":\"1305533577\",\"postid\":\"602456\",\"forumtitle\":\"Ext 3x Help\",\"forumid\":\"40\",\"replycount\":\"1\",\"lastpost\":\"1305857313\",\"lastposter\":\"Daz\",\"excerpt\":\"For Ext 330 using Firefox 4 Firebug, the following error is often happening when our app loads e._flyweights is undefined Yet, this \"}," +
"{\"title\":\"hellllllllllllllpwhy it doesn't fire cellclick event after I change the cell value\",\"threadid\":\"133827\",\"username\":\"aimer311\",\"userid\":\"162000\",\"dateline\":\"1305700219\",\"postid\":\"603309\",\"forumtitle\":\"Ext 3x Help\",\"forumid\":\"40\",\"replycount\":\"3\",\"lastpost\":\"1305856996\",\"lastposter\":\"aimer311\",\"excerpt\":\"okI will discribe this problem as more detail as I canI look into this problem for a whole dayI set clicksToEdit1 to a EditorGridPanelso when I\"}]}";
I've got the following error:
Syntax error at line 1 while loading:
{"totalCount":"3","topics":[{"title
-------------^
expected ';', got ':'
P.S. #2. When I've added '[' to the begining of the response string and ']' to the end , erros disappered, but grid hasn't been filled with data
You're not returning (valid) JSON. Refer to the JSON site for details, but for instance, all property keys must be in double quotes. (All strings must also be in double quotes; single quotes are not valid for JSON strings.)
So for instance, this is not valid JSON:
{totalCount:'3'}
...because the key is not in quotes, and the value is using single quotes. The correct JSON would be:
{"totalCount":"3"}
...if you really want the 3 to be a string, or:
{"totalCount":3}
...if the 3 should be a number.
People frequently confuse JSON and JavaScript's object literal notation, but they are different. Specifically, JSON is a subset of object literal notation. A lot of things that are valid in object literal notation are not valid in JSON. Any time you're in doubt, you can check at jsonlint.com, which provides a proper JSON validator.
I have found the root of issue.
As I've known Ext send to my web service function with parameter 'callback=[some_callback_name]' (e.g. callback1001). It means that Extjs wanna to get results not just in JSON format, but in format 'callback1001()'. When I've return my data in this format everything became good.
Proof links:
http://www.sencha.com/forum/showthread.php?22990-Json-Invalid-label (response #6)
http://indiandeve.wordpress.com/2009/12/02/extjs-error-invalid-label-error-while-using-scripttagproxy-for-json-data-in-paging-grid-example/

Categories

Resources