ElasticSearch / Java - Dynamic Templates aggregation with null values included

ElasticSearch / Java - Dynamic Templates aggregation with null values included - java

I'm having a diffuculties with aggregations over dynamic templates. I have values stored like this.
[
{
"country": "CZ",
"countryName": {
"en": "Czech Republic",
"es": "Republica checa",
"de": "Tschechische Republik"
},
"ownerName": "..."
},
{
"ownerName": "..."
}
]
Country field is classic keyword, mapping for country name is indexed as dynamic template according to the fact that I want to extend with with another languages when I need to.
{
"dynamic_templates": [
{
"countryName_lsi_object_template": {
"path_match": "countryName.*",
"mapping": {
"type": "keyword"
}
}
}
]
}
countryName and country are not mandatory parameters - when the document is not assigned to any country, I can't have countryName filled either. However I need to do a sorted aggregation over the country names with according to chosen key and also need to include buckets with null countries. Is there any way to do that?
Previously, I used TermsValuesSourceBuilder with order on "country" field, but I need data sorted according to specifix language and name and that can't be done over country codes.
(I'm using elasticsearch 7.7.1 and java 8 and recreation of index / changing data structure is not my option.)
I tried to use missing bucket option, but the response does not include buckets with "countryName" missing at all.
TermsValuesSourceBuilder("countryName").field("countryName.en").missingBucket(true);

Related

Sort the search result in ascending order of a multivalued field in Solr

I'm using Solr of version 6.6.0. I have a schema of title (text_general), description(text_general), id(integer). When I search for a keyword to list the results in ascending order of the title my code returns an error can not sort on multivalued field: title.
I have tried to set the sort using the following 3 methods
SolrQuery query = new SolrQuery();
1. query.setSort("title", SolrQuery.ORDER order);
2. query.addSort("title", SolrQuery.ORDER order);
3. SortClause ab = new SolrQuery.SortClause("title", SolrQuery.ORDER.asc);
query.addSort(ab);
but all of these returns the same error
I found a solution by referring to this answer
It says to use min/max functions.
query.setSort(field("pageTitle",min), ORDER.asc);
this what I'm trying to set as the query, I didn't understand what are the arguments used here.
This is the maven dependency that I'm using
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-solrj</artifactId>
<version>6.5.1</version>
</dependency>

Unless title actually is multiValued - can your post have multiple titles - you should define it as multiValued="false" in your schema. However, there's a second issue - a field of the default type text_general isn't suited for sorting, as it'll generate multiple tokens, one for each word in the title. This is useful for searching, but will give weird and non-intuitive results when sorting.
So instead, define a title_sort field and use a field type with a KeywordTokenizer and LowerCaseFilter attached (if you want case insensitive sort), or if you want case sensitive sort, use the already defined string field type for the title_sort field.

The first thing to check is do you really need that title field to be multivalued, or do your documents really have multiple titles ? If not, you just need to fix the field definition by setting multivalued="false".
That said, sorting on a multivalued field doesn't make sense unless determining which one of these multiple values should be used to sort on, or how to combine them into one.
Let' say we need to sort a given resultset by title (alphabetically), first using a single-valued title field :
# Unsorted
"docs": [
{ "id": "1", "title": "One" },
{ "id": "2", "title": "Two" },
{ "id": "3", "title": "Three" },
]
# Sorted
"docs": [
{ "id": "1", "title": "One" },
{ "id": "3", "title": "Three" },
{ "id": "2", "title": "Two" },
]
# -> ok no problem here
Now applying the same logic with a multi-valued field is not possible as is, you would necessarily need to determine which title to use in each document to properly sort them :
# Unorted
"docs": [
{ "id": "1", "title": ["One", "z-One", "a-One"] },
{ "id": "2", "title": ["Two", "z-Two", "a-Two"] },
{ "id": "3", "title": ["Three", "z-Three", "a-Three"] }
]
Hopefully, Solr allows to sort results by the output of a function, meaning you can use any from Solr's function queries to "get" a single value per title field. The answer you referred to is a good example even though it may not work for you (because title would need docValues enabled - depends on field definition - and knowing that max/min functions should be used only with numeric values), just to get the idea :
# here the 2nd argument is a callback to max(), used precisely to get a single value from title
sort=field(title,max) asc

Using two different structures in one Firebase database

I'm currently working on a project where I need two different types of non-related datastructures in my Java Android app. One being users, the other being types of food.
Users are set up like this:
users
userid
name
age
gender
weight
height
But, I also need one that looks like this, which must be searchable:
foods
name
carbohydrate
fat
protein
Is it possible to user the same database (preferrable Firebase, as I'm already using that), or do I need to add another database to the app I'm making?
Edit: I ended up exporting the JSON, rewriting it according to the good answers given here, and then importing it again. It works flawlessly. Thanks for your answers!
{
"foods" : {
"name" : {
"carbohydrates" : "5",
"fats" : "5",
"proteins" : "5"
}
},
"users" : {
"FjtMNTcDrOP2wcaPAa0E0Cc1jRz2" : {
"activity" : "Moderate Exercise (3–5 days/week)",
"age" : "40",
"gender" : "Male",
"height" : "180",
"name" : "Flex",
"weight" : "86"
}
}
}

Yes, it is possible to model multiple entity types (such as your users and foods) in the Firebase Realtime Database. While it doesn't have the concept of a table, it's a hierarchy of JSON values and you can model any structure you want in that.
For example, you could express you data model with this JSON:
{
"users": {
"userid": {
"name": "value",
"age": 42,
"gender: "value",
"weight": 190,
"height": 172
}
},
"foods": {
"name": {
"carbohydrate": 42
"fat": 11,
"protein": 8
}
}
}
In relational terms, the above model defines two "tables": users and foods. In Android code you can define separate references to each of these with:
DatabaseReference rootReference = FirebaseDatabase.getInstance().getReference();
DatabaseReference usersReference = rootReference.getChild("users");
DatabaseReference foodsReference = rootReference.getChild("foods");

Yes, you can use the same Firebase Realtime Database to store that data.
The RTDB can be simplified down to being just a JSON tree. So for your desired implementation, you would have two keys at the root of your database (such as "users" and "foods").
{
"users": {
"userid1": {
"name": "somestring",
"age": "somenum",
"gender": "somestring",
"height": "somenum",
"weight": "somenum",
...
},
...
},
"foods": {
"food1": {
"name": "somename",
"carbs": "somenum",
"fat": "somepercent",
"protein": "somepercent",
...
},
...
}
}
You can also add or remove more root keys as you wish and your project takes shape.
However,
As #Tamir Abutbul suggests in their answer, I would use Cloud Firestore for this project over the RTDB.
The reason for this is that based on your data, you are likely going to need to filter results by a number of different values at a time in the future. Cloud Firestore supports these types of queries natively (docs) whereas you'd need to write a custom solution for the RTDB.
Getting Started with Cloud Firestore

You can use Firebase with Cloud Firestore.
Create "users" collections with your wanted data structure and another collection called "foods" with its own data structure.
The next step is just to decide when to use any one of those collections(according to your app logic).

How to obtain validator expression used when creating a MongoDB collection? [duplicate]

I'm trying to add new field (LastLoginDate of type Date) to a existing collection. Here is my sample script:
db.createCollection( "MyTestCollection",
{ "validator": { "$or":
[
{ "username": { "$type": "string" } },
{ "notes": { "$type": "string" } }
]
}
}
)
db.getCollectionInfos({name: "MyTestCollection"});
[
{
"name" : "MyTestCollection",
"options" : {
"validator" : {
"$or" : [
{
"username" : {
"$type" : "string"
}
},
{
"notes" : {
"$type" : "string"
}
}
]
}
}
}
]
What is the best way to add new field LastLoginDate : { $type: "date" }, to this existing collection "MyTestCollection".
Adding new document or updating existing collection with new field may create this field. But i'm not sure how to enforce the date type on the new field. After adding new filed, if i execute the following command again, it doesn't show type validator for newly added field.

I "should" probably prefix this with one misconception in your question. The fact is MongoDB differs from traditional RDBMS in that it is "schemaless" and you do not in fact need to "create fields" at all. So this differs from a "table schema" where you cannot do anything until the schema changes. "Validation" however is a different thing as well as a "still" relatively new feature as of writing.
If you want to "add a validation rule" then there are methods which depend on the current state of the collection. In either case, there actually is no "add to" function, but the action instead is to "replace" all the validation rules with new ones to specify. Read on for the rules of how this works.
Existing Documents
Where the collection has existing documents, as noted in the documentation
Existing Documents
You can control how MongoDB handles existing documents using the validationLevel option.
By default, validationLevel is strict and MongoDB applies validation rules to all inserts and updates. Setting validationLevel to moderate applies validation rules to inserts and to updates to existing documents that fulfill the validation criteria. With the moderate level, updates to existing documents that do not fulfill the validation criteria are not checked for validity.
This and the following example section are basically saying that in addition to the options on .createCollection() you may also modify an existing collection with documents, but should be "wary" that the present documents may not meet the required rules. Therefore use "moderate" if you are unsure the rule will be met for all documents in the collection.
In order to apply, you use the .runCommand() method at present to issue the "command" which sets the validation rules. Which is "validationLevel" from the passage above.
Since you have existing rules, we can use the `.getCollectionInfos() to retrieve them and then add the new rule and apply:
let validatior = db.getCollectionInfos({name: "MyTestCollection"})[0].options.validator;
validator.$or.push({ "LastLoginDate": { "$type": "date" } });
db.runCommand({
"collMod": "MyTestCollection",
"validator": validator,
"validationLevel": "moderate"
});
Of course as noted before, that if you are confident the documents all meet the conditions then you can apply "strict" as the default instead.
Empty Collection
If in the case is that the collection is actually "empty" with no documents at all or you may "drop" the collection since the current data is not of consequence, then you can simply vary the above and use .createCollection() in combination with .drop():
let validatior = db.getCollectionInfos({name: "MyTestCollection"})[0].options.validator;
validator.$or.push({ "LastLoginDate": { "$type": "date" } });
db.getCollection("MyTestCollection").drop();
db.createCollection( "MyTestCollection", { "validator": validator });

let previousValidator = db.getCollectionInfos({name: "collectionName"})[0].options.validator;
# push the key to required array
previousValidator.$jsonSchema.required.push("isBloodReportAvailable")
let isBloodReportAvailabl = {"bsonType" : "bool", "description" : "must be an bool object and is optional" }
# add new property to validator
previousValidator1.$jsonSchema.properties['isBloodReportAvailable'] = isBloodReportAvailabl
db.runCommand({
"collMod": "collectionName",
"validator": previousValidator,
});

Checkboxes checked into JSON Format in SpringMVC

I am working on a spring MVC application. I have a sitaution where i need to check some checkboxes from UI and save the checked values in the form of JSON in the backend and i need to convert that into a string.
The picture shows more.
So i want to save like:
[{
Coast : 'East',
States : [ 'NY', 'MI' ]
},{
Coast : 'Central',
States : [ 'TX', 'OK' ]
}].
Please suggest me how can i implement this.

Your question is quite vague so I'm going to assume because you've used the json tag that you're asking for help on how to model this information in JSON and handle it within your Spring app.
You probably want to restructure your JSON schema to support extra fields being set per state. Instead of States being a list of strings, you could change it to a list of objects which has a name and selected field.
I'd also recommend you change the keys in your JSON to be lower case, this enables more fluent mapping between your JSON and model classes.
For example, MI is selected in the below JSON, whereas NY isn't:
[{
"coast": "East",
"states": [{
"name": "NY",
"selected": true
}, {
"name": "MI",
"selected": false
}]
}, {
...same again for West and Central
}]
You could then have some classes along the lines of and use Jackson to map between them:
public class Region {
String coast;
List<State> states;
}
public class State {
String name;
boolean selected;
}

Flatten and De-flatten for Solr input/output

I have lots of Java objects which have parent child relationships.
These need to be put into Solr.
To do that, we convert the Java object into json as follows:
{
"employee": {
"name" : "John",
"address": {
"apt": 100,
"city": "New York",
"country": "USA"
},
"vehicles": [
{
"name" : "Hyundai",
"color" : "red"
},
{
"name" : "Toyota",
"color" : "black"
}
]
}
}
Now since Solr does not handle this, I am flattening it out as follows:
"employee.name": "John",
"employee.address.apt": 100,
"employee.address.city": "New York",
"employee.address.country": "USA",
"employee.vehicles_0.name": "Hyundai", // Note how arrays are being flattened
"employee.vehicles_0.color": "red",
"employee.vehicles_1.name": "Toyota",
"employee.vehicles_1.color": "black",
It is easy to flatten, but clients of my library do not want the flattened schema when they query.
So I need to de-flatten the above on return from Solr and convert them back to the original Java object.
Does anyone know how this can be done?
I am thinking of somewhat crude way of taking the flattened output from Solr (as shown above) and write a parser to put the fields back to Java objects. But this seems like lot of work. An easy way out or an existing tool would be much appreciated.
I am using Solr 4.5.1

Solr is designed for search, not storing deep object graphs. You might be better off optimizing the Solr records for search and then getting the original objects from the master store by recordID or some such.
Think about what will you be trying to find. For example, will you be searching for individual vehicles? If yes, your current document level should be a vehicle not an employee.

You can index your documents in a Parent->Child structure at first place.
Take a look at this blog post: http://blog.griddynamics.com/2013/09/solr-block-join-support.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

ElasticSearch / Java - Dynamic Templates aggregation with null values included - java

Related

Sort the search result in ascending order of a multivalued field in Solr

Using two different structures in one Firebase database

How to obtain validator expression used when creating a MongoDB collection? [duplicate]

Checkboxes checked into JSON Format in SpringMVC

Flatten and De-flatten for Solr input/output

Categories

Resources