Index not updated when upserting document - java

I'm using ElasticSearch 1.4.2 and want to upsert some document in json format to it. When I try to insert a test document I can only find it via id, but not via search of the logName field or others that I tried. Probably I miss a step in the upsert method, it's at the end of the question.
Here the query results:
$ curl -XGET http://localhost:9200/pwotest/_search?q=logName:%22pwotest1%22\&pretty=true
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
Searching for the id results in a document where logName is pwotest1:
$ curl -XGET http://localhost:9200/pwotest/_search?q=\$oid:%22549954143004ba1bf99a56ba%22\&pretty=true
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 3.1972246,
"hits": [
{
"_index": "pwotest",
"_type": "User",
"_id": "549954143004ba1bf99a56ba",
"_score": 3.1972246,
"_source": {
"_id": {
"$oid": "549954143004ba1bf99a56ba"
},
"logName": "pwotest1",
"modifiedBy": "test",
"modifiedId": "549954143004ba1bf99a56ba",
"modificationDate": 1419334676507,
"internalType": "create",
"dm_Version": "0.8.2",
"creationDate": 1419334676515,
"createdId": "549954143004ba1bf99a56ba"
}
}
]
}
}
The code to upsert the document is in Java and looks like this:
/**
* See ES doc
* #param o is a representation of PWO object
* #throws PWOException
*/
public void update(JsonObject o) throws PWOException {
Preconditions.checkNotNull(index, "index must not be null for update");
getNode();
Client client = node.client();
// this needs to come from somewhere
String type = "User";
String id = GsonHelper.getId(o).get();
String json = new Gson().toJson(o);
IndexRequest indexRequest = new IndexRequest(index, type, id).
source(json);
UpdateRequest upd = new UpdateRequest(index, type, id).
doc(json).
upsert(indexRequest);
Logger.info("Update is %s [%s, %s, %s]", upd, index, type, id);
try {
client.update(upd).get();
} catch (InterruptedException | ExecutionException e) {
throw new PWOException(GsonHelper.createMessageJsonObject("Update in elastic search failed"), e);
}
}

it looks like your _id in your source is messed up:
"_id":{"$oid":"549954143004ba1bf99a56ba"}
if you can get this to be a basic value I'm pretty confident your query will work, e.g.:
"_id":"549954143004ba1bf99a56ba"

Related

How to get property value direct from mongodb in JAVA

Hi everyone I have a collection of documents like bellow. I want to directly get "rights" from roles array for params: _id, groups._id, roles._id using java mongo driver.
{
"_id": 1000002,
"groups": [
{
"_id": 1,
"roles": [
{
"rights": 3,
"_id": 1
},
{
"rights": 7,
"_id": 2
},
{
"rights": 3,
"_id": 3
}
]
}
],
"timestamp": {
"$date": {
"$numberLong": "1675267318028"
}
},
"users": [
{
"accessProviderId": 1,
"rights": 1,
"_id": 4
},
{
"accessProviderId": 1,
"rights": 3,
"_id": 5
}
]
}
I have AccessListItem class which represents this document and I have used Bson filters to get it from mongo, but after fetching i had to get information through java function.. I want to get int value directly from mongo base.
Bson fileFilter = Filters.eq("_id", itemId);
Bson groupFilter = Filters.elemMatch("groups", Document.parse("{_id:"+groupId+"}"));
Bson roleFilter = Filters.elemMatch("groups.roles", Document.parse("{_id:"+role+"}"));
Bson finalFilter = Filters.and(fileFilter, Filters.and(groupFilter,roleFilter));
MongoCollection<AccessListItem> accessListItemMongoCollection = MongoUtils.getAccessCollection(type);
AccessListItem accessListItem = accessListItemMongoCollection.find(finalFilter).first();
The short answer is you can't.
MongoDB is designed for returning documents, that is, objects containing key-value pairs. There is no mechanism for a MongoDB query to return just a value, i.e. it will never return just 3 or [3].
You could use aggregation with a $project stage at the end to give you a simplified object like:
{ rights: 3}
In javascript that might look like:
db.collection.aggregate([
{$match: {
_id: itemId,
"groups._id": groupId,
"groups.roles._id": role
}},
{$project: {
_id: 0,
group: {
$first: {
$filter: {
input: "$groups",
cond: {$eq: ["$$this._id",groupId]}
}
}
}
}},
{$project: {
"roles": {
$first: {
$filter: {
input: "$group.roles",
cond: { $eq: [ "$$this._id",role]}
}
}
}
}},
{$project: {
rights: "$roles.rights"
}}
])
Example: Playground
I'm not familiar with spring boot, so I'm not sure what that would look like in Java.

How can I find the number of duplicates for a field using MongoDB Java?

How can I find the number of duplicates in each document in Java-MongoDB
I have collection like this.
Collection example:
{
"_id": {
"$oid": "5fc8eb07d473e148192fbecd"
},
"ip_address": "192.168.0.1",
"mac_address": "00:A0:C9:14:C8:29",
"url": "https://people.richland.edu/dkirby/141macaddress.htm",
"datetimes": {
"$date": "2021-02-13T02:02:00.000Z"
}
{
"_id": {
"$oid": "5ff539269a10d529d88d19f4"
},
"ip_address": "192.168.0.7",
"mac_address": "00:A0:C9:14:C8:30",
"url": "https://people.richland.edu/dkirby/141macaddress.htm",
"datetimes": {
"$date": "2021-02-12T19:00:00.000Z"
}
}
{
"_id": {
"$oid": "60083d9a1cad2b613cd0c0a2"
},
"ip_address": "192.168.1.5",
"mac_address": "00:0A:05:C7:C8:31",
"url": "www.facebook.com",
"datetimes": {
"$date": "2021-01-24T17:00:00.000Z"
}
}
example query:
BasicDBObject whereQuery = new BasicDBObject();
DBCursor cursor = table1.find(whereQuery);
while (cursor.hasNext()) {
DBObject obj = cursor.next();
String ip_address = (String) obj.get("ip_address");
String mac_address = (String) obj.get("mac_address");
Date datetimes = (Date) obj.get("datetimes");
String url = (String) obj.get("url");
System.out.println(ip_address, mac_address, datetimes, url);
}
in Java, How I can know count duplicated data of "url". And how many of duplicated.
in mongodb you can solve this problem with "Aggregation Pipelines". You need to implement this pipeline in "Mongodb Java Driver". It gives only duplicated results with their duplicates count.
db.getCollection('table1').aggregate([
{
"$group": {
// group by url and calculate count of duplicates by url
"_id": "$url",
"url": {
"$first": "$url"
},
"duplicates_count": {
"$sum": 1
},
"duplicates": {
"$push": {
"_id": "$_id",
"ip_address": "$ip_address",
"mac_address": "$mac_address",
"url": "$url",
"datetimes": "$datetimes"
}
}
}
},
{ // select documents that only duplicates count higher than 1
"$match": {
"duplicates_count": {
"$gt": 1
}
}
},
{
"$project": {
"_id": 0
}
}
]);
Output Result:
{
"url" : "https://people.richland.edu/dkirby/141macaddress.htm",
"duplicates_count" : 2.0,
"duplicates" : [
{
"_id" : ObjectId("5fc8eb07d473e148192fbecd"),
"ip_address" : "192.168.0.1",
"mac_address" : "00:A0:C9:14:C8:29",
"url" : "https://people.richland.edu/dkirby/141macaddress.htm",
"datetimes" : {
"$date" : "2021-02-13T02:02:00.000Z"
}
},
{
"_id" : ObjectId("5ff539269a10d529d88d19f4"),
"ip_address" : "192.168.0.7",
"mac_address" : "00:A0:C9:14:C8:30",
"url" : "https://people.richland.edu/dkirby/141macaddress.htm",
"datetimes" : {
"$date" : "2021-02-12T19:00:00.000Z"
}
}
]
}
If I understand your question correctly you're trying to find the amount of duplicate entries for the field url. You could iterate over all your documents and add them to a Set. A Set has the property of only storing unique values. When you add your values, the ones that are already in the Set will not be added again. Thus the difference of the number of entries in the Set to the number of documents is the amount of duplicate entries for the given field.
If you wanted to know which URLs are non-unique, you could evaluate the return value from Set.add(Object) which will tell you, whether or not the given value has been in the Set beforehand. If it has, you got yourself a duplicate.

How to use MongoDB Java driver to group by dayOfYear on ISODate attributes?

How to use mongodb java driver to compare dayOfYear of two ISODate objects?
Here are my docs
{"name": "hello", "count": 4, "TIMESTAMP": ISODate("2017-10-02T02:00:35.098Z")}
{"name": "hello", "count": 5, "TIMESTAMP": ISODate("2017-10-02T02:00:35.098Z")}
{"name": "goodbye", "count": 6, "TIMESTAMP": ISODate("2017-10-01T02:00:35.098Z")}
{"name": "foo", "count": 6, "TIMESTAMP": ISODate("2017-10-02T02:00:35.098Z")}
I want to compare the day in "TIMESTAMP" to perform some aggregation
Bson match = Aggregates.match(eq("name": "hello"));
Bson group = Aggregates.group(new Document("name", "$name"), Accumulators.sum("total", 1));
collection.aggregate(Arrays.asList(match, group))
Now I am not sure how to do this aggregation for all the records that belongs to particular day?
so my expected result for "2017-10-02" is
[{"_id": {"name":"hello"}, "total": 9}, {"_id": {"name":"foo"}, "total": 6}]
Given the following documents:
{"name": "hello", "count": 4, "TIMESTAMP": ISODate("2017-10-02T02:00:35.098Z")}
{"name": "hello", "count": 5, "TIMESTAMP": ISODate("2017-10-02T02:00:35.098Z")}
{"name": "goodbye", "count": 6, "TIMESTAMP": ISODate("2017-10-01T02:00:35.098Z")}
{"name": "foo", "count": 6, "TIMESTAMP": ISODate("2017-10-02T02:00:35.098Z")}
The following command ...
db.getCollection('dayOfYear').aggregate([
// project dayOfYear as an attribute
{ $project: { name: 1, count: 1, dayOfYear: { $dayOfYear: "$TIMESTAMP" } } },
// match documents with dayOfYear=275
{ $match: { dayOfYear: 275 } },
// sum the count attribute for the selected day and name
{ $group : { _id : { name: "$name" }, total: { $sum: "$count" } } }
])
... will return:
{
"_id" : {
"name" : "foo"
},
"total" : 6
}
{
"_id" : {
"name" : "hello"
},
"total" : 9
}
I think this meets the requirement expressed in your OP.
Here's the same command expressed using the MongoDB Java driver:
MongoCollection<Document> collection = mongoClient.getDatabase("stackoverflow").getCollection("dayOfYear");
Document project = new Document("name", 1)
.append("count", 1)
.append("dayOfYear", new Document("$dayOfYear", "$TIMESTAMP"));
Document dayOfYearMatch = new Document("dayOfYear", 275);
Document grouping = new Document("_id", "$name").append("total", new Document("$sum", "$count"));
AggregateIterable<Document> documents = collection.aggregate(Arrays.asList(
new Document("$project", project),
new Document("$match", dayOfYearMatch),
new Document("$group", grouping)
));
for (Document document : documents) {
logger.info("{}", document.toJson());
}
Update based on this comment:
One of the problems with project is that it only include fields you specify . The above input is just an example. I have 100 fields in my doc I can't sepecify every single one so if I use project I have to specify all 100 fields in addition to "dayOfYear" field. – user1870400 11 mins ago
You can use the following command to return the same output but without a $project stage:
db.getCollection('dayOfYear').aggregate([
// ignore any documents which do not match dayOfYear=275
{ "$redact": {
"$cond": {
if: { $eq: [ { $dayOfYear: "$TIMESTAMP" }, 275 ] },
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
// sum the count attribute for the selected day
{ $group : { _id : { name: "$name" }, total: { $sum: "$count" } } }
])
Here's that command in its 'Java form':
MongoCollection<Document> collection = mongoClient.getDatabase("stackoverflow").getCollection("dayOfYear");
Document redact = new Document("$cond", new Document("if", new Document("$eq", Arrays.asList(new Document("$dayOfYear", "$TIMESTAMP"), 275)))
.append("then", "$$KEEP")
.append("else", "$$PRUNE"));
Document grouping = new Document("_id", "$name").append("total", new Document("$sum", "$count"));
AggregateIterable<Document> documents = collection.aggregate(Arrays.asList(
new Document("$redact", redact),
new Document("$group", grouping)
));
for (Document document : documents) {
logger.info("{}", document.toJson());
}
Note: Depending on the size of your collection/your non functional requirements/etc you may want to consider the performance of these solutions and either (a) add a match stage before you start projecting/redacting or (b) extract dayOfYear into its own attribute so that you can avoid this complexity entirely.

document missing exception while updating an index in elasticsearch via java api

I am trying to update a value in my index via java api using UpdateRequest which accepts three arguments.
Index
document
id
Question - I know what my index name is but i am not sure what value should be passed in document and id field.
SAMPLE DATA
{
"took": 2,
"timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 },
"hits": {
"total": 1,
"max_score": 0.94064164,
"hits": [
{
"_index": "ticketdump",
"_type": "event",
"_id": "AVefK2vFmf0chKzzBkzy",
"_score": 0.94064164,
"_source": {
"clientversion": "123465",
"queue": "test,test",
"vertical": "test",
"troubleshooting": "test",
"reason": "test",
"status": "test",
"ticketversion": "1132465",
"apuid": 1,
"golive": "2014-07-14",
"clientname": "test",
"message": "test",
"product": "test",
"clientid": 1,
"createddatetime": "2016-05-03 09:43:48",
"area": "test",
"developmentfix": "test",
"actiontaken": "test",
"categoryname": "test",
"parentcategory": "test",
"problemdef": "test",
"ticketid": 1
}
}
]
}
}
I tried to pass _source object but it gave document missing error.Maybe I am missing the concept?
JAVA CODE
UpdateRequest updateRequest = new UpdateRequest(
"ticketdump",
js.getJSONObject("hits")
.getJSONArray("hits")
.getJSONObject(0)
.getJSONObject("_source")
.toString(),
"1"
).script(new Script("ctx._source.message = \"bhavik\""));
client.update(updateRequest).get();
Actually, your UpdateRequest accepts 3 parameters
Index
Type
Id
By the following data you can see that:
Index = ticketdump
Type = event
Id = AVefK2vFmf0chKzzBkzy

Elasticsearch query doesn't produce expected result

I'm having trouble creating a query which should search for any documents with a certain search term in the fields title and text, and should match a state field which could be zero or more values where atleast one must match.
Given the following query:
"bool" : {
"must" : {
"multi_match" : {
"query" : "test",
"fields" : [ "title", "text" ]
}
},
"should" : {
"terms" : {
"state" : [ "NEW" ]
}
},
"minimum_should_match" : "1"
}
Should not the following data be returned as a result?
{
"_shards": {
"failed": 0,
"successful": 5,
"total": 5
},
"hits": {
"hits": [
{
"_id": "JXnEkYFDQp2feATMzp2LTA",
"_index": "tips",
"_score": 1.0,
"_source": {
"state": "NEW",
"text": "This is a test",
"title": "Test"
},
"_type": "tip"
}
],
"max_score": 1.0,
"total": 1
},
"timed_out": false,
"took": 1
}
In my test this is not the case. What am i doing wrong?
the following is the java code producing the outputted query.
SearchRequestBuilder builder = client.prepareSearch("tips").setTypes("tip");
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
if(searchTermIsNotEmpty(searchTerm)){
MultiMatchQueryBuilder qb = QueryBuilders.multiMatchQuery(
searchTerm,
"title", "text"
);
boolQuery.must(qb);
}
if(filters.size() > 0){
boolQuery.should(QueryBuilders.termsQuery("state",filters));
boolQuery.minimumNumberShouldMatch(1);
}
if(boolQuery.hasClauses()){
builder.setQuery(boolQuery);
}
logger.info(boolQuery.toString());
SearchResponse result = builder.execute().actionGet();
return result.toString();
Any help on this is greatly appreciated!
Seems i found the issue, for some reason i was unable to fetch when using the filter enum in it's original form. I had to convert the enum to string and lowercase it.
I then added the following query
boolQuery.must(QueryBuilders.termsQuery("state", getLowerCaseEnumCollection(filters)).minimumMatch(1));
I'm new to elasticsearch, so i don't know if this is a bug, or a feature. Im just glad i figured it out.

Categories

Resources