RegEx for extracting text from a file in NiFi - java

I have a JSON response like below and I only want to extract text following text from file using extracttext processor in NIFI. But, it is saying not a valid Java expression.
JSON Response
"17" : {
"columnId" : 17,
"columnName" : "id",
"value" : "1234:;5678"
}
"17" : {
"columnId" : 17,
"columnName" : "id",
"value" : "1234:;5678"
},
"19" : {
"columnId" : 19,
"columnName" : "HelloWorld",
"value" : "Test 1:;34130"
},
"21" : {
"columnId" : 21,
"columnName" : "Testing",
"value" : "Test"
}
"17" : {
"columnId" : 17,
"columnName" : "id",
"value" : "1299:;6775"
},
"19" : {
"columnId" : 19,
"columnName" : "HelloWorld",
"value" : "Test 2.:;34147"
},
"21" : {
"columnId" : 21,
"columnName" : "Testing",
"value" : "Test"
}
"17" : {
"columnId" : 17,
"columnName" : "id",
"value" : "1299:;6775"
},
"19" : {
"columnId" : 19,
"columnName" : "HelloWorld",
"value" : "Test.:;34147"
},
"21" : {
"columnId" : 21,
"columnName" : "globalregions",
"value" : "Test"
}
"
I have tried expression:
"17" : {(.*?)\}.
It's not working.
Expected result should be :-
"17" : {
"columnId" : 17,
"columnName" : "id",
"value" : "1234:;5678"
}
"17" : {
"columnId" : 17,
"columnName" : "id",
"value" : "1299:;6775"
}

normally you should have unique keys for json object.
and in your json there are several keys "17" in the same object...
however the following regexp should work for your json: "17"\s*:\s*\{[^}]*\}
you can try it: https://regex101.com/r/8RiPHu/1/

Related

how to interrogate mongoDB data using java?

I need to program an OLAP cube in java, and i have a mongoDB DataBase, i want to interrogate my date using this request:
"select custkey from customer where region=afriqua"
the question is how can i use java instructions to get my data from the collections "customer" which is included in the global collection "cube"
PS: I want to get the resultat of the query.
i'm using a NetBeans IDE 8.2 for codding in java ,and my NoSQL database in MOngoDB 3.11.0.
{ "_id" : ObjectId("5cdc6510af5e07bbde44cade"),
"LineOrder" : [
{ "CustKey" : 10, "SupKey" : 2, "PartKey" : 1360, "DateKey" : 200124, "Quantity" : 201578, "Tax" : 700 },
{ "Orderkey" : 20165, "Linenumber" : 5487, "Custkey" : 12, "Partkey" : 102, "Supkey" : 3, "Orderdate" : "May14,2016", "Shippriority" : 40, "Quantity" : 105, "Extebdedprice" : 1480620, "Ordertotalprice" : 11689695, "Discount" : 21, "Revenue" : 95981, "Supplycost" : 4, "Tax" : 198754, "Commdate" : 201647, "shipmode" : "mail" }
] }
{ "_id" : ObjectId("5cdc653eaf5e07bbde44cae3"),
"Date" : [
{ "DateKey" : 201948, "Date" : "April8,2019", "Dayofweek" : "Monday", "Month" : "April", "Year" : 2019, "Yearmonthnum" : 20194, "Yearmonth" : "Apr2019", "Daynuminweek" : 2, "Daynuminmonth" : 8, "Daynuminyear" : 98, "Monthnuminyear" : 4, "Weeknuminyear" : 15, "Lastdayinweekfl" : 2, "Lastdayinmonthfl" : 3, "Sellingseason" : "winter", "Holidayfl" : 1, "Weekdayfl" : 1 },
{ "DateKey" : 201965, "Date" : "May21,2019", "Dayofweek" : "Thusdey", "Month" : "May", "Year" : 2019, "Yearmonthnum" : 20195, "Yearmonth" : "May2019", "Daynuminweek" : 3, "Daynuminmonth" : 15, "Daynuminyear" : 101, "Monthnuminyear" : 5, "Weeknuminyear" : 21, "Lastdayinweekfl" : 2, "Lastdayinmonthfl" : 3, "Sellingseason" : "spring", "Holidayfl" : 2, "Weekdayfl" : 1 }
] }
{ "_id" : ObjectId("5cdc6550af5e07bbde44cae8"),
"customer" : [
{ "CustKey" : 5, "Name" : "Aleksender Bill", "Address" : "jFKRE3kiytrdf", "citys" : [ { "city" : "Vietnam xyz", "Nation" : "Vietnam", "Region" : "Asia" } ] },
{ "CustKey" : 10, "Name" : "Mohamed Dawed", "citys" : [ { "City" : "Draria", "Nation" : "Algeria", "Region" : "Afriqua" } ] },
{ "CustKey" : 12, "Name" : "George", "Nation" : "Canada" }
] }
{ "_id" : ObjectId("5cdc6563af5e07bbde44caed"),
"part" : [
{ "PartKey" : 1360, "Name" : "khaki chocolat", "Mfgr" : "mfgr#5", "size" : 31, "Color" : "medieum", " Categories" : [ { "Category" : "mfgr#56", "type" : "standar burnishe steel", "container" : "jumbo case" } ] },
{ "PartKey" : 1400, "Name" : "zzd nnn", "Mfgr" : "fgr#8", "size" : 10, "Categories" : [ { "Category" : "fgr#10", "type" : "xxxx", "container" : "jumbo" } ] },
{ "PartKey" : 102, "Color" : "edieum", "Categories" : [ { "Category" : "fgr#56", "type" : "yyy", "container" : "jj ccc" } ] }
] }
I try whith this code,but it doesn't work
import com.mongodb.MongoClient;
import com.mongodb.*;
public class Mongoconn {
public static void main(String[] args) {
// connect java to mongoDB
MongoClient mongoClient = new MongoClient("localhost", 27017);
System.out.println("server connection successfully done");
//choisir la base de données
DB dbs=mongoClient.getDB("ssb");
System.out.println("connected to database:"+dbs.getName());
// spécifier la collection
DBCollection coll= dbs.getCollection("cube");
DBCursor cursor = coll.find();
while(cursor.hasNext()) {
int i=1;
System.out.println(cursor.next());
i++;
}
System.out.println("********************************************* ");
// Select Where region = afriqua
BasicDBObject query = new BasicDBObject("Region", "Afriqua");
DBCursor curs = coll.find(query);
try {
while(cursor.hasNext()) {
int i=1;
DBObject obj = curs.next();
System.out.println(obj.get("Region") + " => " + obj.get("info"));
i++;
}
} finally {
curs.close();
}
}
}

mongodb update not working on nested subdocument

My mongodb records are like in this link Updating nested array inside array mongodb and sample records are as below and want to update a field in the nested document "parameter" array provided it satisfies some conditions (_id : "04", operations._id : "100" and operations.parameters.pid : "012"), this update query UPDATES wrong nested record (operations.parameters.pid : '011') , please help where I am going wrong:
{
"_id" : "04",
"name" : "test service 4",
"id" : "04",
"version" : "0.0.1",
"title" : "testing",
"description" : "test",
"protocol" : "test",
"operations" : [
{
"_id" : "99",
"oName" : "test op 52222222222",
"sid" : "04",
"name" : "test op 52222222222",
"oid" : "99",
"description" : "testing",
"returntype" : "test",
"parameters" : [
{
"oName" : "Param1",
"name" : "Param1",
"pid" : "011",
"type" : "582",
"description" : "testing",
"value" : "",
"version" : 1.0
},
{
"oName" : "Param2",
"name" : "Param2",
"pid" : "012",
"type" : "58222",
"description" : "testing",
"value" : "",
"version" : 2.0
}
]
},
{
"_id" : "100",
"oName" : "test op 909090",
"sid" : "05",
"name" : "test op 90909",
"oid" : "1009",
"description" : "testing",
"returntype" : "test",
"parameters" : [
{
"oName" : "Param1",
"name" : "Param1",
"pid" : "011",
"type" : "582",
"description" : "testing",
"value" : "",
"version" : 1.0
},
{
"oName" : "Param2",
"name" : "Param2",
"pid" : "012",
"type" : "58222",
"description" : "testing",
"value" : "",
"version" : 2.0
}
]
},
{
"_id" : "101",
"oName" : "test op 52222222222",
"sid" : "04",
"name" : "test op 52222222222",
"oid" : "99",
"description" : "testing",
"returntype" : "test",
"parameters" : [
{
"oName" : "Param1",
"name" : "Param1",
"pid" : "011",
"type" : "582",
"description" : "testing",
"value" : "",
"version" : 1.0
},
{
"oName" : "Param2",
"name" : "Param2",
"pid" : "012",
"type" : "58222",
"description" : "testing",
"value" : "",
"version" : 1.0
}
]
},
{
"_id" : "102",
"oName" : "test op 909090",
"sid" : "05",
"name" : "test op 90909",
"oid" : "1009",
"description" : "testing",
"returntype" : "test",
"parameters" : [
{
"oName" : "Param1",
"name" : "Param1",
"pid" : "011",
"type" : "582",
"description" : "testing",
"value" : "",
"version" : 1.0
},
{
"oName" : "Param2",
"name" : "Param2",
"pid" : "012",
"type" : "58222",
"description" : "testing",
"value" : "",
"version" : 2.0
}
]
}
]
}
My update query is as follows :
db.foo.update(
{ $and : [{'_id':'04'},
{'operations._id':'100' },
{'operations.parameters.pid': '012'}]},
{
"$set": {
"operations.1.parameters.$.dummy": "foo"
}
}
)
I am using mongodb 3.6.2 referred to https://docs.mongodb.com/master/reference/operator/update/positional-filtered/
Sample record from this link :
{
"_id" : 1.0,
"grades" : [
{
"type" : "quiz",
"questions" : [
10.0,
8.0,
5.0
]
},
{
"type" : "quiz",
"questions" : [
8.0,
9.0,
6.0
]
},
{
"type" : "hw",
"questions" : [
5.0,
4.0,
3.0
]
},
{
"type" : "exam",
"questions" : [
25.0,
10.0,
23.0,
0.0
]
}
]
}
Example from this link
db.student3.update(
{},
{ $inc: { "grades.$[t].questions.$[score]": 2 } },
{ arrayFilters: [ { "t.type": "quiz" } , { "score": { $gte: 8 } } ], multi: true}
)
ERror I got from robo-3t :
cannot use the part (grades of grades.$[t].questions.$[score]) to traverse the element ({grades: [ { type: "quiz", questions: [ 10.0, 8.0, 5.0 ] }, { type: "quiz", questions: [ 8.0, 9.0, 6.0 ] }, { type: "hw", questions: [ 5.0, 4.0, 3.0 ] }, { type: "exam", questions: [ 25.0, 10.0, 23.0, 0.0 ] } ]})
Please help;
Regards
Kris
In your update operation
"$set": {
"operations.1.parameters.$.dummy": "foo"
}
refers to the 1st element in operations that is an element with "_id" : "100", and within parameters array, the $ updates the first element in array.
You need to consider using mongodb 3.6 if you want to update nested array elements using $[] to update all matching elements.
One possible way to do this in 3.4 version is fetching the required sub-document and do matches and updates on the application side.

mongodb lookup function returns an empty array result

I have two mongodb collections employee and department as follows
employee collection:
{
"_id" : ObjectId("58c4a35ac2e604024321788e"),
"id_employee" : 19,
"employee_name" : "Lene Vestergaard Hau",
"employee_address" : "Allerton",
"hours" : 279,
"id_department" : 1,
"projects" : [
285,
453,
499,
804,
956
],
"Children" : [
{
"id_child" : 38,
"child_name" : "Caroline Herschel"
}
]
}
department collection:
{
"_id" : ObjectId("58c49c48c2e669d6555aa5a7"),
"id_department" : 1,
"department_name" : "dept1",
"projects" : [
{
"id_project" : 285,
"project_name" : "1project 285",
"duration" : "211"
},
{
"id_project" : 453,
"project_name" : "1project453",
"duration" : "214"
},
{
"id_project" : 499,
"project_name" : "1project499",
"duration" : "224"
},
{
"id_project" : 804,
"project_name" : "1project804",
"duration" : "217"
}
{
"id_project" : 956,
"project_name" : "1project956",
"duration" : "217"
}
]
I am trying to perform a lookup function to get all the details about the
projects for the employee. The java code I am using :
coll.aggregate(Arrays.asList(
Aggregates.match(Filters.eq("id_employee",19)),
Aggregates.unwind("$projects"),
Aggregates.lookup("department", "projects", "projects.id_project", "lookupData")
)).forEach(printBlock);
}
the code above return lookupData as an empty list. Can any one help please?

Elasticsearch aggregation return always empty buckets [] (elasticsearch version 2.4.1)

I have the following code:
final String index = ElasticSearchUtils.getIndexNameForExecution(queryId);
SearchRequestBuilder query = client.prepareSearch(index);
query.setTypes(indexType.toString());
query.addAggregation(terms("errors").field("code").size(NUMBER_OF_HITS).order(Terms.Order.count(false)));
int pageStart = getFrom(page) * size;
SearchResponse response = query.setFrom(pageStart).setSize(getPageSize(size)).execute().actionGet();
return response.toString();
and part of response is:
{
"took" : 78,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [ {
"_index" : "index_587e1e34e4b040c63c49137f",
"_type" : "ERROR",
"_id" : "AVmspgsKa7ZIkZu1p32G",
"_score" : 1.0,
"_source" : {
"agent_id" : "{8668b249-9443-e611-87c6-005056aa41d1}",
"_v" : "1",
"host" : "RHEL65-X86-DEMO",
"created_at" : "2017-01-17T13:37:58.496Z",
"qid" : "587e1e34e4b040c63c49137f",
"errors" : [ {
"code" : 769,
"module" : "FileHashing",
"function" : "FindFiles"
} ]
}
}, {
"_index" : "index_587e1e34e4b040c63c49137f",
"_type" : "ERROR",
"_id" : "AVmspgsKa7ZIkZu1p32H",
"_score" : 1.0,
"_source" : {
"agent_id" : "{7238f027-fbfc-47cf-85b0-c69838e26a2a}",
"_v" : "1",
"host" : "W8-X64-DEMO",
"created_at" : "2017-01-17T13:37:58.501Z",
"qid" : "587e1e34e4b040c63c49137f",
"errors" : [ {
"code" : 769,
"module" : "FileHashing",
"function" : "FindFiles"
} ]
}
} ]
},
"aggregations" : {
"errors" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
}
}
And the result of search execution is:
As you can see the buckets is empty [], aggregation is not working, but any exception is throwing. This is occurring with elasticsearch 2.4.1, the same code is working with elasticsearch 1.4.1
You simply need to fix this line
query.addAggregation(terms("errors").field("errors.code").size(NUMBER_OF_HITS).order(Terms.Order.count(false)));
^
|
add this

How to get object from MongoDb using criteria api in java

I want to get result where varifiedDate inside secondaryParents is not 0. How can I achieve this?
DB Structure:
{
"_id" : ObjectId("577a151859defb33c0f8bf4a"),
"_class" : "com.nv.tracker.db.model.User",
"parentId" : NumberLong(1000004),
"userId" : NumberLong(1000005),
"firstName" : "Pranav",
"lastName" : "Rathore",
"addr" : {
"address" : "",
"landmark" : "",
"city" : "",
"state" : "",
"country" : ""
},
"email" : "ashish.dubey#newvisionsoftware.in",
"mobile" : "7879066069",
"createdDate" : ISODate("2016-07-04T07:49:44.000Z"),
"activeStatus" : false,
"imagePath" : "http://172.20.0.210:8080/nvtracker/rest/resource/image/1000005",
"deviceId" : "78 79 066069",
"isSuperUser" : false,
"dob" : "1468348200000",
"checkAndriodOrIos" : 0,
"isLogin" : false,
"otp" : 0,
"batteryLevel" : 30,
"locationInterval" : 60,
"deviceSpeed" : 45,
"primaryMobile" : "9039101994",
"secondaryMobile1" : "9039101994",
"secondaryMobile2" : "",
"primaryEmail" : "ashish.dubey#newvisionsoftware.in",
"secondaryEmail1" : "abb#hh.con",
"secondaryEmail2" : "",
"isVerified" : false,
"currentBatteryLevel" : 0,
"currentSpeed" : 0,
"subscriptionExpDate" : NumberLong(1498132426000),
"enableNotification" : 1,
"enableTakeOffAlert" : 0,
"deleted" : false,
"hasExternalDevice" : false,
"secondaryParents" : {
"1000003" : {
"secondaryparentId" : NumberLong(1000003),
"varifiedDate" : NumberLong(14676286792586),
"addedDate" : NumberLong(1467628679286),
"secreteCode" : "18716"
},
"1000004" : {
"secondaryparentId" : NumberLong(1000004),
"varifiedDate" : NumberLong(0),
"addedDate" : NumberLong(1467628679286),
"secreteCode" : "18716"
}
}
}
I want to get the user where varifiedDate inside secondaryParents is not 0.
Please suggest how I can achieve this. Also provide the best tutorial for 'mongodb using java' with criteria API.

Categories

Resources