Data structure to store HashMap in Druid

Data structure to store HashMap in Druid - java

I am newbie in Druid. My problem is that how to store and query HashMap in Druid using java to interact.
I have network table as follow:
Network f1 f1 f3 .... fn
value 1 3 2 ..... 2
Additional, I have range-time table
time impression
2016-08-10-00 1000
2016-08-10-00 3000
2016-08-10-00 4000
2016-08-10-00 2000
2016-08-10-00 8000
In Druid can I store range-time table as a HashMap and query both of the tables above with the statement:
Filter f1 = 1 and f2 = 1 and range-time between [t1, t2].
Can anyone help me ? Thanks so much.

#VanThaoNguye,
Yes you can store the hashmaps in druid and you can query with bound filters.
You can read more about bound filters here: http://druid.io/docs/latest/querying/filters.html#bound-filter

Related

Activiti HistoricProcessInstanceQuery returned with missing processVariables

I am trying to query HistoricProcessInstances from Activiti historyService including the processVariables. But some of the processes have missing variables in the returned list. I have monitored the database to see the sql query that Activiti had been created, and it turned out, the query joins 3 tables together, and can only return 20 000 records. I have approximately 550 processes with 37 processVariables each, so that's going to be 20 350 records.
In the monitored SQL query there is a rnk (rank) created to each line in the result and its always between 1 and 20 000.
...from ACT_HI_PROCINST RES
left outer join ACT_HI_VARINST VAR ON RES.PROC_INST_ID_ = VAR.EXECUTION_ID_ and VAR.TASK_ID_ is null
inner join ACT_HI_VARINST A0 on RES.PROC_INST_ID_ = A0.PROC_INST_ID_
WHERE RES.END_TIME_ is not NULL and
A0.NAME_= 'processOwner'and
A0.VAR_TYPE_ = 'string' and
A0.TEXT_ = 'user123'
) RES
) SUB WHERE SUB.rnk >= 1 AND
SUB.rnk < 20001 and
Is there any possible solution that I can increase this threshold or create a HistoricProcessInstanceQuery with include only specific processVariables?
My code snippet for the query:
processHistories = historyService.createHistoricProcessInstanceQuery()
.processDefinitionKey(processKey).variableValueEquals(VariableNames.processOwner, username)
.includeProcessVariables().finished().orderByProcessInstanceStartTime().desc().list();

You can use NativeQuery from HistoryService.createNativeHistoricProcessInstanceQuery
enter your SQL (copy from the actual historic process instance query without ranks where clause)

Likely this is more a restriction imposed by your database than Activiti/Flowable.

Hazelcast SQL interface slow performance HZ 4.2.2 vs HZ 5.0.2

Situation :
We have a product with approx 30 attributes (String, Enum, Double)
values
We have iMap with indexes for all attributes IndexType.HASH for
string value and IndexType.SORTED for double values. (900MB together)
We have 300k products in map.(aprox 500MB )
We use local Datagrid with one member
JVM config: -Xms6G -Xmx8G
For HZ 5: we enabled JetConfig
config.getJetConfig().setEnabled(true);
Use Java AdoptOpenJDK 11.0.8
When invoking SQL query with pagination in HZ4 we got a response approx in 20-50ms, but the same query in Hazelcast 5 we got results in 2000-2500 ms
...ORDER BY param1 ASC LIMIT 20 OFFSET 0...
SqlResult sqlRows = hazelcastInstance.getSql().execute(sqlBuilder.toString());
When we tried to use predicates on the same map and in HZ4 and HZ5 we got the same results about 2000-2500 ms to get predicated page
PagingPredicate<Long, Product> pagingPredicate = Predicates.pagingPredicate(predicate, ProductComparatorFactory.getEntryComparator(sortName), max);
pagingPredicate.setPage(from / max);
///get final list of products
List<Product> selectedPageA = new ArrayList<>(productMap.getAll(productMap.keySet(pagingPredicate)).values());
For HZ 5 we add Mapping
hazelcastInstance.getSql().execute("CREATE MAPPING "ProductScreenerRepositoryProductMap" EXTERNAL NAME "ProductScreenerRepositoryProductMap"
TYPE IMap
OPTIONS (
'keyFormat' = 'java',
'keyJavaClass' = 'java.lang.Long',
'valueFormat' = 'java',
'valueJavaClass' = 'com.finmason.finriver.product.Product'
)");
}
There is used SQL
SELECT * FROM ProductScreenerRepositoryProductMap
WHERE doubleValue1 >= -0.9624378795139998
AND doubleValue1 <= 0.9727269574354098
AND doubleValue2 >= -0.9
AND doubleValue2 <= 0.9
ORDER BY doubleValue3 ASC LIMIT 20 OFFSET 0
And Product use simple serialization

Please upgrade to Hazelcast 5.1 (planned for February 23 right now).
It should be fixed with https://github.com/hazelcast/hazelcast/pull/20681

Actually this case will speed up by 3 separate PRs from 5.1:
https://github.com/hazelcast/hazelcast/pull/20681 - this one make your query use index
https://github.com/hazelcast/hazelcast/pull/20402 - this one will do less deserialization on cluster side
https://github.com/hazelcast/hazelcast/pull/20398 - this one makes deserialization on client side faster for multi-column queries
There are two cases not resolved in 5.1, they are described in https://github.com/hazelcast/hazelcast/pull/20796 - it should not be
a problem in your case, but if someone else see this post, it may be his/her. I hope that fix will be delivered in 5.1.1.
If you have a possibility to upgrade to full 5.1 after the release then I strongly recommend you to do it.

pagination using MySQL database and Java ResultSet

To implement a pagination on a list, I need to do two queries:
Get elements count from selected table using SELECT COUNT(*)...
Get subset of list using LIMIT and OFFSET in a query.
Are there any way to avoid this?. Are There any metadata where this is stored?
The function resultSet.getRow() retrive the array index of list, then I need to make a query whose results are all rows. After I get a subSet but this is expensive so.
I want send a only query with limits and offsets and retrive the selected datas and total count of datas.
is this possible?
Thanks in advance,
Juan

I saw some things about this, then new doubts are coming on to me.
When a query is lanched with limits, we can add SQL_CALC_FOUND_ROWS * on "select" section as follow:
"SELECT SQL_CALC_FOUND_ROWS * FROM ... LIMIT 0,10"
After, I query the follow:
"SELECT FOUND_ROWS()"
I understand that first query store count in a internal var whose value will be returned on second query. The second query isn't a "select count(*) ..." query so "SELECT FOUND_ROWS()" query should be inexpensive.
Am I right?
Some test that I have made show the follow:
--fist select count(*), second select with limits--
Test 1: 194 ms
out: {"total":94607,"list":["2 - 1397199600000","2 - 1397286000000","13 - 1398150000000","13 - 1398236400000","13 - 1398322800000","13 - 1398409200000","13 - 1398495600000","14 - 1398150000000","14 - 1398236400000","14 - 1398322800000"]}
--the new way--
Test 2: 555 ms
out: {"total":94607,"list":["2 - 1397199600000","2 - 1397286000000","13 - 1398150000000","13 - 1398236400000","13 - 1398322800000","13 - 1398409200000","13 - 1398495600000","14 - 1398150000000","14 - 1398236400000","14 - 1398322800000"]}
Why the test dont show the expected result?
My assumptions are wrong?
thanks, regards

I have resolve the question
The next link has got the response.
https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/

ORMLite groupByRaw and groupBy issue on android SQLite db

I have a SQLite table content with following columns:
-----------------------------------------------
|id|book_name|chapter_nr|verse_nr|word_nr|word|
-----------------------------------------------
the sql query
select count(*) from content where book_name = 'John'
group by book_name, chapter_nr
in DB Browser returns 21 rows (which is the count of chapters)
the equivalent with ORMLite android:
long count = getHelper().getWordDao().queryBuilder()
.groupByRaw("book_name, chapter_nr")
.where()
.eq("book_name", book_name)
.countOf();
returns 828 rows (which is the count of verse numbers)
as far as I know the above code is translated to:
select count(*) from content
where book_name = 'John'
group by book_name, chapter_nr
result of this in DB Browser:
| count(*)
------------
1 | 828
2 | 430
3 | 653
...
21| 542
---------
21 Rows returned from: select count(*)...
so it seems to me that ORMLite returns the first row of the query as the result of countOf().
I've searched stackoverflow and google a lot. I found this question (and more interestingly the answer)
You can also count the number of rows in a custom query by calling the > countOf() method on the Where or QueryBuilder object.
// count the number of lines in this custom query
int numRows = dao.queryBuilder().where().eq("name", "Joe Smith").countOf();
this is (correct me if I'm wrong) exactly what I'm doing, but somehow I just get the wrong number of rows.
So... either I'm doing something wrong here or countOf() is not working the way it is supposed to.
Note: It's the same with groupBy instead of groupByRaw (according to ORMLite documentation joining groupBy's should work)
...
.groupBy("book_name")
.groupBy("chapter_nr")
.where(...)
.countOf()
EDIT: getWordDao returns from class Word:
#DatabaseTable(tableName = "content")
public class Word { ... }

returns 828 rows (which is the count of verse numbers)
This seems to be a limitation of the QueryBuilder.countOf() mechanism. It is expecting a single value and does not understand the addition of GROUP BY to the count query. You can tell that it doesn't because that method returns a single long.
If you want to extract the counts for each of the groups it looks like you will need to do a raw query check out the docs.

Need suggestions to clarify the concept of mongoDB to store and retrieve images

I am new to mongoDB. I was told to use mongoDB to my photo management web app. I am not able to understand mongoDB's basic concept. The documents.
What is documents in mongoDB?
j = { name : "mongo" };
t = { x : 3 };
In mongoDB website they told that the above 2 lines were 2 documents.
But till this time i thought .txt, .doc .excel... etc. were documents.(This may be funny, but i am really in need of understanding its concepts!)
How do you represent a txt file for example example.txt in mongoDB?
What is collection?
Collection of documents is known as "Collections in mongoDB"
How many collections i can create?
All documents were shared in all collections
Finnaly i come to my part, How shall i represent images in mongoDB?
With the help of tutorials i learned to store and retrieve images from mongoDB using java!!
But, without the understanding of mongoDB's concepts i cannot move further!
The blog's and articles about mongoDB is pretty interesting. But still i am not able to understand its basic concepts!!!
Can anyone strike my head with mongoDB!!??

Perhaps comparing MongoDB to SQL would help you ...
In SQL queries work against tables, columns & rows in set-based operations. There are pre-defined schema's (and hopefully indexes) to aid the query processor (as well as the querier!)
SQL Table / Rows
id | Column1 | Column2
-----------------------
1 | aaaaa | Bill
2 | bbbbb | Sally
3 | ccccc | Kyle
SQL Query
SELECT * FROM Table1 WHERE Column1 = 'aaaaa' ORDER BY Column2 DESC
This query would return all the columns in the table named Table1 where the column named Column1 has a value of aaaaa it then will order the results by the value of Column2 and return the results in descending order to the client.
MongoDB
In MongoDB there are no tables, columns or rows ... instead their are Collections (these are like tables) and Documents inside the Collections (like rows.)
MongoDB Collection / Documents
{
"_id" : ObjectId("497ce96f395f2f052a494fd4"),
"attribute1" : "aaaaa",
"attribute2" : "Bill",
"randomAttribute" : "I am different"
}
{
"_id" : ObjectId("497ce96f395f2f052a494fd5"),
"attribute1" : "bbbbb",
"attribute2" : "Sally"
}
{
"_id" : ObjectId("497ce96f395f2f052a494fd6"),
"attribute1" : "ccccc",
"attribute2" : "Kyle"
}
However there is no predefined "table structure" or "schema" like a SQL table. For example you can see the second document in this collection has an attribute called randomAttribute that none of the other documents have.
This is just fine, it won't affect our queries but does allow for for some very powerful things ...
The data is stored in a format called BSON which is very close to the Javascript JSON standard. You can find out more at http://bson.org/
MongoDB Query
SELECT * FROM Table1 WHERE Column1 = 'aaaaa' ORDER BY Column2 DESC
How would we do this same thing in MongoDB's Shell?
> db.collection1.find({attribute1:"aaaaa"}).sort({attribute2:-1});
Perhaps you can already see how similar a MongoDB query really is to SQL (while appearing quite different.) I have some posts up on http://learnmongo.com which might help you as well.

MongoDB is a document database : http://en.wikipedia.org/wiki/Document-oriented_database

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Data structure to store HashMap in Druid - java

#VanThaoNguye, Yes you can store the hashmaps in druid and you can query with bound filters. You can read more about bound filters here: http://druid.io/docs/latest/querying/filters.html#bound-filter

Related

Activiti HistoricProcessInstanceQuery returned with missing processVariables

Hazelcast SQL interface slow performance HZ 4.2.2 vs HZ 5.0.2

pagination using MySQL database and Java ResultSet

ORMLite groupByRaw and groupBy issue on android SQLite db

Need suggestions to clarify the concept of mongoDB to store and retrieve images

Categories

Resources