I was trying to implement the query cache for large queries in ArangoDB.
When i check if the document cursor is cached or not, it shows that the cache is true. But i see no performance improvements in the query time processing.
However using the same query from arangodb web interface shows high performance improvements due to caching.
Edit :
Java Driver Version: 2.7.4
ArangoDb Version: 2.8.7
My Query is:
for t in MyStorage FILTER t.myDate>'2016-01-11' and t.myDate<'2016-06-01' and t.fraud!=null and t.fraud!='' and t.currency=='INR' return {myID:t.myID,myDate:t.myDate,amount:t.amount,fraud:t.fraud}
We tested Caching with the following Testcase and saw a performance improvement.
Can you post an example of your query?
public class ArangoDriverCacheTest {
private static final String COLLECTION_NAME = "unitTestCollection";
private static final String DATABASE_NAME = "unitTestDatabase";
private static ArangoConfigure configure;
private static ArangoDriver driver;
#BeforeClass
public static void setup() throws ArangoException {
configure = new ArangoConfigure();
configure.init();
driver = new ArangoDriver(configure);
// // create test database
try {
driver.createDatabase(DATABASE_NAME);
} catch (final ArangoException e) {
}
driver.setDefaultDatabase(DATABASE_NAME);
// create test collection
try {
driver.createCollection(COLLECTION_NAME);
} catch (final ArangoException e) {
}
driver.truncateCollection(COLLECTION_NAME);
// create some test data
for (int i = 0; i < 1000000; i++) {
final TestEntity value = new TestEntity("user_" + (i % 10), "desc" + (i % 10), i);
driver.createDocument(COLLECTION_NAME, value);
}
}
#AfterClass
public static void shutdown() {
try {
driver.deleteDatabase(DATABASE_NAME);
} catch (final ArangoException e) {
}
configure.shutdown();
}
private AqlQueryOptions createAqlQueryOptions(
final Boolean count,
final Integer batchSize,
final Boolean fullCount,
final Boolean cache) {
return new AqlQueryOptions().setCount(count).setBatchSize(batchSize).setFullCount(fullCount).setCache(cache);
}
#Test
public void test_withoutCache() throws ArangoException {
// set cache mode off
final QueryCachePropertiesEntity properties = new QueryCachePropertiesEntity();
properties.setMode("off");
driver.setQueryCacheProperties(properties);
exceuteQuery(false);
}
#Test
public void test_withCache() throws ArangoException {
// set cache mode on
final QueryCachePropertiesEntity properties = new QueryCachePropertiesEntity();
properties.setMode("on");
driver.setQueryCacheProperties(properties);
// set caching to true for the query
exceuteQuery(true);
}
private void exceuteQuery(final boolean cache) throws ArangoException {
final AqlQueryOptions aqlQueryOptions = createAqlQueryOptions(true, 1000, null, cache);
final String query = "FOR t IN " + COLLECTION_NAME + " FILTER t.age >= #age SORT t.age RETURN t";
final Map<String, Object> bindVars = new MapBuilder().put("age", 90).get();
DocumentCursor<TestEntity> rs = driver.executeDocumentQuery(query, bindVars, aqlQueryOptions, TestEntity.class);
// first time, the query isn't cached
Assert.assertEquals(false, rs.isCached());
final long start = System.currentTimeMillis();
// query the cached value
rs = driver.executeDocumentQuery(query, bindVars, aqlQueryOptions, TestEntity.class);
Assert.assertEquals(cache, rs.isCached());
// load all results
rs.asEntityList();
final long time = System.currentTimeMillis() - start;
System.out.println(String.format("time with cache=%s: %sms", cache, time));
}
private static class TestEntity {
private final String user;
private final String desc;
private final Integer age;
public TestEntity(final String user, final String desc, final Integer age) {
super();
this.user = user;
this.desc = desc;
this.age = age;
}
}
}
Related
I have a query and it works well on the database. However, when I tried to take them as a Java object by using RowMapper, I get an invalid column name error. I checked everything, but I don't understand the reason why this error happening.
My query:
SELECT TEMP.SUMALLTXN, SUM(TEMP.SUMCARD), SUM(TEMP.SUMERRORTXN), SUM(TEMP.SUMERRORTXNCARD)
FROM
(SELECT
SUM(COUNT(*)) OVER() AS SUMALLTXN,
COUNT(mdmtxn.BIN) OVER (PARTITION BY mdmtxn.BIN) AS SUMCARD,
SUM(case when mdmtxn.MDSTATUS NOT IN ('1','9', '60') then 1 else 0 end) AS SUMERRORTXN,
SUM(case when mdmtxn.MDSTATUS NOT IN ('1','9', '60') then 1 else 0 end) OVER (PARTITION BY mdmtxn.BIN) AS SUMERRORTXNCARD
FROM MDM59.MDMTRANSACTION2 mdmtxn WHERE
mdmtxn.CREATEDDATE < TO_CHAR(SYSDATE - INTERVAL ':initialMinuteParameterValue' MINUTE ,'YYYYMMDD HH24:MI:SS') AND
mdmtxn.CREATEDDATE > TO_CHAR(SYSDATE - INTERVAL ':intervalMinuteParameterValue' MINUTE ,'YYYYMMDD HH24:MI:SS')
GROUP BY mdmtxn.MDSTATUS, mdmtxn.BIN
) TEMP
GROUP BY TEMP.SUMALLTXN
My RowMapper:
#Component
public class TotalTransactionsReportRw implements RowMapper<TotalTransactionsReportDto> {
#Override
public TotalTransactionsReportDto mapRow(ResultSet rs, int rowNum) throws SQLException {
return TotalTransactionsReportDto.builder()
.totalNumbersOfTransactions(rs.getString("SUMALLTXN"))
.totalNumbersOfCard(rs.getString("SUMCARD"))
.totalNumbersOfErrorTransactions(rs.getString("SUMERRORTXN"))
.totalNumbersOfErrorCard(rs.getString("SUMERRORTXNCARD"))
.build();
}
private static class TotalTransactionsDetailRwHolder {
private static final TotalTransactionsReportRw INSTANCE = new TotalTransactionsReportRw();
}
public static TotalTransactionsReportRw getInstance() {
return TotalTransactionsReportRw.TotalTransactionsDetailRwHolder.INSTANCE;
}
}
My Dto:
#Value
#Builder
#Data
public class TotalTransactionsReportDto {
private String totalNumbersOfTransactions;
private String totalNumbersOfCard;
private String totalNumbersOfErrorTransactions;
private String totalNumbersOfErrorCard;
}
And in my tasklet class I created a list to get all data from rowmapper:
#Slf4j
#Component
#RequiredArgsConstructor
public class NotificationTasklet implements Tasklet {
private final PofPostOfficeServiceClient pofPostOfficeServiceClient;
private final SequenceSysGuid sequenceSysGuid;
private final BatchProps batchProps;
private JdbcTemplate jdbcTemplate;
private String notificationMailSql;
private String totalTransactionsSql;
private String endOfHtmlString = "</table></body></html>";
private String endOfTableString = "</table>";
private String jobName = "vpos-notification";
private String tdClose = "</td>";`
#Override
public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
List<VposNotificationBatchDto> notificationList = getNotificationList();
List<TotalTransactionsReportDto> totalTransactionsList = getTotalTransactionsList();
AlertMailDto alertMailDto = createAlertMailDto(notificationList,totalTransactionsList);
if (!(notificationList.isEmpty())) {
sendMail(alertMailDto);
}
return RepeatStatus.FINISHED;
}
List<TotalTransactionsReportDto> getTotalTransactionsList() {
return jdbcTemplate.query(
totalTransactionsSql,
new TotalTransactionsReportRw());
}
#Autowired
public void setTotalTransactionsSql(#Value("classpath:sql/vposnotification/select_total_transactions_data.sql")
Resource res) {
int intervalnext = batchProps.getJobProps()
.get(jobName).getAlertProps().getIntervalMinuteParameterValue();
String intervalMinutes = String.valueOf(intervalnext);
int initialMinuteParameterValue = batchProps.getJobProps()
.get(jobName).getAlertProps().getInitialMinuteParameterValue();
String initialMinutes = String.valueOf(initialMinuteParameterValue);
this.totalTransactionsSql = SqlUtils.readSql(res);
this.totalTransactionsSql = this.totalTransactionsSql.replace(":initialMinuteParameterValue", initialMinutes);
this.totalTransactionsSql = this.totalTransactionsSql.replace(":intervalMinuteParameterValue", intervalMinutes);
}
#Autowired
public void setJdbcTemplate(JdbcTemplate jdbcTemplate) {
this.jdbcTemplate = jdbcTemplate;
}
The problem is that your query doesn't actually have columns SUMCARD, SUMERRORTXN and SUMERRORTXNCARD. Although there are DBMSes that alias SUM columns with the name of the column that is summed, Oracle is not one of them. IIRC, Oracle aliases it as, for example, "SUM(SUMCARD)" or maybe "SUM(TEMP.SUMCARD)". However, that is an implementation detail you should not rely on in my opinion.
To get the name you want to use, you need to alias your SUM columns explicitly, e.g. SUM(TEMP.SUMCARD) AS SUMCARD.
I am trying to use DynamoDB on my local pc.
Before I was using MongoDB and the performance of the DynamoDB compared to it is very poor.
The save operation to a table takes a very long time, about 13 seconds for 100 records.
The records are pretty small, example below.
Here is my full example and code which I use to run it:
public class dynamoTry {
private AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard()
.withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration("http://localhost:8000", "us-east-2"))
.build();
private DynamoDB dynamoDB = new DynamoDB(client);
private DynamoDBMapper mapper = new DynamoDBMapper(client);
public static void main(String[] args) {
dynamoTry dt = new dynamoTry ();
dt .deleteTable();
dt .buildGrid();
dt .demoFill();
dt .scanTable();
}
public void buildGrid() {
System.out.println("Attempting to create table; please wait...");
String tableName = "Grid";
List<AttributeDefinition> attributeDefinitions = new ArrayList<AttributeDefinition>();
attributeDefinitions.add(new AttributeDefinition().withAttributeName("name").withAttributeType(ScalarAttributeType.S));
attributeDefinitions.add(new AttributeDefinition().withAttributeName("country").withAttributeType(ScalarAttributeType.S));
List<KeySchemaElement> keySchema = new ArrayList<KeySchemaElement>();
keySchema.add(new KeySchemaElement().withAttributeName("name").withKeyType(KeyType.HASH));
keySchema.add(new KeySchemaElement().withAttributeName("country").withKeyType(KeyType.RANGE));
CreateTableRequest request = new CreateTableRequest().withTableName(tableName).withKeySchema(keySchema)
.withAttributeDefinitions(attributeDefinitions).withProvisionedThroughput(
new ProvisionedThroughput().withReadCapacityUnits(500L).withWriteCapacityUnits(500L));
Table table = dynamoDB.createTable(request);
try {
table.waitForActive();
System.out.println("Success.");
} catch (InterruptedException e) {e.printStackTrace();}
}
public void demoFill() {
final List<GridPoint> gpl = new ArrayList<GridPoint>();
int count = 0;
while (count < 100) {
final String point = "point" + count;
gpl.add(makeGP(point, count, "continent", "country", new HashSet<Double>(Arrays.asList(22.23435, 37.89746))));
count++;
}
long startTime = System.nanoTime();
addBatch(gpl);
long endTime = System.nanoTime();
long duration = (endTime - startTime)/1000000;
System.out.println(duration + " [ms]");
}
public void addBatch(List<GridPoint> gpl) {
mapper.batchSave(gpl);
}
public GridPoint makeGP(String name, int sqNum, String continent, String country, HashSet<Double> cords) {
GridPoint item = new GridPoint();
item.setName(name);
item.setSqNum(sqNum);
item.setContinent(continent);
item.setCountry(country);
item.setCoordinates(cords);
return item;
}
public void scanTable() {
Map<String, AttributeValue> eav = new HashMap<String, AttributeValue>();
eav.put(":val", new AttributeValue().withN("0"));
DynamoDBScanExpression scanExpression = new DynamoDBScanExpression().withFilterExpression("sqNum >= :val").withExpressionAttributeValues(eav);
List<GridPoint> scanResult = mapper.scan(GridPoint.class, scanExpression);
for (GridPoint gp : scanResult) {
System.out.println(gp);
}
}
public void deleteTable() {
Table table = dynamoDB.getTable("Grid");
try {
System.out.println("Attempting to delete table 'Grid', please wait...");
table.delete();
table.waitForDelete();
System.out.print("Success.");
}
catch (Exception e) {
System.err.println("Unable to delete table: ");
System.err.println(e.getMessage());
}
}
}
Here is the code for the GridPoint class:
#DynamoDBTable(tableName = "Grid")
public class GridPoint {
private String name;
private int sqNum;
private String continent;
private String country;
private HashSet<Double> coordinates; // [longitude, latitude]
// Partition key
#DynamoDBHashKey(attributeName = "name")
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
#DynamoDBAttribute(attributeName = "sqNum")
public int getSqNum() {
return sqNum;
}
public void setSqNum(int sqNum) {
this.sqNum = sqNum;
}
#DynamoDBAttribute(attributeName = "continent")
public String getContinent() {
return continent;
}
public void setContinent(String continent) {
this.continent = continent;
}
#DynamoDBAttribute(attributeName = "country")
public String getCountry() {
return country;
}
public void setCountry(String country) {
this.country = country;
}
#DynamoDBAttribute(attributeName = "coordinates")
public HashSet<Double> getCoordinates() {
return coordinates;
}
public void setCoordinates(HashSet<Double> coordinates) {
this.coordinates = coordinates;
}
#Override
public String toString() {
return "GP {name = " + name + ", sqNum = " + sqNum + ", continent = " + continent + ", country = " + country
+ ", coordinates = " + coordinates.toString() + "}";
}
}
Why is it so slow? is there any way of speeding the writing process?
In MongoDB the same operations would take less than a second.
When I was running it about 3000 points it took several minutes to finish, seems not reasonable.
Is it possible to make the batch save process parallel? would it speed things up?
I also tried to set the ProvisionedThroughput parameter to a higher value but that did not help.
I am lost, any help would be appreciated, thank you.
It's is slow because it is not DynamoDB. There is no Local DynamoDB!
DynamoDB is a managed service provided by AWS and it is really fast (milliseconds for the first bytes), highly scalable and durable. It is a really good product with a lot of performance for a small amount of money. But it is a managed service. It only works on AWS environment. There is no way to you or anyone else get a copy and install DynamoDB in Azure, GCP or even in your local environment.
What are you using is a facade, probably developed by AWS Team to help developers test their applications. There are other DynamoDB facades, not developed by AWS Team but everyone of then just respect a protocol that accepts all api calls from the original product. As a facade, his objective is just provide a endpoint that can receive your calls and respond like the original product. If you make a call that the original DynamoDB would respond with an Ok the facade will respond with an Ok. If you make a call that the original DynamoDB would respond with a failure the facade will send you a failure.
There is no compromisse with performance or even data durability. If you need a durable database, with good performance, you must go with MongoDB. DynamoDB was created to be used on AWS environment only.
Again: There is no such thing like DynamoDB local.
DynamoDB has predefined limits. It is possible that you are running into these limits. Consider increasing WriteCapacityUnits for the table to increase performance. You may also want to increase ReadCapacityUnits for the scan.
I am writing an integration test for elasticsearch 5.3.
public class ProtectedWordsIndexTests extends ESIntegTestCase {
private final WordDelimiterActionListener wordsListener =
WordDelimiterActionListener.getInstance();
private final static String INDEX_NAME = "protected_words";
private final static String TYPE_NAME = "word";
private final static String FILTER_NAME = "my_word_delimiter";
#Override
protected Collection<Class<? extends Plugin>> nodePlugins() {
return Collections.singleton(WordDelimiterPlugin.class);
}
#Override
protected Settings nodeSettings(int nodeOrdinal) {
return builder()
.put("plugin.types", TYPE_NAME)
.put("plugin.dynamic_word_delimiter.refresh_interval", "500ms")
.put(super.nodeSettings(nodeOrdinal))
.build();
}
public void testAddWordToIndex() throws Exception {
Settings indexSettings = builder()
.put(IndexMetaData.SETTING_VERSION_CREATED, Version.CURRENT)
.put("index.analysis.filter.my_word_delimiter.type", "dynamic_word_delimiter")
.build();
TokenFilterFactory filterFactory = filterFactory(indexSettings, FILTER_NAME);
createIndex(INDEX_NAME);
ensureGreen();
client().prepareIndex(INDEX_NAME, TYPE_NAME, "1")
.setSource("word", "1tb")
.execute();
Thread.sleep(TimeValue.timeValueSeconds(1).getMillis());
Set<String> protectedWords = wordsListener.getProtectedWords();
assertTrue(protectedWords.size() == 1);
}
}
When I am running testAddWordToIndex() I am getting the following error:
"java.lang.IllegalArgumentException: unknown setting
[plugin.dynamic_word_delimiter.refresh_interval] please check that any
required plugins are installed, or check the breaking changes
documentation for removed settings"
If I remove the following part and increase the refresh interval to be more than the default, the test passes. So I just can't override this.
.put("plugin.dynamic_word_delimiter.refresh_interval", "500ms")
The default refresh interval is declared here:
public class WordDelimiterRunnable extends AbstractRunnable {
public static final TimeValue REFRESH_INTERVAL = TimeValue.timeValueSeconds(20);
public static final String INDEX_NAME = "protected_words";
public static final String INDEX_TYPE = "word";
public static final int RESULTS_SIZE = 10000;
private volatile boolean running;
private final Client client;
private final String index;
private final long interval;
private final String type;
public WordDelimiterRunnable(Client client, Settings settings) {
this.client = client;
this.index = settings.get("plugin.dynamic_word_delimiter.protected_words_index", INDEX_NAME);
this.type = settings.get("plugin.dynamic_word_delimiter.protected_words_type", INDEX_TYPE);
this.interval = settings.getAsTime("plugin.dynamic_word_delimiter.refresh_interval", REFRESH_INTERVAL).getMillis();
}
// more code here
}
You need to register the setting using the SettingsModule#registerSettings(Setting) method as explain here:
https://www.elastic.co/guide/en/elasticsearch/reference/5.x/breaking_50_settings_changes.html#breaking_50_settings_changes
Seems similar to previously answered question:Java 8 stream group by min and max
However it is not!
I have a table with Three Columns:
LogId, StartTime, EndTime
Now We have Multiple entries of same LogId with different StartTime and EndTime
The problem is:
All the columns I have are String, so How to calculate min or max of any column based on their values.
I need to Find out min(StartTime), max(EndTime) group by LogId into a single Stream.
How can this be achieved with minimal code and maximal efficiency using stream in java 8.
Attached is the Sample class:
public class Log {
private static final String inputFileName = "D:\\path\\to\\Log.csv";
private static final String outputFileName = "D:\\path\\to\\Output\\Log.csv";
private static List<Log> logList = null;
private static Map<String, List<Log>> groupByLogId = new HashMap<String, List<Log>>();
private String log_Id;
private String startTime;
private String endTime;
public static Map<String, List<Log>> createLogMap() throws IOException {
Function<String, Log> mapToLog = (line) -> {
String[] p = line.split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)", -1);
Log log = new Log(p[0],p[1],
p[2]);
return log;
};
InputStream is = null;
BufferedReader br = null;
is = new FileInputStream(new File(inputFileName));
br = new BufferedReader(new InputStreamReader(is));
logList = br.lines()
.skip(1)
.map(mapToLog)
.collect(Collectors.toList());
logList.stream().forEach(System.out::println);
groupByLogId = logList.stream()
.collect(Collectors.groupingBy(Log::getLog_Id));
for (Entry<String, List<Log>> entryForLog : groupByLogId.entrySet()) {
System.out.println(" Entity Id " + entryForLog.getKey()
+ " | Value : " + entryForLog.getValue());
}
br.close();
return groupByLogId;
}
public String getLog_Id() {
return log_Id;
}
public void setLog_Id(String log_Id) {
this.log_Id = log_Id;
}
public String getStartTime() {
return startTime;
}
public void setStartTime(String startTime) {
this.startTime = startTime;
}
public String getEndTime() {
return endTime;
}
public void setEndTime(String endTime) {
this.endTime = endTime;
}
public static List<Log> getLoglist() {
return logList;
}
public Log(String log_Id, String startTime, String endTime) {
super();
this.log_Id = log_Id;
this.startTime = startTime;
this.endTime = endTime;
}
#Override
public String toString() {
return (new StringBuffer()
.append(log_Id).append(",")
.append(startTime).append(",")
.append(endTime)
).toString();
}
}
Any help is much appreciated,
Expected Output:
LogId: logid,min(StartTime),max(EndTime)
Of course, storing time as string is not very good idea. It would be better to use something like LocalDateTime instead. In this answer I assume that your string timestamp representations are comparable so I can use date1.compareTo(date2).
Also I strongly recommend you to remove setters making the Log objects immutable. They don't add any value, only make your program harder to debug when you occasionally change existing objects.
Back to your question, add a merger method like this:
class Log {
...
Log merge(Log other) {
if(!other.getLog_Id().equals(this.getLog_Id())) {
throw new IllegalStateException();
}
String start = this.getStartTime().compareTo(other.getStartTime()) < 0 ?
this.getStartTime() : other.getStartTime();
String end = this.getEndTime().compareTo(other.getEndTime()) > 0 ?
this.getEndTime() : other.getEndTime();
return new Log(this.getLog_Id, start, end);
}
}
Now you can simply use toMap() collector supplying your merge function:
streamOfLogs.collect(
Collectors.toMap(Log::getLog_Id, Function.identity(), Log::merge));
This way when two log entries with the same Log_Id appear, the merge method will be called for both of them creating the merged log entry.
public class Database {
private String ric;
private String volume;
private String _url;
private String _userId;
private String _password;
private String _dbLib;
private String _dbFile;
private Connection _conn;
private PreparedStatement _statement;
public Database(LSE item) {
ric = item.get_ric();
volume = item.get_volume();
}
public void writeToDb() throws SQLException{
//setString
}
}
I have a ItemDispatcher class:
public class ItemDispatcher implements Runnable {
private LSE lse;
public ItemDispatcher(LSE lseItem) {
this.lse= lseItem;
}
#Override
public void run() {
try {
new Database(lse).writeToFile();
} catch (IOException e) {
e.printStackTrace();
}
}
}
run() method in ItemDispatcher runs repeatedly. I want to create database connection and prepareStatement in Database class, but doing this on Database class constuctor would create connection many times over. How can I change my design to create connection just once and not over and over again on every execution of run(). I am trying to not do this in any other class and just Database class
Within the scope of ItemDispatcher, declare private variable X of type Database. You might initialize it in a separate method (best) or in the constructor (might be ok). Use the private variable X instead of creating a new instance in method run
Do it in a static block in class Database
static {
}
But this implies that Connections and Statement will be static and then shared by all instances of Database.
Just as an example from another SO post:
public static final Map<String, String> initials = new HashMap<String, String>();
static {
initials.put("AEN", "Alfred E. Newman");
// etc.
}
Use the Singleton pattern . This will allow you to have only one instace of the Database connection.
Taking your code as an example, it would be like this :
public class Database {
private String ric;
private String volume;
private String _url;
private String _userId;
private String _password;
private String _dbLib;
private String _dbFile;
private Connection _conn;
private PreparedStatement _statement;
private static final Database INSTANCE;
private Database(LSE item) {
ric = item.get_ric();
volume = item.get_volume();
}
public static final Database getInstance(LSE item) {
if (INSTANCE == null) {
INSTANCE = new Database(LSE item);
}
return INSTANCE;
}
public void writeToDb() throws SQLException{
//setString
}
}
If your application will be using Threads (Concurrency), I suggest you also to prepare your singleton for those situations , see this question