ElasticSeach | Cannot found inserted document after delete index

ElasticSeach | Cannot found inserted document after delete index - java

I write a simple test validate duplicates are not exist, like this:
#Test
public void testSameDataNotPushedTwice() throws Exception {
// Do some logic
// index contains es index name
// adding this line fail the test
// deleteOldData(esPersistence.getESClient(), index);
esPersistence.insert(cdrData);
esPersistence.insert(cdrData);
SearchResponse searchResponse = getDataFromElastic(esPersistence.getESClient(), index);
assertThat(searchResponse.getHits().getHits().length).isEqualTo(1);
}
As you can see I push data to ES and check hits length equals 1.
Test is passed when the delete line is in commnet.
Now, I want to make sure there is no data from others tests, therefore I want to delete the index before the insert. The delete method works but search response return 0 hits after the insert.
The delete index method:
public static void deleteOldData(RestHighLevelClient client, String index) throws IOException {
GetIndexRequest request = new GetIndexRequest(index);
boolean exists = client.indices().exists(request, RequestOptions.DEFAULT);
if (exists) {
DeleteIndexRequest deleteRequest = new DeleteIndexRequest(index);
client.indices().delete(deleteRequest, RequestOptions.DEFAULT);
}
}
Highlights:
ES 7.6.2
The data is exist in ES.
Adding sleep not solve the problem (even for 10 seconds).
The search is working (document is found) while debbuging.
Bottom line: How can I perform delete index --> insert --> search and found the documents?
EDIT:
Add insert to ES and GetSettingsRequest:
deleteOldData(esPersistence.getESClient(), index);
esPersistence.insert(testData);
GetSettingsRequest request = new GetSettingsRequest().indices(index);
GetSettingsResponse getSettingsResponse = esPersistence.getESClient().indices().getSettings(request, RequestOptions.DEFAULT);
esPersistence.insert(testData);
Insert methods:
public boolean insert(List<ProjectData> projDataList) {
// Relevant Lines
BulkRequest bulkRequest = prepareBulkRequests(projDataList, esConfiguration.getCdrDataIndexName());
insertBulk(bulkRequest)
}
private BulkRequest prepareBulkRequests(List<ProjectData> data, String indexName) {
BulkRequest bulkRequest = new BulkRequest();
for (ProjectData ProjectData : data) {
String json = jsonParser.parsePojo(ProjectData);
bulkRequest.add(new IndexRequest(indexName)
.id(ProjectData.getId())
.source(json, XContentType.JSON));
}
return bulkRequest;
}
private boolean insertBulk(BulkRequest bulkRequest) {
try {
BulkResponse bulkResponse = rhlClient.bulk(bulkRequest, RequestOptions.DEFAULT);
if (bulkResponse.hasFailures()) {
logger.error(buildCustomBulkFailedMessage(bulkResponse));
return false;
}
} catch (IOException e) {
logger.warn("Failed to insert csv fields. Error: {}", e.getMessage());
return false;
}
return true;
}

With a Speical thanks to David Pilato (from ES fourm) - need to refresh the index after the insert operation, like this:
client.indices().refresh(new RefreshRequest(index), RequestOptions.DEFAULT);
link.

Related

Get all Body request using foreach loop on ArrayList on unirest in Java

I have an arraylist i.e.
ArrayList<Integer> getStoresId = [1845, 1846, 1847]
Now I have to execute an end point everytime on each of these id's,
e.g. End point is like this:
https://endpoint/stores/1845/timeEntries,
and i have use this on every storeID.
for which I am using following logic:
public static JSONObject getAllTimeClocks() throws IOException, UnirestException {
ArrayList<Integer> getStoresId = [1845, 1846, 1847]
HttpResponse<JsonNode> bodyResponse = null;
for (Integer id : getStoresId) {
bodyResponse = Unirest.get(url + id + "/TimeClockEntries").header("x-access-token", token).asJson();
}
assert bodyResponse != null;
return bodyResponse.getBody().getObject();
}
and everytime, it is returning me 'null', I need to know, what I am doing wrong.
NOTE: Many of things here are not shared, i.e. token, url etc.

How to iterate all rows from Google Sheet

I'm working with Google Spreadsheets Api and Android Studio. I'm reading info from a sheet, and showing the results into a List View. My problem is that I can only retrieve the info from the first row, but not from the next ones. How can I do that?
I have the next code:
private class MakeRequestTask extends AsyncTask<Void, Void, List<String>> {
private Exception mLastError = null;
MakeRequestTask(GoogleAccountCredential credential) {
}
#Override
protected List<String> doInBackground(Void... params) {
try {
return getDataFromApi();
} catch (Exception e) {
mLastError = e;
cancel(true);
return null;
}
}
private List<String> getDataFromApi() throws IOException {
String range = "Sheet1!A2:H";
List<String> results = new ArrayList<String>();
ValueRange response = mService.spreadsheets().values()
.get(spreadsheet_id, range)
.execute();
List<List<Object>> values = response.getValues();
if (values == null) {
//No me deja mostrar toast aqui
} else {
for (List row : values) {
row.get(0);
row.get(1);
row.get(2);
row.get(5);
System.out.println("resultados:" + row.get(0) + ", " + row.get(1));
}
}
return results;
}

Try to check the sample code on Reading multiples ranges
To read multiple discontinuous ranges, use a spreadsheets.values.batchGet, which lets you specify any number of ranges to retrieve:
Here is a sample Java code:
List<String> ranges = Arrays.asList(
//Range names ...
);
BatchGetValuesResponse result = service.spreadsheets().values().batchGet(spreadsheetId)
.setRanges(ranges).execute();
Another thing is try to replace the for loop to:
for (List row : values) {
// Print columns A and E, which correspond to indices 0 and 4.
System.out.printf("%s, %s\n", row.get(0), row.get(4));
}
You can specify the indices you needed. In the example index 0 to 4.
Hope this is what you are looking for.

Issues with getting data from vector using Java and objects

I'm currently playing around with Java, JForms, and the Twitter4j library. Trying to create a way to populate a vector with tweetID's in order to perform actions on them (Retweets, Favorites, replies, etc.). I'm having a bit of a problem with either:
A. Not being able to add to the vector, or
B. Not being able to read from it.
Based on what I've seen from the logger output, I'm positive I'm not adding to the array correctly. Could somebody take a look with me and see what's going on?
main.java (The code to add to the vector with the tweet ID.)
private void btn_refreshTimelineActionPerformed(java.awt.event.ActionEvent evt) {
TwitterUtilities tu = new TwitterUtilities();
Twitter twitter = tu.getConnect();
// Gets tweet information.
List<Status> statuses = null;
try {
statuses = twitter.getHomeTimeline();
} catch (TwitterException ex) {
Logger.getLogger(main.class.getName()).log(Level.SEVERE, null, ex);
}
// Init table.
DefaultTableModel model = (DefaultTableModel) tbl_tweets.getModel();
// Clear table.
model.setRowCount(0);
// Populate table.
for (Status status : statuses)
{
tu.setTargetID(status.getId());
Vector row = new Vector();
row.add(status.getUser().getName());
row.add(status.getText());
model.addRow(row);
}
}
main.java (The part where getTargetTweetID is invoked.)
private void btn_favoriteActionPerformed(java.awt.event.ActionEvent evt) {
TwitterUtilities tu = new TwitterUtilities();
Twitter twitter = tu.getConnect();
int targetIndex = tbl_tweets.getSelectedRow();
long tweetID = tu.getTargetTweetID(targetIndex);
try {
twitter.createFavorite(tweetID);
} catch (TwitterException ex) {
Logger.getLogger(main.class.getName()).log(Level.SEVERE, null, ex);
}
}
TwitterUtilities.java
public class TwitterUtilities {
public Vector tweetIDVector;
// Other stuff... but here's the important stuff.
public void setTargetID(long targetStatus)
{
this.tweetIDVector.addElement(targetStatus);
}
public long getTargetTweetID(int targetIndex)
{
// Get tweet ID for the selected tweet.
Object targetTweetID = this.tweetIDVector.get(targetIndex);
return (long) targetTweetID;
}
}
The error message that I'm getting upon runtime:
Exception in thread "AWT-EventQueue-0" java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 1
Any help would be greatly appreciated!
EDIT: After working with the suggestions, I've now updated the following code to this new setup:
TwitterUtilities.java
public class TwitterUtilities {
public Vector tweetIDVector = new Vector<>();
// More fun stuff...
public void setTargetID(long targetStatus)
{
this.tweetIDVector.addElement(targetStatus);
for (int i = 0; i <= tweetIDVector.size(); i++)
{
System.out.println(tweetIDVector.get(i));
}
}
And so now the following happens on run:
838641565769871360
Exception in thread "AWT-EventQueue-0" java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 1
EDIT 2: Changed the Vector into a List. Loading the Tweet ID's is working. Also changed the <= in the for loop to <. Now it is loading everything the way it's supposed to. Still having issues with getting data from the List in the getTargetTweetID.

Integrating Kafka with Apache Calcite

I'm trying to integrate calcite with Kafka, I refrenced CsvStreamableTable.
Each ConsumerRecord is convert to Object[] using the fowlloing code:
static class ArrayRowConverter extends RowConverter<Object[]> {
private List<Schema.Field> fields;
public ArrayRowConverter(List<Schema.Field> fields) {
this.fields = fields;
}
#Override
Object[] convertRow(ConsumerRecord<String, GenericRecord> consumerRecord) {
Object[] objects = new Object[fields.size()+1];
int i = 0 ;
objects[i++] = consumerRecord.timestamp();
for(Schema.Field field : this.fields) {
Object obj = consumerRecord.value().get(field.name());
if( obj instanceof Utf8 ){
objects[i ++] = obj.toString();
}else {
objects[i ++] = obj;
}
}
return objects;
}
}
Enumerator is implemented as following,one thread is constantly polling records from kafka and put them into a queue, getRecord() method poll from that queue:
public E current() {
return current;
}
public boolean moveNext() {
for(;;) {
if(cancelFlag.get()) {
return false;
}
ConsumerRecord<String, GenericRecord> record = getRecord();
if(record == null) {
try {
Thread.sleep(200L);
} catch (InterruptedException e) {
e.printStackTrace();
}
continue;
}
current = rowConvert.convertRow(record);
return true;
}
}
I tested SELECT STREAM * FROM Kafka.clicks, it works fine.
rowtime is the first column explicitly added,and the value is record Timestamp of Kafka.
But when I tried
SELECT STREAM FLOOR(rowtime TO HOUR)
AS rowtime,ip,COUNT(*) AS c FROM KAFKA.clicks GROUP BY FLOOR(rowtime TO HOUR), ip
It threw exception
java.sql.SQLException: Error while executing SQL "SELECT STREAM FLOOR(rowtime TO HOUR) AS rowtime,ip,COUNT(*) AS c FROM KAFKA.clicks GROUP BY FLOOR(rowtime TO HOUR), ip": From line 1, column 85 to line 1, column 119: Streaming aggregation requires at least one monotonic expression in GROUP BY clause
at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
at org.apache.calcite.avatica.Helper.createException(Helper.java:41)

You need to declare that the "ROWTIME" column is monotonic. In MockCatalogReader, note how "ROWTIME" is declared monotonic in the "ORDERS" and "SHIPMENTS" streams. That’s why some queries in SqlValidatorTest.testStreamGroupBy() are valid and others are not. The key method relied up by the validator is SqlValidatorTable.getMonotonicity(String columnName).

how to disable page query in Spring-data-elasticsearch

I use spring-data-elasticsearch framework to get query result from elasticsearch server, the java code like this:
public void testQuery() {
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withFields("createDate","updateDate").withQuery(matchAllQuery()).withPageable(new PageRequest(0,Integer.MAX_VALUE)).build();
List<Entity> list = template.queryForList(searchQuery, Entity.class);
for (Entity e : list) {
System.out.println(e.getCreateDate());
System.out.println(e.getUpdateDate());
}
}
I get the raw query log in server, like this:
{"from":0,"size":10,"query":{"match_all":{}},"fields":["createDate","updateDate"]}
As per the query log, spring-data-elasticsearch will add size limit to the query. "from":0, "size":10, How can I avoid it to add the size limit?

You don't want to do this, you could use the findAll functionality on a repository that returns an Iterable. I think the best way to obtain all items is to use the scan/scroll functionality. Maybe the following code block can put you in the right direction:
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(QueryBuilders.matchAllQuery())
.withIndices("customer")
.withTypes("customermodel")
.withSearchType(SearchType.SCAN)
.withPageable(new PageRequest(0, NUM_ITEMS_PER_SCROLL))
.build();
String scrollId = elasticsearchTemplate.scan(searchQuery, SCROLL_TIME_IN_MILLIS, false);
boolean hasRecords = true;
while (hasRecords) {
Page<CustomerModel> page = elasticsearchTemplate.scroll(scrollId, SCROLL_TIME_IN_MILLIS, CustomerModel.class);
if (page != null) {
// DO something with the records
hasRecords = (page.getContent().size() == NUM_ITEMS_PER_SCROLL);
} else {
hasRecords = false;
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

ElasticSeach | Cannot found inserted document after delete index - java

With a Speical thanks to David Pilato (from ES fourm) - need to refresh the index after the insert operation, like this: client.indices().refresh(new RefreshRequest(index), RequestOptions.DEFAULT); link.

Related

Get all Body request using foreach loop on ArrayList on unirest in Java

How to iterate all rows from Google Sheet

Issues with getting data from vector using Java and objects

Integrating Kafka with Apache Calcite

how to disable page query in Spring-data-elasticsearch

Categories

Resources