I am deeply sorry for this messy title, but I am completly lost on why this can happen.
I am trying to parse a JSON String using Jackson. My code is simple:
import com.fasterxml.jackson.databind.ObjectMapper;
import formatter.Tweet;
import com.fasterxml.jackson.databind.DeserializationFeature;
public class FormatterTester {
static String tweet = "{\"created_at\":\"Fri May 03 11:43:17 +0000 2019\",\"id\":1124278249620566017,\"id_str\":\"1124278249620566017\",\"text\":\"RT #entkom: '\\u0e40\\u0e0b\\u0e49\\u0e19\\u0e15\\u0e4c-\\u0e28\\u0e38\\u0e20\\u0e1e\\u0e07\\u0e29\\u0e4c' \\u0e41\\u0e08\\u0e01\\u0e04\\u0e27\\u0e32\\u0e21\\u0e19\\u0e48\\u0e32\\u0e23\\u0e31\\u0e01 \\u0e21\\u0e2d\\u0e1a\\u0e04\\u0e27\\u0e32\\u0e21\\u0e2a\\u0e38\\u0e02\\u0e43\\u0e2b\\u0e49\\u0e41\\u0e1f\\u0e19\\u0e04\\u0e25\\u0e31\\u0e1a https:\\/\\/t.co\\/hBbi5hzEH8\",\"source\":\"\\u003ca href=\\\"http:\\/\\/twitter.com\\/download\\/android\\\" rel=\\\"nofollow\\\"\\u003eTwitter for Android\\u003c\\/a\\u003e\",\"truncated\":false,\"in_reply_to_status_id\":null,\"in_reply_to_status_id_str\":null,\"in_reply_to_user_id\":null,\"in_reply_to_user_id_str\":null,\"in_reply_to_screen_name\":null,\"user\":{\"id\":1062336001941504001,\"id_str\":\"1062336001941504001\",\"name\":\"\\ud83d\\udc0a\\u26bd\\ud83d\\udc2f\\ud83c\\udfb8\\ud83d\\udc99sugajin\\/\\/\\ud83d\\udc9a\\ud83d\\udc7b\\ud83d\\udc32\\ud83d\\udc0a\",\"screen_name\":\"sugajinBTS1\",\"location\":null,\"url\":null,\"description\":\"#BTS\\u597d\\u304d\\ud83d\\udc95\\u30b8\\u30f3\\u30cb\\u30e0\\u3088\\u308a\\u306e\\uff75\\uff99\\uff8d\\uff9f\\uff9d\\n#LGBTQ\\u304c\\u3082\\u3063\\u3068\\u7406\\u89e3\\u3055\\u308c\\u3066\\u6b32\\u3057\\u3044\\n#lovebychance\\u306e\\u6cbc\\u306b\\u30cf\\u30de\\u308a\\u4e2d\\n#season2\\u3068\\u3063\\u3066\\u3082\\u671f\\u5f85\\uff01\\uff01\\n#PinSon\\u2665SonPin\\n#2wish\\ud83d\\udc99\\ud83d\\udc9a\\n#Magus\\n#TeamReal\\n#LBCForever\\n\\u7121\\u8a00\\u30d5\\u30a9\\u30ed\\u30fc\\u5931\\u793c\\u3057\\u307e\\u3059\\ud83d\\ude47\",\"translator_type\":\"none\",\"protected\":false,\"verified\":false,\"followers_count\":61,\"friends_count\":224,\"listed_count\":0,\"favourites_count\":37785,\"statuses_count\":11611,\"created_at\":\"Tue Nov 13 13:26:54 +0000 2018\",\"utc_offset\":null,\"time_zone\":null,\"geo_enabled\":false,\"lang\":\"ja\",\"contributors_enabled\":false,\"is_translator\":false,\"profile_background_color\":\"F5F8FA\",\"profile_background_image_url\":\"\",\"profile_background_image_url_https\":\"\",\"profile_background_tile\":false,\"profile_link_color\":\"1DA1F2\",\"profile_sidebar_border_color\":\"C0DEED\",\"profile_sidebar_fill_color\":\"DDEEF6\",\"profile_text_color\":\"333333\",\"profile_use_background_image\":true,\"profile_image_url\":\"http:\\/\\/pbs.twimg.com\\/profile_images\\/1062337509701513216\\/5HFkKxoi_normal.jpg\",\"profile_image_url_https\":\"https:\\/\\/pbs.twimg.com\\/profile_images\\/1062337509701513216\\/5HFkKxoi_normal.jpg\",\"profile_banner_url\":\"https:\\/\\/pbs.twimg.com\\/profile_banners\\/1062336001941504001\\/1543643861\",\"default_profile\":true,\"default_profile_image\":false,\"following\":null,\"follow_request_sent\":null,\"notifications\":null},\"geo\":null,\"coordinates\":null,\"place\":null,\"contributors\":null,\"retweeted_status\":{\"created_at\":\"Fri May 03 01:29:52 +0000 2019\",\"id\":1124123879654301696,\"id_str\":\"1124123879654301696\",\"text\":\"'\\u0e40\\u0e0b\\u0e49\\u0e19\\u0e15\\u0e4c-\\u0e28\\u0e38\\u0e20\\u0e1e\\u0e07\\u0e29\\u0e4c' \\u0e41\\u0e08\\u0e01\\u0e04\\u0e27\\u0e32\\u0e21\\u0e19\\u0e48\\u0e32\\u0e23\\u0e31\\u0e01 \\u0e21\\u0e2d\\u0e1a\\u0e04\\u0e27\\u0e32\\u0e21\\u0e2a\\u0e38\\u0e02\\u0e43\\u0e2b\\u0e49\\u0e41\\u0e1f\\u0e19\\u0e04\\u0e25\\u0e31\\u0e1a https:\\/\\/t.co\\/hBbi5hzEH8\",\"source\":\"\\u003ca href=\\\"http:\\/\\/twitter.com\\\" rel=\\\"nofollow\\\"\\u003eTwitter Web Client\\u003c\\/a\\u003e\",\"truncated\":false,\"in_reply_to_status_id\":null,\"in_reply_to_status_id_str\":null,\"in_reply_to_user_id\":null,\"in_reply_to_user_id_str\":null,\"in_reply_to_screen_name\":null,\"user\":{\"id\":69565234,\"id_str\":\"69565234\",\"name\":\"ent_komchadluek\",\"screen_name\":\"entkom\",\"location\":null,\"url\":null,\"description\":null,\"translator_type\":\"none\",\"protected\":false,\"verified\":false,\"followers_count\":6684,\"friends_count\":1115,\"listed_count\":86,\"favourites_count\":14,\"statuses_count\":31813,\"created_at\":\"Fri Aug 28 11:28:17 +0000 2009\",\"utc_offset\":null,\"time_zone\":null,\"geo_enabled\":false,\"lang\":\"en\",\"contributors_enabled\":false,\"is_translator\":false,\"profile_background_color\":\"FF6699\",\"profile_background_image_url\":\"http:\\/\\/abs.twimg.com\\/images\\/themes\\/theme11\\/bg.gif\",\"profile_background_image_url_https\":\"https:\\/\\/abs.twimg.com\\/images\\/themes\\/theme11\\/bg.gif\",\"profile_background_tile\":true,\"profile_link_color\":\"B40B43\",\"profile_sidebar_border_color\":\"CC3366\",\"profile_sidebar_fill_color\":\"E5507E\",\"profile_text_color\":\"362720\",\"profile_use_background_image\":true,\"profile_image_url\":\"http:\\/\\/pbs.twimg.com\\/profile_images\\/471687167\\/ent1_normal.jpg\",\"profile_image_url_https\":\"https:\\/\\/pbs.twimg.com\\/profile_images\\/471687167\\/ent1_normal.jpg\",\"default_profile\":false,\"default_profile_image\":false,\"following\":null,\"follow_request_sent\":null,\"notifications\":null},\"geo\":null,\"coordinates\":null,\"place\":null,\"contributors\":null,\"is_quote_status\":false,\"quote_count\":9,\"reply_count\":33,\"retweet_count\":584,\"favorite_count\":505,\"entities\":{\"hashtags\":[],\"urls\":[{\"url\":\"https:\\/\\/t.co\\/hBbi5hzEH8\",\"expanded_url\":\"http:\\/\\/www.komchadluek.net\\/news\\/ent\\/370511#.XMuZj_HCjrY.twitter\",\"display_url\":\"komchadluek.net\\/news\\/ent\\/37051\\u2026\",\"indices\":[52,75]}],\"user_mentions\":[],\"symbols\":[]},\"favorited\":false,\"retweeted\":false,\"possibly_sensitive\":false,\"filter_level\":\"low\",\"lang\":\"th\"},\"is_quote_status\":false,\"quote_count\":0,\"reply_count\":0,\"retweet_count\":0,\"favorite_count\":0,\"entities\":{\"hashtags\":[],\"urls\":[{\"url\":\"https:\\/\\/t.co\\/hBbi5hzEH8\",\"expanded_url\":\"http:\\/\\/www.komchadluek.net\\/news\\/ent\\/370511#.XMuZj_HCjrY.twitter\",\"display_url\":\"komchadluek.net\\/news\\/ent\\/37051\\u2026\",\"indices\":[64,87]}],\"user_mentions\":[{\"screen_name\":\"entkom\",\"name\":\"ent_komchadluek\",\"id\":69565234,\"id_str\":\"69565234\",\"indices\":[3,10]}],\"symbols\":[]},\"favorited\":false,\"retweeted\":false,\"possibly_sensitive\":false,\"filter_level\":\"low\",\"lang\":\"th\",\"timestamp_ms\":\"1556883797446\"}";
public static void main(String[]args) {
String valor_retorno= null;
Tweet tw;
try {
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
tw = objectMapper.readValue(tweet, Tweet.class);
System.out.println("Check 3 - El formatter retorna:\n"+tw.toString());
valor_retorno = tw.toString();
} catch (Exception e) {
e.printStackTrace();
System.out.println("\nException " + e.getClass() + ": " + e.getMessage());
} finally {
System.out.println("\nReturn: Valor_retorno = "+valor_retorno);
}
}
}
If you run the code you'll see it works fine. Where is the problem then? I have to do this same operation on an Oracle NoSQL database. It's not important to know any of the parts related to the data retrieval since they work fine, I've tested them. The code is quite similar:
String data = new String(value.toByteArray(),StandardCharsets.UTF_8);
ObjectMapper objectMapper = new ObjectMapper();
objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
objectMapper.configure(Feature.ALLOW_UNQUOTED_CONTROL_CHARS, true);
tw = objectMapper.readValue(data, Tweet.class);
My objective is to obtain exactly the same result as in the first code. A String of values separated by '|' according of the attributes of my class Tweet.
However, this code is compressed in a Jar file, and run internally by the database over all the Tweets recorded. I can't see what happens nor debug it, but it produces the following exception:
com.fasterxml.jackson.core.JsonParseException: Illegal character ((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between tokens
I've tried scaping the string "data" with StringEscapeUtils.escapeJava(data);
what then produces the following exception:
com.fasterxml.jackson.core.JsonParseException: Unexpected character ('\' (code 92)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
I've also tried scaping the string like this data.replace('\'', ' '); without success.
I can't understand after many tests, why it runs well on the demo I put here first and not on the actual project, having exactly the same dependencies.
For some reason, Jackson can't parse what I retrieve from the DataBase. This is most likely due to a problem of codification or decodification in the CentOS that my Docker container uses to hold the DB and where the script is invoqued and executed.
In the end, using Gson for the parsing was the best option, though it would still produce errors if you don't trim() the String. Apparently, for some reason the JSON came quoted twice. This is, ""JSON text"".
The code:
package formatter;
import java.io.*;
import java.lang.String;
import java.nio.charset.StandardCharsets;
import java.util.List;
import oracle.kv.*;
import com.google.gson.Gson;
import oracle.kv.KeyValueVersion;
import oracle.kv.exttab.Formatter;
public class TweetFormatter implements Formatter {
public TweetFormatter() {
super();
}
public String toOracleLoaderFormat(final KeyValueVersion kvv, final KVStore kvStore){
String valor_retorno= null;
Tweet tw; //antes sin null
BufferedWriter bf = FormatterUtils.getInstance().getWriter();
try {
final Key key = kvv.getKey();
final Value value = kvv.getValue();
Value.Format format = value.getFormat();
FormatterUtils.getInstance().writeLine(bf,"[Key: "+ key + ", Value:" +value.toByteArray()+ "]" + ". Format= "+ format.toString());
//Filtrar Clave
List<String> major = key.getMajorPath();
FormatterUtils.getInstance().writeLine(bf,"Check 1:\n Key is: "+key + "\n Key length is: "+major.size()
+ "\n Values are: "+major.toString() + "\n contains: "+major.contains("TweeterStream"));
Boolean contains = false;
for(String x : major) {
if(x.equals("TweeterStream")||x.equals("/TweeterStream")||x.equals("/TweeterStream/")) {
contains = true;
break;
}
}
//Parsear
if(contains){
String data = new String(value.toByteArray(),StandardCharsets.UTF_8);
data = data.trim();
tw = new Gson().fromJson(data,Tweet.class); //FUNCIONA
FormatterUtils.getInstance().writeLine(bf,"Check 3 - El formatter retorna:\n"+tw.toString());
valor_retorno = tw.toString();
}else{
FormatterUtils.getInstance().writeLine(bf,"\nEstoy en else");
}
FormatterUtils.getInstance().writeLine(bf,"\nestoy fuera del if-else");
} catch (Exception e) {
e.printStackTrace();
FormatterUtils.getInstance().writeLine(bf, "\nException " + e.getClass() + ": " + e.getMessage());
} finally {
FormatterUtils.getInstance().writeLine(bf,"\nReturn: Valor_retorno = "+valor_retorno);
FormatterUtils.getInstance().generateLog(bf);
}
return valor_retorno;
}
I'm having trouble finding examples of what I'm trying to do...
I'd like to create a Lambda function in Java. I thought I'd always use Javascript for Lambda functions, but in this case I'll end up re-using application logic already written in Java, so it makes sense.
In the past I've written Javascript Lambda functions that are triggered by Kinesis events. Super simple, function receives the events as a parameter, do something, voila. I'd like to do the same thing with Java. Really simple :
Kinesis Event(s) -> Trigger Function -> (Java) Receive Kinesis Events, do something with them
Anyone have experience with this kind of use case?
Here is some sample code I wrote to demonstrate the same concept internally. This code forwards events from one stream to another.
Note this code does not handle retries if there are errors in forwarding, nor is it meant to be performant in a production environment, but it does demonstrate how to handle the records from the publishing stream.
import com.amazonaws.regions.Region;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.kinesis.AmazonKinesisClient;
import com.amazonaws.services.kinesis.model.PutRecordsRequest;
import com.amazonaws.services.kinesis.model.PutRecordsRequestEntry;
import com.amazonaws.services.kinesis.model.PutRecordsResult;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.LambdaLogger;
import com.amazonaws.services.lambda.runtime.events.KinesisEvent;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class KinesisToKinesis {
private LambdaLogger logger;
final private AmazonKinesisClient kinesisClient = new AmazonKinesisClient();
public PutRecordsResult eventHandler(KinesisEvent event, Context context) {
logger = context.getLogger();
if (event == null || event.getRecords() == null) {
logger.log("Event contains no data" + System.lineSeparator());
return null;
} else {
logger.log("Received " + event.getRecords().size() +
" records from " + event.getRecords().get(0).getEventSourceARN() + System.lineSeparator());
}
final Long startTime = System.currentTimeMillis();
// set up the client
Region region;
final Map<String, String> environmentVariables = System.getenv();
if (environmentVariables.containsKey("AWS_REGION")) {
region = Region.getRegion(Regions.fromName(environmentVariables.get("AWS_REGION")));
} else {
region = Region.getRegion(Regions.US_WEST_2);
logger.log("Using default region: " + region.toString() + System.lineSeparator());
}
kinesisClient.setRegion(region);
Long elapsed = System.currentTimeMillis() - startTime;
logger.log("Finished setup in " + elapsed + " ms" + System.lineSeparator());
PutRecordsRequest putRecordsRequest = new PutRecordsRequest().withStreamName("usagecounters-global");
List<PutRecordsRequestEntry> putRecordsRequestEntryList = event.getRecords().parallelStream()
.map(r -> new PutRecordsRequestEntry()
.withData(ByteBuffer.wrap(r.getKinesis().getData().array()))
.withPartitionKey(r.getKinesis().getPartitionKey()))
.collect(Collectors.toList());
putRecordsRequest.setRecords(putRecordsRequestEntryList);
elapsed = System.currentTimeMillis() - startTime;
logger.log("Processed " + putRecordsRequest.getRecords().size() +
" records in " + elapsed + " ms" + System.lineSeparator());
PutRecordsResult putRecordsResult = kinesisClient.putRecords(putRecordsRequest);
elapsed = System.currentTimeMillis() - startTime;
logger.log("Forwarded " + putRecordsRequest.getRecords().size() +
" records to Kinesis " + putRecordsRequest.getStreamName() +
" in " + elapsed + " ms" + System.lineSeparator());
return putRecordsResult;
}
}
When I access a JavaScript object's member variable using Nashorn ScriptObjectMirror.get(), the returned object's type seems to be determined at runtime. For example, if the value fits in a Java int, get() seems to return a Java Integer. If the value won't fit in an int, get() seems to return a Java Long, and so on.
Right now, I use instanceof to check the type and convert the value to a long.
Is there a more convenient way of getting a member's value without loss and without checking the type in Java? Perhaps Nashorn could always give me a Java Double, throwing an error in case the member's not numeric.
I can imagine this is a rather narrow case that probably shouldn't be handled by Nashorn...
Example:
package com.tangotangolima.test.nashorn_types;
import jdk.nashorn.api.scripting.ScriptObjectMirror;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
import java.io.StringReader;
public class Main {
public static void main(String[] args) throws ScriptException {
final ScriptEngineManager mgr = new ScriptEngineManager();
final ScriptEngine js = mgr.getEngineByName("nashorn");
final String script = "" +
"var p = 1;" +
"var q = " + (Integer.MAX_VALUE + 1L) + ";" +
"var r = {" +
"s: 1," +
"t: " + (Integer.MAX_VALUE + 1L) +
" };";
js.eval(new StringReader(script));
say(js.get("p").getClass().getName()); // -> java.lang.Integer
say(js.get("q").getClass().getName()); // -> java.lang.Long
final ScriptObjectMirror r = (ScriptObjectMirror) js.get("r");
say(r.get("s").getClass().getName()); // -> java.lang.Integer
say(r.get("t").getClass().getName()); // -> java.lang.Long
}
static void say(String s) {
System.out.println(s);
}
}
This code can do ScriptObjectMirror JS -> Java conversion
private static Object convertIntoJavaObject(Object scriptObj) {
if (scriptObj instanceof ScriptObjectMirror) {
ScriptObjectMirror scriptObjectMirror = (ScriptObjectMirror) scriptObj;
if (scriptObjectMirror.isArray()) {
List<Object> list = Lists.newArrayList();
for (Map.Entry<String, Object> entry : scriptObjectMirror.entrySet()) {
list.add(convertIntoJavaObject(entry.getValue()));
}
return list;
} else {
Map<String, Object> map = Maps.newHashMap();
for (Map.Entry<String, Object> entry : scriptObjectMirror.entrySet()) {
map.put(entry.getKey(), convertIntoJavaObject(entry.getValue()));
}
return map;
}
} else {
return scriptObj;
}
}
public static void main(String[] args) throws ScriptException, NoSuchMethodException {
final ScriptEngine engine = new ScriptEngineManager().getEngineByName("nashorn");
engine.eval("function objProvider(){return {a:1, b:'2','c': true,'d': {'e':[],'f':['1',{'g':45}]}};}");
final Object scriptObj = ((Invocable) engine).invokeFunction("objProvider");
Object javaObj = convertIntoJavaObject(scriptObj);
System.out.println(javaObj);
//{a=1, b=2, c=true, d={e=[], f=[1, {g=45}]}}
}
Recommended approach is to check "instanceof java.lang.Number" from java code -- if you expect JavaScript "number" value. Once casted to Number, you can convert to int, long, double by calling methods such as intValue, longValue etc.
I really like Igor's approach. Here is his convertToJavaObject() code as a complete program using only standard Java and in addition to including a toJava() also includes a toJavascript() that goes the other way.
import jdk.nashorn.api.scripting.ScriptObjectMirror;
import javax.script.Invocable;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
public class Main {
#SuppressWarnings("removal")
private static Object toJava(Object jsObj) {
if (jsObj instanceof ScriptObjectMirror) {
var jsObjectMirror = (ScriptObjectMirror) jsObj;
if (jsObjectMirror.isArray()) {
var list = new ArrayList<>();
for (Map.Entry<String, Object> entry : jsObjectMirror.entrySet()) {
list.add(toJava(entry.getValue()));
}
return list;
} else {
var map = new HashMap<String, Object>();
for (Map.Entry<String, Object> entry : jsObjectMirror.entrySet()) {
map.put(entry.getKey(), toJava(entry.getValue()));
}
return map;
}
} else {
return jsObj;
}
}
public static void main(String[] args) throws ScriptException, NoSuchMethodException {
var code = String.join("\n",
"function objProvider() {",
" return {a:1, b:'2','c': true,'d': {'e':[],'f':['1',{'g':45}]}}",
"}",
"function toJavascript (jObj) {",
" if (jObj instanceof java.util.List) {",
" var l = []; for each (var item in jObj) {",
" l.push(toJavascript(item));",
" }",
" return l;",
" }",
" if (jObj instanceof java.util.Map) {",
" var m = {}; for each (var key in jObj.keySet()) {",
" m[key] = toJavascript(jObj.get(key));",
" }",
" return m;",
" }",
" return jObj;",
"}"
);
var engine = new ScriptEngineManager().getEngineByName("nashorn");
engine.eval(code);
var jsObj = ((Invocable) engine).invokeFunction("objProvider");
var formatted = ((Invocable) engine).invokeMethod(engine.eval("JSON"), "stringify", jsObj);
System.out.println("JSON.stringify(jsObj): " + formatted);
var javaObj = toJava(jsObj);
System.out.println("javaObj: " + javaObj);
//{a=1, b=2, c=true, d={e=[], f=[1, {g=45}]}}
var newJsObj = ((Invocable) engine).invokeFunction("toJavascript", javaObj);
formatted = ((Invocable) engine).invokeMethod(engine.eval("JSON"), "stringify", newJsObj);
System.out.println("JSON.stringify(newJsObj): " + formatted);
// just to show this doesn't work without conversion
formatted = ((Invocable) engine).invokeMethod(engine.eval("JSON"), "stringify", javaObj);
System.out.println("JSON.stringify(javaObj): " + formatted);
}
}
I wrote below program to understand how elastic search could be used to do full text search. Here when I search for individual words it works right but I want to search for combinations of words and that is not working.
package in.blogspot.randomcompiler.elastic_search_demo;
import in.blogspot.randomcompiler.elastic_search_impl.Event;
import java.util.Date;
import org.elasticsearch.action.count.CountRequestBuilder;
import org.elasticsearch.action.count.CountResponse;
import org.elasticsearch.action.delete.DeleteResponse;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchRequestBuilder;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.Client;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.index.query.FilterBuilder;
import org.elasticsearch.index.query.FilterBuilders;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import com.fasterxml.jackson.core.JsonProcessingException;
public class ElasticSearchDemo
{
public static void main( String[] args ) throws JsonProcessingException
{
Client client = new TransportClient()
.addTransportAddress(new InetSocketTransportAddress("localhost", 9301));
DeleteResponse deleteResponse1 = client.prepareDelete("chat-data", "event", "1").execute().actionGet();
DeleteResponse deleteResponse2 = client.prepareDelete("chat-data", "event", "2").execute().actionGet();
DeleteResponse deleteResponse3 = client.prepareDelete("chat-data", "event", "3").execute().actionGet();
Event e1 = new Event("LOGIN", new Date(), "Agent1 logged into chat");
String e1Json = e1.prepareJson();
System.out.println("JSON: " + e1Json);
IndexResponse indexResponse1 = client.prepareIndex("chat-data", "event", "1").setSource(e1Json).execute().actionGet();
printIndexResponse("e1", indexResponse1);
Event e2 = new Event("LOGOUT", new Date(), "Agent1 logged out of chat");
String e2Json = e2.prepareJson();
System.out.println("JSON: " + e2Json);
IndexResponse indexResponse2 = client.prepareIndex("chat-data", "event", "2").setSource(e2Json).execute().actionGet();
printIndexResponse("e2", indexResponse2);
Event e3 = new Event("BREAK", new Date(), "Agent1 went on break in the middle of a chat");
String e3Json = e3.prepareJson();
System.out.println("JSON: " + e3Json);
IndexResponse indexResponse3 = client.prepareIndex("chat-data", "event", "3").setSource(e3Json).execute().actionGet();
printIndexResponse("e3", indexResponse3);
FilterBuilder filterBuilder = FilterBuilders.termFilter("value", "break middle");
SearchRequestBuilder searchBuilder = client.prepareSearch();
searchBuilder.setPostFilter(filterBuilder);
CountRequestBuilder countBuilder = client.prepareCount();
countBuilder.setQuery(QueryBuilders.constantScoreQuery(filterBuilder));
CountResponse countResponse1 = countBuilder.execute().actionGet();
System.out.println("HITS: " + countResponse1.getCount());
SearchResponse searchResponse1 = searchBuilder.execute().actionGet();
SearchHits hits = searchResponse1.getHits();
for(int i=0; i<hits.hits().length; i++) {
SearchHit hit = hits.getAt(i);
System.out.println("[" + i + "] " + hit.getId() + " : " +hit.sourceAsString());
}
client.close();
}
private static void printIndexResponse(String description, IndexResponse response) {
System.out.println("Index response for: " + description);
System.out.println("Index name: " + response.getIndex());
System.out.println("Index type: " + response.getType());
System.out.println("Index id: " + response.getId());
System.out.println("Index version: " + response.getVersion());
}
}
The issue I am facing is that when I search for "break middle" it returns nothing, expectation is that it should return the 3rd event.
I understand that I need to configure a different analyzer rather the default one to make it index appropriately.
Could someone please help me in understanding how to do that. Some complete example would to great to have.
The problem is caused because you are using the Term filter:
FilterBuilder filterBuilder = FilterBuilders.termFilter("value", "break middle");
A Term filter doesn't analyse the data in the query string - so Elasticsearch is looking for the exact string "break middle".
However the third document will probably have been broken down by ES into individual terms as follows:
Agent1
went
on
break
in
the
middle
of
a
chat
to fix the issue, use a filter or query that analyses the string you're passing - for example use a Query_String query or Match query.
For example:
QueryBuilder qb = QueryBuilders.matchQuery("event", "break middle");
or:
QueryBuilder qb = QueryBuilders.queryString("break middle");
See the Java API documentation for Elasticsearch for more info.
-what I want to do
I would like to get data from Google Spreadsheet using Google Spreadsheet API Java library without authentication.
The Google Spreadsheet is published with public.
I would like to use the following method:
com.google.gdata.data.spreadsheet.CustomElementCollection
-Issue
CustomElementCollection return collect data with authentication.
But CustomElementCollection return null without authentication.
As listEntry.getPlainTextContent() shows data, so I think I should be able to get the data in any ways.
-Source code attached
With authentication: Auth.java
import java.net.URL;
import java.util.List;
import com.google.gdata.client.spreadsheet.ListQuery;
import com.google.gdata.client.spreadsheet.SpreadsheetService;
import com.google.gdata.data.spreadsheet.CustomElementCollection;
import com.google.gdata.data.spreadsheet.ListEntry;
import com.google.gdata.data.spreadsheet.ListFeed;
import com.google.gdata.data.spreadsheet.SpreadsheetEntry;
import com.google.gdata.data.spreadsheet.WorksheetEntry;
public class Auth {
public static void main(String[] args) throws Exception{
String applicationName = "AppName";
String user = args[0];
String pass = args[1];
String key = args[2];
String query = args[3];
SpreadsheetService service = new SpreadsheetService(applicationName);
service.setUserCredentials(user, pass); //set client auth
URL entryUrl = new URL("http://spreadsheets.google.com/feeds/spreadsheets/" + key);
SpreadsheetEntry spreadsheetEntry = service.getEntry(entryUrl, SpreadsheetEntry.class);
WorksheetEntry worksheetEntry = spreadsheetEntry.getDefaultWorksheet();
ListQuery listQuery = new ListQuery(worksheetEntry.getListFeedUrl());
listQuery.setSpreadsheetQuery( query );
ListFeed listFeed = service.query(listQuery, ListFeed.class);
List<ListEntry> list = listFeed.getEntries();
for( ListEntry listEntry : list )
{
System.out.println( "content=[" + listEntry.getPlainTextContent() + "]");
CustomElementCollection elements = listEntry.getCustomElements();
System.out.println(
" name=" + elements.getValue("name") +
" age=" + elements.getValue("age") );
}
}
}
Without authentication: NoAuth.java
import java.net.URL;
import java.util.List;
import com.google.gdata.client.spreadsheet.FeedURLFactory;
import com.google.gdata.client.spreadsheet.ListQuery;
import com.google.gdata.client.spreadsheet.SpreadsheetService;
import com.google.gdata.data.spreadsheet.CustomElementCollection;
import com.google.gdata.data.spreadsheet.ListEntry;
import com.google.gdata.data.spreadsheet.ListFeed;
import com.google.gdata.data.spreadsheet.WorksheetEntry;
import com.google.gdata.data.spreadsheet.WorksheetFeed;
public class NoAuth {
public static void main(String[] args) throws Exception{
String applicationName = "AppName";
String key = args[0];
String query = args[1];
SpreadsheetService service = new SpreadsheetService(applicationName);
URL url = FeedURLFactory.getDefault().getWorksheetFeedUrl(key, "public", "basic");
WorksheetFeed feed = service.getFeed(url, WorksheetFeed.class);
List<WorksheetEntry> worksheetList = feed.getEntries();
WorksheetEntry worksheetEntry = worksheetList.get(0);
ListQuery listQuery = new ListQuery(worksheetEntry.getListFeedUrl());
listQuery.setSpreadsheetQuery( query );
ListFeed listFeed = service.query( listQuery, ListFeed.class );
List<ListEntry> list = listFeed.getEntries();
for( ListEntry listEntry : list )
{
System.out.println( "content=[" + listEntry.getPlainTextContent() + "]");
CustomElementCollection elements = listEntry.getCustomElements();
System.out.println(
" name=" + elements.getValue("name") +
" age=" + elements.getValue("age") );
}
}
}
Google Spreadsheet:
https://docs.google.com/spreadsheet/pub?key=0Ajawooo6A9OldHV0VHYzVVhTZlB6SHRjbGc5MG1CakE&output=html
-Result
Without authentication
content=[age: 23]
name=null age=null
With authentication
content=[age: 23]
name=Taro age=23
Please let me know the useful information to avoid the issue.
I don't know why it works like that, but when you don't access request with credentials, you are not able to retrieve cells via:
CustomElementCollection elements = listEntry.getCustomElements();
System.out.println(" name=" + elements.getValue("name") + " age=" + elements.getValue("age") );
I've tested it and I have found only this way to retrieve data:
List<ListEntry> list = listFeed.getEntries();
for (ListEntry row : list) {
System.out.println(row.getTitle().getPlainText() + "\t"
+ row.getPlainTextContent());
}
It prints:
Taro age: 23
Hanako age: 16
As you see, you should parse text and retrieve age from raw String.
I believe the problem is that you are using the "basic" projection for your spreadsheet. If you use the "values" projection, everything should work as expected.
I was wondering about this as well. I looked at the feed coming in (just paste the URL to the sheet into Chrome), and it seems like there is no XML markup, and are all coming in under the <content> tag. So it makes sense that the parser is lumping it all into the text content of the BaseEntry (instead of making a ListEntry).