Parallelstream.ForEach() double item

Parallelstream.ForEach() double item - java

I have a piece of software that generates SOAP-requests based on an excel-file, and then emails the results.
Due to the potential size of the requests, I do the soap-request-handling in parallel. The following code handles the above mentioned.
public void HandleData() {
List<NodeAnalysisReply> replies = Collections.synchronizedList(new ArrayList<>());
new Thread(() -> {
List<NodeAnalysisRequest> requests;
SOAPMessageFactory factory = new SOAPMessageFactory();
SOAPResponseParser parser = new SOAPResponseParser();
try {
requests = new ExcelParser().parseData(file);
requests.parallelStream().forEach((request) -> {
try {
SOAPMessage message = factory.createNodeRequestMessage(
new RequestObject(requestInfoFactory.makeInfo(trackingID), request));
SOAPMessage response = new SoapConnector(server.getUrl()).executeRequest(message);
ByteArrayOutputStream out = new ByteArrayOutputStream();
response.writeTo(out);
NodeAnalysisReply curReply = parser.ParseXMLResponse(out.toString(), request);
synchronized (replies) {
System.out.println("Adding: " + curReply.getRequest().toString());
replies.add(curReply);
}
} catch (UnsupportedOperationException | SOAPException | IOException e) {
handleSoap(e.getMessage());
}
});
} catch (IOException e) {
handleBadParse();
}
try {
for(NodeAnalysisReply reply : replies){
System.out.println("Data: " + reply.getRequest().toString());
}
mailer.SendEmail("Done", email, replies);
} catch (MessagingException e) {
e.printStackTrace();
}
}).start();
}
When I run the code with two piece of data, the following happens:
Adding: Søndergade 52 6920 // OK
Adding: Ternevej 1 6920 // OK
Data: Ternevej 1 6920 // What
Data: Ternevej 1 6920 // WHAT..
are equal? true
So even though it adds both items to the list, it seems like the last one takes both places. How come is that, and how do I solve it? - I really do miss the Parrallel.ForEach() form C#!
EDIT: As requested, the code for NodeAnalysisReply.
public class NodeAnalysisReply {
public ReplyInfo getReplyInfo() {
return replyInfo;
}
public void setReplyInfo(ReplyInfo replyInfo) {
this.replyInfo = replyInfo;
}
public List < nodeAnalysisListDetails > getNodeAnalysisListDetails() {
return nodeAnalysisListDetails;
}
public void setNodeAnalysisListDetails(List < nodeAnalysisListDetails > nodeAnalysisListDetails) {
this.nodeAnalysisListDetails = nodeAnalysisListDetails;
}
public void addNodeAnalysisListDetail(nodeAnalysisListDetails nodeAnalysisListDetails) {
this.nodeAnalysisListDetails.add(nodeAnalysisListDetails);
}
ReplyInfo replyInfo;
public String getFormattedXML() {
return formattedXML;
}
public void setFormattedXML(String formattedXML) {
this.formattedXML = formattedXML;
}
String formattedXML;
public NodeAnalysisRequest getRequest() {
return request;
}
public void setRequest(NodeAnalysisRequest request) {
this.request = request;
}
NodeAnalysisRequest request;
List < nodeAnalysisListDetails > nodeAnalysisListDetails = new ArrayList < > ();
}

synchronized (replies) {
System.out.println("Adding: " + curReply.getRequest().toString());
replies.add(curReply);
}
The above code in a lambda of stream is called a side effect and is not encouraged at all.
What you should do is something like below.
replies.addAll(requests.parallelStream().map((request) -> {
try {
SOAPMessage message = factory.createNodeRequestMessage(
new RequestObject(requestInfoFactory.makeInfo(trackingID), request));
SOAPMessage response = new SoapConnector(server.getUrl()).executeRequest(message);
ByteArrayOutputStream out = new ByteArrayOutputStream();
response.writeTo(out);
NodeAnalysisReply curReply = parser.ParseXMLResponse(out.toString(), request);
return curReply;
} catch (UnsupportedOperationException | SOAPException | IOException e) {
handleSoap(e.getMessage());
return null;
}
})
.filter(curReply -> curReply != null)
.collect(Collectors.toList())
);
In the Above code you map each request to a NodeAnalysisReply first and then filter only the non null values and finally you Collect it into a list and all those to your replies list.

Related

Encountering this error while running test cases -ERROR N/A (Null Pointer Exception)

#Test
public void testBatchFailClientBatchSyncCallIllegalArgumentExceptions() throws Exception {
Map<String, String> singletonMap = Collections.singletonMap(ACCEPT_STRING_ID, defaultLocalizationMap.get(ACCEPT_STRING_ID));
StringRequest[] requests = stringRequestFactory.createRequests(singletonMap);
when(lmsClient.batchSyncCall(requests)).thenThrow(new IllegalArgumentException());
List<Backend.Response> responses = callLms(new StringRequest[] {requests[0]});
Assert.assertNotNull(responses);
assertEquals(EntityDescriptors.ERROR_V1, responses.get(0).entityDescriptor());
assertEquals(Http.Status.SERVICE_UNAVAILABLE, responses.get(0).status());
}
#Test
public void testBatchFailClientBatchSyncCallIOException() throws Exception {
Map<String, String> singletonMap = Collections.singletonMap(ACCEPT_STRING_ID, defaultLocalizationMap.get(ACCEPT_STRING_ID));
StringRequest[] requests = stringRequestFactory.createRequests(singletonMap);
when(lmsClient.batchSyncCall(requests)).thenThrow(new IOException());
List<Backend.Response> responses = callLms(new StringRequest[] {requests[0]});
Assert.assertNotNull(responses);
assertEquals(EntityDescriptors.ERROR_V1, responses.get(0).entityDescriptor());
assertEquals(Http.Status.SERVICE_UNAVAILABLE, responses.get(0).status());
}
Source Code -
#Override
public List<Backend.Response> handleRequests(BackendRequestContext context, List<Backend.Request> requests, Metrics metrics) {
StringRequest[] stringRequests = new StringRequest[requests.size()];
final String language = context.locale().toLanguageTag().replace("-", "_");
for (int i = 0; i < requests.size(); i++) {
final Backend.Request request = requests.get(i);
final String id = request.requiredPathParam(STRING_ID_PATH_PARAM);
final Optional<String> marketplaceDisplayName = request.queryParam(MARKETPLACE_NAME_QUERY_PARAM);
final Optional<String> stage = request.queryParam(STAGE_QUERY_PARAM);
final StringRequest stringRequest = new StringRequest(id);
stringRequest.setLanguage(language);
marketplaceDisplayName.ifPresent(stringRequest::setMarketplaceName);
stage.map(Stage::getStage).ifPresent(stringRequest::setStage);
stringRequests[i] = stringRequest;
}
StringResultBatch batchResult = invokeBatchSync(stringRequests);
return IntStream.of(requests.size()).mapToObj(i -> {
final Backend.Request request = requests.get(i);
try {
return transform(request, batchResult.get(i), language);
} catch (IOException e) {
LOGGER.error("", e);
return Backend.Response.builder()
.withRequest(request)
.withEntityDescriptor(EntityDescriptors.ERROR_V1)
.withStatus(Http.Status.SERVICE_UNAVAILABLE)
.withBody(ErrorResponses.ServerError.serviceUnavailable(ErrorResponse.InternalInfo.builder()
.withMessage("Error retrieving ["
+ request.requiredPathParam(STRING_ID_PATH_PARAM)
+ "]")
.build())
.tokens())
.build();
}
}
).collect(Collectors.toList());
}
private StringResultBatch invokeBatchSync(StringRequest[] stringRequests) {
try {
// LMS Client has an async batch call,
// but it returns a proprietary class (StringResultBatchFuture) which eventually wraps a BSFFutureReply.
// Neither of which provide access to anything like a Java-standard Future.
return client.batchSyncCall(stringRequests);
} catch (IllegalArgumentException | IOException e) {
//
return null;
}
}
I have two test cases here for the source file. I'm getting the Error N/A. It says null pointer exception. Can someone please review this and help me with this. It will be really appreciated. Thank you in advance
P.S - The source file takes input request as string and performs string translation and returns us that string.

TupleTag not found in DoFn

I have a DoFn that is supposed to split input into two separate PCollections. The pipeline builds and runs up until it is time to output in the DoFn, and then I get the following exception:
"java.lang.IllegalArgumentException: Unknown output tag Tag<edu.mayo.mcc.cdh.pipeline.PubsubToAvro$PubsubMessageToArchiveDoFn$2.<init>:219#2587af97b4865538>
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:216)...
If I declare the TupleTags I'm using in the ParDo, I get that error, but if I declare them outside of the ParDo I get a syntax error saying the OutputReceiver can't find the tags. Below is the apply and the ParDo/DoFn:
PCollectionTuple results = (messages.apply("Map to Archive", ParDo.of(new PubsubMessageToArchiveDoFn()).withOutputTags(noTag, TupleTagList.of(medaPcollection))));
PCollection<AvroPubsubMessageRecord> medaPcollectionTransformed = results.get(medaPcollection);
PCollection<AvroPubsubMessageRecord> noTagPcollectionTransformed = results.get(noTag);
static class PubsubMessageToArchiveDoFn extends DoFn<PubsubMessage, AvroPubsubMessageRecord> {
final TupleTag<AvroPubsubMessageRecord> medaPcollection = new TupleTag<AvroPubsubMessageRecord>(){};
final TupleTag<AvroPubsubMessageRecord> noTag = new TupleTag<AvroPubsubMessageRecord>(){};
#ProcessElement
public void processElement(ProcessContext context, MultiOutputReceiver out) {
String appCode;
PubsubMessage message = context.element();
String msgStr = new String(message.getPayload(), StandardCharsets.UTF_8);
try {
JSONObject jsonObject = new JSONObject(msgStr);
LOGGER.info("json: {}", jsonObject);
appCode = jsonObject.getString("app_code");
LOGGER.info(appCode);
if(appCode == "MEDA"){
LOGGER.info("Made it to MEDA tag");
out.get(medaPcollection).output(new AvroPubsubMessageRecord(
message.getPayload(), message.getAttributeMap(), context.timestamp().getMillis()));
} else {
LOGGER.info("Made it to default tag");
out.get(noTag).output(new AvroPubsubMessageRecord(
message.getPayload(), message.getAttributeMap(), context.timestamp().getMillis()));
}
} catch (Exception e) {
LOGGER.info("Error Processing Message: {}\n{}", msgStr, e);
}
}
}

Can you try without MultiOutputReceiver out parameter in the processElement method ?
Outputs are then returned with context.output with passing element and corresponding TupleTag.
Your example only with context :
static class PubsubMessageToArchiveDoFn extends DoFn<PubsubMessage, AvroPubsubMessageRecord> {
final TupleTag<AvroPubsubMessageRecord> medaPcollection = new TupleTag<AvroPubsubMessageRecord>(){};
final TupleTag<AvroPubsubMessageRecord> noTag = new TupleTag<AvroPubsubMessageRecord>(){};
#ProcessElement
public void processElement(ProcessContext context) {
String appCode;
PubsubMessage message = context.element();
String msgStr = new String(message.getPayload(), StandardCharsets.UTF_8);
try {
JSONObject jsonObject = new JSONObject(msgStr);
LOGGER.info("json: {}", jsonObject);
appCode = jsonObject.getString("app_code");
LOGGER.info(appCode);
if(appCode == "MEDA"){
LOGGER.info("Made it to MEDA tag");
context.output(medaPcollection, new AvroPubsubMessageRecord(
message.getPayload(), message.getAttributeMap(), context.timestamp().getMillis()));
} else {
LOGGER.info("Made it to default tag");
context.output(noTag, new AvroPubsubMessageRecord(
message.getPayload(), message.getAttributeMap(), context.timestamp().getMillis()));
}
} catch (Exception e) {
LOGGER.info("Error Processing Message: {}\n{}", msgStr, e);
}
}
I also show you an example that works for me :
public class WordCountFn extends DoFn<String, Integer> {
private final TupleTag<Integer> outputTag = new TupleTag<Integer>() {};
private final TupleTag<Failure> failuresTag = new TupleTag<Failure>() {};
#ProcessElement
public void processElement(ProcessContext ctx) {
try {
// Could throw ArithmeticException.
final String word = ctx.element();
ctx.output(1 / word.length());
} catch (Throwable throwable) {
final Failure failure = Failure.from("step", ctx.element(), throwable);
ctx.output(failuresTag, failure);
}
}
public TupleTag<Integer> getOutputTag() {
return outputTag;
}
public TupleTag<Failure> getFailuresTag() {
return failuresTag;
}
}
In my first output (good case), no need to pass the TupleTag ctx.output(1 / word.length());
For my second output (failure case), I pass the Failure tag with the corresponding element.

I was able to get around this by making my ParDo an anonymous function instead of a class. I put the whole function inline and had no problem finding the output tags after I did that. Thanks for the suggestions!

ElasticSearch index exists not working / reliable

I am writing a simple Java wrapper around ElasticSearch's admin client. To test it I have a main method that first checks if an index exists (IndicesExistsRequest), if so deletes it (DeleteIndexRequest), and creates the index again. See code below. Yet I consistently get an IndexAlreadyExistsException.
By the way I am trying to get a client for the node that you start from the command prompt (by simply typing "elastic search"). I have tried every combination of methods on nodeBuilder's fluent interface, but I can't seem to get one.
public static void main(String[] args) {
ElasticSearchJavaClient esjc = new ElasticSearchJavaClient("nda");
if (esjc.indexExists()) {
esjc.deleteIndex();
}
esjc.createIndex();
URL url = SchemaCreator.class.getResource("/elasticsearch/specimen.type.json");
String mappings = FileUtil.getContents(url);
esjc.createType("specimen", mappings);
}
final Client esClient;
final IndicesAdminClient adminClient;
final String indexName;
public ElasticSearchJavaClient(String indexName) {
this.indexName = indexName;
esClient = nodeBuilder().clusterName("elasticsearch").client(true).node().client();
adminClient = esClient.admin().indices();
}
public boolean deleteIndex() {
logger.info("Deleting index " + indexName);
DeleteIndexRequest request = new DeleteIndexRequest(indexName);
try {
DeleteIndexResponse response = adminClient.delete(request).actionGet();
if (!response.isAcknowledged()) {
throw new Exception("Failed to delete index " + indexName);
}
logger.info("Index deleted");
return true;
} catch (IndexMissingException e) {
logger.info("No such index: " + indexName);
return false;
}
}
public boolean indexExists() {
logger.info(String.format("Verifying existence of index \"%s\"", indexName));
IndicesExistsRequest request = new IndicesExistsRequest(indexName);
IndicesExistsResponse response = adminClient.exists(request).actionGet();
if (response.isExists()) {
logger.info("Index exists");
return true;
}
logger.info("No such index");
return false;
}
public void createIndex() {
logger.info("Creating index " + indexName);
CreateIndexRequest request = new CreateIndexRequest(indexName);
IndicesAdminClient iac = esClient.admin().indices();
CreateIndexResponse response = iac.create(request).actionGet();
if (!response.isAcknowledged()) {
throw new Exception("Failed to create index " + indexName);
}
logger.info("Index created");
}

You can also execute a synchronous request like this:
boolean exists = client.admin().indices()
.prepareExists(INDEX_NAME)
.execute().actionGet().isExists();

Here is my solution when using RestHighLevelClient client;
Here a code-snippet: :
public boolean checkIfIndexExists(String indexName) throws IOException {
Response response = client.getLowLevelClient().performRequest("HEAD", "/" + indexName);
int statusCode = response.getStatusLine().getStatusCode();
return (statusCode != 404);
}
A contribution for someone else !

The skgemini's answer is ok if you want to check if index is available by the actual index name or any of its aliases.
If you however want to check only by the index name, here is how.
public boolean checkIfIndexExists(String index) {
IndexMetaData indexMetaData = client.admin().cluster()
.state(Requests.clusterStateRequest())
.actionGet()
.getState()
.getMetaData()
.index(index);
return (indexMetaData != null);
}

OK, I figured out a solution. Since the java client's calls are done asynchronously you have to use the variant which takes an action listener. The solution still gets a bit contrived though:
// Inner class because it's just used to be thrown out of
// the action listener implementation to signal that the
// index exists
private class ExistsException extends RuntimeException {
}
public boolean exists() {
logger.info(String.format("Verifying existence of index \"%s\"", indexName));
IndicesExistsRequest request = new IndicesExistsRequest(indexName);
try {
adminClient.exists(request, new ActionListener<IndicesExistsResponse>() {
public void onResponse(IndicesExistsResponse response) {
if (response.isExists()) {
throw new ExistsException();
}
}
public void onFailure(Throwable e) {
ExceptionUtil.smash(e);
}
});
}
catch (ExistsException e) {
return true;
}
return false;
}

I had the same issue but i didn't like the solution which uses an ActionListener. ElasticSearch also offers a Future variant (at least at version 6.1.0).
Here a code-snippet:
public boolean doesIndexExists(String indexName, TransportClient client) {
IndicesExistsRequest request = new IndicesExistsRequest(indexName);
ActionFuture<IndicesExistsResponse> future = client.admin().indices().exists(request);
try {
IndicesExistsResponse response = future.get();
boolean result = response.isExists();
logger.info("Existence of index '" + indexName + "' result is " + result);
return result;
} catch (InterruptedException | ExecutionException e) {
logger.error("Exception at waiting for IndicesExistsResponse", e);
return false;//do some clever exception handling
}
}
May be this helps someone else too. Cheers!

This works on Elasticsearch 7.x:
public boolean indexExists(String indexName) throws IOException {
return client.indices().exists(new org.elasticsearch.client.indices.GetIndexRequest(indexName), RequestOptions.DEFAULT);
}

Hashtable remains empty after put()

I have trouble using Hastable in this class:
public class HttpBuilder {
...
private int ret;
public Hashtable headers;
private String content;
HttpBuilder(int majorv, int minorv, int ret){
ver[0] = majorv;
ver[1] = minorv;
this.ret = ret;
headers = new Hashtable();
}
...
public void addHeader(String header, String value){
headers.put(header, value);
}
...
}
This class builds a string from multiple input parameters. I use it in multiple threads. Something like this:
HttpBuilder Get(HttpParser request) {
HttpBuilder response;
String doc;
if (request.getRequestURL().equals("/")) {
try {
doc = LoadDoc("main.html");
} catch (IOException e) {
response = new HttpBuilder(1, 1, 500);
return response;
}
response = new HttpBuilder(1, 1, 200);
response.addHeader("content-type", "text/html");
response.setContent(doc);
} else {
response = new HttpBuilder(1, 1, 404);
}
return response;
}
After addHeader Hashtable is empty.
Consume data:
public String toString() {
String result;
int len = 0;
result = "HTTP/"+Integer.toString(ver[0])+"."+Integer.toString(ver[1])+
" "+getHttpReply(ret)+"\n";
if(content!=null){
len = content.length();
if(len!=0){
headers.put("content-length", Integer.toString(len));
}
}
Iterator it = headers.entrySet().iterator();
while (it.hasNext()) {
Map.Entry pairs = (Map.Entry) it.next();
result += pairs.getKey() + ": " + pairs.getValue() + "\n";
it.remove();
}
if(len!=0){
result+="\n"+content;
}
return result;
}
Thread class where i use HttpBuilder
class ClientThread implements Runnable {
private Socket socket;
private ServerData data;
static public final String NotImplemented = "HTTP/1.1 501 Not Implemented";
static public final String NotFound = "HTTP/1.1 404 Not Found";
ClientThread(Socket socket, ServerData data) {
this.socket = socket;
this.data = data;
}
#Override
public void run() {
try {
HttpParser request = new HttpParser(socket.getInputStream());
HttpBuilder response;
if (request.parseRequest() != 200) {
response = new HttpBuilder(1, 1, 501);
} else {
if (request.getMethod().equals("GET")) {
response = Get(request);
} else if (request.getMethod().equals("POST")) {
response = Post(request);
} else {
response = new HttpBuilder(1, 1, 400);
}
}
} catch (IOException e) {
Server.log.log(Level.SEVERE, e.getLocalizedMessage());
} finally {
try {
socket.close();
Server.log.log(Level.INFO, "Close connection");
} catch (IOException e) {
Server.log.log(Level.SEVERE, e.getLocalizedMessage());
}
}
}
void send(String response) throws IOException {
PrintWriter out;
out = new PrintWriter(socket.getOutputStream(), true);
out.print(response);
}
String LoadDoc(String doc) throws IOException {
final String Folder = "web" + File.separator;
String result = null;
doc = Folder + doc;
long len;
File f = new File(doc);
FileReader fr = new FileReader(f);
len = f.length();
char[] buffer = new char[(int) len];
fr.read(buffer);
result = new String(buffer);
fr.close();
return result;
}
HttpBuilder Get(HttpParser request) {
HttpBuilder response;
String doc;
if (request.getRequestURL().equals("/")) {
try {
doc = LoadDoc("main.html");
} catch (IOException e) {
response = new HttpBuilder(1, 1, 500);
return response;
}
response = new HttpBuilder(1, 1, 200);
response.addHeader("content-type", "text/html");
response.setContent(doc);
} else {
response = new HttpBuilder(1, 1, 404);
}
return response;
}
HttpBuilder Post(HttpParser request) {
HttpBuilder response;
String str;
if(request.getRequestURL().equals("/")){
response = new HttpBuilder(1,1, 200);
str = request.getContentParam("user");
response.setContent(str+" added to the base.");
}else {
response = new HttpBuilder(1, 1, 404);
}
return response;
}
}

It seems a bad idea to modify your object in toString().
The purpose of toString() is to return a String representation of your object. Multiple subsequent calls to toString() should return the same result.
When you iterate over the headers in toString() you remove the headers :
Iterator it = headers.entrySet().iterator();
while (it.hasNext()) {
Map.Entry pairs = (Map.Entry) it.next();
result += pairs.getKey() + ": " + pairs.getValue() + "\n";
it.remove();
}
If that's a desired behavior, I suggest you use a method with a different name for this logic.
Since toString() overrides a method of Object, it's possible that it's called somewhere where you're not expecting it to be called, and empties your headers map.

Your debugger calls toString on the Hashtable, so you see the values displayed. But calling this method also removes the values, so viewing it in the debugger actually empties the table. This is a Bad Idea, your toString method should not modify the object.
And your HttpBuilder isn't thread safe, because you use a Hashtable. Luckily, you don't call it from multiple threads, at least, not in the code you have posted.

using dbpedia spotlight in java or scala

Does anyone know where to find a little how to on using dbpedia spotlight in java or scala? Or could anyone explain how it's done? I can't find any information on this...

The DBpedia Spotlight wiki pages would be a good place to start.
And I believe the installation page has listed the most popular ways (using a jar, or set up a web service) to use the application.
It includes instructions on using the Java/Scala API with your own installation, or calling the Web Service.
There are some additional data needed to be downloaded to run your own server for full service, good time to make a coffee for yourself.

you need download dbpedia spotlight (jar file) after that u can use next two classes ( author pablomendes ) i only make some change .
public class db extends AnnotationClient {
//private final static String API_URL = "http://jodaiber.dyndns.org:2222/";
private static String API_URL = "http://spotlight.dbpedia.org:80/";
private static double CONFIDENCE = 0.0;
private static int SUPPORT = 0;
private static String powered_by ="non";
private static String spotter ="CoOccurrenceBasedSelector";//"LingPipeSpotter"=Annotate all spots
//AtLeastOneNounSelector"=No verbs and adjs.
//"CoOccurrenceBasedSelector" =No 'common words'
//"NESpotter"=Only Per.,Org.,Loc.
private static String disambiguator ="Default";//Default ;Occurrences=Occurrence-centric;Document=Document-centric
private static String showScores ="yes";
#SuppressWarnings("static-access")
public void configiration(double CONFIDENCE,int SUPPORT,
String powered_by,String spotter,String disambiguator,String showScores){
this.CONFIDENCE=CONFIDENCE;
this.SUPPORT=SUPPORT;
this.powered_by=powered_by;
this.spotter=spotter;
this.disambiguator=disambiguator;
this.showScores=showScores;
}
public List<DBpediaResource> extract(Text text) throws AnnotationException {
LOG.info("Querying API.");
String spotlightResponse;
try {
String Query=API_URL + "rest/annotate/?" +
"confidence=" + CONFIDENCE
+ "&support=" + SUPPORT
+ "&spotter=" + spotter
+ "&disambiguator=" + disambiguator
+ "&showScores=" + showScores
+ "&powered_by=" + powered_by
+ "&text=" + URLEncoder.encode(text.text(), "utf-8");
LOG.info(Query);
GetMethod getMethod = new GetMethod(Query);
getMethod.addRequestHeader(new Header("Accept", "application/json"));
spotlightResponse = request(getMethod);
} catch (UnsupportedEncodingException e) {
throw new AnnotationException("Could not encode text.", e);
}
assert spotlightResponse != null;
JSONObject resultJSON = null;
JSONArray entities = null;
try {
resultJSON = new JSONObject(spotlightResponse);
entities = resultJSON.getJSONArray("Resources");
} catch (JSONException e) {
//throw new AnnotationException("Received invalid response from DBpedia Spotlight API.");
}
LinkedList<DBpediaResource> resources = new LinkedList<DBpediaResource>();
if(entities!=null)
for(int i = 0; i < entities.length(); i++) {
try {
JSONObject entity = entities.getJSONObject(i);
resources.add(
new DBpediaResource(entity.getString("#URI"),
Integer.parseInt(entity.getString("#support"))));
} catch (JSONException e) {
LOG.error("JSON exception "+e);
}
}
return resources;
}
}
second class
/**
* #author pablomendes
*/
public abstract class AnnotationClient {
public Logger LOG = Logger.getLogger(this.getClass());
private List<String> RES = new ArrayList<String>();
// Create an instance of HttpClient.
private static HttpClient client = new HttpClient();
public List<String> getResu(){
return RES;
}
public String request(HttpMethod method) throws AnnotationException {
String response = null;
// Provide custom retry handler is necessary
method.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,
new DefaultHttpMethodRetryHandler(3, false));
try {
// Execute the method.
int statusCode = client.executeMethod(method);
if (statusCode != HttpStatus.SC_OK) {
LOG.error("Method failed: " + method.getStatusLine());
}
// Read the response body.
byte[] responseBody = method.getResponseBody(); //TODO Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
// Deal with the response.
// Use caution: ensure correct character encoding and is not binary data
response = new String(responseBody);
} catch (HttpException e) {
LOG.error("Fatal protocol violation: " + e.getMessage());
throw new AnnotationException("Protocol error executing HTTP request.",e);
} catch (IOException e) {
LOG.error("Fatal transport error: " + e.getMessage());
LOG.error(method.getQueryString());
throw new AnnotationException("Transport error executing HTTP request.",e);
} finally {
// Release the connection.
method.releaseConnection();
}
return response;
}
protected static String readFileAsString(String filePath) throws java.io.IOException{
return readFileAsString(new File(filePath));
}
protected static String readFileAsString(File file) throws IOException {
byte[] buffer = new byte[(int) file.length()];
#SuppressWarnings("resource")
BufferedInputStream f = new BufferedInputStream(new FileInputStream(file));
f.read(buffer);
return new String(buffer);
}
static abstract class LineParser {
public abstract String parse(String s) throws ParseException;
static class ManualDatasetLineParser extends LineParser {
public String parse(String s) throws ParseException {
return s.trim();
}
}
static class OccTSVLineParser extends LineParser {
public String parse(String s) throws ParseException {
String result = s;
try {
result = s.trim().split("\t")[3];
} catch (ArrayIndexOutOfBoundsException e) {
throw new ParseException(e.getMessage(), 3);
}
return result;
}
}
}
public void saveExtractedEntitiesSet(String Question, LineParser parser, int restartFrom) throws Exception {
String text = Question;
int i=0;
//int correct =0 ; int error = 0;int sum = 0;
for (String snippet: text.split("\n")) {
String s = parser.parse(snippet);
if (s!= null && !s.equals("")) {
i++;
if (i<restartFrom) continue;
List<DBpediaResource> entities = new ArrayList<DBpediaResource>();
try {
entities = extract(new Text(snippet.replaceAll("\\s+"," ")));
System.out.println(entities.get(0).getFullUri());
} catch (AnnotationException e) {
// error++;
LOG.error(e);
e.printStackTrace();
}
for (DBpediaResource e: entities) {
RES.add(e.uri());
}
}
}
}
public abstract List<DBpediaResource> extract(Text text) throws AnnotationException;
public void evaluate(String Question) throws Exception {
evaluateManual(Question,0);
}
public void evaluateManual(String Question, int restartFrom) throws Exception {
saveExtractedEntitiesSet(Question,new LineParser.ManualDatasetLineParser(), restartFrom);
}
}
main()
public static void main(String[] args) throws Exception {
String Question ="Is the Amazon river longer than the Nile River?";
db c = new db ();
c.configiration(0.0, 0, "non", "CoOccurrenceBasedSelector", "Default", "yes");
System.out.println("resource : "+c.getResu());
}

I just add one little fix for your answer.
Your code is running, if you add the evaluate method call:
public static void main(String[] args) throws Exception {
String question = "Is the Amazon river longer than the Nile River?";
db c = new db ();
c.configiration(0.0, 0, "non", "CoOccurrenceBasedSelector", "Default", "yes");
c.evaluate(question);
System.out.println("resource : "+c.getResu());
}
Lamine

In the request method of the second class (AnnotationClient) in Adel's answer, the author Pablo Mendes hasn't finished
TODO Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
which is an annoying warning that needs to be removed by replacing
byte[] responseBody = method.getResponseBody(); //TODO Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
// Deal with the response.
// Use caution: ensure correct character encoding and is not binary data
response = new String(responseBody);
with
Reader in = new InputStreamReader(method.getResponseBodyAsStream(), "UTF-8");
StringWriter writer = new StringWriter();
org.apache.commons.io.IOUtils.copy(in, writer);
response = writer.toString();

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parallelstream.ForEach() double item - java

Related

Encountering this error while running test cases -ERROR N/A (Null Pointer Exception)

TupleTag not found in DoFn

ElasticSearch index exists not working / reliable

Hashtable remains empty after put()

using dbpedia spotlight in java or scala

Categories

Resources