How to process the rdf version of a DBpedia page with Jena? - java

In all dbpedia pages, e.g.
http://dbpedia.org/page/Ireland
there's a link to a RDF file.
In my application I need to analyse the rdf code and run some logic on it.
I could rely on the dbpedia SPARQL endpoint, but I prefer to download the rdf code locally and parse it, to have full control over it.
I installed JENA and I'm trying to parse the code and extract for example a property called: "geo:geometry".
I'm trying with:
StringReader sr = new StringReader( node.rdfCode )
Model model = ModelFactory.createDefaultModel()
model.read( sr, null )
How can I query the model to get the info I need?
For example, if I wanted to get the statement:
<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<geo:geometry xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" rdf:datatype="http://www.openlinksw.com/schemas/virtrdf#Geometry">POINT(-7 53)</geo:geometry>
</rdf:Description>
Or
<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<dbpprop:countryLargestCity xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="en">Dublin</dbpprop:countryLargestCity>
</rdf:Description>
What is the right filter?
Many thanks!
Mulone

Once you have the file parsed in a Jena model you can iterate and filter with something like:
//Property to filter the model
Property geoProperty =
model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
"geometry");
//Iterator based on a Simple selector
StmtIterator iter =
model.listStatements(new SimpleSelector(null, geoProperty, (RDFNode)null));
//Loop to traverse the statements that match the SimpleSelector
while (iter.hasNext()) {
Statement stmt = iter.nextStatement();
System.out.print(stmt.getSubject().toString());
System.out.print(stmt.getPredicate().toString());
System.out.println(stmt.getObject().toString());
}
The SimpleSelector allows you to pass any (subject,predicate,object) pattern to match statements in the model. In your case if you only care about a specific predicate then first and third parameters of the constructor are null.
Allowing filtering two different properties
To allow more complex filtering you can implement the selects method in the
SimpleSelector interface like here:
Property geoProperty = /* like before */;
Property countryLargestCityProperty =
model. createProperty("http://dbpedia.org/property/",
"countryLargestCity");
SimpleSelector selector = new SimpleSelector(null, null, (RDFNode)null) {
public boolean selects(Statement s)
{ return s.getPredicate().equals(geoProperty) ||
s.getPredicate().equals(countryLargestCityProperty) ;}
}
StmtIterator iter = model.listStatements(selector);
while(it.hasNext()) {
/* same as in the previous example */
}
Edit: including a full example
This code includes a full example that works for me.
import com.hp.hpl.jena.util.FileManager;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.SimpleSelector;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Literal;
import com.hp.hpl.jena.rdf.model.StmtIterator;
import com.hp.hpl.jena.rdf.model.Statement;
public class TestJena {
public static void main(String[] args) {
FileManager fManager = FileManager.get();
fManager.addLocatorURL();
Model model = fManager.loadModel("http://dbpedia.org/data/Ireland.rdf");
Property geoProperty =
model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
"geometry");
StmtIterator iter =
model.listStatements(new SimpleSelector(null, geoProperty,(RDFNode) null));
//Loop to traverse the statements that match the SimpleSelector
while (iter.hasNext()) {
Statement stmt = iter.nextStatement();
if (stmt.getObject().isLiteral()) {
Literal obj = (Literal) stmt.getObject();
System.out.println("The geometry predicate value is " +
obj.getString());
}
}
}
}
This full example prints out:
The geometry predicate value is POINT(-7 53)
Notes on Linked Data
http://dbpedia.org/page/Ireland is the HTML document version of the resource http://dbpedia.org/resource/Ireland
In order to get the RDF you should resolve :
http://dbpedia.org/data/Ireland.rdf
or
http://dbpedia.org/resource/Ireland + Accept: application/rdfxml in the HTTP header.
With curl it'd be something like:
curl -L -H 'Accept: application/rdf+xml' http://dbpedia.org/resource/Ireland

Related

How to pass “values” attribute as array of strings in payload object in case of Googles AUTOML TABLE?

We are  using Google cloud service AUTOML TABLES for online prediction.
We have created, trained and deployed the model. The model is giving predictions using the Google console. We are trying to integrate this model in our java code.
We are not able to pass “values”  attribute as array of strings in payload object in java code. We haven’t found anything for this in documentation.
Please find the links we are using for this:
https://cloud.google.com/automl-tables/docs/samples/automl-tables-predict
Please find the json object in the screenshot.
Please let us know how to pass “values”  attribute as array of strings in payload object?
Thanks.
Based from the reference you are following, to be able to populate "values" you need to define it at the main(). You can refer to Class Value.Builder if you need to set Numbers, Null, etc. values.
List<Value> values = new ArrayList<>();
values.add(Value.newBuilder().setStringValue("This is test data.").build());
// add more elements in values as needed
This list values will be used in Row that accepts iterable protobuf value. See Row.newBuilder.addAllValues().
Row row = Row.newBuilder().addAllValues(values).build();
Using these, the payload is complete and a prediction request be built:
ExamplePayload payload = ExamplePayload.newBuilder().setRow(row).build();
PredictRequest request =
PredictRequest.newBuilder()
.setName(name.toString())
.setPayload(payload)
.putParams("feature_importance", "true")
.build();
PredictResponse response = client.predict(request);
Your full prediction code should look like this:
import com.google.cloud.automl.v1beta1.AnnotationPayload;
import com.google.cloud.automl.v1beta1.ExamplePayload;
import com.google.cloud.automl.v1beta1.ModelName;
import com.google.cloud.automl.v1beta1.PredictRequest;
import com.google.cloud.automl.v1beta1.PredictResponse;
import com.google.cloud.automl.v1beta1.PredictionServiceClient;
import com.google.cloud.automl.v1beta1.Row;
import com.google.cloud.automl.v1beta1.TablesAnnotation;
import com.google.protobuf.Value;
import com.google.protobuf.NullValue;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
class TablesPredict {
public static void main(String[] args) throws IOException {
// TODO(developer): Replace these variables before running the sample.
String projectId = "your-project-id";
String modelId = "TBL9999999999";
// Values should match the input expected by your model.
List<Value> values = new ArrayList<>();
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setStringValue("blue-colar").build());
values.add(Value.newBuilder().setStringValue("married").build());
values.add(Value.newBuilder().setStringValue("primary").build());
values.add(Value.newBuilder().setStringValue("no").build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setStringValue("yes").build());
values.add(Value.newBuilder().setStringValue("yes").build());
values.add(Value.newBuilder().setStringValue("cellular").build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setNullValue(NullValue.NULL_VALUE).build());
values.add(Value.newBuilder().setStringValue("unknown").build());
predict(projectId, modelId, values);
}
static void predict(String projectId, String modelId, List<Value> values) throws IOException {
// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
try (PredictionServiceClient client = PredictionServiceClient.create()) {
// Get the full path of the model.
ModelName name = ModelName.of(projectId, "us-central1", modelId);
Row row = Row.newBuilder().addAllValues(values).build();
ExamplePayload payload = ExamplePayload.newBuilder().setRow(row).build();
// Feature importance gives you visibility into how the features in a specific prediction
// request informed the resulting prediction. For more info, see:
// https://cloud.google.com/automl-tables/docs/features#local
PredictRequest request =
PredictRequest.newBuilder()
.setName(name.toString())
.setPayload(payload)
.putParams("feature_importance", "true")
.build();
PredictResponse response = client.predict(request);
System.out.println("Prediction results:");
for (AnnotationPayload annotationPayload : response.getPayloadList()) {
TablesAnnotation tablesAnnotation = annotationPayload.getTables();
System.out.format(
"Classification label: %s%n", tablesAnnotation.getValue().getStringValue());
System.out.format("Classification score: %.3f%n", tablesAnnotation.getScore());
// Get features of top importance
tablesAnnotation
.getTablesModelColumnInfoList()
.forEach(
info ->
System.out.format(
"\tColumn: %s - Importance: %.2f%n",
info.getColumnDisplayName(), info.getFeatureImportance()));
}
}
}
}
For testing purposes I used Google's test dataset (gs://cloud-ml-tables-data/bank-marketing.csv) and used the code above to run send prediction.
See test prediction:

Nashorn Abstract Syntax Tree Traversal

I am attempting to parse this Javascript via Nashorn:
function someFunction() { return b + 1 };
and navigate to all of the statements. This including statements inside the function.
The code below just prints:
"function {U%}someFunction = [] function {U%}someFunction()"
How do I "get inside" the function node to it's body "return b + 1"? I presume I need to traverse the tree with a visitor and get the child node?
I have been following the second answer to the following question:
Javascript parser for Java
import jdk.nashorn.internal.ir.Block;
import jdk.nashorn.internal.ir.FunctionNode;
import jdk.nashorn.internal.ir.Statement;
import jdk.nashorn.internal.parser.Parser;
import jdk.nashorn.internal.runtime.Context;
import jdk.nashorn.internal.runtime.ErrorManager;
import jdk.nashorn.internal.runtime.Source;
import jdk.nashorn.internal.runtime.options.Options;
import java.util.List;
public class Main {
public static void main(String[] args){
Options options = new Options("nashorn");
options.set("anon.functions", true);
options.set("parse.only", true);
options.set("scripting", true);
ErrorManager errors = new ErrorManager();
Context context = new Context(options, errors, Thread.currentThread().getContextClassLoader());
Source source = Source.sourceFor("test", "function someFunction() { return b + 1; } ");
Parser parser = new Parser(context.getEnv(), source, errors);
FunctionNode functionNode = parser.parse();
Block block = functionNode.getBody();
List<Statement> statements = block.getStatements();
for(Statement statement: statements){
System.out.println(statement);
}
}
}
Using private/internal implementation classes of nashorn engine is not a good idea. With security manager on, you'll get access exception. With jdk9 and beyond, you'll get module access error w/without security manager (as jdk.nashorn.internal.* packages not exported from nashorn module).
You've two options to parse javascript using nashorn:
Nashorn parser API ->https://docs.oracle.com/javase/9/docs/api/jdk/nashorn/api/tree/Parser.html
To use Parser API, you need to use jdk9+.
For jdk8, you can use parser.js
load("nashorn:parser.js");
and call "parse" function from script. This function returns a JSON object that represents AST of the script parsed.
See this sample: http://hg.openjdk.java.net/jdk8u/jdk8u-dev/nashorn/file/a6d0aec77286/samples/astviewer.js

Regex for SPARQL

I have downloaded dbpedia_quotationsbook.zip from dbpedia which contains dbpedia_quotationsbook.nt triplestore.
In this triplestore
subject is authorname
predicate is "sameas"
object is authorcode
I have tried this querying triplestore using JENA , simple queries are running.
Now I want all authorcode whose authorname matches partially with given string .
So I tried following query
select ?code
where
{
FILTER regex(?name, "^Rob") <http://www.w3.org/2002/07/owl#sameAs> ?code.
}
above query should return all authorcodes whose authorname contains
"Rob"
I am getting following exception
Exception in thread "main" com.hp.hpl.jena.query.QueryParseException: Encountered " "." ". "" at line 5, column 74.
Was expecting one of:
<IRIref> ...
<PNAME_NS> ...
<PNAME_LN> ...
<BLANK_NODE_LABEL> ...
<VAR1> ...
<VAR2> ...
"true" ...
"false" ...
<INTEGER> ...
<DECIMAL> ...
<DOUBLE> ...
<INTEGER_POSITIVE> ...
<DECIMAL_POSITIVE> ...
<DOUBLE_POSITIVE> ...
<INTEGER_NEGATIVE> ...
<DECIMAL_NEGATIVE> ...
<DOUBLE_NEGATIVE> ...
<STRING_LITERAL1> ...
<STRING_LITERAL2> ...
<STRING_LITERAL_LONG1> ...
<STRING_LITERAL_LONG2> ...
"(" ...
<NIL> ...
"[" ...
<ANON> ...
at com.hp.hpl.jena.sparql.lang.ParserSPARQL11.perform(ParserSPARQL11.java:102)
at com.hp.hpl.jena.sparql.lang.ParserSPARQL11.parse$(ParserSPARQL11.java:53)
at com.hp.hpl.jena.sparql.lang.SPARQLParser.parse(SPARQLParser.java:34)
at com.hp.hpl.jena.query.QueryFactory.parse(QueryFactory.java:148)
at com.hp.hpl.jena.query.QueryFactory.create(QueryFactory.java:80)
at com.hp.hpl.jena.query.QueryFactory.create(QueryFactory.java:53)
at com.hp.hpl.jena.query.QueryFactory.create(QueryFactory.java:41)
at rdfcreate.NewClass.query(NewClass.java:55)
at rdfcreate.NewClass.main(NewClass.java:97)
Jena Code
import com.hp.hpl.jena.query.Dataset;
import com.hp.hpl.jena.query.Query;
import com.hp.hpl.jena.query.QueryExecution;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QueryFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.tdb.TDBFactory;
import com.hp.hpl.jena.util.FileManager;
/**
*
* #author swapnil
*/
public class NewClass {
String read()
{
final String tdbDirectory = "C:\\TDBLoadGeoCoordinatesAndLabels";
String dataPath = "F:\\Swapnil Drive\\Project Stuff\\Project data 2015 16\\Freelancer\\SPARQL\\dbpedia_quotationsbook.nt";
Model tdbModel = TDBFactory.createModel(tdbDirectory);
/*Incrementally read data to the Model, once per run , RAM > 6 GB*/
FileManager.get().readModel( tdbModel, dataPath, "N-TRIPLES");
tdbModel.close();
return tdbDirectory;
}
void query(String tdbDirectory, String query1)
{
Dataset dataset = TDBFactory.createDataset(tdbDirectory);
Model tdb = dataset.getDefaultModel();
Query query = QueryFactory.create(query1);
QueryExecution qexec = QueryExecutionFactory.create(query, tdb);
/*Execute the Query*/
ResultSet results = qexec.execSelect();
System.out.println(results.getRowNumber());
while (results.hasNext()) {
// Do something important
QuerySolution qs = results.nextSolution();
qs.toString();
System.out.println("sol "+qs);
}
qexec.close();
tdb.close() ;
}
public static void main(String[] args) {
NewClass nc = new NewClass();
String tdbd= nc.read();
nc.query(tdbd, "select ?code\n" +
"WHERE\n" +
"{\n" +
"<http://dbpedia.org/resource/Robert_H._Schuller> <http://www.w3.org/2002/07/owl#sameAs> ?code.\n" +
"}");
}
}
}
Result
sol ( ?code = http://quotationsbook.com/author/6523 )
Above query gives me code of the given author.
Please help me on this
You cannot mix patterns and filters. You must first bind (ie select) the ?name using a triple pattern and then filter the results. Jena basically complains because your SPARQL has invalid syntax.
Now, you could run the query below but your data only contains mapping between dbpedia URIs and quotationsbook URIs.
PREFIX owl: <http://www.w3.org/2002/07/owl#>
select ?code
where
{
?author <name> ?name .
?author owl:sameAs ?code .
FILTER regex(?name, "^Rob")
}
The above means
Get names of authors
Get codes of authors
Include only authors whose name matches regex
Select their codes
Again this would only work for data available locally. Problem is that you do not have the actual names. Of course you could change you query to regex entire dbpedia identifiers, but that's not perfect.
FILTER regex(?author, "Rob")
What you can do, because dbpedia resources are dereferencable, is wrap the name triple pattern in a GRAPH pattern
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
select ?author ?code
where
{
GRAPH <file://path/to/dbpedia_quotationsbook.nt>
{
?author owl:sameAs ?code .
}
GRAPH ?author
{
?author <http://www.w3.org/2000/01/rdf-schema#label> ?name .
FILTER regex(?name, "^Rob")
}
}
Here's what's happening
Get ?authors and ?codes from the import file (SPARQL GUI imports into a graph)
Treat ?author as graph name, so that it can be downloaded from the web
Get ?author's ?names
Filter ?names which start with Rob
There are two important bits to make this work, depending on your SPARQL processor (I'm using SPARQL GUI from dotNetRDF toolkit).
Here's a screenshot of the results I got. Notice the highlighted settings and Fiddler log of dbpedia requests.
Bottom line is I've just given you an example of federated SPARQL query.

RDF/XML Jena getValue

i need a help for getting some information out of the RDF with Jena Framework.
I have an RDF content like this:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:ts="http://www.test.com/testModel.owl#">
<ts:Entity rdf:ID="1234_test">
<....>
</ts>
</rdf:RDF>
Now my idea is to get out the ID from ts:Entity. This is my code:
Model model = ModelFactory.createDefaultModel();
InputStream requestBody = new ByteArrayInputStream(request.getBytes());
String BASE = "http://www.test.com/testModel.owl#";
model.read(requestBody,BASE);
requestBody.close();
StmtIterator iter = model.listStatements();
while (iter.hasNext()) {
Statement stmt = iter.nextStatement(); // get next statement
Resource subject = stmt.getSubject(); // get the subject
Property predicate = stmt.getPredicate(); // get the predicate
RDFNode object = stmt.getObject(); // get the object
System.out.println(subject + " | "+predicate);
}
The only problem, in that case, is that i have to scroll all the Statement. There is a way for getting direct the ID from ts:Entity ? Maybe before getting the resource and after the value of the ID related to that resource.
Thanks in advance for helping.
Sorry, i am again here, because i have a similar question, if for example i have this rdf:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:ts="http://www.test.com/testModel.owl#">
<ts:Entity rdf:ID="1234_test">
<ts:Resource>
<ts:testProp rdf:datatype="http://www.w3.org/2001/XMLSchema#string">test_ID_test</ts:testProp>
</ts>
</ts>
</rdf:RDF>
How i can extract the value test_ID_test??? And if i want use SPARQL how can i do with jena???
You should be using SPARQL to query your model rather than iterating over all the statements. Jena provides a good tutorial on how to use SPARQL with their API.
How about:
Resource ENTITY_TYPE = model.getResource("http://www.test.com/testModel.owl#Entity");
StmtIterator iter = model.listStatements(null, RDF.type, ENTITY_TYPE);
while (iter.hasNext()) {
String entityID = iter.next().getSubject().getURI();
System.out.println(entityID);
}
That will get the URI of each entity.

Extracting Values From an XML File Either using XPath, SAX or DOM for this Specific Scenario

I am currently working on an academic project, developing in Java and XML. Actual task is to parse XML, passing required values preferably in HashMap for further processing. Here is the short snippet of actual XML.
<root>
<BugReport ID = "1">
<Title>"(495584) Firefox - search suggestions passes wrong previous result to form history"</Title>
<Turn>
<Date>'2009-06-14 18:55:25'</Date>
<From>'Justin Dolske'</From>
<Text>
<Sentence ID = "3.1"> Created an attachment (id=383211) [details] Patch v.2</Sentence>
<Sentence ID = "3.2"> Ah. So, there's a ._formHistoryResult in the....</Sentence>
<Sentence ID = "3.3"> The simple fix it to just discard the service's form history result.</Sentence>
<Sentence ID = "3.4"> Otherwise it's trying to use a old form history result that no longer applies for the search string.</Sentence>
</Text>
</Turn>
<Turn>
<Date>'2009-06-19 12:07:34'</Date>
<From>'Gavin Sharp'</From>
<Text>
<Sentence ID = "4.1"> (From update of attachment 383211 [details])</Sentence>
<Sentence ID = "4.2"> Perhaps we should rename one of them to _fhResult just to reduce confusion?</Sentence>
</Text>
</Turn>
<Turn>
<Date>'2009-06-19 13:17:56'</Date>
<From>'Justin Dolske'</From>
<Text>
<Sentence ID = "5.1"> (In reply to comment #3)</Sentence>
<Sentence ID = "5.2"> &gt; (From update of attachment 383211 [details] [details])</Sentence>
<Sentence ID = "5.3"> &gt; Perhaps we should rename one of them to _fhResult just to reduce confusion?</Sentence>
<Sentence ID = "5.4"> Good point.</Sentence>
<Sentence ID = "5.5"> I renamed the one in the wrapper to _formHistResult. </Sentence>
<Sentence ID = "5.6"> fhResult seemed maybe a bit too short.</Sentence>
</Text>
</Turn>
.....
and so on
</BugReport>
There are many commenter like 'Justin Dolske' who have commented on this report and what I actually looking for is the list of commenter and all sentences they have written in a whole XML file. Something like if(from == justin dolske) getHisAllSentences(). Similarly for other commenters (for all). I have tried many different ways to get the sentences only for 'Justin dolske' or other commenters, even in a generic form for all using XPath, SAX and DOM but failed. I am quite new to these technologies including JAVA and any don't know how to achieve it.
Can anyone guide me specifically how could I get it with any of above technologies or is there any other better strategy to do it?
(Note: Later I want to put it in a hashmap such as like this HashMap (key, value) where key = name of commenter (justin dolske) and value is (all sentences))
Urgent help will be highly appreciated.
There're several ways using which you can achieve your requirement.
One way would be use JAXB. There're several tutorials available on this on the web, so feel free to refer to them.
You can also think of creating a DOM and then extracting data from it and then put it into your HashMap.
One reference implementation would be something like this:
import java.io.File;
import java.util.ArrayList;
import java.util.HashMap;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
public class XMLReader {
private HashMap<String,ArrayList<String>> namesSentencesMap;
public XMLReader() {
namesSentencesMap = new HashMap<String, ArrayList<String>>();
}
private Document getDocument(String fileName){
Document document = null;
try{
document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File(fileName));
}catch(Exception exe){
//handle exception
}
return document;
}
private void buildNamesSentencesMap(Document document){
if(document == null){
return;
}
//Get each Turn block
NodeList turnList = document.getElementsByTagName("Turn");
String fromName = null;
NodeList sentenceNodeList = null;
for(int turnIndex = 0; turnIndex < turnList.getLength(); turnIndex++){
Element turnElement = (Element)turnList.item(turnIndex);
//Assumption: <From> element
Element fromElement = (Element) turnElement.getElementsByTagName("From").item(0);
fromName = fromElement.getTextContent();
//Extracting sentences - First check whether the map contains
//an ArrayList corresponding to the name. If yes, then use that,
//else create a new one
ArrayList<String> sentenceList = namesSentencesMap.get(fromName);
if(sentenceList == null){
sentenceList = new ArrayList<String>();
}
//Extract sentences from the Turn node
try{
sentenceNodeList = turnElement.getElementsByTagName("Sentence");
for(int sentenceIndex = 0; sentenceIndex < sentenceNodeList.getLength(); sentenceIndex++){
sentenceList.add(((Element)sentenceNodeList.item(sentenceIndex)).getTextContent());
}
}finally{
sentenceNodeList = null;
}
//Put the list back in the map
namesSentencesMap.put(fromName, sentenceList);
}
}
public static void main(String[] args) {
XMLReader reader = new XMLReader();
reader.buildNamesSentencesMap(reader.getDocument("<your_xml_file>"));
for(String names: reader.namesSentencesMap.keySet()){
System.out.println("Name: "+names+"\tTotal Sentences: "+reader.namesSentencesMap.get(names).size());
}
}
}
Note: This is just a demonstration and you would need to modify it to suit your need. I've created it based on your XML to show one way of doing it.
I suggest to use JAXB to creates a Data Model reflecting your XML structure.
One done, you can load the XML into Java instances.
Put each 'Turn' into a Map< String, List< Turn >>, using Turn.From as key.
Once done, you'll can write:
List< Turn > justinsTurn = allTurns.get( "'Justin Dolske'" );

Categories

Resources