Regex for SPARQL - java

I have downloaded dbpedia_quotationsbook.zip from dbpedia which contains dbpedia_quotationsbook.nt triplestore.
In this triplestore
subject is authorname
predicate is "sameas"
object is authorcode
I have tried this querying triplestore using JENA , simple queries are running.
Now I want all authorcode whose authorname matches partially with given string .
So I tried following query
select ?code
where
{
FILTER regex(?name, "^Rob") <http://www.w3.org/2002/07/owl#sameAs> ?code.
}
above query should return all authorcodes whose authorname contains
"Rob"
I am getting following exception
Exception in thread "main" com.hp.hpl.jena.query.QueryParseException: Encountered " "." ". "" at line 5, column 74.
Was expecting one of:
<IRIref> ...
<PNAME_NS> ...
<PNAME_LN> ...
<BLANK_NODE_LABEL> ...
<VAR1> ...
<VAR2> ...
"true" ...
"false" ...
<INTEGER> ...
<DECIMAL> ...
<DOUBLE> ...
<INTEGER_POSITIVE> ...
<DECIMAL_POSITIVE> ...
<DOUBLE_POSITIVE> ...
<INTEGER_NEGATIVE> ...
<DECIMAL_NEGATIVE> ...
<DOUBLE_NEGATIVE> ...
<STRING_LITERAL1> ...
<STRING_LITERAL2> ...
<STRING_LITERAL_LONG1> ...
<STRING_LITERAL_LONG2> ...
"(" ...
<NIL> ...
"[" ...
<ANON> ...
at com.hp.hpl.jena.sparql.lang.ParserSPARQL11.perform(ParserSPARQL11.java:102)
at com.hp.hpl.jena.sparql.lang.ParserSPARQL11.parse$(ParserSPARQL11.java:53)
at com.hp.hpl.jena.sparql.lang.SPARQLParser.parse(SPARQLParser.java:34)
at com.hp.hpl.jena.query.QueryFactory.parse(QueryFactory.java:148)
at com.hp.hpl.jena.query.QueryFactory.create(QueryFactory.java:80)
at com.hp.hpl.jena.query.QueryFactory.create(QueryFactory.java:53)
at com.hp.hpl.jena.query.QueryFactory.create(QueryFactory.java:41)
at rdfcreate.NewClass.query(NewClass.java:55)
at rdfcreate.NewClass.main(NewClass.java:97)
Jena Code
import com.hp.hpl.jena.query.Dataset;
import com.hp.hpl.jena.query.Query;
import com.hp.hpl.jena.query.QueryExecution;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QueryFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.tdb.TDBFactory;
import com.hp.hpl.jena.util.FileManager;
/**
*
* #author swapnil
*/
public class NewClass {
String read()
{
final String tdbDirectory = "C:\\TDBLoadGeoCoordinatesAndLabels";
String dataPath = "F:\\Swapnil Drive\\Project Stuff\\Project data 2015 16\\Freelancer\\SPARQL\\dbpedia_quotationsbook.nt";
Model tdbModel = TDBFactory.createModel(tdbDirectory);
/*Incrementally read data to the Model, once per run , RAM > 6 GB*/
FileManager.get().readModel( tdbModel, dataPath, "N-TRIPLES");
tdbModel.close();
return tdbDirectory;
}
void query(String tdbDirectory, String query1)
{
Dataset dataset = TDBFactory.createDataset(tdbDirectory);
Model tdb = dataset.getDefaultModel();
Query query = QueryFactory.create(query1);
QueryExecution qexec = QueryExecutionFactory.create(query, tdb);
/*Execute the Query*/
ResultSet results = qexec.execSelect();
System.out.println(results.getRowNumber());
while (results.hasNext()) {
// Do something important
QuerySolution qs = results.nextSolution();
qs.toString();
System.out.println("sol "+qs);
}
qexec.close();
tdb.close() ;
}
public static void main(String[] args) {
NewClass nc = new NewClass();
String tdbd= nc.read();
nc.query(tdbd, "select ?code\n" +
"WHERE\n" +
"{\n" +
"<http://dbpedia.org/resource/Robert_H._Schuller> <http://www.w3.org/2002/07/owl#sameAs> ?code.\n" +
"}");
}
}
}
Result
sol ( ?code = http://quotationsbook.com/author/6523 )
Above query gives me code of the given author.
Please help me on this

You cannot mix patterns and filters. You must first bind (ie select) the ?name using a triple pattern and then filter the results. Jena basically complains because your SPARQL has invalid syntax.
Now, you could run the query below but your data only contains mapping between dbpedia URIs and quotationsbook URIs.
PREFIX owl: <http://www.w3.org/2002/07/owl#>
select ?code
where
{
?author <name> ?name .
?author owl:sameAs ?code .
FILTER regex(?name, "^Rob")
}
The above means
Get names of authors
Get codes of authors
Include only authors whose name matches regex
Select their codes
Again this would only work for data available locally. Problem is that you do not have the actual names. Of course you could change you query to regex entire dbpedia identifiers, but that's not perfect.
FILTER regex(?author, "Rob")
What you can do, because dbpedia resources are dereferencable, is wrap the name triple pattern in a GRAPH pattern
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
select ?author ?code
where
{
GRAPH <file://path/to/dbpedia_quotationsbook.nt>
{
?author owl:sameAs ?code .
}
GRAPH ?author
{
?author <http://www.w3.org/2000/01/rdf-schema#label> ?name .
FILTER regex(?name, "^Rob")
}
}
Here's what's happening
Get ?authors and ?codes from the import file (SPARQL GUI imports into a graph)
Treat ?author as graph name, so that it can be downloaded from the web
Get ?author's ?names
Filter ?names which start with Rob
There are two important bits to make this work, depending on your SPARQL processor (I'm using SPARQL GUI from dotNetRDF toolkit).
Here's a screenshot of the results I got. Notice the highlighted settings and Fiddler log of dbpedia requests.
Bottom line is I've just given you an example of federated SPARQL query.

Related

Double data type not working with RDF star query

Apache Jena is not able to query RDF Star Triples that have a double data type in them. Here is the code for reproduction of the issue with Jena 3.17 (it can be reproduced on other versions too).
Dataset dataset = TDB2Factory.createDataset();
Model tempModel = ModelFactory.createDefaultModel();
StringReader reader = new StringReader("#prefix : <http://ex#> "
+ "#prefix xsd: <http://www.w3.org/2001/XMLSchema#> "
+ ":rk :val \"1.0\"^^xsd:double ."
+ "<<:rk :val \"1.0\"^^xsd:double>> :p_key 1");
RDFDataMgr.read(tempModel, reader, null, Lang.TURTLE);
dataset.begin(TxnType.WRITE);
Graph repositoryGraph = dataset.getNamedModel("RAW_MODEL").getGraph();
StmtIterator it = tempModel.listStatements();
while(it.hasNext()) {
repositoryGraph.add(it.nextStatement().asTriple());
}
dataset.commit()
dataset.end()
Now during query time, I am using the following code.
dataset.begin(TxnType.READ);
Query query = QueryFactory.create("SELECT ?s ?o ?id WHERE {"
+ "<<?s <http://ex#val> ?o>> <http://ex#p_key> ?id"
+ "}");
try (QueryExecution exec = QueryExecutionFactory.create(query, dataset.getUnionModel())) {
ResultSet result = exec.execSelect();
while (result.hasNext()) {
System.out.println(result.next().toString());
}
}
dataset.end()
The above query fails to fetch any result. However, if I just replace xsd:double with xsd:float or xsd:decimal the results are fetched. Hence, I am looking for help to understand what is causing this issue with xsd:double?
Note: You might think that I am not using the most optimal way to make insertions. However, this was due to other requirements in the code and reproduction of issue is possible through this route.
It works in Jena 4.0.0.
In 3.17.0 - SPARQL was more like the original RDF* in its use of indexing.
As a consequence, the non-canonical term map cause a problem.
Try a lexical form of "1.0e0"^^xsd:double or v 4.x.x.

How to set a property path in Jena's Sparql API?

I would like to avoid passing SPARQL queries around as Strings. Therefore I use Jena's API for creating my queries. Now I need a PropertyPath in my query, but I can't find any Java class supporting this. Can you give me a hint?
Here's some example code where I would like to insert this (Jena 3.0.1):
private Query buildQuery(final String propertyPath) {
ElementTriplesBlock triplesBlock = new ElementTriplesBlock();
triplesBlock.addTriple(
new Triple(NodeFactory.createURI(this.titleUri.toString()),
//How can I set a property path as predicate here?
NodeFactory.???,
NodeFactory.createVariable("o"))
);
final Query query = buildSelectQuery(triplesBlock);
return query;
}
private Query buildSelectQuery(final ElementTriplesBlock queryBlock) {
final Query query = new Query();
query.setQuerySelectType();
query.setQueryResultStar(true);
query.setDistinct(true);
query.setQueryPattern(queryBlock);
return query;
}
You can use PathFactory to create property paths
Consider the graph below:
#prefix dc: <http://purl.org/dc/elements/1.1/>.
#prefix ex: <http://example.com/>.
ex:Manager ex:homeOffice ex:HomeOffice
ex:HomeOffice dc:title "Home Office Title"
Suppose you want to create a pattern like:
?x ex:homeOffice/dc:title ?title
The code below achieves it:
//create the path
Path exhomeOffice = PathFactory.pathLink(NodeFactory.createURI("http://example.com/homeOffice"));
Path dcTitle = PathFactory.pathLink(NodeFactory.createURI("http://purl.org/dc/elements/1.1/title"));
Path fullPath = PathFactory.pathSeq(exhomeOffice,dcTitle);
TriplePath t = new TriplePath(Var.alloc("x"),fullPath,Var.alloc("title"));

How do I build a SPARQL list input using jena querybuilder?

I have a bunch of code that uses the Apache Jena querybuilder API (SelectBuilder class). I am trying to add a term like this to my existing SPARQL query:
(?a ?b ?c) :hasMagicProperty ?this .
I have verified that this query works in TopBraid, but I can't figure out how to represent (?a, ?b, ?c) in the Jena API. What do I have to do to convert this list of Vars into a valid Jena resource node?
I am willing to explore alternate SPARQL-building frameworks, if they have robust support for typed literals, IRIs, and filters, as well as this list construct. I have skimmed over several other frameworks for building up SPARQL queries, but none of them seem to have a list construct.
Edit
My query building code (in Groovy) looks something like this:
def selectBuilder = new SelectBuilder()
selectBuilder.addPrefixes(...)
def thisVar = Var.alloc('this')
selectBuilder.addOptional(thisVar, 'rdf:type', ':MyEntity')
def aVar = Var.alloc('a')
def bVar = Var.alloc('b')
def cVar = Var.alloc('c')
List<Var> abc = [aVar, bVar, cVar]
//this doesn't work!!!
selectBuilder.addWhere(abc, ':hasMagicProperty', thisVar)
selectBuilder.addWhere(aVar, ':hasACode', 'code A')
selectBuilder.addWhere(bVar, ':hasBCode', 'code B')
selectBuilder.addWhere(cVar, ':hasCCode', 'code C')
def sparqlQuery = selectBuilder.buildString()
I have spent a couple of hours trying to work with the RDFList class, and I haven't figured it out. I'll keep trying, and see if I can grok it. In the meantime, any help would be appreciated. :)
Edit
Here is an unsuccessful attempt to use RDFList:
//this code does not work!
def varNode = NodeFactory.createVariable('a')
def model = ModelFactory.createDefaultModel()
def rdfNode = model.asRDFNode(varNode)
def rdfList = new RDFListImpl(model.createResource().asNode(), model)
//this line throws an exception!!
rdfList.add(rdfNode)
selectBuilder.addWhere(rdfList, ':hasMagicProperty', thisVar)
//com.hp.hpl.jena.shared.PropertyNotFoundException: http://www.w3.org/1999/02/22-rdf-syntax-ns#rest
The following method is a workaround, using multiple triples to recursively build up the RDF list:
/*
* Jena querybuilder does not yet support RDF lists. See:
* http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#collections
*/
private Node buildRdfCollection(SelectBuilder queryBuilder, List<?> itemList) {
if (itemList.isEmpty()) {
return RDF.nil.asNode()
}
def head = itemList.first()
def rest = buildRdfCollection(queryBuilder, itemList.subList(1, itemList.size()))
def listNode = NodeFactory.createAnon()
queryBuilder.addWhere(listNode, RDF.first, head)
queryBuilder.addWhere(listNode, RDF.rest, rest)
return listNode
}
...
def listNode = buildRdfCollection(queryBuilder, abc)
queryBuilder.addWhere(listNode, ':hasMagicProperty', thisVar)
The generated SPARQL code looks like this:
_:b0 rdf:first ?c ;
rdf:rest rdf:nil .
_:b1 rdf:first ?b ;
rdf:rest _:b0 .
_:b2 rdf:first ?a ;
rdf:rest _:b1 ;
:hasMagicProperty ?this .
This is a long-winded equivalent to:
(?a ?b ?c) :hasMagicProperty ?this .
I wrote the queryBuilder and I don't think that in it's current state it will do what you want. Query builder is based on (but does not yet fully implement) the w3c SPARQL 1.1 recommendation:
http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#rQuery
However, I think you can create your query using the Jena QueryFactory
String queryString = "SELECT * WHERE { "+
" OPTIONAL { ?this a :MyEntity } ."+
" (?a ?b ?c) :hasMagicProperty ?result . "+
" ?a :hasACode 'code A' . "+
" ?b :hasACode 'code B' . "+
" ?c :hasACode 'code C' ."+
" }";
Query query = QueryFactory.create(queryString) ;
Unfortunately, I don't think this is what you really want. Notice that ?this is not bound to any of the other statements and so will produce a cross product of all :MyEntity type subjects with the ?a, ?b, ?c and `?result`` bindings.
If you can create the query with QueryFactory, I can ensure that QueryBuilder will support it.
UPDATE
I have updated QueryBuilder (the next Snapshot should contain the changes). You should now be able to do the following:
Var aVar = Var.alloc('a')
Var bVar = Var.alloc('b')
Var cVar = Var.alloc('c')
selectBuilder.addWhere(selectBuilder.list(aVar, bVar, cVar), ':hasMagicProperty', thisVar)
selectBuilder.addWhere(aVar, ':hasACode', 'code A')
selectBuilder.addWhere(bVar, ':hasBCode', 'code B')
selectBuilder.addWhere(cVar, ':hasCCode', 'code C')
If you can also simply add the standard text versions of values in the list parameters like:
selectBuilder.list( "<a>", "?b", "'c'" )

NoSuchMethod when trying to create a SPARQL query with jena

I am trying to make some SPARQL queries using vc-db-1.rdf and q1.rq from ARQ examples. Here is my java code:
import com.hp.hpl.jena.rdf.model.*;
import com.hp.hpl.jena.util.FileManager;
import com.hp.hpl.jena.query.* ;
import com.hp.hpl.jena.query.ARQ;
import com.hp.hpl.jena.iri.*;
import java.io.*;
public class querier extends Object
{
static final String inputFileName = "vc-db-1.rdf";
public static void main (String args[])
{
// Create an empty in-memory model
Model model = ModelFactory.createDefaultModel();
// use the FileManager to open the bloggers RDF graph from the filesystem
InputStream in = FileManager.get().open(inputFileName);
if (in == null)
{
throw new IllegalArgumentException( "File: " + inputFileName + " not found");
}
// read the RDF/XML file
model.read( in, "");
// Create a new query
String queryString = "PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#> SELECT ?y ?givenName WHERE { ?y vcard:Family \"Smith\" . ?y vcard:Given ?givenName . }";
QueryFactory.create(queryString);
}
}
Compilation passes just fine.
The problem is that the query is not even executed, but I am getting an error during creating it at line
QueryFactory.create(queryString);
with the following explanation:
C:\Wallet\projects\java\ARQ_queries>java querier
Exception in thread "main" java.lang.NoSuchMethodError: com.hp.hpl.jena.iri.IRI.
resolve(Ljava/lang/String;)Lcom/hp/hpl/jena/iri/IRI;
at com.hp.hpl.jena.n3.IRIResolver.resolveGlobal(IRIResolver.java:191)
at com.hp.hpl.jena.sparql.mgt.SystemInfo.createIRI(SystemInfo.java:31)
at com.hp.hpl.jena.sparql.mgt.SystemInfo.<init>(SystemInfo.java:23)
at com.hp.hpl.jena.query.ARQ.init(ARQ.java:373)
at com.hp.hpl.jena.query.ARQ.<clinit>(ARQ.java:385)
at com.hp.hpl.jena.query.Query.<clinit>(Query.java:53)
at com.hp.hpl.jena.query.QueryFactory.create(QueryFactory.java:68)
at com.hp.hpl.jena.query.QueryFactory.create(QueryFactory.java:40)
at com.hp.hpl.jena.query.QueryFactory.create(QueryFactory.java:28)
at querier.main(querier.java:24)
How can i solve this? Thank you.
It looks like you're missing the IRI library on the classpath (the IRI library is separate from the main Jena JAR). Jena has runtime dependencies on several other libraries which are included in the lib directory of the Jena distribution. All of these need to be on your classpath at runtime (but not necessarily at compile time).

How to process the rdf version of a DBpedia page with Jena?

In all dbpedia pages, e.g.
http://dbpedia.org/page/Ireland
there's a link to a RDF file.
In my application I need to analyse the rdf code and run some logic on it.
I could rely on the dbpedia SPARQL endpoint, but I prefer to download the rdf code locally and parse it, to have full control over it.
I installed JENA and I'm trying to parse the code and extract for example a property called: "geo:geometry".
I'm trying with:
StringReader sr = new StringReader( node.rdfCode )
Model model = ModelFactory.createDefaultModel()
model.read( sr, null )
How can I query the model to get the info I need?
For example, if I wanted to get the statement:
<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<geo:geometry xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" rdf:datatype="http://www.openlinksw.com/schemas/virtrdf#Geometry">POINT(-7 53)</geo:geometry>
</rdf:Description>
Or
<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<dbpprop:countryLargestCity xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="en">Dublin</dbpprop:countryLargestCity>
</rdf:Description>
What is the right filter?
Many thanks!
Mulone
Once you have the file parsed in a Jena model you can iterate and filter with something like:
//Property to filter the model
Property geoProperty =
model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
"geometry");
//Iterator based on a Simple selector
StmtIterator iter =
model.listStatements(new SimpleSelector(null, geoProperty, (RDFNode)null));
//Loop to traverse the statements that match the SimpleSelector
while (iter.hasNext()) {
Statement stmt = iter.nextStatement();
System.out.print(stmt.getSubject().toString());
System.out.print(stmt.getPredicate().toString());
System.out.println(stmt.getObject().toString());
}
The SimpleSelector allows you to pass any (subject,predicate,object) pattern to match statements in the model. In your case if you only care about a specific predicate then first and third parameters of the constructor are null.
Allowing filtering two different properties
To allow more complex filtering you can implement the selects method in the
SimpleSelector interface like here:
Property geoProperty = /* like before */;
Property countryLargestCityProperty =
model. createProperty("http://dbpedia.org/property/",
"countryLargestCity");
SimpleSelector selector = new SimpleSelector(null, null, (RDFNode)null) {
public boolean selects(Statement s)
{ return s.getPredicate().equals(geoProperty) ||
s.getPredicate().equals(countryLargestCityProperty) ;}
}
StmtIterator iter = model.listStatements(selector);
while(it.hasNext()) {
/* same as in the previous example */
}
Edit: including a full example
This code includes a full example that works for me.
import com.hp.hpl.jena.util.FileManager;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.SimpleSelector;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Literal;
import com.hp.hpl.jena.rdf.model.StmtIterator;
import com.hp.hpl.jena.rdf.model.Statement;
public class TestJena {
public static void main(String[] args) {
FileManager fManager = FileManager.get();
fManager.addLocatorURL();
Model model = fManager.loadModel("http://dbpedia.org/data/Ireland.rdf");
Property geoProperty =
model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
"geometry");
StmtIterator iter =
model.listStatements(new SimpleSelector(null, geoProperty,(RDFNode) null));
//Loop to traverse the statements that match the SimpleSelector
while (iter.hasNext()) {
Statement stmt = iter.nextStatement();
if (stmt.getObject().isLiteral()) {
Literal obj = (Literal) stmt.getObject();
System.out.println("The geometry predicate value is " +
obj.getString());
}
}
}
}
This full example prints out:
The geometry predicate value is POINT(-7 53)
Notes on Linked Data
http://dbpedia.org/page/Ireland is the HTML document version of the resource http://dbpedia.org/resource/Ireland
In order to get the RDF you should resolve :
http://dbpedia.org/data/Ireland.rdf
or
http://dbpedia.org/resource/Ireland + Accept: application/rdfxml in the HTTP header.
With curl it'd be something like:
curl -L -H 'Accept: application/rdf+xml' http://dbpedia.org/resource/Ireland

Categories

Resources