How to select subjects with specific properties from RDF with Jena?

How to select subjects with specific properties from RDF with Jena? - java

This is a follow up question on my last one, since I am still struggling with this topic... I need to select some subjects from my model that meet specific requirements..
If I list my statements (this is only short part of the output), I get something like this:
WorkOrder2 hasType Workorder .
WorkOrder2 hasResult Fuselage22 .
WorkOrder2 type NamedIndividual .
Now, I would like to select and iterate thourgh all subjects that hasType Workorder. My idea was something like this:
public static ArrayList<String> listAllWorkorders(Model model) {
ArrayList<String> workorders = new ArrayList<String>();
// list of all work orders associated with given fuselage and work
// station
ResIterator it = model.listSubjectsWithProperty(
ResourceFactory.createProperty(ArumCorePrefix + "hasType"), ArumCorePrefix + "Workorder");
while (it.hasNext()) {
Resource r = it.next();
String workorder = trimPrefix(r.toString());
workorders.add(workorder);
}
// sort the result alphabetically
Collections.sort(workorders);
return workorders;
}
However, it does not return anything... If I use listSubjectsWithProperty without the second argument (String), it works but returns not only Workorders but some toher stuff with hasType property, which I do not want to. What is wrong with my code! Can I use something like this and make it work?
Dont worry about the static use of this function (I will take care of this non-elegant way as soon as I udnerstand whats wrong.)
Also, I would like to implement some more compelx filtering - for example selecting subjects with multiple properties that all has to match in order to return them, like hasType Workorder, hasResult someResult, inStation station etc... Does Jena support something like this! If not, what is the common approach?
Thanks for any tips!
And a follow-up: How do I check whether some statement is present in my model? I know that there is model.contains(Statements s) method but do I have to create the statement in argument in roder to call this method? Isnt there some more elegant way like model.contains(Resource r, Property p, Resource o)?

There are a number of ways you can do this in Jena, but they mostly come down to calling
StmtIterator Model.listStatements(Resource,Property,RDFNode)
with the resource as the first argument, and null as the second and third arguments, as a wildcard. The other methods that do similar things are really just special cases of this. For instance,
listObjectsOfProperty(Property p) — listStatements(null,p,null) and take the object from each statement.
listObjectsOfProperty(Resource s, Property p) — listStatements(s,p,null) and take the object from each statement.
listResourcesWithProperty(Property p) — listStatements(null,p,null) and take the subject from each statement
listResourcesWithProperty(Property p, RDFNode o) — listStatements(null,p,o) and take the subject from each statement
For convenience, you might prefer to use the method
StmtIterator Resource.listProperties()
which returns an iterator over all the statements in the resource's model with the given resource as a subject.
Here's some example code that includes your model and uses each of these methods:
import java.io.ByteArrayInputStream;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Resource;
import com.hp.hpl.jena.rdf.model.StmtIterator;
public class ResourcePropertiesExample {
final static String NS = "http://example.org/";
final static String modelText = "" +
"#prefix : <"+NS+"> .\n" +
":WorkOrder2 :hasType :Workorder .\n" +
":WorkOrder2 :hasResult :Fuselage22 .\n" +
":WorkOrder2 :type :NamedIndividual .\n" +
"";
public static void main(String[] args) {
final Model model = ModelFactory.createDefaultModel();
model.read( new ByteArrayInputStream( modelText.getBytes()), null, "TTL" );
final Resource workOrder2 = model.getResource( NS+"WorkOrder2" );
System.out.println( "Using Model.listStatements()" );
StmtIterator stmts = model.listStatements( workOrder2, null, (RDFNode) null );
while ( stmts.hasNext() ) {
System.out.println( stmts.next() );
}
System.out.println( "Using Resource.listProperties()" );
stmts = workOrder2.listProperties();
while ( stmts.hasNext() ) {
System.out.println( stmts.next() );
}
}
}
The output is:
Using Model.listStatements()
[http://example.org/WorkOrder2, http://example.org/type, http://example.org/NamedIndividual]
[http://example.org/WorkOrder2, http://example.org/hasResult, http://example.org/Fuselage22]
[http://example.org/WorkOrder2, http://example.org/hasType, http://example.org/Workorder]
Using Resource.listProperties()
[http://example.org/WorkOrder2, http://example.org/type, http://example.org/NamedIndividual]
[http://example.org/WorkOrder2, http://example.org/hasResult, http://example.org/Fuselage22]
[http://example.org/WorkOrder2, http://example.org/hasType, http://example.org/Workorder]
As for checking whether a model contains certain statements, you can, as you noted, use Model.contains, and I don't think there's anything the matter with that. You can also use the various Resource has* methods, such as
hasLiteral( Property p, [various types of literal] )
hasProperty( Property p, RDFNode / String / String, String )
Using these, you could use, continuing the example above, and assuming you'd defined the property hasResult and resources fuselage21 and fuselage22:
workOrder2.hasProperty( hasResult, fuselage21 ); // false
workOrder2.hasProperty( hasResult, fuselage22 ); // true

Related

Giving array as parameter to a Jena built-in

I need to create a new built-in for Jena. With this one I would like to be able to extract the minimum date from where it is.
I just wondering if it is possible to give a class of datas to a built-in instead of just one parameter.
Here is the bodyCall of my function :
#Override
public boolean bodyCall(Node[] args, int length, RuleContext context) {
System.out.println("Entra");
checkArgs(length, context);
BindingEnvironment env = context.getEnv();
Node n1 = getArg(0, args, context);
Node n2 = getArg(1, args, context);
//int count = 0;
//do{
//System.out.println("RULE"+context.getEnv().getGroundVersion(n2).getLiteralLexicalForm()); count ++;}while(count <2);
System.out.println("Date 1: " + n1 + " and Date 2: " + n2);
if (n1.isLiteral() && n2.isLiteral()) {
Object v1 = n1.getLiteralValue();
Object v2 = n2.getLiteralValue();
Node max = null;
if (v1 instanceof XSDDateTime && v2 instanceof XSDDateTime) {
XSDDateTime nv1 = (XSDDateTime) v1;
XSDDateTime nv2 = (XSDDateTime) v2;
Calendar data1 = new GregorianCalendar (nv1.getYears(), nv1.getMonths(), nv1.getDays());
Calendar data2 = new GregorianCalendar (nv2.getYears(), nv2.getMonths(), nv2.getDays());
SimpleDateFormat df = new SimpleDateFormat();
df.applyPattern("yyyy-dd-MM");
if (data1.compareTo(data2) > 0)
{
System.out.println("la data piu' grande e' DATA1: " +df.format(data1.getTime()));
max = args[0];
}
else
{
max = args[1];
System.out.print("la data piu' grande e' DATA1: " +df.format(data1.getTime()));
}
return env.bind(args[2], max);
}
}
// Doesn't (yet) handle partially bound cases
return false;
}
});
This is my simple rule:
#prefix ex: http://www.semanticweb.org/prova_rules_M#
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
[maxDate:
(?p rdf:type ex:Persona)
(?p http://www.semanticweb.org/prova_rules_M/persona#data_nascita ?c)
(?p http://www.semanticweb.org/prova_rules_M/persona#data_nascita ?d)
maxDate(?c,?d,?x)
-> print(?x)
]
I give to the built-in three parameters. Two for input and one for output.
My idea is using two varibles : ?c and ?d. In both of them there is a birthday date. I would like to get the first record from the ?c and the next record from the ?d. But, it looks like that Jena takes each time the first record.
Is it possible, by Java, telling that I want the second record and scroll the results ?
For example, my ontology is composed by two dates:
1)1992-04-13T00:00:00.0;
2)1988-04-25T00:00:00.0
I want to have 1) in ?c and 2) in ?d and then, make an algorithm to get the minimum between them.
ps : In the "bodyCall" above there is my try to get the maximum between to dates that I give to the rule. It works fine for this purpose.
Thank you all.

When you implement bodyCall(Node[], int, RuleContext) or headAction(Node[], int, RuleContext) as part of implementing a Builtin, you are given an array of arguments that represents the arguments to to the builtin. In a rule, you can hand any number of variables to the the builtin (not only one).
It loosely seems like (and you can correct me if I am misinterpreting your question) that you are looking to work over some class expression in order to get the data that you need. If your overall goal is to operate on 'a class of data', then there are multiple ways to achieve this.
(easiest) Formulate your class expression as statements within the body of the rule. This will ensure that your builtin is passed only individuals of the appropriate class. Chaining together multiple preconditions can allow you to only operate on certain individuals (a 'class of data').
(potentially nontrivial) If you intend to have your builtin operate on a class, use the RuleContext passed to your bodyCall(...) or headAction(...) in order to find individuals that satisfy your class expression (by calling RuleContext#find(...) or some other method).
As an example, let's say that we wanted to act on each member of the class urn:ex:Question. In the first solution, we'd formulate a rule similar to the following:
[eachIndividual: (?x rdf:type urn:ex:Question) -> builtin(?x)]
This would ensure that we'd operate on every single instance of urn:ex:Question. An example of the second solution would be to pass the class expression to your builtin directly. Your question does not indicate how you would identify the class in question, so I will arbitrarily assume that you are interested in classes which are rdfs:subClassOf urn:ex:Question.
[eachSubclass: (x? rdfs:subClassof urn:ex:Question) -> builtin(?x)]
In this case, you would need to somehow operate on your 'class of data' within your builtin. As mentioned previously, you could potentially use the RuleContext to do so.
EDIT
Let us assume that you have 40 individuals of type urn:ex:Question, and each individual has a property urn:ex:dateSubmitted that indicates when it was submitted. This can be rather trivially solved using a SPARQL query:
SELECT ?post WHERE {
?post a urn:ex:Question .
?post urn:ex:dateSubmitted ?date .
}
ORDER BY ?date
LIMIT 1
Edit 2
Based on the new information in your update, you can probably just modify your body call to look like the following:
#Override
public boolean bodyCall( final Node[] args, final int length, final RuleContext context )
{
checkArgs(length, context);
final Node n1 = getArg(0, args, context);
final Node n2 = getArg(1, args, context);
if (n1.isLiteral() && n2.isLiteral()) {
final Node max = Util.compareTypedLiterals(n1, n2) < 0 ? n2 : n1;
return context.getEnv().bind(args[2], max);
}
return false;
}

How to convert Hibernate Result Object to String?

I implemented a hibernate query and would like to assign the result to one of my class variables.
The problem is that the results of hibernate queries seem to be objects or something, as the syso of a result looks very strange:
[exercise.java.basics.storage.WarehouseProduct#77f6d2e3]
This is the method executing the query:
public void updateStock() {
Session session = getSessionFactory().getCurrentSession();
Criteria criteriaNail = session.createCriteria( WarehouseProduct.class );
criteriaNail.add( Restrictions.like( "productName", String.valueOf( Product.NAIL ) ) );
List nailCountResult = criteriaNail.list();
System.out.println( nailCountResult.toString() );
}
The database has only 2 colums and the value I need is in the second.
What I would like to do is something like this:
this.nailCount = nailCountResult.[XYZ --> Get the value from the second column];
Is something like this possible? How can I cast these result objects to something readable?
best regards
daZza

First of all I suggest to change the line to
List<WarehouseProduct> nailCountResult = criteriaNail.list();
And now it is not a ResultSet, it's a list of WarehouseProduct Objects.
You can access each object with index.
You can loop over the result list and see them like
for( WarehouseProduct wp : nailCountResult ) {
System.out.println( wp.nailCount);
}
As a side note, you are breaking encapsulation here. Please look in to it.

All you need to to is this
String value=nailCountResult.get(0).getXXXX();

Java Properties - int becomes null

Whilst I've seen similar looking questions asked before, the accepted answers have seemingly provided an answer to a different question (IMO).
I have just joined a company and before I make any changes/fixes, I want to ensure that all the tests pass. I've fixed all but one, which I've discovered is due to some (to me) unexpected behavior in Java.
If I insert a key/value pair into a Properties object where the value is an int, I expected autoboxing to come into play and getProperty would return a string. However, that's not what's occuring (JDK1.6) - I get a null. I have written a test class below:
import java.util.*;
public class hacking
{
public static void main(String[] args)
{
Properties p = new Properties();
p.put("key 1", 1);
p.put("key 2", "1");
String s;
s = p.getProperty("key 1");
System.err.println("First key: " + s);
s = p.getProperty("key 2");
System.err.println("Second key: " + s);
}
}
And the output of this is:
C:\Development\hacking>java hacking
First key: null
Second key: 1
Looking in the Properties source code, I see this:
public String getProperty(String key) {
Object oval = super.get(key);
String sval = (oval instanceof String) ? (String)oval : null;
return ((sval == null) && (defaults != null)) ? defaults.getProperty(key) : sval;
}
The offending line is the second line - if it's not a String, it uses null.
I can't see any reason why this behavior would be desired/expected. The code was written by almost certainly someone more capable than I am, so I assume there is a good reason for it. Could anyone explain? If I've done something dumb, save time and just tell me that! :-)
Many thanks

This is form docs:
"Because Properties inherits from Hashtable, the put and putAll methods can be applied to a Properties object. Their use is strongly discouraged as they allow the caller to insert entries whose keys or values are not Strings. The setProperty method should be used instead. If the store or save method is called on a "compromised" Properties object that contains a non-String key or value, the call will fail. Similarly, the call to the propertyNames or list method will fail if it is called on a "compromised" Properties object that contains a non-String key."

I modified your code to use the setProperty method as per the docs and it brings up compilation error
package com.stackoverflow.framework;
import java.util.*;
public class hacking
{
public static void main(String[] args)
{
Properties p = new Properties();
p.setProperty("key 1", 1);
p.setProperty("key 2", "1");
String s;
s = p.getProperty("key 1");
System.err.println("First key: " + s);
s = p.getProperty("key 2");
System.err.println("Second key: " + s);
}
}

Custom functions in SPARQL with the Jena API

first time post here. I was hoping someone could help me make custom SPARQL functions for use within the Jena (ARQ) API. I need SPARQL to do some aggregation, and I know that it already implements avg, count, min, max, and sum, but I need to be able to do standard deviation and median as well (I also need range, but that can be done with using just min and max).
I was hoping the query could be similar to what you use for the already implemented functions:
PREFIX example: <http://www.examples.com/functions#>
PREFIX core: <http://www.core.com/values#>
SELECT (stddev(?price) as ?stddev)
WHERE {
?s core:hasPrice ?price
}
I don't know if that is possible or not, but if I need to use it like other custom functions that would be fine too, as long as it still gets the standard deviation of the results.
All I know is that the functions would be written in Java, which I already know pretty well. So, I was wondering if anyone knew of a good way to go about this or where to start looking for some guidance. I've tried looking for documentation on it, but there doesn't seem to be anything. Any help would be greatly appreciated.
Thanks in advance.

I am not sure you can do what you want without actually changing the grammar.
SUM(...) for example is a keyword defined by the SPARQL grammar:
https://svn.apache.org/repos/asf/jena/trunk/jena-arq/Grammar/sparql_11.jj
https://svn.apache.org/repos/asf/jena/trunk/jena-arq/src/main/java/com/hp/hpl/jena/sparql/expr/aggregate/AggSum.java
A filter function or a property function is probably not what you are looking for.
By the way, you do not get STDDEV in SQL as well. Is it because two passes over the data are necessary?

Aggregate functions are a special case of SPARQL (and hence of ARQ) functions.
I think in ARQ it is not easy to extend the set of aggregate functions, while it is easy (and documented) to extend the set of filter functions and the one of property functions.
You could calculate anyway the standard deviation with something like this:
PREFIX afn: <http://jena.hpl.hp.com/ARQ/function#>
PREFIX core: <http://www.core.com/values#>
SELECT ( afn:sqrt( sum( (?price - ?avg) * (?price - ?avg) ) / (?count - 1) ) as ?stddev )
WHERE {
?s core:hasPrice ?price .
{
SELECT (avg(?price) as ?avg) (count(*) as ?count)
WHERE {
?s core:hasPrice ?price
}
}
}
I'm forced anyway to use afn:sqrt that is an ARQ "proprietary" function not in the SPARQL 1.1 draft, so this query wouldn't work on frameworks different from Jena

Yes, ARQ is extensible in a variety of ways. The ARQ extensions page would be the best place to start.

ARQ allows you to add your own aggregate functions by registering them in the AggregateRegistry. The example code shows how this is done. This can be used to add the custom standard deviation aggregate function requested in the question. In the example below, Commons Math is used to do the calculation.
import org.apache.commons.math3.stat.descriptive.SummaryStatistics;
import org.apache.jena.graph.Graph;
import org.apache.jena.query.*;
import org.apache.jena.rdf.model.Model;
import org.apache.jena.rdf.model.ModelFactory;
import org.apache.jena.sparql.engine.binding.Binding;
import org.apache.jena.sparql.expr.ExprList;
import org.apache.jena.sparql.expr.NodeValue;
import org.apache.jena.sparql.expr.aggregate.Accumulator;
import org.apache.jena.sparql.expr.aggregate.AccumulatorFactory;
import org.apache.jena.sparql.expr.aggregate.AggCustom;
import org.apache.jena.sparql.expr.aggregate.AggregateRegistry;
import org.apache.jena.sparql.function.FunctionEnv;
import org.apache.jena.sparql.graph.NodeConst;
import org.apache.jena.sparql.sse.SSE;
public class StandardDeviationAggregate {
/**
* Custom aggregates use accumulators. One accumulator is created for each group in a query execution.
*/
public static AccumulatorFactory factory = (agg, distinct) -> new StatsAccumulator(agg);
private static class StatsAccumulator implements Accumulator {
private AggCustom agg;
private SummaryStatistics summaryStatistics = new SummaryStatistics();
StatsAccumulator(AggCustom agg) { this.agg = agg; }
#Override
public void accumulate(Binding binding, FunctionEnv functionEnv) {
// Add values to summaryStatistics
final ExprList exprList = agg.getExprList();
final NodeValue value = exprList.get(0).eval(binding, functionEnv) ;
summaryStatistics.addValue(value.getDouble());
}
#Override
public NodeValue getValue() {
// Get the standard deviation
return NodeValue.makeNodeDouble(summaryStatistics.getStandardDeviation());
}
}
public static void main(String[] args) {
// Register the aggregate function
AggregateRegistry.register("http://example/stddev", factory, NodeConst.nodeMinusOne);
// Add data
Graph g = SSE.parseGraph("(graph " +
"(:item1 :hasPrice 13) " +
"(:item2 :hasPrice 15) " +
"(:item3 :hasPrice 20) " +
"(:item4 :hasPrice 30) " +
"(:item5 :hasPrice 32) " +
"(:item6 :hasPrice 11) " +
"(:item7 :hasPrice 16))");
Model m = ModelFactory.createModelForGraph(g);
String qs = "PREFIX : <http://example/> " +
"SELECT (:stddev(?price) AS ?stddev) " +
"WHERE { ?item :hasPrice ?price }";
// Execute query and print results
Query q = QueryFactory.create(qs) ;
QueryExecution qexec = QueryExecutionFactory.create(q, m);
ResultSet rs = qexec.execSelect() ;
ResultSetFormatter.out(rs);
}
}
I hope this example helps someone at least, even though the question was posted a few years back.

Reformatting code with Regular Expressions

We have an ArrayList of items in several classes which are giving me trouble every time I'd like to insert a new item into the list. It was a mistake on my part to have designed the classes in the way I did but changing the design now would be more headache than it's worth (bureaucratic waterfall model.) I should have anticipated format changes to the documents the customer was supplying us waterfall be damned.
I'd like to write a simple script in python which goes into a class, adds the item to the list, and then increments all retrievals for the following items. That doesn't sound very explanatory:
Foo extends Bar{
public Foo(){
m_Tags.add("Jane");
m_Tags.add("Bob");
m_Tags.add("Jim");
}
public String GetJane() { return m_ParsedValue.get( m_Tags.get(1) ); }
public String GetBob() { return m_ParsedValue.get( m_Tags.get(2) ); }
public String GetJim() { return m_ParsedValue.get( m_Tags.get(3) ); }
}
You see if I want to add a value between "Jane" and "Bob" I then have to increment the integers in the Get* functions. I just want to write a simple script in Python that does the work for me. Someone I very much respect suggested regex.
Edit:
Yes, LinkedHashMap. So simple, so easy and so not in the design specs now. I hate waterfall. Hate it with a passion. This whole bit was a "small" and "easy" part that "shouldn't take much time to design." I made mistakes. It's stuck in stone now.

You want your regular expression to be as flexible as the compiler will be with respect to whitespace between tokens. Doing so and mimicking whitespace usage makes the pattern pretty messy. The code below (sorry: Perl, not Python) edits your source files in-place.
#! /usr/bin/perl -i.bak
use warnings;
use strict;
my $template =
'^( public
String
Get)(\w+)( \( \) { return
m_ParsedValue . get \( m_Tags . get \( )(\d+)( \) \) ; } )$';
$template =~ s/ +/\\s*/g;
$template =~ s/(\r?\n)+/\\s+/g;
my $getter = qr/$template/x;
die "Usage: $0 after new-name source ..\n" unless #ARGV >= 3;
my $after = shift;
my $add = shift;
my $index;
while (<>) {
unless (/$getter/) {
print;
next;
}
my($abc,$name,$lmno,$i,$xyz) = ($1,$2,$3,$4,$5);
if (defined $index) {
print join "" => $abc, $name, $lmno, ++$index, $xyz;
}
else {
if ($name eq $after) {
$index = $i;
print; print join "" => $abc, $add, $lmno, ++$index, $xyz;
}
else { print; }
}
}
For example,
$ ./add-after Jane Foo code.java
$ cat code.java
Foo extends Bar{
public Foo(){
m_Tags.add("Jane");
m_Tags.add("Bob");
m_Tags.add("Jim");
}
public String GetJane() { return m_ParsedValue.get( m_Tags.get(1) ); }
public String GetFoo() { return m_ParsedValue.get( m_Tags.get(2) ); }
public String GetBob() { return m_ParsedValue.get( m_Tags.get(3) ); }
public String GetJim() { return m_ParsedValue.get( m_Tags.get(4) ); }
}

Don't do this with regexp. Create symbolic constants (using for example an enum) that map the names to numbers.

Comments about bad-practices apart - here is the code you asked in the language you asked for.
The best thing if you are keeping the system this way, probably would be to make these java files be automatically generated in the build process itself -- you 'd just keep a names list in a .txt file in the directory. This script is suitable to do that.
(It won't modify your files, it genrate new ones based on the template you posted here)
import re, sys
template = """Foo extends Bar{
public Foo(){
%s
}
%s
}
"""
tag_templ = """ m_Tags.add("%s");"""
getter_templ = """ public String GetJane() { return m_ParsedValue.get( m_Tags.get(%d) ); }"""
def parse_names(filename):
data = open(filename).read()
names = re.findall(r'm_Tags\.add\("(.*?)"', data)
return names
def create_file(filename, names):
tag_lines = [tag_templ % name for name in names]
getter_lines = [getter_templ % (i + 1) for i in range(len(names))]
code = template % ("\n".join(tag_lines), "\n".join(getter_lines))
file = open(filename,"wt")
file.write(code)
file.close()
def insert_name(after, new_name, names):
names.insert(names.index(after) + 1, new_name)
if __name__ == "__main__":
if len(sys.argv ) < 4:
sys.stderr.write("Usage: changer.py <filename> <name-before-insertion> <new-name>")
sys.exit(1)
filename, name_before, new_name = sys.argv[1:]
names = parse_names(filename)
insert_name(name_before, new_name, names)
create_file(filename, names)

I'm doing this (well, something very similar) right now, but using Excel and VBA macros. All of the Business Values are organized and ordered in a spreadsheet. I just have to click a button to generate the appropriate code for the selected cells, then copy-paste to the IDE. Better yet, I have several "code columns" for each row. Some of them generate queries, some XSL transformations, and some procedures. For one row of business data, I can get all three types of generated code very easily.
I found this (re-generate) to be MUCH easier than re-formatting the existing code I had.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to select subjects with specific properties from RDF with Jena? - java

Related

Giving array as parameter to a Jena built-in

How to convert Hibernate Result Object to String?

Java Properties - int becomes null

Custom functions in SPARQL with the Jena API

Reformatting code with Regular Expressions

Categories

Resources