Retrieve Data Properties of An Individual using OWL API in eclipse - java

I want to retrieve all data properties set for an individual of any class using owl api. The code i have used is
OWLNamedIndividual inputNoun = df.getOWLNamedIndividual(IRI.create(prefix + "Cow"));
for (OWLDataProperty prop: inputNoun.getDataPropertiesInSignature())
{
System.out.println("the properties for Cow are " + prop); //line 1
}
This code compiles with success but line 1 print nothing at all. What should be the correct syntax. Have thoroughly googled and couldnt find any thing worth it.

OWLNamedIndividual::getDataPropertiesInSignature() does not return the properties for which the individual has a filler, it returns the properties that appear in the object itself. For an individual this is usually empty. The method is on the OWLObject interface, which covers things like class and property expressions and ontologies, for which it has a more useful output.
If you want the data properties with an actual filler for an individual, use OWLOntology::getDataPropertyAssertionAxioms(OWLIndividual), like this:
OWLNamedIndividual input = ...
Set<OWLDataPropertyAssertionAxiom> properties=ontology.getDataPropertyAssertionAxioms(input);
for (OWLDataPropertyAssertionAxiom ax: properties) {
System.out.println(ax.getProperty());
}

Related

Parse a single POJO from multiple YAML documents representing different classes

I want to use a single YAML file which contains several different objects - for different applications. I need to fetch one object to get an instance of MyClass1, ignoring the rest of docs for MyClass2, MyClass3, etc. Some sort of selective de-serializing: now this class, then that one... The structure of MyClass2, MyClass3 is totally unknown to the application working with MyClass1. The file is always a valid YAML, of course.
The YAML may be of any structure we need to implement such a multi-class container. The preferred parsing tool is snakeyaml.
Is it sensible? How can I ignore all but one object?
UPD: replaced all "document" with "object". I think we have to speak about the single YAML document containing several objects of different structure. More of it, the parser knows exactly only 1 structure and wants to ignore the rest.
UDP2: I think it is impossible with snakeyaml. We have to read all objects anyway - and select the needed one later. But maybe I'm wrong.
UPD2: sample config file
---
-
exportConfiguration781:
attachmentFieldName: "name"
baseSftpInboxPath: /home/user/somedir/
somebool: false
days: 9999
expected:
- ABC w/o quotes
- "Cat ABC"
- "Some string"
dateFormat: yyyy-MMdd-HHmm
user: someuser
-
anotherConfiguration:
k1: v1
k2:
- v21
- v22
This is definitely possible with SnakeYAML, albeit not trivial. Here's a general rundown what you need to do:
First, let's have a look what loading with SnakeYAML does. Here's the important part of the YAML class:
private Object loadFromReader(StreamReader sreader, Class<?> type) {
Composer composer = new Composer(new ParserImpl(sreader), resolver, loadingConfig);
constructor.setComposer(composer);
return constructor.getSingleData(type);
}
The composer parses YAML input into Nodes. To do that, it doesn't need any knowledge about the structure of your classes, since every node is either a ScalarNode, a SequenceNode or a MappingNode and they just represent the YAML structure.
The constructor takes a root node generated by the composer and generates native POJOs from it. So what you want to do is to throw away parts of the node graph before they reach the constructor.
The easiest way to do that is probably to derive from Composer and override two methods like this:
public class MyComposer extends Composer {
private final int objIndex;
public MyComposer(Parser parser, Resolver resolver, int objIndex) {
super(parser, resolver);
this.objIndex = objIndex;
}
public MyComposer(Parser parser, Resolver resolver, LoaderOptions loadingConfig, int objIndex) {
super(parser, resolver, loadingConfig);
this.objIndex = objIndex;
}
#Override
public Node getNode() {
return strip(super.getNode());
}
private Node strip(Node input) {
return ((SequenceNode)input).getValue().get(objIndex);
}
}
The strip implementation is just an example. In this case, I assumed your YAML looks like this (object content is arbitrary):
- {first: obj}
- {second: obj}
- {third: obj}
And you simply select the object you actually want to deserialize by its index in the sequence. But you can also have something more complex like a searching algorithm.
Now that you have your own composer, you can do
Constructor constructor = new Constructor();
// assuming we want to get the object at index 1 (i.e. second object)
Composer composer = new MyComposer(new ParserImpl(sreader), new Resolver(), 1);
constructor.setComposer(composer);
MyObject result = (MyObject)constructor.getSingleData(MyObject.class);
The answer of #flyx was very helpful for me, opening the way to workaround the library (in our case - snakeyaml) limitations by overriding some methods. Thanks a lot! It's quite possible there is a final solution in it - but not now. Besides, the simple solution below is robust and should be considered even if we'd found the complete library-intruding solution.
I've decided to solve the task by double distilling, sorry, processing the configuration file. Imagine the latter consisting of several parts and every part is marked by the unique token-delimiter. For the sake of keeping the YAML-likenes, it may be
---
#this is a unique key for the configuration A
<some YAML document>
---
#this is another key for the configuration B
<some YAML document
The first pass is pre-processing. For the given String fileString and String key (and DELIMITER = "\n---\n". for example) we select a substring with the key-defined configuration:
int begIndex;
do {
begIndex= fileString.indexOf(DELIMITER);
if (begIndex == -1) {
break;
}
if (fileString.startsWith(DELIMITER + key, begIndex)) {
fileString = fileString.substring(begIndex + DELIMITER.length() + key.length());
break;
}
// spoil alien delimiter and repeat search
fileString = fileString.replaceFirst(DELIMITER, " ");
} while (true);
int endIndex = fileString.indexOf(DELIMITER);
if (endIndex != -1) {
fileString = fileString.substring(0, endIndex);
}
Now we feed the fileString to the simple YAML parsing
ExportConfiguration configuration = new Yaml(new Constructor(ExportConfiguration.class))
.loadAs(fileString, ExportConfiguration.class);
This time we have a single document that must co-respond to the ExportConfiguration class.
Note 1: The structure and even the very content of the rest of configuration file plays absolutely no role. This was the main idea, to get independent configurations in a single file
Note 2: the rest of configurations may be JSON or XML or whatever. We have a method-preprocessor that returns a String configuration - and the next processor parses it properly.

Stanford Core NLP: Entity type non deterministic

I had built a java parser using Stanford Core NLP. I am finding an issue in getting the consistent results with the CORENLP object. I am getting the different entity types for the same input text. It seems like a bug to me in CoreNLP. Wondering if any of the StanfordNLP users have encountered this issue and found workaround for the same. This is my Service class which I am instantiating and reusing.
class StanfordNLPService {
//private static final Logger logger = LogConfiguration.getInstance().getLogger(StanfordNLPServer.class.getName());
private StanfordCoreNLP nerPipeline;
/*
Initialize the nlp instances for ner and sentiments.
*/
public void init() {
Properties nerAnnotators = new Properties();
nerAnnotators.put("annotators", "tokenize,ssplit,pos,lemma,ner");
nerPipeline = new StanfordCoreNLP(nerAnnotators);
}
/**
* #param text Text from entities to be extracted.
*/
public void printEntities(String text) {
// boolean tracking = PerformanceMonitor.start("StanfordNLPServer.getEntities");
try {
// Properties nerAnnotators = new Properties();
// nerAnnotators.put("annotators", "tokenize,ssplit,pos,lemma,ner");
// nerPipeline = new StanfordCoreNLP(nerAnnotators);
Annotation document = nerPipeline.process(text);
// a CoreMap is essentially a Map that uses class objects as keys and has values with custom types
List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
for (CoreLabel token : sentence.get(CoreAnnotations.TokensAnnotation.class)) {
// Get the entity type and offset information needed.
String currEntityType = token.get(CoreAnnotations.NamedEntityTagAnnotation.class); // Ner type
int currStart = token.get(CoreAnnotations.CharacterOffsetBeginAnnotation.class); // token offset_start
int currEnd = token.get(CoreAnnotations.CharacterOffsetEndAnnotation.class); // token offset_end.
String currPos = token.get(CoreAnnotations.PartOfSpeechAnnotation.class); // POS type
System.out.println("(Type:value:offset)\t" + currEntityType + ":\t"+ text.substring(currStart,currEnd)+"\t" + currStart);
}
}
}catch(Exception e){
e.printStackTrace();
}
}
}
Discrepancy result: type changed from MISC to O from the initial use.
Iteration 1:
(Type:value:offset) MISC: Appropriate 100
(Type:value:offset) MISC: Time 112
Iteration 2:
(Type:value:offset) O: Appropriate 100
(Type:value:offset) O: Time 112
Here is the answer from the NER FAQ:
http://nlp.stanford.edu/software/crf-faq.shtml
Is the NER deterministic? Why do the results change for the same data?
Yes, the underlying CRF is deterministic. If you apply the NER to the same sentence more than once, though, it is possible to get different answers the second time. The reason for this is the NER remembers whether it has seen a word in lowercase form before.
The exact way this is used as a feature is in the word shape feature, which treats words such as "Brown" differently if it has or has not seen "brown" as a lowercase word before. If it has, the word shape will be "Initial upper, have seen all lowercase", and if it has not, the word shape will be "Initial upper, have not seen all lowercase".
This feature can be turned off in recent versions with the flag -useKnownLCWords false
I've looked over the code some, and here is a possible way to resolve this:
What you could do to solve this is load each of the 3 serialized CRF's with useKnownLCWords set to false, and serialize them again. Then supply the new serialized CRF's to your StanfordCoreNLP.
Here is a command for loading a serialized CRF with useKnownLCWords set to false, and then dumping it again:
java -mx600m -cp "*:." edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier classifiers/english.all.3class.distsim.crf.ser.gz -useKnownLCWords false -serializeTo classifiers/new.english.all.3class.distsim.crf.ser.gz
Put whatever names you want to obviously! This command assumes you are in stanford-corenlp-full-2015-04-20/ and have a directory classifiers with the serialized CRF's. Change as appropriate for your set up.
This command should load the serialized CRF, override with the useKnownLCWords set to false, and then re-dump the CRF to new.english.all.3class.distsim.crf.ser.gz
Then in your original code:
nerAnnotators.put("ner.model","comma-separated-list-of-paths-to-new-serialized-crfs");
Please let me know if this works or if it's not working, and I can look more deeply into this!
After doing some research, I found the issue is in ClassifierCombiner.classify() method. One of the baseClassifiers edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz loaded by default is returning different type on some occasion. I am trying to load only the first model to resolve this issue.
The problem is the following area of the code
CRFClassifier.classifyMaxEnt()
int[] bestSequence = tagInference.bestSequence(model); Line 1249
ExactBestSequenceFinder.bestSequence() is returning different sequence for for the above model for the same input when called multiple times.
Not sure if this needs code fix or some configuration changes to the model. Any additional insight is appreciated.

Parse a YAML file

This is the first time I am working with YAML files, so the first think I looked at was to find any library that could help me to parse the file.
I have found two libraries, YamlBean and SnakeYAML. I am not sure which one that I am going to use.
Here is an example of the file that I am trying to parse:
users:
user1:
groups:
- Premium
user2:
groups:
- Mod
user3:
groups:
- default
groups:
Mod:
permissions:
test: true
inheritance:
- Premium
default:
permissions:
test.test: true
inheritance:
- Mod
Admin:
permissions:
test.test.test: true
inheritance:
- Mod
The file will change dynamical so I don't know how many users or groups the file would contain.
The information I would like to fetch from this is the user name and the group like this:
user1 Premium
user2 Mod
user3 default
And from the groups only the group names, like this:
Mod
default
Admin
Anyone could get me started here? And what is the best library to use for this?
YamlBean or SnakeYAML?
I guess, I need to save the information in something that I easily could iterate over.
You could also use Jacksons YAML module.
In order to use that, you'll need a few classes. The model classes which will carry the content of your file and the a class that takes care of reading the YAML file.
The root model class could look like this:
public class MyYamlFile {
#JsonProperty
private List<User> users;
#JsonProperty
private List<Group> groups;
// getter methods ommitted
}
The User(*) class:
public class User {
#JsonProperty
private List<String> name;
#JsonProperty
private List<GroupType> groups;
// getter methods ommitted
}
The GroupType could be an Enum containing all possible group types:
public enum GroupType {
Premium, Mod, Default
}
Don't forget that the enum entries are case sensitive. So "premium" won't work.
You can build all your model classes that way. Every sub entry should get an own model class.
Now to the part where you can read that YAML file:
public MyYamlFile readYaml(final File file) {
final ObjectMapper mapper = new ObjectMapper(new YAMLFactory()); // jackson databind
return mapper.readValue(file, MyYamlFile.class);
}
As you can see, this part is really neat, because you don't need much. The file instance contains your YAML file. You can create one like this:
File file = new File("path/to/my/yaml/usersAndGroups.yaml");
Instead of File the readValue method also supports InputStream, java.io.Reader, String (with the whole content), java.net.URL and byte array.
You should find something that suits you.
(*) You should consider changing the structure of your YAML file, because I don't think it is possible to use dynamic keys with Jackson (maybe someone knows more about that):
users:
- name: user1
groups:
- Premium
- name: user2
groups:
- Mod
- name: user3
groups:
- Default
groups:
....
I ended up using SnakeYaml and made some split strings to solve my issue.
Loaded the yaml file to Object and then into a Map, then split the result from the Map into String[] and then in a for loop I read out the name from the String[]. I did the same with groups.
I know that there is better solutions out there but this is good enough for this project.
Thanks all for the replies.
Found this helpful link that will parse the input without touching java code, if ever need to change the config
https://stackabuse.com/reading-and-writing-yaml-files-in-java-with-snakeyaml
InputStream inputStream = new FileInputStream(new File("src/main/resources/customer.yaml"));
Yaml yaml = new Yaml();
Map<String, Object> data = yaml.load(inputStream);
System.out.println(data);
YamlBean is included to DMelt Java numeric computational environment (http://jwork.org/dmelt/). You can create Yaml files using jhplot.io.HFileYAML class which creates a key-value map and save as yaml file.

Java Parsing nested conf file

I have a config file:
company=My Company
num_users=3
user={
name="John"
age=24
}
user={
name="Anna"
age=27
}
user={
name="Jack"
age=22
}
I'm trying to parse this config file using Java. I tried java.util.Properties but I don't know how to get each single user data.
I still can get company property value using getProperty method.
Please help me with this.
Thanks!
Nothing wrong with having same property name with different value but getProperty(String key) can not differentiate between them as it will return the first value.
Secondly you can not access nested property directly.Right now here getProperty will return whole String including {} as value because that's what your value contains.You may get value and perform some operations to fetch values from that whole String.Because property file should have only key=value format means left hand side should be key and right hand side should be value.That's it.
If you want to store values as you have specified in your code you should go for JSON format than you can store whole JSON data to file and get it back from the file while you want to use it.
If you use java.util.Properties class to load the config file, you will get the following result:
{company=My Company, age=22, user={, name="Jack", }=, num_users=3}
The reason, you could refer to the javaDoc for "public void load(Reader reader)" method of Properties class.
Since you don't describ the detailed syntax format for your config file,
base on your example input, the following sample code could retrive the name=value correctly:
String reg="(\\w+)\\s*=\\s*((?>\\{[^\\{\\}]*\\})|(?>.*$))";
Pattern pMod = Pattern.compile(reg, Pattern.MULTILINE);
Matcher mMod = pMod.matcher(line);
while (mMod.find()) {
System.out.println(mMod.group(1));
System.out.println(mMod.group(2));
}
The output is:
company
My Company
num_users
3
user
{
name="John"
age=24
}
user
{
name="Anna"
age=27
}
user
{
name="Jack"
age=22
}

Apache commons configuration reading properties in the format a.<no>.b

I have a properties file that says
window.1.height=100
window.1.width=80
window.2.height=50
window.2.width=30
window.3.height=150
window.3.width=100
I am using the PropertiesConfiguration class and reading the properties.
How can I know the count of windows in the properties. Is therea pattern search
I usually use something like
int i = 0;
String val;
for(;;) {
val = props.get("foo" + i);
if (null == val) {
break;
}
//process val
}
This places the constraint that the counter values must be contiguous.
There are a couple of things you can do if you have any control over the properties file itself. If you are locked into that format, I don't believe there is anything you can do.
However, if you are not locked into that format, here are a couple of solutions:
XML Configuration
Change from a properties file to an XML file format. Something like this:
<windows>
<window>
<width>80</width>
<height>100</height>
</window>
<window>
<width>30</width>
<height>50</height>
</window>
<window>
<width>100</width>
<height>150</height>
</window>
</windows>
Then use XMLConfiguration instead of PropertiesConfiguration. You could then call
config.getList("windows").size()
to get the count of windows.
Properties Configuration
Your other option, which still involves a properties file, is a little bit more contrived. Your properties file would change to look like this:
window.height=100
window.width=80
window.height=50
window.width=30
window.height=150
window.width=100
Then to get the number of windows you would call
config.getList("window.height").size();
However, using this method, you would have to change how you retrieve the values. For example, in order to get the width and height of the second window, you would use this:
config.getInteger("window.width(1)");
config.getInteger("window.height(1)");
Using parens, you can access an individual element of a list, using zero-based indicies. It is a little more difficult to understand, but it would work.
The api has it already onboard. See Configuration#subset

Categories

Resources