Failed to parse some ontologies

Failed to parse some ontologies - java

When parsing a set of ontologies, some of the files give me the following error while others work well (Note that I am using OWL API 5.1.6):
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1033)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:933)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadImports(OWLOntologyManagerImpl.java:1630)
....
Could not parse JSONLD org.eclipse.rdf4j.rio.jsonld.JSONLDParser.parse(JSONLDParser.java:110)
org.semanticweb.owlapi.rio.RioParserImpl.parseDocumentSource(RioParserImpl.java:172)
org.semanticweb.owlapi.rio.RioParserImpl.parse(RioParserImpl.java:125)
....
Stack trace:
org.eclipse.rdf4j.rio.RDFParseException: unqualified attribute 'class' not allowed [line 3, column 65]
org.semanticweb.owlapi.rio.RioParserImpl.parse(RioParserImpl.java:138)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:193)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.load(OWLOntologyManagerImpl.java:1071)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1033)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:933)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadImports(OWLOntologyManagerImpl.java:1630)
....
and many errors like those.
Any idea how to fix this problem(s)?
update:
The snippet that loads the ontology is:
File file = new File("C:\\vocabs\\" + Ontofile.getName());
OWLOntologyManager m = OWLManager.createOWLOntologyManager();
OWLOntology o;
o = m.loadOntologyFromOntologyDocument(file);
OWLDocumentFormat format = m.getOntologyFormat(o);
OWLOntologyXMLNamespaceManager nsManager = new
OWLOntologyXMLNamespaceManager(o, format);

This error is saying that one of the ontologies you're parsing is not valid JSON/LD format.
To fix this, you have to do two things:
Ensure the format that's being used is the one you expect: OWLAPI, if no format is specified, will attempt to use all parsers available until one of them successfully parses the ontology
Fix the input data if the format is correct: in this case, for JSON/LD, the error is on line 3
If the format used is not what should be, you need to specify a format in your code - for that, you'll have to add a snippet of the code you're using to parse your files.

Related

Use option p of weka filter (RemoveType) in java code

I am using weka API in my java code and have a dataset with string ID to keep track of instances, weka mentioned in this page that there is an option p that can help printing the ID of each instance in the prediction result even if the attribute has removed. But how this can be approached in java code since none of the options listed in RemoveType filter is p?
Thank you

p option, on the weka page you mentioned, is the parameter which you can set through some of the the classes which are available in the package weka.classifiers.evaluation.output.prediction
With these classes you can set the things you want in output prediction file. E.g. OutputDistribution, AttributeIndices(P)- Attribute indices which you want to have in output file, Number of decimal places in prediction probabilities, etc.
You can use any of the below classes depending on the output file format you want.
PlainText
HTML
XML
CSV
Setting the parameters through code :
Evaluation eval = new Evaluation(data);
StringBuffer forPredictionsPrinting = new StringBuffer();
PlainText classifierOutput = new PlainText();
classifierOutput.setBuffer(forPredictionsPrinting);
Boolean outputDistribution = new Boolean(true);
classifierOutput.setOutputDistribution(true);
You can find detailed usage of this class at
https://www.programcreek.com/java-api-examples/?api=weka.classifiers.evaluation.output.prediction.PlainText

OWL Parsing From EFO

I have been trying endlessly to parse the Experimental Factor Ontology (EFO) file, but I am not able to parse it. The file I have opens fine in Protege, but I cannot seem to get it to load in Java. I have looked at a few sets of example code, and I am copying them seemingly exactly, but I do not understand why parsing fails. Here is my code:
System.setProperty("entityExpansionLimit","100000000");
OWLOntologyManager manager = OWLManager.createOWLOntologyManager();
URI uri = URI.create("file:~/efo.owl");
IRI iri = IRI.create(uri);
OWLOntology ontology = manager.loadOntologyFromOntologyDocument(iri);
And here are the errors I get:
Could not load ontology: Problem parsing
file:/~/efo.owl
Could not parse ontology. Either a suitable parser could not be found, or
parsing failed. See parser logs below for explanation.
The following parsers were tried:
Thank you, I know some similar posts have been made, but I have been unable to figure it out and am quite desperate! I can provide the stack trace if necessary, but it is quite long as there is a trace for each parser.

File URI need to be absolute for OWLAPI to parse them, but as you have a local file you can just create a File instance and pass that to IRI.create().
Alternatively pass the File instance to OWLOntologyManager::loadOntologyFromOntologyDocument()

There must be something wrong with the local, downloaded file. Loading the ontology directly from the ontology IRI worked.

java: Protocol message tag had invalid wire type error when reading .pb file

I try to read .pb extension file.
Specifically, I would like to read this dataset (in .tgz).
I write the following code:
Path path = Paths.get(filename);
byte[] data = Files.readAllBytes(path);
Document document = Document.parseFrom(data);
But then I received the following error.
com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.
The last line of the code caused this error, but I do not know how to solve it.

Your files are actually in "delimited" format: each one contains multiple messages, each with a length prefix.
InputStream stream = new FileInputStream(filename);
Document document = Document.parseDelimitedFrom(steam);
Keep calling parseDelimitedFrom(stream) to read more messages until it returns null (end of file).
Also note that the file I looked at -- testNegative.pb in heldout_relations.tgz -- appeared to contain instances of Relation, not Document. Make sure you are parsing the correct type, because the protobuf implementation can't tell the difference -- you'll get garbage if you parse the wrong type.

Read Quads into Jena Model

I am using the following steps to read quads into Jena Model
InputStreamin = FileManager.get().open(fn); //fn--filename
Model md = ModelFactory.createDefaultModel();
md.read(in,null,"TTL");
Quads in file are:
#prefix dbpedia: <http://dbpedia.org/resource/> .
dbpedia:53b56e90c8a15fcd48eb5001 dbpedia:type dbpedia:willtest dbpedia:1 .
dbpedia:53b56e90c8a15fcd48eb5001 dbpedia:end dbpedia:1404394351023 dbpedia:1 .
dbpedia:53b56e90c8a15fcd48eb5001 dbpedia:room dbpedia:Room202cen dbpedia:1 .
dbpedia:53debf266ad34658725225ed dbpedia:reading dbpedia:0 dbpedia:2 .
dbpedia:53debf206ad34658725225e5 dbpedia:begining dbpedia:1407106678270 dbpedia:3 .
But on running I get following error:
Exception in thread "main" com.hp.hpl.jena.n3.turtle.TurtleParseException: Encountered " <DECIMAL> "1. "" at line 2, column 60.
Was expecting one of:
";" ...
"," ...
"." ...
Error is generated due to the Quad file only. A triple file is read clearly. Is there any other method to read quads into Jena Model?
UPDATE#1
I did as Christian has mentioned in the answer, but now I get the following errors:
Exception in thread "main" com.hp.hpl.jena.shared.JenaException:
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1;
Content is not allowed in prolog.
Same data file can be found at link.

Looks like you are reading a Turtle File, and as of http://www.w3.org/TR/turtle/#abstract , Turtle is not compatible with N-Quads:
This document defines a textual syntax for RDF called Turtle that
allows an RDF graph to be completely written in a compact and natural
text form, with abbreviations for common usage patterns and datatypes.
Turtle provides levels of compatibility with the N-Triples [N-TRIPLES]
format as well as the triple pattern syntax of the SPARQL W3C
Recommendation.
What you are basically doing is, you tell the parser that it has to parse a "Triples-syntax" file but you pass down a "Quad-syntax" file.
Change your file ending to .nq and use md.read(in,null); instead. This should then automatically detect that it is "Quad-syntax". And of course also make sure that your file is according to the N-Quads syntax, as defined here: http://www.w3.org/TR/n-quads/

Validate a XSD file

I want to validate an XSD file (not XML). The approach i am using is to treat the XSD as any other XML file and use this www.w3.org/2001/XMLSchema.xsd as the schema.
I am using the following code:
String schemaLang = "http://www.w3.org/2001/XMLSchema";
SchemaFactory factory = SchemaFactory.newInstance(schemaLang);
Schema schema = factory.newSchema(new StreamSource("C:\\Users\\aprasad\\Desktop\\XMLSchema.xsd"));
Validator validator = schema.newValidator();
validator.validate(new StreamSource("shiporder.xsd"));
But i am getting the following error:
Failed to read schema document 'XMLSchema.xsd', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>.
Not sure what the error is as the file path is correct.
Please tell me the correct approach to validate an XSD file.

You need to have two additional files right beside XMLSchema.xsd. These are:
XMLSchema.dtd
datatypes.dtd
XMLSchema.xsd references these two files.
Right beside, so if XMLSchema.xsd is located at C:/XMLSchema.xsd then you have to have C:/XMLSchema.dtd and C:/datatypes.dtd.
SchemaFactory instances use (see SchemaFactory.setResourceResolver(LSResourceResolver)) by default an internal class called XMLCatalogResolver which implements LSResourceResolver. The former (I assume) looks for referenced files beside the referer.
If you look really hard then the cause of your SAXParseException is a FileNotFoundException that says the the system couldn't find the XMLSchema.dtd file.
Other than this, your code is OK (and your schema too).

According to the javadoc for the StreamSource class, if you use the constructor method that takes a String, that string needs to be a valid URI. For example, if you are trying to reference a local file, you may need to prefix the path with file:/. Alternatively, you can pass a File object to the constructor:
Schema schema = factory.newSchema(new File(new StreamSource("C:\\Users\\aprasad\\Desktop\\XMLSchema.xsd")));
In summary, it would be beneficial in this case to do some simple testing to rule out problems caused by your program not finding the necessary files, for example
File schemaFile1 = new File("C:\\Users\\aprasad\\Desktop\\XMLSchema.xsd");
File schemaFile2 = new File("shiporder.xsd");
assert schemaFile1.exists();
assert schemaFile2.exists();

I wonder what you are trying to achieve? If factory.newSchema(X) throws no exception, then X must be a valid schema(*). That seems a much more straightforward thing to do than validating against the schema for schema documents.
(*) the reverse isn't necessarily true of course: X might be valid against the schema for schema documents, but be invalid for other reasons, such as violating a UPA constraint.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Failed to parse some ontologies - java

Related

Use option p of weka filter (RemoveType) in java code

OWL Parsing From EFO

java: Protocol message tag had invalid wire type error when reading .pb file

Read Quads into Jena Model

Validate a XSD file

Categories

Resources