OWL Parsing From EFO - java

I have been trying endlessly to parse the Experimental Factor Ontology (EFO) file, but I am not able to parse it. The file I have opens fine in Protege, but I cannot seem to get it to load in Java. I have looked at a few sets of example code, and I am copying them seemingly exactly, but I do not understand why parsing fails. Here is my code:
System.setProperty("entityExpansionLimit","100000000");
OWLOntologyManager manager = OWLManager.createOWLOntologyManager();
URI uri = URI.create("file:~/efo.owl");
IRI iri = IRI.create(uri);
OWLOntology ontology = manager.loadOntologyFromOntologyDocument(iri);
And here are the errors I get:
Could not load ontology: Problem parsing
file:/~/efo.owl
Could not parse ontology. Either a suitable parser could not be found, or
parsing failed. See parser logs below for explanation.
The following parsers were tried:
Thank you, I know some similar posts have been made, but I have been unable to figure it out and am quite desperate! I can provide the stack trace if necessary, but it is quite long as there is a trace for each parser.

File URI need to be absolute for OWLAPI to parse them, but as you have a local file you can just create a File instance and pass that to IRI.create().
Alternatively pass the File instance to OWLOntologyManager::loadOntologyFromOntologyDocument()

There must be something wrong with the local, downloaded file. Loading the ontology directly from the ontology IRI worked.

Related

FileNotFound while File is there

I am using getClassLoader().getResources to find the path for Jsoup to parse.
String path = JsoupDemo1.class.getClassLoader().getResource("student.xml").getPath();
Document document = Jsoup.parse(new File(path), "utf-8");
Elements names = document.getElementsByTag("name");
System.out.println(names.size());
My student.xml has been placed under the src folder in my module "day11_xml" and this code snippet comes from the class JsoupDemo1 in the package cn.itcast.xml.jsoup under the same module of "day11_xml". The error messages reads as follows:
java.io.FileNotFoundException:/Users/dingshun/Downloads/New%20Java%20Projects/demo/out/production/day11_xml/student.xml (No such file or directory)
I don't get it, as I can find the exact file in the given path. I'm confused, but could you guys help me out? Also, I'm new to both Java programming and this forum and if this question sounds silly or my question format is not right, please let me know.
What you're doing looks good. Maybe use the stream version JSoup.parse.
URL url = JsoupDemo1.class.getClassLoader().getResource("student.xml");
InputStream stream = JsoupDemo1.class.getClassLoader().getResourceAsStream("student.xml");
document = Jsoup.parse(stream, "utf-8", url.toURI()toString());
The documentation linked seems to imply it will work with html not xml, so maybe you need to use the other argument which provides a parser?
Actually, it turned out that Jsoup could not find my file because the path name "New%20Java%20Projects" has spaces between them. When I reload the file in a folder which has no spaces in its name, it works out just fine. So it can parse xml using parse​(File in, String charsetName) method. It seems it cannot parse path name which has spaces in it.

Failed to parse some ontologies

When parsing a set of ontologies, some of the files give me the following error while others work well (Note that I am using OWL API 5.1.6):
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1033)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:933)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadImports(OWLOntologyManagerImpl.java:1630)
....
Could not parse JSONLD org.eclipse.rdf4j.rio.jsonld.JSONLDParser.parse(JSONLDParser.java:110)
org.semanticweb.owlapi.rio.RioParserImpl.parseDocumentSource(RioParserImpl.java:172)
org.semanticweb.owlapi.rio.RioParserImpl.parse(RioParserImpl.java:125)
....
Stack trace:
org.eclipse.rdf4j.rio.RDFParseException: unqualified attribute 'class' not allowed [line 3, column 65]
org.semanticweb.owlapi.rio.RioParserImpl.parse(RioParserImpl.java:138)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:193)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.load(OWLOntologyManagerImpl.java:1071)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:1033)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:933)
uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadImports(OWLOntologyManagerImpl.java:1630)
....
and many errors like those.
Any idea how to fix this problem(s)?
update:
The snippet that loads the ontology is:
File file = new File("C:\\vocabs\\" + Ontofile.getName());
OWLOntologyManager m = OWLManager.createOWLOntologyManager();
OWLOntology o;
o = m.loadOntologyFromOntologyDocument(file);
OWLDocumentFormat format = m.getOntologyFormat(o);
OWLOntologyXMLNamespaceManager nsManager = new
OWLOntologyXMLNamespaceManager(o, format);
This error is saying that one of the ontologies you're parsing is not valid JSON/LD format.
To fix this, you have to do two things:
Ensure the format that's being used is the one you expect: OWLAPI, if no format is specified, will attempt to use all parsers available until one of them successfully parses the ontology
Fix the input data if the format is correct: in this case, for JSON/LD, the error is on line 3
If the format used is not what should be, you need to specify a format in your code - for that, you'll have to add a snippet of the code you're using to parse your files.

Validating XSD itself

Could anyone please tell me how to validate an XSD file itself (not XML against XSD)? I have checked many forums and sites (including SO) and most of them refers some or the other online validator. But this is not a one-time check for us. Our application involves an XSL transformation using an XSD, so we need to determine whether the XSD to be used is itself in a valid format or not, as in, all the tags match, with a starting and a closing one. Certain tags aren't allowed as a child tag, etc. That's why we need a proper java code to achieve the same.
Any help would be highly appreciated.
You can validate an XSD file against the w3 XSD schema that can be found here.
Use the same validation techniques you validate any other XML file with an XSD file, only the source document would be your XSD file.
you can use xmllint for that:
xmllint --noout --dtdvalid http://www.w3.org/2001/XMLSchema.dtd my-schema.xsd
You can try javax.xml.validation package
SchemaFactory f = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema s = f.newSchema(new File("1.xsd"));
Schema.newSchema() API
Parses the specified File as a schema and returns it as a Schema
You can validate your XSD online here.
Just copy and paste your XSD here and click on validate Schema , it will give you the result.

Validate a XSD file

I want to validate an XSD file (not XML). The approach i am using is to treat the XSD as any other XML file and use this www.w3.org/2001/XMLSchema.xsd as the schema.
I am using the following code:
String schemaLang = "http://www.w3.org/2001/XMLSchema";
SchemaFactory factory = SchemaFactory.newInstance(schemaLang);
Schema schema = factory.newSchema(new StreamSource("C:\\Users\\aprasad\\Desktop\\XMLSchema.xsd"));
Validator validator = schema.newValidator();
validator.validate(new StreamSource("shiporder.xsd"));
But i am getting the following error:
Failed to read schema document 'XMLSchema.xsd', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>.
Not sure what the error is as the file path is correct.
Please tell me the correct approach to validate an XSD file.
You need to have two additional files right beside XMLSchema.xsd. These are:
XMLSchema.dtd
datatypes.dtd
XMLSchema.xsd references these two files.
Right beside, so if XMLSchema.xsd is located at C:/XMLSchema.xsd then you have to have C:/XMLSchema.dtd and C:/datatypes.dtd.
SchemaFactory instances use (see SchemaFactory.setResourceResolver(LSResourceResolver)) by default an internal class called XMLCatalogResolver which implements LSResourceResolver. The former (I assume) looks for referenced files beside the referer.
If you look really hard then the cause of your SAXParseException is a FileNotFoundException that says the the system couldn't find the XMLSchema.dtd file.
Other than this, your code is OK (and your schema too).
According to the javadoc for the StreamSource class, if you use the constructor method that takes a String, that string needs to be a valid URI. For example, if you are trying to reference a local file, you may need to prefix the path with file:/. Alternatively, you can pass a File object to the constructor:
Schema schema = factory.newSchema(new File(new StreamSource("C:\\Users\\aprasad\\Desktop\\XMLSchema.xsd")));
In summary, it would be beneficial in this case to do some simple testing to rule out problems caused by your program not finding the necessary files, for example
File schemaFile1 = new File("C:\\Users\\aprasad\\Desktop\\XMLSchema.xsd");
File schemaFile2 = new File("shiporder.xsd");
assert schemaFile1.exists();
assert schemaFile2.exists();
I wonder what you are trying to achieve? If factory.newSchema(X) throws no exception, then X must be a valid schema(*). That seems a much more straightforward thing to do than validating against the schema for schema documents.
(*) the reverse isn't necessarily true of course: X might be valid against the schema for schema documents, but be invalid for other reasons, such as violating a UPA constraint.

Creating XML Schema from URL works but from Local File fails?

I need to validate XML Schema Instance (XSD) documents which are programmatically generated so I'm using the following Java snippet, which works fine:
SchemaFactory factory = SchemaFactory.newInstance(
XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema xsdSchema = factory.newSchema( // Reads URL every time...
new URL("http://www.w3.org/2001/XMLSchema.xsd"));
Validator xsdValidator = xsdSchema.newValidator();
xsdValidator.validate(new StreamSource(schemaInstanceStream));
However, when I save the XML Schema definition file locally and refer to it this way:
Schema schema = factory.newSchema(
new File("test/xsd/XMLSchema.xsd"));
It fails with the following exception:
org.xml.sax.SAXParseException: schema_reference.4: Failed to read schema document 'file:/Users/foo/bar/test/xsd/XMLSchema.xsd', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>.
I've ensured that the file exists and is readable by doing exists() and canRead() assertions on the File object. I've also downloaded the file with a couple different utilities (web browser, wget) to ensure that there is no corruption.
Any idea why I can validate XSD instance documents when I generate the schema from the HTTP URL but I get the above exception when trying to generate from a local file with the same contents?
[Edit]
To elaborate, I've tried multiple forms of factory.newSchema(...) using Readers and InputStreams (instead of the File directly) and still get exactly the same error. Moreover, I've dumped the file contents before using it or the various input streams to ensure it's the right one. Quite vexing.
Full Answer
It turns out that there are three additional files referenced by XML Schema which must be also stored locally and XMLSchema.xsd contains an import statement whose schemaLocation attribute must be changed. Here are the files that must be saved in the same directory:
XMLSchema.xsd - change schemaLocation to "xml.xsd" in the "import" element for XML Namespace.
XMLSchema.dtd - as is.
datatypes.dtd - as is.
xml.xsd - as is.
Thanks to #Blaise Doughan and #Tomasz Nurkiewicz for their hints.
I assume you are trying to load XMLSchema.xsd. Please also download XMLSchema.dtd and datatypes.dtd and put them in the same directory. This should push you a little bit further.
UPDATE
Is XMLSchema.xsd importing any other schemas by relative paths that are not on the local file systen?
Your relative path may not be correct wrt your working directory. Try entering a fully qualified path to eliminate the possibility that the file can not be found.
org.xml.sax.SAXParseException: schema_reference.4: Failed to read
schema document 'file:/Users/foo/bar/test/xsd/XMLSchema.xsd', because
1) could not find the document; 2) the document could not be read; 3)
the root element of the document is not .

Categories

Resources