How to parse a big rdf file in rdf4j

How to parse a big rdf file in rdf4j - java

I want to parse a huge file in RDF4J using the following code but I get an exception due to parser limit;
public class ConvertOntology {
public static void main(String[] args) throws RDFParseException, RDFHandlerException, IOException {
String file = "swetodblp_april_2008.rdf";
File initialFile = new File(file);
InputStream input = new FileInputStream(initialFile);
RDFParser parser = Rio.createParser(RDFFormat.RDFXML);
parser.setPreserveBNodeIDs(true);
Model model = new LinkedHashModel();
parser.setRDFHandler(new StatementCollector(model));
parser.parse(input, initialFile.getAbsolutePath());
FileOutputStream out = new FileOutputStream("swetodblp_april_2008.nt");
RDFWriter writer = Rio.createWriter(RDFFormat.TURTLE, out);
try {
writer.startRDF();
for (Statement st: model) {
writer.handleStatement(st);
}
writer.endRDF();
}
catch (RDFHandlerException e) {
}
finally {
out.close();
}
}
The parser has encountered more than "100,000" entity expansions in this document; this is the limit imposed by the application.
I execute my code as following as suggested on the RDF4J web site to set up the two parameters (as in the following command)
mvn -Djdk.xml.totalEntitySizeLimit=0 -DentityExpansionLimit=0 exec:java
any help please

The error is due to the Apache Xerces XML parser, rather than the default JDK XML parser.
So Just delete Xerces XML folder from you .m2 repository and the code works fine.

Related

Unable to attach file to issue in jira via rest api Java

I want to attach multiple files to issue. I'm able to create issue successfully however i am facing problem in attaching documents after creating issue. I have referred to this link SOLVED: attach a file using REST from scriptrunner
I am getting 404 error even though issue exists and also user has all the permissions.
File fileToUpload = new File("D:\\dummy.txt");
InputStream in = null;
try {
in = new FileInputStream(fileToUpload);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
HttpResponse < String > response3 = Unirest
.post("https://.../rest/api/2/issue/test-85/attachments")
.basicAuth(username, password).field("file", in , "dummy.txt")
.asString();
System.out.println(response3.getStatus());
here test-85 is a issueKey value.
And i am using open-unirest-java-3.3.06.jar. Is the way i am attaching documents is correct?

I am not sure how open-unirest manages its fields, maybe it tries to put them as json field, rather than post content.
I've been using Rcarz's Jira client. It's a little bit outdated but it still works.
Maybe looking at its code will help you, or you can just use it directly.
The Issue class:
public JSON addAttachment(File file) throws JiraException {
try {
return restclient.post(getRestUri(key) + "/attachments", file);
} catch (Exception ex) {
throw new JiraException("Failed add attachment to issue " + key, ex);
}
}
And in RestClient class:
import org.apache.http.client.methods.HttpEntityEnclosingRequestBase;
import org.apache.http.entity.mime.MultipartEntity;
import org.apache.http.entity.mime.content.FileBody;
public JSON post(String path, File file) throws RestException, IOException, URISyntaxException {
return request(new HttpPost(buildURI(path)), file);
}
private JSON request(HttpEntityEnclosingRequestBase req, File file) throws RestException, IOException {
if (file != null) {
File fileUpload = file;
req.setHeader("X-Atlassian-Token", "nocheck");
MultipartEntity ent = new MultipartEntity();
ent.addPart("file", new FileBody(fileUpload));
req.setEntity(ent);
}
return request(req);
}
So I'm not sure why you're getting a 404, Jira is sometime fuzzy and not really clear about its error, try printing the full error, or checking Jira's log if you can. Maybe it's just the "X-Atlassian-Token", "nocheck", try adding it.

Can not read file when run within jar file

I have an akka http service. I simply return the api documentation for a get request. The documentation is in html file.
It all works fine when run within the IDE. When I package it as a jar I get error 'resource not found'. I am not sure why it can not read the html file when hosted in a jar and works fine when in IDE.
Here is the code for the route.
private Route topLevelRoute() {
return pathEndOrSingleSlash(() -> getFromResource("asciidoc/html/api.html"));
}
The files are located in resource path.

I have got this working now.
I am doing this.
private Route topLevelRoute() {
try {
InputStreamReader inputStreamReader = new InputStreamReader(getClass().getResourceAsStream("/asciidoc/html/api.html"));
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
//Get the stream input into string builder
reader.lines().forEach(s -> strBuild.append(s));
inputStreamReader.close();
bufferedReader.close();
//pass the string builder as string with contenttype set to html
complete(HttpEntities.create(ContentTypes.TEXT_HTML_UTF8, strBuild.toString()))
} catch (Exception ex) {
//Catch any exception here
}
}

Docx4j gives error when I try to read content of existing docx file

I am trying to read the content of a docx file from my system using Docx4Java. I have searched enough for the answer but unfortunately couldn't find one.
Below is the error I got while I tried to implement my code.
java.io.FileNotFoundException: G:\WorkSpaces\111.docx (The system cannot find the file specified)
PS : There is no mistake in providing file path. No jar file is absent. I have checked everything before asking.
Can someone please tell me where am I going wrong ?
import java.io.*;
import java.util.*;
import org.docx4j.*;
public class doc4jcodegeeks {
public static void main(String[] args) throws FileNotFoundException {
try {
doc4jcodegeeks dcf = new doc4jcodegeeks();
dcf.getTemplate();
}
catch (Exception e) {
e.printStackTrace();
}
}
private WordprocessingMLPackage getTemplate() throws Docx4JException, FileNotFoundException {
WordprocessingMLPackage template = WordprocessingMLPackage.load(new FileInputStream(
new File("G:\\WorkSpaces\\111.docx")));
return template;
}

Seems to be G: is network disk. In windows JVM runs under System user. This user can't see network disks. You can try:
Change user, when you start your program;
Try to specify full network path ( \\share\filename.docx )
At last copy file to local disk;

Thanks for your answer Ken Bekov. After some time, I figured out the solution and displayed document's contents on output window in the following way :
private WordprocessingMLPackage getTemplate() throws Docx4JException, FileNotFoundException {
WordprocessingMLPackage template = WordprocessingMLPackage.load(new java.io.File("G:\\WorkSpaces\\111.docx"));
MainDocumentPart documentPart = template.getMainDocumentPart();
List<Object> listObj = documentPart.getContent();
String str = listObj.toString();
System.out.println(str);
return template;
}

SyndFeedInput().build in Java: Cannot access org.jdom.Document class file for org.jdom.Document not found

i'm using Netbeans IDe 7.0.1.
I'm testing a program in Java which is using ROME in order to parse the xml.
public class RSSNew {
public static void main(String[] args) throws Exception {
URL url = new URL("RSS URL");
XmlReader reader = null;
try {
reader = new XmlReader(url);
SyndFeed feed = new SyndFeedInput().build(reader); /* HERE */
}
finally {
if (reader != null)
reader.close();
}
}
}
The error is:
**cannot access org.jdom.Document
class file for org.jdom.Document not found
SyndFeed feed = new SyndFeedInput().build(reader);
Note: C:\Users\User PC\Documents\NetBeansProjects\RSS\src\rss\RSS.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 error**
Have you eperienced an error like this?
Thanks, in advance
ps. i have added the following jar files in my project which are:
feed4j.jar
rome-1.0.jar
rome-1.0-javadoc.jar

I just had the same error message - you need to add jdom.jar to your project classpath as well, it's used by rome. You can get it from here: http://www.jdom.org/dist/binary/

Java Servlet - write data to file

I have a servlet which uses file with data. The relative path to this file is contained in web.xml.
I have following part of code, which reads data from file:
public class LoginServlet extends HttpServlet {
private Map<String, UserData> users;
public void init() throws ServletException {
super.init();
String userFilePath = getServletContext().getInitParameter("user.access.file");
InputStream userFile = this.getClass().getResourceAsStream(userFilePath);
try {
users = readUsersFile(userFile);
} catch (IOException e) {
e.printStackTrace();
throw new ServletException(e);
}
....
....
}
private Map<String, UserData> readUsersFile(InputStream is) throws IOException{
BufferedReader fileReader = new BufferedReader(new InputStreamReader(is));
Map<String, UserData> result = new HashMap<String, UserData>();
....
....
....
return result;
}
}
Because this is a servlet and it will not work only on my PC, I can't use absolute path.
Does anyone know how I can write data to the file, using a similar way?

If the resource URL is resolveable to an absolute local disk file system path and it is writable, then you can use
URL url = this.getClass().getResource(userFilePath);
File file = new File(url.toURI().getPath());
OutputStream output = new FileOutputStream(file);
// ...
This is however in turn not guaranteed to work on all environments.
Your best bet is really to have a fixed and absolute local disk file system path. The normal practice is however to store structured data (usernames/password) in a database and not a file.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to parse a big rdf file in rdf4j - java

The error is due to the Apache Xerces XML parser, rather than the default JDK XML parser. So Just delete Xerces XML folder from you .m2 repository and the code works fine.

Related

Unable to attach file to issue in jira via rest api Java

Can not read file when run within jar file

Docx4j gives error when I try to read content of existing docx file

SyndFeedInput().build in Java: Cannot access org.jdom.Document class file for org.jdom.Document not found

Java Servlet - write data to file

Categories

Resources