How to configure RDF4J Rio writer to write IRIs with special characters?

How to configure RDF4J Rio writer to write IRIs with special characters? - java

I want to write an rdf4j.model.Model with the rdf/turtle format. The model should contain IRIs with the characters {}.
When I try to write the RDF model with rdf4j.rio.Rio, the {} characters are written as %7B%7D. Is there a way to overcome this? e.g. create an rdf4j.model.IRI with path and query variables or configure the writer to preserve the {} characters?
I am using org.eclipse.rdf4j:rdf4j-runtime:3.6.2.
An example snippet:
import org.eclipse.rdf4j.model.BNode;
import org.eclipse.rdf4j.model.IRI;
import org.eclipse.rdf4j.model.Model;
import org.eclipse.rdf4j.model.impl.SimpleValueFactory;
import org.eclipse.rdf4j.model.util.ModelBuilder;
import org.eclipse.rdf4j.rio.*;
import org.eclipse.rdf4j.rio.helpers.BasicWriterSettings;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.logging.Level;
import java.util.logging.Logger;
public class ExamplePathVariable {
private final static Logger LOG = Logger.getLogger(ExamplePathVariable.class.getCanonicalName());
public static void main(String[] args) {
SimpleValueFactory rdf = SimpleValueFactory.getInstance();
ModelBuilder modelBuilder = new ModelBuilder();
BNode subject = rdf.createBNode();
IRI predicate = rdf.createIRI("http://example.org/onto#hasURI");
// IRI with special characters !
IRI object = rdf.createIRI("http://example.org/{token}");
modelBuilder.add(subject, predicate, object);
String turtleStr = writeToString(RDFFormat.TURTLE, modelBuilder.build());
LOG.log(Level.INFO, turtleStr);
}
static String writeToString(RDFFormat format, Model model) {
OutputStream out = new ByteArrayOutputStream();
try {
Rio.write(model, out, format,
new WriterConfig().set(BasicWriterSettings.INLINE_BLANK_NODES, true));
} finally {
try {
out.close();
} catch (IOException e) {
LOG.log(Level.WARNING, e.getMessage());
}
}
return out.toString();
}
}
This is what I get:
INFO:
[] <http://example.org/onto#hasURI> <http://example.org/%7Btoken%7D> .

There is no easy way to do what you want, because that would result in a syntactically invalid URI representation in Turtle.
The characters '{' and '}', even though they are not actually reserved characters in URIs, are not allowed to exist in un-encoded form in a URI (see https://datatracker.ietf.org/doc/html/rfc3987). The only way to serialize them legally is by percent-encoding them.
As an aside the only reason this bit of code:
IRI object = rdf.createIRI("http://example.org/{token}");
succeeds is that the SimpleValueFactory you are using does not do character validation (for performance reasons). If you instead use the recommended approach (since RDF4J 3.5) of using the Values static factory:
IRI object = Values.iri("http://example.org/{token}");
...you would immediately have gotten a validation error.
If you want to input a string where in advance you don't know if it's going to contain any invalid chars, and want to have a best-effort approach to convert it to a legal URI, you can use ParsedIRI.create:
IRI object = Values.iri(ParsedIRI.create("http://example.org/{token}").toString());

Related

Save and read multiple types of setting in a text file

I want to have a setting system that I can read write and use variables from the are stored in a file.
To summarize, There is a class and inside that class is a list of settings.
When I make a setting I want to add it to the list so that I can write it to the text file later.
I also want to be able to get the setting value without casting it which would use generics.
So for boolSetting I would only need to do boolSetting.get() or boolSetting.value ect
To start with code I have already written I have the code to read and write to the file. This works perfect (I think). I just need help with the setting part. Here is the read and write to file.
package winter.settings;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
import net.minecraft.src.Config;
import winter.Client;
public class WinterSettings {
public static File WinterSetting;
public static void readSettings() {
try {
File WinterSetting = new File(System.getProperty("user.dir"), "WinterSettings.txt");
BufferedReader bufferedreader = new BufferedReader(new InputStreamReader(new FileInputStream(WinterSetting), "UTF-8"));
String s = "";
while ((s = bufferedreader.readLine()) != null)
{
System.out.println(s);
String[] astring = s.split(":");
Client.modules.forEach(m ->{
if(m.name==astring[0]) {
m.settings.forEach(setting ->{
if(setting.name==astring[1]) {
setting.value=astring[2];
}
});
}
});
}
bufferedreader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void writeSettings() {
try {
File WinterSetting = new File(System.getProperty("user.dir"), "WinterSettings.txt");
PrintWriter printwriter = new PrintWriter(new FileWriter(WinterSetting));
Client.modules.forEach(m ->{
m.settings.forEach(setting ->{
printwriter.println(m.name+":"+setting.name+":"+setting.value);
});
});
printwriter.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Pretty much how this works is I have a setting in a Module which just stores some information.
The setting has a name and a value
To write it I am just writing
The module name, the setting name, the setting value For example: ModuleName:SettingName:false
This works fine, but leads to the problem that I just don't know enough about generics. I can't find a way that works with writing reading and setting / getting. The setting should have a name and value. Some code I wrote is below I just don't know how to continue it.
public class Setting<T> {
public String name;
public T value;
public Setting(String name, T value) {
this.name = name;
this.value = value;
}
public T getValue() {
return value;
}
}
From here I have subclasses for each type of setting. Not sure if this is good programming or not.
Now I can set get / write, but when I read the value isn't updated correctly.
Right now I make a new setting like
private final BooleanSetting toggleSprint = new BooleanSetting("ToggleSprint", true);
There is one problem to this from what I can tell. First off when I try to add it to a list when initilizing I get an error.
Type mismatch: cannot convert from boolean to BooleanSetting.
In short: I need to be able to read write get and set a value in a setting object. This can be boolean / int / ect.
Above is some of my code to read / write to txt file. Setting class and what I have of making a new setting.
My 2 problems are that I read the settings correctly and when making them I can't add them to a list.

Use the Boolean.True static variable
new BooleanSetting("ToggleSprint", Boolean.TRUE);
or
Boolean.valueOf(true)

Not able to parse new york times article using boilerpipe

I am trying to get news article from 'new york times' url but it is not giving any output, but if I try for any other newspaper it gives output. I want to know if something is wrong with my code or boilerpipe is not able to fetch it. Plus sometimes the output is not in english language means it shows in unicode mainly for 'daily news', I want to know reason for that also.
import java.io.InputStream;
import java.net.URL;
import org.xml.sax.InputSource;
import de.l3s.boilerpipe.document.TextDocument;
import de.l3s.boilerpipe.extractors.ArticleExtractor;
import de.l3s.boilerpipe.extractors.DefaultExtractor;
import de.l3s.boilerpipe.sax.BoilerpipeSAXInput;
class ExtractData
{
public static void main(final String[] args) throws Exception
{
URL url;
url = new URL(
"http://www.nytimes.com/2013/03/02/nyregion/us-judges-offer-addicts-a-way-to-avoid-prison.html?hp&_r=0");
// NOTE We ignore HTTP-based character encoding in this demo...
final InputStream urlStream = url.openStream();
final InputSource is = new InputSource(urlStream);
final BoilerpipeSAXInput in = new BoilerpipeSAXInput(is);
final TextDocument doc = in.getTextDocument();
urlStream.close();
// You have the choice between different Extractors
//System.out.println(DefaultExtractor.INSTANCE.getText(doc));
System.out.println(ArticleExtractor.INSTANCE.getText(doc));
}
}

Nytimes.com has a paywall and it returns HTTP 303 for your request, you could try to handle the redirect and cookies. Trying other user-agent strings might also work.

Parsing nested JSON nodes to POJOs using Google Http Java Client

For example I have a response with the following JSON:
{
response: {
id: 20,
name: Stas
}
}
And I want to parse it to the following object:
class User {
private int id;
private String name;
}
How to skip the response node?
I use Google Http Java Client and it will be good if someone will answer how to do it there.
How will this lines have changed?
request.setParser(new JacksonFactory().createJsonObjectParser());
return request.execute().parseAs(getResultType());

You can now implement this in one step:
new JsonObjectParser.Builder(jsonFactory)
.setWrapperKeys(Arrays.asList("response"))
.build()
http://javadoc.google-http-java-client.googlecode.com/hg/1.15.0-rc/index.html

I do not know the Google Http Java Client, but if you can access the Jackson ObjectMapper you could do the following:
1.) Enable root unwrapping:
objectMapper.enable(DeserializationFeature.UNWRAP_ROOT_VALUE);
2.) Add annotation to User.class:
#JsonRootName("response")
class User {
…
}
I hope you can use this approach.
Edit: I dug through the google-http-java-client API and have come to the conclusion that you cannot access the ObjectMapper directly. In order to use the full power of Jackson you would have to write your own implementation of JsonObjectParser to wrap a 'real' Jackson parser. Sorry about that, maybe someone else could come up with a better solution.

I didn't find a native way (for this library) to solve my task. As a result I solved this problem by extending the functionality of JsonObjectParser. It entails expanding of the JacksonFactory, but it's a final class, so I used aggregation.
I wrote the following classes:
JacksonFilteringFactory
import com.google.api.client.json.JsonObjectParser;
import com.google.api.client.json.jackson2.JacksonFactory;
public class JacksonFilteringFactory {
private final JacksonFactory factory = new JacksonFactory();
public JsonObjectParser createJsonObjectParser(String filteringNode) {
return new FilteringJsonObjectParser(factory, filteringNode);
}
}
FilteringJsonObjectParser
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.lang.reflect.Type;
import java.nio.charset.Charset;
import org.json.JSONException;
import org.json.JSONObject;
import org.json.JSONTokener;
import com.google.api.client.json.JsonFactory;
import com.google.api.client.json.JsonObjectParser;
import com.vkredmessenger.AppController;
import com.vkredmessenger.util.StringUtils;
public class FilteringJsonObjectParser extends JsonObjectParser {
private String mFilteringNode;
public FilteringJsonObjectParser(JsonFactory jsonFactory,
String filteringNode) {
super(jsonFactory);
mFilteringNode = filteringNode;
}
#Override
public Object parseAndClose(InputStream in,
Charset charset, Type dataType)
throws IOException {
String originalResponse =
StringUtils.convertStreamToString(in, charset);
String response = null;
try {
JSONTokener tokener = new JSONTokener(originalResponse);
JSONObject originalResponseObject =
(JSONObject) tokener.nextValue();
JSONObject responseObject =
originalResponseObject.getJSONObject(mFilteringNode);
response = responseObject.toString();
} catch (JSONException e) {
e.printStackTrace();
}
InputStream filteredIn =
new ByteArrayInputStream(response.getBytes(charset));
return super.parseAndClose(filteredIn, charset, dataType);
}
}
So, for example from my question, the result parsing code will be the following:
request.setParser(new JacksonFilteringFactory().createJsonObjectParser("response"));
return request.execute().parseAs(getResultType());

Recursive method to return different objects stored as JSON files

I found this answer and I think it applies, but with a little twist.
How to return multiple objects from a Java method?
I have two JSON formatted files using the YML subset, one of which is a list of devices, and the other is a file that lists the attributes of a particular type of device.
The choice of dividing up the list of Device instances into one file and the attributes in another file is to allow the Device manufacturer to change the attributes without having to go back and rewrite/recompile hard coded attributes.
In any case, I could use two different calls to the JSONParser and then add the list of attributes to the Device object in the end, but that solution seems like a waste of code since, except for the inner part of the while loop where the values are set, they do exactly the same thing.
I thought something like a Ruby-ish Yield might do the trick in the inner loop, but not sure if this exists in Java.
So, without much further ado, here is the code:
// Import the json simple parser, used to read configuration files
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.JSONValue;
import org.json.simple.parser.*;
public class LoadJson {
private String path = "";
private String fileType = "";
//public LoadJson(String path, String fileType){
//this.path = path;
//this.fileType = fileType;
//}
public static Device openJSON(String fileType, String deviceName) {
JSONParser parser = new JSONParser();
Device myDevice = new Device();
ContainerFactory containerFactory = new ContainerFactory(){
public List creatArrayContainer() {
return new LinkedList();
}
public Map createObjectContainer() {
return new LinkedHashMap();
}
};
try {
File appBase = new File("."); //current directory
String path = appBase.getAbsolutePath();
System.out.println(path);
Map obj = (Map)parser.parse(new FileReader(path+fileType),containerFactory);
Iterator iter = obj.entrySet().iterator();
//Iterator iterInner = new Iterator();
while(iter.hasNext()){
//LinkedList entry = (LinkedList)iter.next();
LinkedList myList = new LinkedList();
Map.Entry entry = (Map.Entry)iter.next();
myList = (LinkedList) (entry.getValue());
Iterator iterate = myList.iterator();
while (iterate.hasNext())
{
LinkedHashMap entry2 = (LinkedHashMap)iterate.next();
if(fileType=="mav2opc")
{
String deviceName1 = entry2.get("DeviceName").toString();
String userName = entry2.get("UserName").toString();
String password = entry2.get("Password").toString();
String deviceType = entry2.get("DeviceType").toString();
String ipAddress = entry2.get("IPAddress").toString();
myDevice = new Device(deviceName1, userName, password, deviceType,ipAddress);
openJSON(deviceType,deviceName1);
System.out.println(myDevice);
} else
{
//Add a tag
String tagName = entry2.get("tagName").toString();
String tagType = entry2.get("tagType").toString();
String tagXPath = entry2.get("tagXPath").toString();
String tagWritable = entry2.get("tagWritable").toString();
}
}
}
//System.out.println("==toJSONString()==");
//System.out.println(JSONValue.toJSONString(json));
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch(ParseException pe){
System.out.println(pe);
}
return myDevice;
}
}

Ok, so you have two files, one with list of devices and the other, per-device file, which has attributes of the device. The structures of the two files is exactly same, I am guessing something like
The devices file:
{{"DeviceName:"d1","UserName":"u1","Password":"p1","DeviceType":"t1","IPAddress":"i1"},
{"DeviceName:"d2","UserName":"u2","Password":"p2","DeviceType":"t2","IPAddress":"ir"}}
And the per-device file
{{"tagName:"n1","tagType":"t1","tagXPath":"X1","tagWritable":true}}
In the per-device file there is only one map, though it is inside a list. The processing logic is to open the file, create the parser, read json and for each entry in the list process the map.
The logic for processing the map is the only difference between the two. Note that right now you are discarding what you read from the per-device file, which you will have to store somewhere in the myDevice
One way to do this is using a callback: create an interface MapHandler that has a method process. openJSON takes a parameter of this type and calls process on it for each method.
The device-level handler can be constructed with the myDevice being processed and set the fields.

Identifying file type in Java

Please help me to find out the type of the file which is being uploaded.
I wanted to distinguish between excel type and csv.
MIMEType returns same for both of these file. Please help.

I use Apache Tika which identifies the filetype using magic byte patterns and globbing hints (the file extension) to detect the MIME type. It also supports additional parsing of file contents (which I don't really use).
Here is a quick and dirty example on how Tika can be used to detect the file type without performing any additional parsing on the file:
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.HashMap;
import org.apache.tika.metadata.HttpHeaders;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.metadata.TikaMetadataKeys;
import org.apache.tika.mime.MediaType;
import org.apache.tika.parser.AutoDetectParser;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.Parser;
import org.xml.sax.helpers.DefaultHandler;
public class Detector {
public static void main(String[] args) throws Exception {
File file = new File("/pats/to/file.xls");
AutoDetectParser parser = new AutoDetectParser();
parser.setParsers(new HashMap<MediaType, Parser>());
Metadata metadata = new Metadata();
metadata.add(TikaMetadataKeys.RESOURCE_NAME_KEY, file.getName());
InputStream stream = new FileInputStream(file);
parser.parse(stream, new DefaultHandler(), metadata, new ParseContext());
stream.close();
String mimeType = metadata.get(HttpHeaders.CONTENT_TYPE);
System.out.println(mimeType);
}
}

I hope this will help. Taken from an example not from mine:
import javax.activation.MimetypesFileTypeMap;
import java.io.File;
class GetMimeType {
public static void main(String args[]) {
File f = new File("test.gif");
System.out.println("Mime Type of " + f.getName() + " is " +
new MimetypesFileTypeMap().getContentType(f));
// expected output :
// "Mime Type of test.gif is image/gif"
}
}
Same may be true for excel and csv types. Not tested.

I figured out a cheaper way of doing this with java.nio.file.Files
public String getContentType(File file) throws IOException {
return Files.probeContentType(file.toPath());
}
- or -
public String getContentType(Path filePath) throws IOException {
return Files.probeContentType(filePath);
}
Hope that helps.
Cheers.

A better way without using javax.activation.*:
URLConnection.guessContentTypeFromName(f.getAbsolutePath()));

If you are already using Spring this works for csv and excel:
import org.springframework.mail.javamail.ConfigurableMimeFileTypeMap;
import javax.activation.FileTypeMap;
import java.io.IOException;
public class ContentTypeResolver {
private FileTypeMap fileTypeMap;
public ContentTypeResolver() {
fileTypeMap = new ConfigurableMimeFileTypeMap();
}
public String getContentType(String fileName) throws IOException {
if (fileName == null) {
return null;
}
return fileTypeMap.getContentType(fileName.toLowerCase());
}
}
or with javax.activation you can update the mime.types file.

The CSV will start with text and the excel type is most likely binary.
However the simplest approach is to try to load the excel document using POI. If this fails try to load the file as a CSV, if that fails its possibly neither type.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to configure RDF4J Rio writer to write IRIs with special characters? - java

Related

Save and read multiple types of setting in a text file

Not able to parse new york times article using boilerpipe

Parsing nested JSON nodes to POJOs using Google Http Java Client

Recursive method to return different objects stored as JSON files

Identifying file type in Java

Categories

Resources