Writing foreign characters to .json file, eclipse vs jar

Writing foreign characters to .json file, eclipse vs jar - java

Using eclipse IDE to for tests on writing data to .json and .txt files with few foreign(Chinese, Hindi) characters using Java. I could successfully write into .txt where as .json displayed ascii characters.
Code Snippet:
try(BufferedWriter br = new BufferedWriter(new FileWriter(new File("test.json")))) {
JSONObject obj = new JSONObject();
obj.put("key", "Hello, ओ ो ु ऋ 样品");
String str = obj.toJSONString();
System.out.println(str);
br.write(str);
br.close();
} catch (Exception e) {
e.printStackTrace();
}
Output of .txt: {"key":"Hello, ओ ो ु ऋ 样品"}
Output of .json: {"key":"Hello, à¤“ à¥‹ à¥� à¤‹ æ ·å“�"}
Have tried using DataOutputStream to write data. But the result is same.
On decoding, it worked to decode back as same foreign character and looks good.
On building a jar, and running the same as .jar file doesn't give same results. Writing and Reading both were displayed in ascii. Yes, I understand in eclipse the file is saved as utf-8, which helped to compile. By the way I'm using maven to build the jar.
Please help me with the wayout a solution. Thanks.

Related

Vscode doesn't recognize Umlaute (äöü) when reading and writing files with Java

I have a Java project which reads from a .txt file and counts the frequency of every word and saves every word along with its frequency in a .stat file. The way I do this is by reading the file with a BufferedReader, using replaceAll to replace all special characters with spaces and then iterating through the words and finally writing into a .stat with a PrintWriter.
This program works fine if I run it in Eclipse.
However, if I run it in VSCode, the Umlaute (äöü) get recognized as Special characters and are removed from the words.
If I don't use a replaceAll and leave all the special characters in the text, they will get recognized and displayed normally in the .stat.
If I use replaceAll("[^\\p{IsAlphabetic}+]"), the Umlaute will get replaced by all kinds of weird Unicode characters (for Example Ăbermut instead of Übermut).
If I use replaceAll("[^a-zA-ZäöüÄÖÜß]"), the Umlaute will just get replaced by spaces. The same happens if I mention the Umlaute via their Unicode.
This has to be a problem with the encoding in VSCode or perhaps Powershell, as it works fine in other IDEs.
I already checked if Eclipse and VSCode use the same Jdk version, which they did. It's 17.0.5 and the only one installed on my machine.
I also tried out all the different encoding settings in VSCode and I recreated the project from scratch after changing the settings, to no avail.
Here's the code of the minimal reproducable problem:
import java.io.*;
public class App {
static String s;
public static void main(String[] args) {
Reader reader = new Reader();
reader.readFile();
}
}
public class Reader {
public void readFile() {
String s = null;
File file = new File("./src/textfile.txt");
try (FileReader fileReader = new FileReader(file);
BufferedReader bufferedReader = new BufferedReader(fileReader);) {
s = bufferedReader.readLine();
} catch (FileNotFoundException ex) {
// TODO: handle exception
} catch (IOException ex) {
System.out.println("IOException");
}
System.out.println(s);
System.out.println(s.replaceAll("[a-zA-ZäöüÄÖÜß]", " "));
}
}
My textfile.txt contains the line "abcABCäöüÄÖÜß".
The above program outputs
ï»¿abcABCÃ¤Ã¶Ã¼Ã?Ã?Ã?Ã?
ï»¿ Ã¤Ã¶Ã¼Ã?Ã?Ã?Ã?
Which shows that the problem is presumably in the Reader, as the glibberish Unicode symbols don't get picked up by the replaceAll.

I solved it by explicitly turning all java files and all .txt files into UTF-8 encoding (in the bottom bar in VSCode), setting UTF-8 as the standard encoding in the VSCode settings and modifying both the FileReader and FileWriter to work with the UTF-8 encoding like this:
FileReader fileReader = new FileReader(file, Charset.forName("UTF-8"));
FileWriter fileWriter = new FileWriter(file, Charset.forName("UTF-8"));

File/ontology (turtle, N3, JSON, RDF-XML) etc to .ttl file conversions in java

I wanted to ask if there is a way in Java where I can read in, basically any file format (N3, JSON, RDF-XML) etc and then convert it into turtle(.ttl). I have searched on Google to get some idea, but they mainly just explain for specific file types and how a file type can be converted to RDF whereas I want it the other way.
EDIT (following the code example given in the answer):
if(FilePath.getText().equals("")){
FilePath.setText("Cannot be empty");
}else{
try {
// get the inputFile from file chooser and setting a text field with
// the path (FilePath is the variable name fo the textField in which the
// path to the selected file from file chooser is done earlier)
FileInputStream fis = new FileInputStream(FilePath.getText());
// guess the format of the input file (default set to RDF/XML)
// when clicking on the error I get take to this line.
RDFFormat inputFormat = Rio.getParserFormatForFileName(fis.toString()).orElse(RDFFormat.RDFXML);
//create a parser for the input file and a writer for Turtle
RDFParser rdfParser = Rio.createParser(inputFormat);
RDFWriter rdfWriter = Rio.createWriter(RDFFormat.TURTLE,
new FileOutputStream("./" + fileName + ".ttl"));
//link parser to the writer
rdfParser.setRDFHandler(rdfWriter);
//start the conversion
InputStream inputStream = fis;
rdfParser.parse(inputStream, fis.toString());
//exception handling
} catch (FileNotFoundException ex) {
Logger.getLogger(FileConverter.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(FileConverter.class.getName()).log(Level.SEVERE, null, ex);
}
}
I have added "eclipse-rdf4j-3.0.3-onejar.jar" to the Libraries folder in NetBeans and now when I run the program I keep getting this error:
Exception in thread "AWT-EventQueue-0" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
at org.eclipse.rdf4j.common.lang.service.ServiceRegistry.(ServiceRegistry.java:31)
Any help or advice would be highly appreciated. Thank you.

Yes, this is possible. One option is to use Eclipse RDF4J for this purpose, or more specifically, its Rio parser/writer toolkit.
Here's a code example using RDF4J Rio. It detects the syntax format of the input file based on the file extension, and directly writes the data to a new file, in Turtle syntax:
// the input file
java.net.URL url = new URL(“http://example.org/example.rdf”);
// guess the format of the input file (default to RDF/XML)
RDFFormat inputFormat = Rio.getParserFormatForFileName(url.toString()).orElse(RDFFormat.RDFXML);
// create a parser for the input file and a writer for Turtle format
RDFParser rdfParser = Rio.createParser(inputFormat);
RDFWriter rdfWriter = Rio.createWriter(RDFFormat.TURTLE,
new FileOutputStream("/path/to/example-output.ttl"));
// link the parser to the writer
rdfParser.setRDFHandler(rdfWriter);
// start the conversion
try(InputStream inputStream = url.openStream()) {
rdfParser.parse(inputStream, url.toString());
}
catch (IOException | RDFParseException | RDFHandlerException e) { ... }
For more examples, see the RDF4J documentation.
edit Regarding your NoClassDefFoundError: you're missing a necessary third party library on your classpath (in this particular case, a logging library).
Instead of using the onejar, it's probably better to use Maven (or Gradle) to set up your project. See the development environment setup notes, or for a more step by step guide, see this tutorial (the tutorial uses Eclipse rather than Netbeans, but the points about how to set up your maven project will be very similar in Netbeans).
If you really don't want to use Maven, what you can also do is just download the RDF4J SDK, which is a ZIP file. Unpack it, and just add all jar files in the lib/ directory to Netbeans.

An other option would be to use Apache Jena.

Reading input/output from Chatting program[exe file]

what i tried
try {
File fileDir = new File("B:\\Palringo\\palringo.exe");
BufferedReader in = new BufferedReader(
new InputStreamReader(
new FileInputStream("B:\\Palringo\\palringo.exe"), "UTF8"));
String str;
while ((str = in.readLine()) != null) {
System.out.println(str);
}
in.close();
}
catch (UnsupportedEncodingException e)
{
System.out.println(e.getMessage());
}
Output :
unreadable Strings
what i want
i want to control the (palringo.exe) so i can make Bot for it
What is palringo.exe ?:
its a chatting program you can download it or use web version (palringo.im).
am i doing wrong by opening a file that is exe ? should i connect to the website by Connection classes in java ? if so , how i can connect it ?

This doesn't work. You cannot read an exe file.
You need to have the source code or library to add that software to you code. You simply cannot read a exe file and extract code, because exe file will be encrypted and it will be in lower level languages.
But you can use exec() to run that exe file.

I know this is a very late answer, if you are looking to connect and manipulate palringo, there are a couple of APIs available.
https://github.com/calico-crusade/PalringoApi
This specific one can also be found on Nuget, though it is for C#. You could copy over the majority of the connection code to Java if you wish.

java output html code to file

I have a chunk of html code that should be outputted as a .html file, in java. The pre-written html code is the header and table for a page, and i need to generate some data, and output it to the same .html file. Is there an easier way to print the html code than to do prinln() line by line? Thanks

You can look at some Java libraries for parsing HTML code. A quick Google search tuns up a few. Read in the HTML and then use their queries to manipulate the DOM as needed and then spit it back out.
e.g. http://jsoup.org/

Try using a templating engine, MVEL2 or FreeMarker, for example. Both can be used by standalone applications outside of a web framework. You lose time upfront but it will save you time in the long run.

JSP (Java Server Pages) allows you to write HTML files which have some Java code easily embedded within them. For example
<html><head><title>Hi!</title></head><body>
<% some java code here that outputs some stuff %>
</body></html>
Though that requires that you have an enterprise Java server installed. But if this is on a web server, that might not be unreasonable to have.
If you want to do it in normal Java, that depends. I don't fully understand which part you meant you will be outputting line by line. Did you mean you are going to do something like
System.out.println("<html>");
System.out.println("<head><title>Hi!</title></head>");
System.out.println("<body>");
// etc
Like that? If that's what you meant, then don't do that. You can just read in the data from the template file and output all the data at once. You could read it into a multiline text string of you could read the data in from the template and output it directly to the new file. Something like
while( (strInput = templateFileReader.readLine()) != null)
newFileOutput.println(strInput);
Again, I'm not sure exactly what you mean by that part.

HTML is simply a way of marking up text, so to write a HTML file, you are simply writing the HTML as text to a file with the .html extension.
There's plenty of tutorials out there for reading and writing from files, as well as getting a list of files from a directory. (Google 'java read file', 'java write file', 'java list directory' - that is basically everything you need.) The important thing is the use of BufferedReader/BufferedWriter for pulling and pushing the text in to the files and realising that there is no particular code science involved in writing HTML to a file.
I'll reiterate; HTML is nothing more than <b>text with tags</b>.
Here's a really crude example that will output two files to a single file, wrapping them in an <html></html> tag.
BufferedReader getReaderForFile(filename) {
FileInputStream in = new FileInputStream(filename);
return new BufferedReader(new InputStreamReader(in));
}
public void main(String[] args) {
// Open a file
BufferedReader myheader = getReaderForFile("myheader.txt");
BufferedReader contents = getReaderForFile("contentfile.txt");
FileWriter fstream = new FileWriter("mypage.html");
BufferedWriter out = new BufferedWriter(fstream);
out.write("<html>");
out.newLine();
for (String line = myheader.readLine(); line!=null; line = myheader.readLine()) {
out.write(line);
out.newLine(); // readLine() strips 'carriage return' characters
}
for (String line = contents.readLine(); line!=null; line = contents.readLine()) {
out.write(line);
out.newLine(); // readLine() strips 'carriage return' characters
}
out.write("</html>");
}

To build a simple HTML text file, you don't have to read your input file line by line.
File theFile = new File("file.html");
byte[] content = new byte[(int) theFile.length()];
You can use "RandomAccessFile.readFully" to read files entirely as a byte array:
// Read file function:
RandomAccessFile file = null;
try {
file = new RandomAccessFile(theFile, "r");
file.readFully(content);
} finally {
if(file != null) {
file.close();
}
}
Make your modifications on the text content:
String text = new String(content);
text = text.replace("<!-- placeholder -->", "generated data");
content = text.getBytes();
Writing is also easy:
// Write file content:
RandomAccessFile file = null;
try {
file = new RandomAccessFile(theFile, "rw");
file.write(content);
} finally {
if(file != null) {
file.close();
}
}

How to print the content of a tar.gz file with Java?

I have to implement an application that permits printing the content of all files within a tar.gz file.
For Example:
if I have three files like this in a folder called testx:
A.txt contains the words "God Save The queen"
B.txt contains the words "Ubi maior, minor cessat"
C.txt.gz is a file compressed with gzip that contain the file c.txt with the words "Hello America!!"
So I compress testx, obtain the compressed tar file: testx.tar.gz.
So with my Java application I would like to print in the console:
"God Save The queen"
"Ubi maior, minor cessat"
"Hello America!!"
I have implemented the ZIP version and it works well, but keeping tar library from apache ant http://commons.apache.org/compress/, I noticed that it is not easy like ZIP java utils.
Could someone help me?
I have started looking on the net to understand how to accomplish my aim, so I have the following code:
GZIPInputStream gzipInputStream=null;
gzipInputStream = new GZIPInputStream( new FileInputStream(fileName));
TarInputStream is = new TarInputStream(gzipInputStream);
TarEntry entryx = null;
while((entryx = is.getNextEntry()) != null) {
if (entryx.isDirectory()) continue;
else {
System.out.println(entryx.getName());
if ( entryx.getName().endsWith("txt.gz")){
is.copyEntryContents(out);
// out is a OutputStream!!
}
}
}
So in the line is.copyEntryContents(out), it is possible to save on a file the stream passing an OutputStream, but I don't want it! In the zip version after keeping the first entry, ZipEntry, we can extract the stream from the compressed root folder, testx.tar.gz, and then create a new ZipInputStream and play with it to obtain the content.
Is it possible to do this with the tar.gz file?
Thanks.

surfing the net, i have encountered an interesting idea at : http://hype-free.blogspot.com/2009/10/using-tarinputstream-from-java.html.
After converting ours TarEntry to Stream, we can adopt the same idea used with Zip Files like:
InputStream tmpIn = new StreamingTarEntry(is, entryx.getSize());
// use BufferedReader to get one line at a time
BufferedReader gzipReader = new BufferedReader(
new InputStreamReader(
new GZIPInputStream(
inputZip )));
while (gzipReader.ready()) { System.out.println(gzipReader.readLine()); }
gzipReader.close();
SO with this code you could print the content of the file testx.tar.gz ^_^

To not have to write to a File you should use a ByteArrayOutputStream and use the public String toString(String charsetName)
with the correct encoding.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Writing foreign characters to .json file, eclipse vs jar - java

Related

Vscode doesn't recognize Umlaute (äöü) when reading and writing files with Java

File/ontology (turtle, N3, JSON, RDF-XML) etc to .ttl file conversions in java

Reading input/output from Chatting program[exe file]

java output html code to file

How to print the content of a tar.gz file with Java?

Categories

Resources