BufferedReader method ReadLine() is converting en-dash ("\u2013") to hyphen ("\u002D")

BufferedReader method ReadLine() is converting en-dash ("\u2013") to hyphen ("\u002D") - java

I have folders in a repository in SVN which have an en-dash ("\u2013") in their names.
I am first calling the "svn list" (in my Windows 7 + UTF-8 encoding) to get the list of the directory.
After that calling BufferedReader readLine(), it reads the text of the list.
The name of the folders being displayed contain a hyphen ("\u002D") instead of the en-dash ("\u2013").
Are there any limitations regarding that ?
class Test {
public static void main(String args[]) {
BufferedReader br = null;
try {
String sCurrentLine;
br = new BufferedReader(new FileReader("C:\\test–ing.xml"));
System.out.println(br.readLine());
while ((sCurrentLine = br.readLine()) != null) {
System.out.println(sCurrentLine);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)
br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
} // end main

This is probably the problem:
br = new BufferedReader(new FileReader("C:\\test–ing.xml"));
That will use the platform default encoding. You've said that the file is UTF-8-encoded - so you need to specify that you want UTF-8, which means avoiding FileReader's broken API:
br = new BufferedReader(new InputStreamReader(
new FileInputStream("C:\\test–ing.xml"), "UTF-8"));
That's assuming the file really is valid UTF-8 containing the expected character. You should check that before doing anything else.
Alternatively, given that this is XML, I assume in your real code you're going to use it as XML? If so, I would just load it straight from an input stream, and let the XML parser handle the encoding.

Related

How to display ðŸ”´ in JLabel or java

I am trying to figure out how to display ðŸ”´ in java. I am able to convert unicode to show so if I could convert ðŸ”´to unicode then I would be set. I was thinking I could just make a big check list but figure that would cause alot of strain to check.
I am trying to show 🔴, but the API call gives me ðŸ”´. My question is, how do I change ðŸ”´ to the red circle other then finding the Unicode string for it.
This is what the symbol looks like 🔴
Code to format json
private static String getStringFromInputStream(InputStream is) {
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
String line;
try {
br = new BufferedReader(new InputStreamReader(is));
while ((line = br.readLine()) != null) {
sb.append(line);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return sb.toString();
}

It looks like you're looking at the characters with an editor in which you haven't specified the correct character encoding / set.
To use smybol that String in Java, try:
String text = "\uD83D\uDD34";
JLabel label = new JLabel(text);
Source of the unicode escapes: https://www.htmlsymbols.xyz/unicode/U+1F534
(It may still be that your font doesn't have the character, in which case it will probably look like a question mark - better than 4 strange accented characters, probably)

There still remains one problem: new InputStreamReader(is) uses the default encoding the program runs on. I would expect the fixed encoding of the is: new InputStreamReader(is, "UTF-8")

BufferedReader creating odd characters, which character encoding do I use?

I am trying to take multiple text files and merge them all in to one new file. However, looking at the new file that was created there are some weird characters that have replaced quotation marks and I can't figure out why or how this is occurring. Tried specifying the encoding but it did not solve the problem. Am I using the wrong character encoding?
Reader reader = new InputStreamReader(new FileInputStream(fileName), "utf-8");
Here is the issue:
Original file contains:
|3_PatFemale("X")|3_PatSex (”M” or “F”)|
New file contains
|3_PatFemale("X")|3_PatSex (�M� or �F�)|
code:
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(exportFile),"UTF-8"));
for (File f : files) {
FileInputStream fis;
try {
fis = new FileInputStream(f);
BufferedReader in = new BufferedReader(new InputStreamReader(fis));
String aLine;
while ((aLine = in.readLine()) != null) {
out.write(aLine);
out.newLine();
}
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}

Change the contents of the file form this
|3_PatFemale("X")|3_PatSex (”M” or “F”)|
to
|3_PatFemale("X")|3_PatSex ("M" or "F")|
The quotation you are using
”
(prime quotation mark)
is different from
"
For further reference: https://askleo.com/why_do_i_get_odd_characters_instead_of_quotes_in_my_documents/

Java fileinput (read and write) to a file looping thru each line

I have this method that access a exisitng file, loop thru each line and replace (string to string) a certain line if the condition is met:
import java.io.BufferedReader;
import java.io.DataInputStream;
import java.io.InputStreamReader;
private void UpdateConfig() {
try {
FileInputStream fstream = new FileInputStream("c:\\user\\config.properties");
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
while ((strLine = br.readLine()) != null) {
if (strLine.contains("FTPDate=2014/07/01 00:59:00")) {
System.out.println("FILE " + strLine);
strLine.replace("FTPDate=2014/07/01 00:59:00", "FTPDate=2014/09/10 00:00:00");
//strLine.replace("((19|20)\\d\\d/(0?[1-9]|1[012])/(0?[1-9]|[12][0-9]|3[01])) ([2][0-3]|[0-1][0-9]|[1-9]):[0-5][0-9]:([0-5][0-9]|[6][0])", "2014/09/10 00:00:00");
System.out.println("FILE " + strLine);
}
}
in.close();
} catch (Exception e) {
}
}
In the sysout it seems its being replaced:
FILE FTPDateTejas=2014/07/01 00:59:00
FILE FTPDateTejas=2014/09/10 00:00:00
But when I check the file, the date still stays the same. Am I missing something? anyone knows what I missed out? thank you

When you are doing:
strLine = br.readLine() it loads the next line from the BufferedReader into memory. This means that you have your data on disk and in memory and that those two are not linked to each other in any way. When doing modifications on strLine I believe you have in your code:
strLine = strLine.replace("FTPDate=2014/07/01 00:59:00", "FTPDate=2014/09/10 00:00:00");
As replace doesn't modify the contents of the objects on which it is being called but returns a new String objects (Strings are immutable). So what that does it creates a new object but does not modify your on disk data (as I said, it's not linked to it any more!).
You could think "ok then how do I link those two and override the file in place?". Well Java does provide random file access as described in the doc but the only thing you can do with it is modify characters at a certain position, you cannot insert things in the middle. So what you would have to do is read the rest of your file, make your modification and then append that rest of the file, yes you need to shift things in case your new string with which you are substituting would be shorter/longer than what you are replacing.
That's why an easier solution would be to:
open a new file to write to
write line by line to it (the strings after the replace)
delete the old file and rename the new file
Without copying the file the code would look something like this:
private void UpdateConfig() {
File fstream = new File("c:\\user\\config.properties");
File file = new File("c:\\user\\config.properties-new");
try {
file.createNewFile();
} catch (IOException e) {
// handle
}
try (FileReader in = new FileReader(fstream);
FileWriter fw = new FileWriter(file.getAbsoluteFile())) {
try (BufferedReader br = new BufferedReader(in);
BufferedWriter bw = new BufferedWriter(fw)) {
String strLine;
while ((strLine = br.readLine()) != null) {
if (strLine.contains("FTPDate=2014/07/01 00:59:00")) {
System.out.println("FILE " + strLine);
strLine = strLine.replace("FTPDate=2014/07/01 00:59:00",
"FTPDate=2014/09/10 00:00:00");
//strLine.replace("((19|20)\\d\\d/(0?[1-9]|1[012])/(0?[1-9]|[12][0-9]|3[01])) ([2][0-3]|[0-1][0-9]|[1-9]):[0-5][0-9]:([0-5][0-9]|[6][0])", "2014/09/10 00:00:00");
bw.write(strLine);
System.out.println("FILE " + strLine);
}
}
}
// copy files here
} catch (IOException e) {
// handle
}
}
There might be some logical/syntactic problems as I was writing in in a plain text editor. I modified the code a bit to use Java 7's try-with-resources, which is a cleaner way of closing resources than what you were doing - in your code when an exception would be thrown the stream might not had been closed.

Parsed strings from .csv-file are invalid tokens in an kml-file. How can i solve this?

I have a code which parses strings from an CSV.-file (with twitter data) and gives them to a new KML file. When i parse the comments from the twitter data there are of course unknown tokens like: ðŸš¨. When i open up the new KML-File in Google Earth i get an error because of this unknown tokens.
Question:
When i parse the strings, can i tell java it should throw out all unknown tokens from the string so that i don't have any unknown tokens in my KML?
Thank you
Code below:
String csvFile = "twitter.csv";
BufferedReader br = null;
String line = "";
String cvsSplitBy = ";";
String[] twitter = null;
int row_desired = 0;
int row_counter = 0;
String[] placemarks = new String[1165];
// ab hier einlesen der CSV
try {
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
if (row_counter++ == row_desired) {
twitter = line.split(cvsSplitBy);
placemarks[row_counter] =
"<Placemark>\n"+
"<name>User ID: "+twitter[7]+"</name>\n"+
"<description>This User wrote: "+twitter[5]+" at the: "+twitter[6]+"</description>\n"+
"<Point>\n"+
"<coordinates>"+twitter[1]+","+twitter[2]+"</coordinates>\n"+
"</Point>\n"+
"</Placemark>\n";
row_desired++;
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
for(int i = 2; i <= 1164;i++){
String kml2 = kml.concat(""+placemarks[i]+"");
kml=kml2;
}
kml = kml.concat("</Document></kml>");
FileWriter fileWriter = new FileWriter(filepath);
fileWriter.write(kml);
fileWriter.close();
Runtime.getRuntime().exec(googlefilepath + filepath);
}

Text files are not all built equal: you must always consider what character encoding is in use. I'm not sure about Twitter's data specifically, but I would guess they're doing like the rest of the world and using UTF-8.
Basically, avoid FileReader and instead use the constructor of InputStreamReader which lets you specify the Charset.
Tip: if you're using Java 7+, try this:
for (String line : Files.readAllLines(file.toPath(), Charset.forName("UTF-8"))) { ...
More Info
The javadoc of FileReader states "The constructors of this class assume that the default character encoding"
You should avoid this class, always. Or at least for any data that might ever be transferred between computers. Even a program running on Windows "using the default charset" will assume UTF-8 when run from inside Eclipse, or ISO_8859_1 when running outside Eclipse! Such non-determinism from a class is not good.

Reading and displaying data from a .txt file

How do you read and display data from .txt files?

BufferedReader in = new BufferedReader(new FileReader("<Filename>"));
Then, you can use in.readLine(); to read a single line at a time. To read until the end, write a while loop as such:
String line;
while((line = in.readLine()) != null)
{
System.out.println(line);
}
in.close();

If your file is strictly text, I prefer to use the java.util.Scanner class.
You can create a Scanner out of a file by:
Scanner fileIn = new Scanner(new File(thePathToYourFile));
Then, you can read text from the file using the methods:
fileIn.nextLine(); // Reads one line from the file
fileIn.next(); // Reads one word from the file
And, you can check if there is any more text left with:
fileIn.hasNext(); // Returns true if there is another word in the file
fileIn.hasNextLine(); // Returns true if there is another line to read from the file
Once you have read the text, and saved it into a String, you can print the string to the command line with:
System.out.print(aString);
System.out.println(aString);
The posted link contains the full specification for the Scanner class. It will be helpful to assist you with what ever else you may want to do.

In general:
Create a FileInputStream for the file.
Create an InputStreamReader wrapping the input stream, specifying the correct encoding
Optionally create a BufferedReader around the InputStreamReader, which makes it simpler to read a line at a time.
Read until there's no more data (e.g. readLine returns null)
Display data as you go or buffer it up for later.
If you need more help than that, please be more specific in your question.

I love this piece of code, use it to load a file into one String:
File file = new File("/my/location");
String contents = new Scanner(file).useDelimiter("\\Z").next();

Below is the code that you may try to read a file and display in java using scanner class. Code will read the file name from user and print the data(Notepad VIM files).
import java.io.*;
import java.util.Scanner;
import java.io.*;
public class TestRead
{
public static void main(String[] input)
{
String fname;
Scanner scan = new Scanner(System.in);
/* enter filename with extension to open and read its content */
System.out.print("Enter File Name to Open (with extension like file.txt) : ");
fname = scan.nextLine();
/* this will reference only one line at a time */
String line = null;
try
{
/* FileReader reads text files in the default encoding */
FileReader fileReader = new FileReader(fname);
/* always wrap the FileReader in BufferedReader */
BufferedReader bufferedReader = new BufferedReader(fileReader);
while((line = bufferedReader.readLine()) != null)
{
System.out.println(line);
}
/* always close the file after use */
bufferedReader.close();
}
catch(IOException ex)
{
System.out.println("Error reading file named '" + fname + "'");
}
}
}

If you want to take some shortcuts you can use Apache Commons IO:
import org.apache.commons.io.FileUtils;
String data = FileUtils.readFileToString(new File("..."), "UTF-8");
System.out.println(data);
:-)

public class PassdataintoFile {
public static void main(String[] args) throws IOException {
try {
PrintWriter pw = new PrintWriter("C:/new/hello.txt", "UTF-8");
PrintWriter pw1 = new PrintWriter("C:/new/hello.txt");
pw1.println("Hi chinni");
pw1.print("your succesfully entered text into file");
pw1.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
BufferedReader br = new BufferedReader(new FileReader("C:/new/hello.txt"));
String line;
while((line = br.readLine())!= null)
{
System.out.println(line);
}
br.close();
}
}

In Java 8, you can read a whole file, simply with:
public String read(String file) throws IOException {
return new String(Files.readAllBytes(Paths.get(file)));
}
or if its a Resource:
public String read(String file) throws IOException {
URL url = Resources.getResource(file);
return Resources.toString(url, Charsets.UTF_8);
}

You most likely will want to use the FileInputStream class:
int character;
StringBuffer buffer = new StringBuffer("");
FileInputStream inputStream = new FileInputStream(new File("/home/jessy/file.txt"));
while( (character = inputStream.read()) != -1)
buffer.append((char) character);
inputStream.close();
System.out.println(buffer);
You will also want to catch some of the exceptions thrown by the read() method and FileInputStream constructor, but those are implementation details specific to your project.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

BufferedReader method ReadLine() is converting en-dash ("\u2013") to hyphen ("\u002D") - java

Related

How to display ðŸ”´ in JLabel or java

BufferedReader creating odd characters, which character encoding do I use?

Java fileinput (read and write) to a file looping thru each line

Parsed strings from .csv-file are invalid tokens in an kml-file. How can i solve this?

Reading and displaying data from a .txt file

Categories

Resources