Some parts of String missing - eclipse bug - java

I am trying to read all the words as a String from the url - http://www.puzzlers.org/pub/wordlists/unixdict.txt
But the outputted String has some part of the strings missing whenever there is a '
Why I am getting the same.
How to avoid the same.
I am getting the same error when using String builder instead of concatenating.
public static String getUrlContents(String theUrl) throws IOException {
URL url = new URL(theUrl);
URLConnection urlConnection = url.openConnection();
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));
String line;
String text = "";
while ((line = bufferedReader.readLine()) != null) {
text += line + " ";
}
bufferedReader.close();
return text;
}
Output:
After ain' huge blank and then continues from 'd anyhow
and it continues after
So it's eating up the text between two subsequent ' and '
Looks like the text is there, since when I search for antony which is between the blanks the eclipse highlights the word as seen below, but it's not visible on my screen :O

As already answered by Naruto that console has buffer size and above that size content is not visible. To check whether your string is correct or not, just copy whole content from console CTRL+A and paste in notepad file, I'm sure you will see complete content.
Basically it's not a bug but predefined size of the console, you can change it as well at (Window > Preferences > Run/Debug > Console).
Another way is just use Fixed Width Console and set Max size which is 1000 and you will be able to see the content in the console.

Related

characters not appearing when I print when I import a file?

I'm importing a file into my code and trying to print it. the file contains
i don't like cake.
pizza is good.
i don’t like "cookies" to.
17.
29.
the second dont has a "right single quotation" and when I print it the output is
don�t
the question mark is printed out a blank square. is there a way to convert it to a regular apostrophe?
EDIT:
public class Somethingsomething {
public static void main(String[] args) throws FileNotFoundException,
IOException {
ArrayList<String> list = new ArrayList<String>();
File file = new File("D:\\project1Test.txt");//D:\\project1Test.txt
if(file.exists()){//checks if file exist
FileInputStream fileStream = new FileInputStream(file);
InputStreamReader input = new InputStreamReader(fileStream);
BufferedReader reader = new BufferedReader(input);
String line;
while( (line = reader.readLine()) != null) {
list.add(line);
}
for(int i = 0; i < list.size(); i ++){
System.out.println(list.get(i));
}
}
}}
it should print as normal but the second "don't" has a white block on the apostrophe
this is the file I'm using https://www.mediafire.com/file/8rk7nwilpj7rn7s/project1Test.txt
edit: if it helps even more my the full document where the character is found here
https://www.nytimes.com/2018/03/25/business/economy/labor-professionals.html
It’s all about character encoding. The way characters are represented isn't always the same and they tend to get misinterpreted.
Characters are usually stored as numbers that depend on the encoding standard (and there are so many of them). For example in ASCII, "a" is 97, and in UTF-8 it's 61.
Now when you see funny characters such as the question mark (called replacement character) in this case, it's usually that an encoding standard is being misinterpreted as another standard, and the replacement character is used to replace the unknown or misinterpreted character.
To fix your problem you need to tell your reader to read your file using a specific character encoding, say SOME-CHARSET.
Replace this:
InputStreamReader input = new InputStreamReader(fileStream);
with this:
InputStreamReader input = new InputStreamReader(fileStream, "SOME-CHARSET");
A list of charsets is available here. Unfortunately, you might want to go through them one by one. A short list of most common ones could be found here.
Your problem is almost certainly the encoding scheme you are using. You can read a file in most any encoding scheme you want. Just tell Java how your input was encoded. UTF-8 is common on Linux. Windows native is CP-1250.
This is the sort of problem you have all the time if you are processing files created on a different OS.
See here and Here
I'll give you a different approach...
Use the appropriate means for reading plain text files. Try this:
public static String getTxtContent(String path)
{
try(BufferedReader br = new BufferedReader(new FileReader(path)))
{
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
return sb.toString();
}catch(IOException fex){ return null; }
}

How to neglect blank spaces in a string search in html file using java?

I have a html file which I have to search line by line and look for a particular string and then take some actions accordingly.
The problem is that the string is being matched to the entire line of the each line of the html file.
So if there are some spaces before the actual string in a given line, the match turns out to be false, even though it should be positive.
package read_txt;
import java.io.*;
class FileRead
{
public static void main(String args[])
{
try{
// Open the file that is the first
// command line parameter
FileInputStream fstream = new FileInputStream("textfile.html");
// Get the object of DataInputStream
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Print the content on the console
//String a = "media query";
switch (strLine) {
case "#media query" :
System.out.println("media query found");
System.out.println("html file responsive");
break;
// default :
// System.out.println("html file unresponsive");
//break;
}
}
//Close the input stream
in.close();
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
}
In my code above, I am searching for a String "media query". Now suppose this is the html file being searched :
The codes works fine for this html file, but now suppose we have this html file :
The string match does not work although a media query string is present, but if I change the matched string to " media query" instead of "media query", it works again.
Any idea how can I ignore the blank spaced occurring before appearance of any text in a line?
In this case, I would think that using "switch" is not the right way to go.
You might use
if (strLine.contains("media query"))
but that will fail if the line has "media query" (two spaces instead of one).
So, you best bet might be to use a regular expression.
You could use endsWith, e.g.
if (strLine.endsWith("media query")) { ...
In cases, where the searched string could be somewhere in the middle of line you could use indexOf, e.g.
if (strLine.indexOf("medial quera") >= 0) { ...

Java print a word from text file when user enters word number

I'm trying to write a Java application that reads a text file. Suppose I have a text file beg.txt which contains text:
I am a beginner
When the user enters word number 4, the program has to print word 'beginner'.
How can I do this in Java, please?
First give a try before asking this.
Just for your help. Try following steps, this is not the only way.
Read your file
Split string to a string array using space
Print array[your choice - 1]
BufferedReader br = null;
String[] str;
try {
String sCurrentLine;
StringBuilder sb = new StringBuilder();
br = new BufferedReader(new FileReader("C:\\testing.txt"));
while ((sCurrentLine = br.readLine()) != null) {
sb.append(sCurrentLine);
}
str = sb.toString.split(" ");
} catch (IOException e) {
e.printStackTrace();
}
if user enters 4 then you can use array 'str' like this :
String result = str[userEnteredValue - 1];
Note: the above code will work only when the file will contain space delimitted characters.
File read=new File("D:\\Test.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(read),Charset.forName("UTF-8")));
String news = reader.readLine();
String[] records = news.split(" ");
if your input is 4
and get records[4]
Well, the basic process will be something like the following:
Load the text file
Get user input
Process text file with parameters from user
Step 1 will depend on which version of Java you're using. If Java 7, I'd look at nio2. Java 6 has other options. Or you could you Guava or Apache Commons. Since the processing required is minimal, I would store the output of this step as a simple String.
Getting the user input can be done in a number of ways, but one option is to use a Scanner.
Finally, processing the file can be done by using String.split() with a simple regex and then picking the correct element from the resulting array.

Java Loses International Characters in Stream

I am having trouble reading international characters in Java.
The default character set being used is UTF-8 and my Eclipse workspace is also set to this.
I am reading a title of a video from the Internet (Gangam Style in fact ;) ) which contains Korean characters, I am doing this as follows:
BufferedReader stdIn = new BufferedReader(new InputStreamReader(shellCommand.getInputStream()));
String fileName = null, output = null;
while ((output = stdInput.readLine()) != null) {
if (output.indexOf("Destination") > 0) {
System.out.println(output);
I know that the title it will read is: "PSY - GANGNAM STYLE (강남스타일) M/V", but the console displays the following instead: "PSY - GANGNAM STYLE () M V" which causes errors further along in my program.
It seems like the InputStream Reader isn't reading these characters correctly.
Does anyone have any ideas? I've spent the last hour scouring the Internet and haven't found any answers. Thanks in advance everyone.
The default character set being used is UTF-8
The default where? In Java itself, or in the video? It would be a much clearer if you specified this explicitly. You should check that's correct for the video data too.
It seems like the InputStream Reader isn't reading these characters correctly.
Well, all we know is that the text isn't showing properly on the console. Either it isn't being read correctly, or it's not being displayed correctly. You should print out each character's Unicode value so you can check the exact content of the string. For example:
static void logCharacters(String text) {
for (int i = 0; i < text.length(); i++) {
char c = text.charAt(i);
System.out.println(c + " " + Integer.toHexString(c));
}
}
You need to enure default char-set using Charset.defaultCharset().name() else use
InputStreamReader in = new InputStreamReader(shellCommand.getInputStream(), "UTF-8");
I tried sample program and it prints correctly in eclipse. It might be problem of windows console as AlexR has pointed out.
byte[] bytes = "PSY - GANGNAM STYLE (강남스타일) M/V".getBytes();
InputStreamReader reader = new InputStreamReader(new ByteArrayInputStream(bytes));
BufferedReader bufferedReader = new BufferedReader(reader);
String str = bufferedReader.readLine();
System.out.println(str);
Output:
PSY - GANGNAM STYLE (강남스타일) M/V

How to replace multiple occurences of a string in a text file with a variable entered by the user and save all to a new file?

public static void main(String args[])
{
try
{
File file = new File("input.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "000000", oldtext = "414141";
while((line = reader.readLine()) != null)
{
oldtext += line + "\r\n";
}
reader.close();
// replace a word in a file
//String newtext = oldtext.replaceAll("drink", "Love");
//To replace a line in a file
String newtext = oldtext.replaceAll("This is test string 20000", "blah blah blah");
FileWriter writer = new FileWriter("input.txt");
writer.write(newtext);writer.close();
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
}
}
A couple suggestions on your sample code:
Have the user pass in old and new on the command line (i.e., args[0] and args1).
If it's sufficient to do this a line at a time, it's going to be much more efficient to read a line, replace old -> new, then stream it out.
Also check out StringUtils and IOUtils, which may make your life easier in this case.
Easiest is the String.replace(oldstring, newstring), or String.replaceAll(regex, newString) function, you can just read the one file and write the replacement into a new file (or do it line by line if you're concerned about file size).
After reading your last comment - that's a totally different story... the preferred solution would be to parse the css file into an object model (like DOM), apply the changes there and serialize the model to css afterwards. It's much easier to find all color attributes in DOM and change them compared to doing the same with search and replace.
I've found some CSS parser in the wild wild web, but none of them looked like being capable of writing CSS files.
If you wanted to replace the color names with search and replace, you'd search for 'color:<colorname>' and replace it with 'color:<youHexColorValue>'. You may have to do the same for 'color:"<colorname>"', because the color name can be set in double quotes (another argument for using a CSS parser..)
String.replaceAll() is the easiest way to do it. Just read the complete CSS file into one String, replace all as suggested above and write the new String to the same (or a temporary) file (first).

Categories

Resources