There are lot of post on this, and everyone suggesting to change the text file content.
My requirement here is, i am parsing a c++ source file. During this parsing i might need to merge multi lines together when i find a backslash at the end.
Example:
char line[100]="hello join the multiple lines.\
Oh, dont ask me to edit CPP source file.";
How do I read this text from xyz.cpp file, and figure out the line has a backslash at the end.
I used FileInputReader to read line by line, but the backslash is missing when i get the line in java.
I hope you will not suggest me to change my CPP source code to replace \ with \
Thanks in advance.
The backslash is an escape character in Java. So if you want to match with a real backslash - \, then you have to look for \\.
You can use contains() or indexOf() for string literals.
Or read character by character and check the condition:
if (c == '\\')
Hope this helps!
On the simplest level you can just split the file data by newlines (data.split('\n') where data is a String) and then check if it ends in a backslash (line.endsWith('\\') where line is a String)
The following code loads the file and prints each line
import java.io.File;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Scanner;
public class Test {
public static void main(String args[]) throws Exception{
Scanner scan = new Scanner(new File("file.txt"));
scan.useDelimiter("\\Z");
String content = scan.next();
String[] lines = content.split("\n");
for (String value : lines) {
System.out.println(value);
}
}
}
I created a file "file.txt" containing the following lines
line 1
line 2\
line 2cnt
line 3
The code will output
line 1
line 2\
line 2cnt
line 3
To join the lines you can run the following code
public static void main(String args[]) throws Exception{
Scanner scan = new Scanner(new File("file.txt"));
scan.useDelimiter("\\Z");
String content = scan.next();
String[] lines = content.split("\n");
for (String value : lines) {
if (value.endsWith("\\")) {
value = value.substring(0, value.length()-1);
System.out.print(value);
} else {
System.out.println(value);
}
}
}
which will output
line 1
line 2line 2cnt
line 3
Edited as per your comment:
public static void main(String args[]) throws Exception {
FileInputStream fileInputStream = new FileInputStream("file.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(fileInputStream));
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
}
outputs
line 1
line 2\
line 2cnt
line 3
and
public static void main(String args[]) throws Exception {
FileInputStream fileInputStream = new FileInputStream("file.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(fileInputStream));
String line;
while ((line = reader.readLine()) != null) {
if (line.endsWith("\\")) {
line= line.substring(0, line.length()-1);
System.out.print(line);
} else {
System.out.println(line);
}
}
}
outputs
line 1
line 2line 2cnt
line 3
Not sure, why you don't see the backslash on your machine. Can you post the complete code and the file content? What platform are you running it on? What is the encodeing of the file? Maybe you need to pass the encoding to the InputStreamReader like this:
BufferedReader reader = new BufferedReader(new InputStreamReader(fileInputStream, "UTF-8"));
It was compilation issue of eclipse ide, i restarted the eclipse and did clean compile. Now every thing is working as expected. Thank you guys for your time.
Related
Using command line, I am supposed to enter a file name that contains text and search for a specific word.
foobar file.txt
I started writing the following code:
import java.util.*;
import java.io.*;
class Find {
public static void main (String [] args) throws FileNotFoundException {
String word = args[0];
Scanner input = new Scanner (new File (args[1]) );
while (input.hasNext()) {
String x = input.nextLine();
}
}
}
My program is supposed to find word and then print the whole line that contains it.
Please be specific since I am new to java.
You are already reading in each line of the file, so using the String.contains() method will be your best solution
if (x.contains(word) ...
The contains() method simply returns true if the given String contains the character sequence (or String) you pass to it.
Note: This check is case sensitive, so if you want to check if the word exists with any mix of capitalization, just convert the strings to the same case first:
if (x.toLowerCase().contains(word.toLowerCase())) ...
So now here is a complete example:
public static void main(String[] args) throws FileNotFoundException {
String word = args[0];
Scanner input = new Scanner(new File(args[1]));
// Let's loop through each line of the file
while (input.hasNext()) {
String line = input.nextLine();
// Now, check if this line contains our keyword. If it does, print the line
if (line.contains(word)) {
System.out.println(line);
}
}
}
Firest you have to open file and then read it line by line and check that word is in that line on not. see the code below.
class Find {
public static void main (String [] args) throws FileNotFoundException {
String word = args[0]; // the word you want to find
try (BufferedReader br = new BufferedReader(new FileReader("foobar.txt"))) { // open file foobar.txt
String line;
while ((line = br.readLine()) != null) { //read file line by line in a loop
if(line.contains(word)) { // check if line contain that word then prints the line
System.out.println(line);
}
}
}
}
}
I thought this was only an issue with Python 2 but have run into a similar issue now with java (Windows 10, JDK8).
My searches have lead to little resolution so far.
I read from 'stdin' input stream this value: Viļāni. When I print it to console I get this: Vi????ni.
Relevant code snippets are as follows:
BufferedReader in = new BufferedReader(new InputStreamReader(System.in, StandardCharsets.UTF_8));
ArrayList<String> corpus = new ArrayList<String>();
String inputString = null;
while ((inputString = in.readLine()) != null) {
corpus.add(inputString);
}
String[] allCorpus = new String[corpus.size()];
allCorpus = corpus.toArray(allCorpus);
for (String line : allCorpus) {
System.out.println(line);
}
Further expansion on my problem as follows:
I read a file containing the following 2 lines:
を
Sōten_Kōro
When I read this from disk and output to a second file I get the following output:
ã‚’
S�ten_K�ro
When I read the file from stdin using cat testinput.txt | java UTF8Tester I get the following output:
???
S??ten_K??ro
Both are obviously wrong. I need to be able to print the correct characters to console and file. My sample code is as follows:
public class UTF8Tester {
public static void main(String args[]) throws Exception {
BufferedReader stdinReader = new BufferedReader(new InputStreamReader(System.in, StandardCharsets.UTF_8));
String[] stdinData = readLines(stdinReader);
printToFile(stdinData, "stdin_out.txt");
BufferedReader fileReader = new BufferedReader(new FileReader("testinput.txt"));
String[] fileData = readLines(fileReader);
printToFile(fileData, "file_out.txt");
}
private static void printToFile(String[] data, String fileName)
throws FileNotFoundException, UnsupportedEncodingException {
PrintWriter writer = new PrintWriter(fileName, "UTF-8");
for (String line : data) {
writer.println(line);
}
writer.close();
}
private static String[] readLines(BufferedReader reader) throws IOException {
ArrayList<String> corpus = new ArrayList<String>();
String inputString = null;
while ((inputString = reader.readLine()) != null) {
corpus.add(inputString);
}
String[] allCorpus = new String[corpus.size()];
return corpus.toArray(allCorpus);
}
}
Really stuck here and help would really be appreciated! Thanks in advance. Paul
System.in/out will use the default Windows character set.
Java String will use Unicode internally.
FileReader/FileWriter are old utility classes that use the default character set, hence they are for non-portable local files only.
The error you saw, was a special character as two bytes UTF-8 sequence, but every (special UTF-8) byte interpreted as the default single byte encoding, but with a value not present, hence twice a ? substitution.
Required is that the character can be entered on System.in in the default charset.
Then the String was converted from the default charset.
Writing it to file in UTF-8 needs to specify UTF-8.
Hence:
BufferedReader stdinReader = new BufferedReader(new InputStreamReader(System.in));
String[] stdinData = readLines(stdinReader);
printToFile(stdinData, "stdin_out.txt");
Path path = Paths.get("testinput-utf8.txt");
List<String> lines = Files.readAllLines(path); // Here the default is UTF-8!
Path path = Paths.get("testinput-winlatin1.txt");
List<String> lines = Files.readAllLines(path, "Windows-1252");
Files.write(lines, Paths.get("file_out.txt"), StandardCharsets.UTF_8);
To check whether your current computer system handles Japanese:
System.out.println("Hiragana letter Wo '\u3092'."); // Either を or ?.
Seeing ? the conversion to the default system encoding could not deliver.
を is U+3092, u-encoded as ASCII with \u3092.
To create an UTF-8 text under Windows:
Files.write(Paths.get("out-utf8.txt"),
"\uFEFFHiragana letter Wo '\u3092'.".getBytes(StandardCharsets.UTF_8));
Here I use an ugly (generally unneeded) BOM marker char \uFEFF (a zero-width space) that will let Windows Notepad recognize the text being in UTF-8.
I have a program that is supposed to take a text file specified in the run arguments and print it one word at a time on separate lines. It is supposed to omit any special characters except for dashes (-) and apostrophes (').
I have basically finished the program, except that I can only get it to print the first line of text in the file.
Here is what is in the text file:
This is the first line of the input file. It has
more than one line!
Here are the run arguments I am using:
java A1 A1.txt
Here is my code:
import java.io.*;
import java.util.*;
public class A1
{
public static void main (String [] args) throws IOException
{
if (args.length > 0)
{
String file = (args[2]);
try
{
FileReader fr = new FileReader (file);
BufferedReader br = new BufferedReader(fr);
String s = br.readLine();
int i = 1;
StringTokenizer st = new StringTokenizer(s);
while (st.hasMoreTokens())
{
System.out.println(st.nextToken());
}
br.close();
} catch (IOException e)
{
System.out.println ("The following error occurred " + e);
}
}
}
}
You are only calling readLine() once! So you are only reading and parsing through the first line of the input file. The program then ends.
What you want to do is throw that in a while loop and read every line of the file, until you reach the end, like so:
while((s = br.readLine()) != null) {
StringTokenizer st = new StringTokenizer(s);
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
}
Basically, what this means is "while there is a next line to be read, do the following with that line".
I am using a scanner to read a file which is structured as follows:
ali nader sepahi
simon nadel
rahim nadeem merse
shahid nadeem
Each line has a multiple strings which represented the full name of the person. How to add "+" in between spaces for each name, so I will be having something like this "ali+nader+sepahi" printed into one String.
public class dataScanner
{
public dataScanner() throws IOException
{
Scanner file = new Scanner(new File("info.txt"));
while(file.hasNext())
{
String s = file.next().trim();
System.out.println(s+"+");
}
}
}
Use Scanner.nextLine to read the whole line, then replace the spaces with +
For this kind of need, a Scanner is not really suitable, you should use a BufferedReader and String.replace(char, char) as next:
try (BufferedReader reader = new BufferedReader(new FileReader("info.txt"))) {
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line.replace(' ', '+'));
}
}
What is the fastest way I can read line by line with each line containing two Strings.
An example input file would be:
Fastest, Way
To, Read
One, File
Line, By Line
.... can be a large file
There are always two sets of strings on each line that I need even if there are spaces between the String e.g. "By Line"
Currently I am using
FileReader a = new FileReader(file);
BufferedReader br = new BufferedReader(a);
String line;
line = br.readLine();
long b = System.currentTimeMillis();
while(line != null){
Is that efficient enough or is there a more efficient way using standard JAVA API (no outside libraries please) Any help is appreciated Thanks!
It depends what do you mean when you say "efficient." From the point of view of performance it is OK. If you are asking about the code style and size, I pesonally do almost you do with a small correction:
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while((line = br.readLine()) != null) {
// do something with line.
}
For reading from STDIN Java 6 offers you yet another way. Use class Console and its methods
readLine()
and
readLine(fmt, Object... args)
import java.util.*;
import java.io.*;
public class Netik {
/* File text is
* this, is
* a, test,
* of, the
* scanner, I
* wrote, for
* Netik, on
* Stack, Overflow
*/
public static void main(String[] args) throws Exception {
Scanner sc = new Scanner(new File("test.txt"));
sc.useDelimiter("(\\s|,)"); // this means whitespace or comma
while(sc.hasNext()) {
String next = sc.next();
if(next.length() > 0)
System.out.println(next);
}
}
}
The result:
C:\Documents and Settings\glowcoder\My Documents>java Netik
this
is
a
test
of
the
scanner
I
wrote
for
Netik
on
Stack
Overflow
C:\Documents and Settings\glowcoder\My Documents>
If you want separate two sets of String you can do this in that way:
BufferedReader in = new BufferedReader(new FileReader(file));
String str;
while ((str = in.readLine()) != null) {
String[] strArr = str.split(",");
System.out.println(strArr[0] + " " + strArr[1]);
}
in.close();