using Scanner to read a file - java

I found the following useful in the past for reading in text files:
new Scanner(file).useDelimiter("\\Z").next();
However I came across a file today that was only partially read in with this syntax. I'm not sure what makes this file special, it's just a .jsp
I found the below worked in this instance but I'd like to know why the previous method didn't work.
Scanner in = new Scanner(new FileReader(file));
String text = in.useDelimiter("\\Z").next();

Save the jsp file as .txt and try to read it using your first method. if it works i feel size can be the issue.

Related

is there way access specific line in file (without looping) java - libgdx

Hi i am using LibGDX to program android.
Is there some way to sraightforward access specific line on the file without going through all the lines and reading them until i reach desired line? (Can I just say I want to read line number so-and-so?)
I know there are such methods:
FileHandle file = Gdx.files.internal("list.txt");
BufferedReader reader = new BufferedReader(file.reader());
reader.readLine(); <--- but it reads only first line !
//// or using scanner
scanner1 = new Scanner(new File("list.txt"));
scanner1.nextLine(); <--- also reads first line ..
Can I do it without looping through unnecessary lines? Any solutions, workarounds welcome. Thanx

Using multiple files from a directory with a scanner (Java)

I'm trying to read from multiple .txt files in a directory using a scanner in Java.
So far, I have
File directory = new File("textanalyzer/Shakespeare");
File[] filenames = directory.listFiles();
Scanner scanner = new Scanner(new File(filenames)).useDelimiter("[^a-zA-Z<]+");
The rest of my program uses the text from these files. I have the rest of the program written but I'm stuck on this one thing.
I've been looking around for a solution, but I can't really find anything. I know that what I have isn't very good but I don't know enough Java to be able to improve it. I've also tried using Apache imports but I can't figure out how to make them work (FileIterator, in particular).
Finally, I would really like to use the Scanner class so that I can use the Delimiter. It is super helpful for what I'm trying to do.
Not quite sure what your goal is but this basic example might help.
File[] fileArray=new File("textanalyzer/Shakespeare").listFiles();
for(File f: fileArray) // loop thru all files
{
if(f.getName().endsWith(".txt")) // to deal with the .txt files.
{
Scanner s=new Scanner(f); // to read the files
}
}

Read zip or jar file without unzipping it first

I'm not looking for any answers that involve opening the zip file in a zip input or output stream. My question is is it possible in java to just simply open a jar file like any other file (using buffered reader/writer), read it's contents, and write them somewhere else? For example:
import java.io.*;
public class zipReader {
public static void main(String[] args){
BufferedReader br = new BufferedReader(new FileReader((System.getProperty("user.home").replaceAll("\\\\", "/") + "/Desktop/foo.zip")));
BufferedWriter bw = new BufferedWriter(new FileWriter((System.getProperty("user.home").replaceAll("\\\\", "/") + "/Desktop/baf.zip")));
char[] ch = new char[180000];
while(br.read(ch) > 0){
bw.write(ch);
bw.flush();
}
br.close();
bw.close();
}
}
This works on some small zip/jar files, but most of the time will just corrupt them making it impossible to unzip or execute them. I have found that setting the size of the char[] to 1 will not corrupt the file, just everything in it, meaning I can open the file in an archive program but all it's entries will be corrupted and unusable. Does anyone know how to write the above code so it won't corrupt the file? Also here is a line from a jar file I tested this on that became corrupted:
nèñà?G¾Þ§V¨ö—‚?‰9³’?ÀM·p›a0„èwåÕüaEܵp‡aæOùR‰(JºJ´êgžè*?”6ftöãÝÈ—ê#qïc3âi,áž…¹¿Êð)V¢cã>Ê”G˜(†®9öCçM?€ÔÙÆC†ÑÝ×ok?ý—¥úûFs.‡
vs the original:
nèñàG¾Þ§V¨ö—‚‰9³’ÀM·p›a0„èwåÕüaEܵp‡aæOùR‰(JºJ´êgžè*?”6ftöãÝÈ—ê#qïc3âi,áž…¹¿Êð)V¢cã>Ê”G˜(†®9öCçM€ÔÙÆC†ÑÝ×oký—¥úûFs.‡
As you can see either the reader or writer adds ?'s into the files and I can't figure out why. Again I don't want any answers telling me to open it entry by entry, I already know how to do that, if anyone knows the answer to my question please share it.
Why would you want to convert binary data to chars? I think it will be much better to InputStream/OutputStream using byte arrays. See http://www.javapractices.com/topic/TopicAction.do?Id=245
for examples.
bw.write(ch) will write the entire array. Read will only fill in some of it, and return a number telling you how much. This is nothing to do with zip files, just with how IO works.
You need to change your code to look more like:
int charsRead = br.read(buffer);
if (charsRead >= 0) {
bw.write(buffer, 0, charsRead);
} else {
// whatever I do at the end.
}
However, this is only 1/2 of your problem. You are also converting bytes to characters and back again, which will corrupt the data in other ways. Stick to streams.
see the ZipInputStream and ZipOutputStream classes
Edit: use plain FileInputStream and FileOutputStream. I suspect there may be some issues when the reader is interpreting the bytes as characters.
see also: Standard concise way to copy a file in Java? Since you ant to copy the whole file, there is nothing special about it being a zip file

Java: reading text from a file results with strange formatting

Usually, when I read text files, I do it like this:
File file = new File("some_text_file.txt");
Scanner scanner = new Scanner(new FileInputStream(file));
StringBuilder builder = new StringBuilder();
while(scanner.hasNextLine()) {
builder.append(scanner.nextLine());
builder.append('\n');
}
scanner.close();
String text = builder.toString();
There may be better ways, but this method has always worked for me perfectly.
For what I am working on right now, I need to read a large text file (over 700 kilobytes in size). Here is a sample of the text when opened in Notepad (the one that comes standard with any Windows operating system):
"lang"
{
"Language" "English"
"Tokens"
{
"DOTA_WearableType_Daggers" "Daggers"
"DOTA_WearableType_Glaive" "Glaive"
"DOTA_WearableType_Weapon" "Weapon"
"DOTA_WearableType_Armor" "Armor"
However, when I read the text from the file using the method that I provided above, the output is:
I could not paste the output for some reason. I have also tried to read the file like so:
File file = new File("some_text_file.txt");
Path path = file.toPath();
String text = new String(Files.readAllBytes(path));
... with no change in result.
How come the output is not as expected? I also tried reading a text file that I wrote and it worked perfectly fine.
It looks like encoding problem. Use a tool that can detect encoding to open the file (like Notepad++) and find how it is encoded. Then use the other constructor for Scanner:
Scanner scanner = new Scanner(new FileInputStream(file), encoding);
Or you can simply experiment with it, trying different encodings. It looks like UTF-16 to me.
final Scanner scanner = new Scanner(new FileInputStream(file), "UTF-16");

Java Scanner unable to read file

I'm doing a very simple text-parsing program, using files given to me by a friend.
However, when I open the file using a Scanner like so,
Scanner scan = new Scanner(new File(path));
System.err.println(scan.hasNext());
while(scan.hasNextLine())
System.err.println(scan.nextLine());
System.err.println(scan.next());
result:
false
Exception in thread "main" java.util.NoSuchElementException
at java.util.Scanner.throwFor(Scanner.java:855)
at java.util.Scanner.next(Scanner.java:1364)
at Test.main(Test.java:18)
the scanner treats the file(which is some 1400 lines long) as empty.
Can anyone think of any reason a scanner might not be able to see a file? I suspect the fact that the file was imported from a Windows machine to a Linux machine may have something to do with it, but my mind is open to other possibilities
edited for formatting and code errors
I resolved it using new Scanner(new BufferedReader(new FileReader(fileName))) instead of new Scanner(new File(fileName))
Found the problem:
Looked at the file byte by byte. found an EOF character in the first byte.
Java was ignoring the rest of the file.
EDIT: Fisrt guess was wrong
The file might have 1400 lines full of whitespaces.
it maybe occurred for this problems:
1-your file maybe isn't created.
2-your file is in use for other programs.
3-the path address is false.

Categories

Resources