Reading from InflaterInputStream and parsing the result

Reading from InflaterInputStream and parsing the result - java

I am quite new to java, just started yesterday. Since I am a big fan of learning by doing, I am making a small project with it. But I am stucked in this part. I have written a file using this function:
public static boolean writeZippedFile(File destFile, byte[] input) {
try {
// create file if doesn't exist part was here
try (OutputStream out = new DeflaterOutputStream(new FileOutputStream(destFile))) {
out.write(input);
}
return true;
} catch (IOException e) {
// error handlind was here
}
}
Now that I have successully wrote a compressed file using above method, I want to read it back to console. First I need to be able to read the decompressed content and write string representaion of that content to console. However, I have a second problem that I don't want to write characters up to first \0 null character. Here is how I attempt to read the compressed file:
try (InputStream is = new InflaterInputStream(new FileInputStream(destFile))) {
}
and I am completely stuck here. Question is, how to discard first few character until '\0' and then write the rest of the decompressed file to console.

I understand that your data contain text since you want to print a string respresentation. I further assume that the text contains unicode characters. If this is true, then your console should also support unicode for the characters to be displayed correctly.
So you should first read the data byte by byte until you encounter the \0 character and then you can use a BufferedReader to print the rest of the data as lines of text.
try (InputStream is = new InflaterInputStream(new FileInputStream(destFile))) {
// read the stream a single byte each time until we encounter '\0'
int aByte = 0;
while ((aByte = is.read()) != -1) {
if (aByte == '\0') {
break;
}
}
// from now on we want to print the data
BufferedReader b = new BufferedReader(new InputStreamReader(is, "UTF8"));
String line = null;
while ((line = b.readLine()) != null) {
System.out.println(line);
}
b.close();
} catch(IOException e) { // handle }

Skip the first few characters using InputStream#read()
while (is.read() != '\0');

Related

Scanner unable to read foreign characters in file

I'm currently creating a tool that can extract and search for data stored on a smartwatch for a University project.
I have been able to extract a file in particular called "Node.db" from my smartwatch which contains the Bluetooth MAC Address of the mobile phone the smartwatch is connected to. I am now trying to create a scanner than will scan this "node.db" file and print out the MAC Address.
This is the code I currently have:
// Identify the location of the node.txt file
File file = new File("C:\\WatchData\\node.txt");
// Notify the user that Bluetooth extraction has initalized
Txt_Results.append("Pulling bluetooth data...");
Scanner in = null;
try {
in = new Scanner(file);
while(in.hasNext())
{ // Scan till the end of the file
String line=in.nextLine();
// Scan the file for this string
if(line.contains("settings.bluetooth"))
// Print the MAC Address string out for the user
System.out.println(line);
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
A previous function converted the file to .txt.
The code searches each line and looks for "settings.bluetooth" and should print out this line which contains the MAC Address if it is found. However, I believe the format of the node.db file is stopping the scanner from finding this string. I believe that some of the data in the file is encoded. An example of how the data is presented is shown below. I believe it is the black characters it doesn't recognize:
When I run the code on the file, the program will simply hang and provide no error message. I have left the program to run for over 20 minutes and still no success.
The exact line I am trying to print out from the file is shown below:
I have tested this code on a text file without these encoded characters and can conclude that the code does work. So my question is the following:
Is there a way that I can get the scanner to skip the characters it doesn't recognize in the file so it can continue scanning the file?
Thanks in advance.

Since you didn't provide the files here, so I can't write code to test on your files. It looks like your files have an different encoding than that Java uses to decode it.
So, you need to try different encoding settings for your input stream.
Usually, you specify the encoding by:
String encoding = "UTF-8"; // try "UTF-8" first and also change to other encodings to see the results
Reader reader = new InputStreamReader(new FileInputStream("your_file_name"), encoding);
Refer to this post for more information. This post also talks about how to write code to detect the encoding of your file.
BTW, the decoded characters shown in your file with a dark background are some control characters in ASCII.
I would also suggest you try changing the decoding method of your text viewer application to see if you can actually make the text display correctly in a particular encoding method.
UPDATE
It looks like Scanner doesn't work while using other IO class actually works fine.
StringBuilder sb = new StringBuilder();
try (BufferedReader reader = new BufferedReader(new FileReader("node.txt"))) {
String line;
while ((line = reader.readLine()) != null) {
sb.append(line);
}
} catch (Exception e) {
// TODO: handle exception
}
int index = sb.indexOf("settings.bluetooth");
if (index != -1)
System.out.println(sb.substring(index, index + 18));
UPDATE
It looks like only when you create a Scanner from a File will cause an exception in one of the Scanner's inner method when reading from the file. But using an input stream as below will always work, even wrapping it inside a Scanner.
try (Scanner s = new Scanner(new FileInputStream("node.txt"))) {
while(s.hasNext()) {
System.out.println(s.next());
}
} catch (Exception e) {
e.printStackTrace();
}
UPDATE
This solution just eliminates all the illegal characters from your file.
public static void main(String args[]) {
String encoding = "UTF-8"; // try "UTF-8" first and also change to other encodings to see the results
StringBuilder sb = new StringBuilder();
try(Reader reader = new InputStreamReader(new FileInputStream("node.txt"), encoding)) {
int c = -1;
while ((c = reader.read()) != -1) {
if (eligible(c)) {
sb.append((char)c);
}
}
} catch (Exception e){
e.printStackTrace();
}
int index = sb.indexOf("settings.bluetooth");
if (index >= 0) {
System.out.println(sb.substring(index));
}
}
public static boolean eligible(int c) {
return (c >= 'a' && c <= 'z' || (c >= 'A' && c <= 'Z') || (c >= '0' && c <= '9') || c == '.');
}

GZIP eats newlines

I have the following code for compressing and decompressing string.
public static byte[] compress(String str)
{
try
{
ByteArrayOutputStream obj = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(obj);
gzip.write(str.getBytes("UTF-8"));
gzip.close();
return obj.toByteArray();
}
catch (IOException e)
{
e.printStackTrace();
}
return null;
}
public static String decompress(byte[] bytes)
{
try
{
GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes));
BufferedReader bf = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
StringBuilder outStr = new StringBuilder();
String line;
while ((line = bf.readLine()) != null)
{
outStr.append(line);
}
return outStr.toString();
}
catch (IOException e)
{
return e.getMessage();
}
}
I compress into byte array on windows, and then send the byte array through socket to the linux and uncompress it there. However upon uncompression it seem that all my newline characters are gone.
So I thought that the problem was linux to windows relationship. However I have tried writing a simple program on windows that uses it, and found that the newlines are still gone.
Can anyone shed any light as to what causes it? I can't figure out any explanation.

I think the problem is here:
while ((line = bf.readLine()) != null)
{
outStr.append(line);
}
The readLine see's the newline char but doesn't include it in the returned value for line
The problem is worse than you think, perhaps.
readLine() gets all the characters up to, but not including, a newline (or some variety of returns and linefeed characters) OR the end of file. So you don't know if the last line you get had a newline on the end or not.
This might not matter, and if so, you can just add this following the other append:
outStr.append('\n');
Some files might end up with an extra line ending at the end of file.
If it does matter, you will need to use read() and then output all the characters you receive. In that case, you might end up with the infamous "What's at the end of the line?" problem you allude to between Windows, Linux and the MacOS and the way they use different combinations of return and new-line characters to end lines.

It is not GZIP that is "eating" newlines.
It is this code:
while ((line = bf.readLine()) != null)
{
outStr.append(line);
}
The readLine() method reads a line (up to and including a line termination sequence) and then returns it without a newline. You then append it to outStr ... without replacing the line termination that was stripped.
But even if you replaced the line termination, you can't guarantee to preserve the actual line termination sequence that was used ... if you do it that way.
I recommend that you replace the readLine() calls with read() calls; i.e. read and then buffer the data one character at a time. It solves two problems at once. It may even be faster, because you are avoiding the unnecessary overhead of assembling line Strings.

How to return value of FileInputStream in Android [duplicate]

This question already has answers here:
How do I read / convert an InputStream into a String in Java?
(62 answers)
Closed 9 years ago.
I have a function based on FileInputStream that should return a string she reads from the file. But why is the return value - empty.
The function's return value should be displayed in MainActivity.
This function is part of the service in my app.
public static String UserLoginFile;
public static String UserPassFile;
public String LoadUserInfopassFromFilePostData() {
String FILENAME = "TeleportSAASPass.txt";
FileInputStream inputStream = null;
if(UserLoginFile != null){
try {
inputStream = openFileInput(FILENAME);
byte[] reader = new byte[inputStream.available()];
while (inputStream.read(reader) != -1) {}
UserPassFile = new String(reader);
// Toast.makeText(getApplicationContext(), "GPSTracker " + UserPassFile, Toast.LENGTH_LONG).show();
} catch(IOException e) {
} finally {
if (inputStream != null) {
try {
inputStream.close();
} catch (IOException e) {
}
}
}
}
return UserPassFile;
}
Please tell me how to fix my function so that it can return a string which she read from the file.

I would suggest using an input stream reader. Check out
http://developer.android.com/reference/java/io/InputStreamReader.html
Your code will look something like:
inputStream = openFileInput(FILENAME);
InputStreamReader( inputstream, "UTF_8"), 8);
StringBuilder response = new StringBuilder();
String line;
//Read the response from input stream line-wise
while((line = reader.readLine()) != null){
response.append(line).append('\n');
}
//Create response string
result = response.toString();
//Close input stream
reader.close();

Consider the difference between your code:
while (inputStream.read(reader) != -1) {}
UserPassFile = new String(reader);
And this alternative:
while (inputStream.read(reader) != -1) {
UserPassFile = new String(reader);
}
In your case, UserPassFile is assigned from reader after every read() call, including the final one which fails. In the second case, UserPassFile is only assigned after a read which succeeds, and so should still contain a non-empty string to return when the end of the method is reached.
However, you seem to be doing one other odd thing. You assume that you will be able to read the entire .available() amount of data in a single read, which is probably true (at least if you were able to allocate a byte[] that large). However, after reading this data, you try again and see if you fail, which seems unnecessary. It would seem that if you are going to make the assumption that you can read the data in a single attempt, then you should simply close the stream and return once you have done so - there doesn't seem to be a compelling reason to try yet another read which would be expected to fail. Otherwise, if there's a possibility that you can't get all the data in a single read(), you should keep track of how much you have gotten and keep appending until you get a failed read or otherwise decide you are done.

How can I recognize a special delimiter string when reading from a file of strings?

I want to read strings from a file. When a certain string (><) is found, I want to start reading integers instead, and convert them to binary strings.
My program is reading the strings in and saving them in an ArrayList successfully, but
it does not recognise the >< symbol and therefore the reading of the binary strings is not successful.
The Code
try {
FileInputStream fstream = new FileInputStream(fc.getSelectedFile().getPath());
// Get the object of DataInputStream
DataInputStream ino = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(ino));
String ln;
String str, next;
int line, c =0;
while ((ln = br.readLine()) != null) {
character = ln;
System.out.println(character);
iname.add(ln); // arraylist that holds the strings
if (iname.get(c).equals("><")) {
break; // break and moves
// on with the following while loop to start reading binary strings instead.
}
c++;
}
String s = "";
// System.out.println("SEQUENCE of bytes");
while ((line = ino.read()) != -1) {
String temp = Integer.toString(line, 2);
arrayl.add(temp);
System.out.println("telise? oxii");
System.out.println(line);
}
ino.close();
} catch (Exception exc) { }
The file I'm trying to read is for example:
T
E
a
v
X
L
A
.
x
"><"
sequence of bytes.
Where the last part is saved as bytes and in the textfile appears like that. no worries this bit works. all the strings are saved in a new line.

< is two characters and iname.get(c) is only one character.
What u should do is test if ln equals > and then another test if the next character equals < . If both test pass then break out of the loop.
you will have to becarefull

Use a Scanner. It allows you to specify a delimiter, and has methods for reading input tokens as String or int.

Could you not do something like:
while ((ln = br.readLine()) != null){
character=ln;
System.out.println(character);
//
// Look for magic characters >< and stop reading if found
//
if (character.indexOf("><") >= 0) {
break;
}
iname.add(ln);
}
This would work if you didn't want to add the magic symbol to your ArrayList. Your code sample is incomplete - if you're still having trouble you'd need to post the whole class.

println(char), characters turn into Chinese?

Please help me to troubleshoot this problem.
A have an input file 'Trial.txt' with content "Thanh Le".
Here is the function I used in an attempt to read from the file:
public char[] importSeq(){
File file = new File("G:\\trial.txt");
char temp_seq[] = new char[100];
try{
FileInputStream fis = new FileInputStream(file);
BufferedInputStream bis = new BufferedInputStream(fis);
DataInputStream dis = new DataInputStream(bis);
int i = 0;
//Try to read all character till the end of file
while(dis.available() != 0){
temp_seq[i]=dis.readChar();
i++;
}
System.out.println(" imported");
} catch (FileNotFoundException e){
e.printStackTrace();
} catch (IOException e){
e.printStackTrace();
}
return temp_seq;
}
And the main function:
public static void main(String[] args) {
Sequence s1 = new Sequence();
char result[];
result = s1.importSeq();
int i = 0;
while(result[i] != 0){
System.out.println(result[i]);
i++;
}
}
And this is the output.
run:
imported
瑨
慮
栠
汥
BUILD SUCCESSFUL (total time: 0 seconds)

That's honestly said a pretty clumsy way to read a text file into a char[].
Here's a better example, assuming that the text file contains only ASCII characters.
File file = new File("G:/trial.txt");
char[] content = new char[(int) file.length()];
Reader reader = null;
try {
reader = new FileReader(file);
reader.read(content);
} finally {
if (reader != null) try { reader.close(); } catch (IOException ignore) {}
}
return content;
And then to print the char[], just do:
System.out.println(content);
Note that InputStream#available() doesn't necessarily do what you're expecting.
See also:
Java IO tutorial

Because in Java a char is made by 2 bytes, so, when you use readChar, it will read pairs of letters and compose them into unicode characters.
You can avoid this by using readByte(..) instead..

Some code to demonstrate, what exactly is happening. A char in Java consists of two bytes and represents one character, the glyph (pixels) you see on the screen. The default encoding in Java is UTF-16, one particular way to use two bytes to represent one of all the glyphs. Your file had one byte to represent one character, probably ASCII. When you read one UTF-16 character, you read two bytes and thus two ASCII characters from your file.
The following code tries to explain how single ASCII bytes 't' and 'h', become one chinese UTF-16 character.
public class Main {
public static void main(String[] args) {
System.out.println((int)'t'); // 116 == x74 (116 is 74 in Hex)
System.out.println((int)'h'); // 104 == x68
System.out.println((int)'瑨'); // 29800 == x7468
// System.out.println('\u0074'); // t
// System.out.println('\u0068'); // h
// System.out.println('\u7468'); // 瑨
char th = (('t' << 8) + 'h'); //x74 x68
System.out.println(th); //瑨 == 29800 == '\u7468'
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Reading from InflaterInputStream and parsing the result - java

Skip the first few characters using InputStream#read() while (is.read() != '\0');

Related

Scanner unable to read foreign characters in file

GZIP eats newlines

How to return value of FileInputStream in Android [duplicate]

How can I recognize a special delimiter string when reading from a file of strings?

println(char), characters turn into Chinese?

Categories

Resources