I have a file that contains a string followed by bytes that contain binary numbers encoded in them.
Thisisastring. �J
In my code I try to ignore the string and focus on decoding the bytes that are separated by a space. When I run the code the outcome seems to be correct except the first binary number is off by a lot.
StringBuffer buffer = new StringBuffer();
File file = new File(arg);
FileInputStream in = new FileInputStream(file);
InputStreamReader isr = new InputStreamReader(in, "UTF8");
Reader inn = new BufferedReader(isr);
int ch;
while ((ch = inn.read()) > -1){
buffer.append((char)ch);
}
inn.close();
String content = buffer.toString();
String temp = new String();
for(int i=0; i<content.length(); i++){
temp += content.charAt(i);
if(content.charAt(i) == ' '){
while(i != content.length()-1){
i++;
byte b = (byte) content.charAt(i);
String x = Integer.toString(b & 0xFF, 2);
System.out.println(x);
}
}
}
Results:
11111101 <- Why is only this one incorrect?
11000
1001010
1011
What is expected:
10010101
00011000
01001010
1011
You should not use Readers or Strings for binary data.
StringBuffer buffer = new StringBuffer();
File file = new File(arg);
FileInputStream in = new FileInputStream(file);
DataInputStream dis = new DataInputStream(new BufferedInputStream(in));
int ch;
while ((ch = din.read()) > -1){
buffer.append((char)ch);
if (ch == ' ')
{
// next byte is a binary value
byte b = din.readByte();
String x = Integer.toString(b & 0xFF, 2);
System.out.println(x);
}
}
Related
I'm sending 2 String values into an OutputStream from the Client.java as follows :
outputStream.write(username.getText().getBytes());
outputStream.write(password.getText().getBytes());
In the Server.java, i'm trying to get each value separated, when i read the inputStream :
inputStream = s.getInputStream();
byte[]username = new byte[20];
inputStream.read(username);
String user = new String(username);
System.out.println("username = "+user);
i get logically : usernamepassword as the console output.
what i want to do is :
String usr = new String(user);
String pass = new String(password);
Is there a better way to do it than adding some delimiter in the outputStream String ?
You need to delimit the two string values so the reader knows where one string ends and the next string begins. What that delimiter actually consists of is up to you to decide based on your particular needs.
You could write out a string's byte length using a fixed-width integer before then writing out the actual bytes. The reader can then read the length first before then reading the specified number of bytes that follow:
DataOutputStream dos = new DataOutputStream(outputStream);
byte[] bytes;
int len;
bytes = username.getText().getBytes(StandardCharsets.UTF_8);
len = bytes.length;
dos.writeInt(len);
dos.write(bytes, 0, len);
bytes = password.getText().getBytes(StandardCharsets.UTF_8);
len = bytes.length;
dos.writeInt(len);
dos.write(bytes, 0, len);
inputStream = s.getInputStream();
DataInputStream dis = new DataInputStream(inputStream);
byte[] bytes;
int len;
len = dis.readInt();
bytes = new byte[len];
dis.readFully(bytes);
String username = new String(bytes, StandardCharsets.UTF_8);
len = dis.readInt();
bytes = new byte[len];
dis.readFully(bytes);
String password = new String(bytes, StandardCharsets.UTF_8);
Alternatively, DataOutputStream and DataInputStream can write/read String values directly, handling the above logic internally for you (using a short instead of an int for the length value):
DataOutputStream dos = new DataOutputStream(outputStream);
dos.writeUTF(username.getText());
dos.writeUTF(password.getText());
inputStream = s.getInputStream();
DataInputStream dis = new DataInputStream(inputStream);
String username = dis.readUTF();
String password = dis.readUTF();
you could write out a unique character sequence that will never appear in the string values themselves, such as a line break or control character (even a null terminator). The reader can then read bytes until it encounters that sequence:
OutputStreamWriter writer = new OutputStreamWriter(outputStream, StandardCharsets.UTF_8);
String s;
s = username.getText();
writer.write(s, 0, s.length());
writer.write(10);
s = password.getText();
writer.write(s, 0, s.length());
writer.write(10);
inputStream = s.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream, StandardCharsets.UTF_8));
String username = reader.readLine();
String password = reader.readLine();
Alternatively:
OutputStreamWriter writer = new OutputStreamWriter(outputStream, StandardCharsets.UTF_8);
String s;
s = username.getText();
writer.write(s, 0, s.length());
writer.write(0);
s = password.getText();
writer.write(s, 0, s.length());
writer.write(0);
inputStream = s.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream, StandardCharsets.UTF_8));
StringBuilder sb = new StringBuilder();
int ch;
do
{
ch = reader.read();
if (ch <= 0) break;
sb.append((char)ch);
}
while (true);
String username = sb.toString();
sb.setLength(0);
do
{
ch = reader.read();
if (ch <= 0) break;
sb.append((char)ch);
}
while (true);
String password = sb.toString();
I would go for DataOutputStream for writing and the DataInputStream for reading. With those, you can write an integer value before each String to know the length of the text, something like this:
DataOutputStream dos = new DataOutputStream(outputStream);
dos.writeInt(username.getText().length());
dos.write(username.getText().getBytes());
dos.writeInt(password.getText().length());
dos.write(password.getText().getBytes());
And then, on the server side:
DataInputStream dis = new DataInputStream(inputStream);
int length = 0; // will be used to store the length of each text
length = bytesRead = dis.readInt(); // Read the length of the first text
byte[] usernameBuffer = new byte[length];
dis.read(usernameBuffer);
String username = new String(usernameBuffer);
// Now reading the other text
length = dis.readInt(); // Read the length of the second text
byte[] passwordBuffer = new byte[length];
dis.read(passwordBuffer);
String password = new String(passwordBuffer);
I've been doing research on a java problem I have with no success. I've read a whole bunch of similar questions here on StackOverflow but the solutions just doesn't seem to work as expected.
I'm trying to read a binary file byte by byte.
I've used:
while ((data = inputStream.read()) != -1)
loops...
for (int i = 0; i < bFile.length; i++) {
loops...
But I only get empty or blank output. The actual content of the file I'm trying to read is as follows:
¬í sr assignment6.PetI¿Z8kyQŸ I ageD weightL namet Ljava/lang/String;xp > #4 t andysq ~ #bÀ t simbasq ~ #I t wolletjiesq ~
#$ t rakker
I'm merely trying to read it byte for byte and feed it to a character array with the following line:
char[] charArray = Character.toChars(byteValue);
Bytevalue here represents an int of the byte it's reading.
What is going wrong where?
Since java 7 it is not needed to read byte by byte, there are two utility function in Files:
Path path = Paths.get("C:/temp/test.txt");
// Load as binary:
byte[] bytes = Files.readAllBytes(path);
String asText = new String(bytes, StandardCharset.ISO_8859_1);
// Load as text, with some Charset:
List<String> lines = Files.readAllLines(path, StandardCharsets.ISO_8859_1);
As you want to read binary data, one would use readAllBytes.
String and char is for text. As opposed to many other programming languages, this means Unicode, so all scripts of the world may be combined. char is 16 bit as opposed to the 8 bit byte.
For pure ASCII, the 7 bit subset of Unicode / UTF-8, byte and char values are identical.
Then you might have done the following (low-quality code):
int fileLength = (int) path.size();
char[] chars = new char[fileLength];
int i = 0;
int data;
while ((data = inputStream.read()) != -1) {
chars[i] = (char) data; // data actually being a byte
++i;
}
inputStream.close();
String text = new String(chars);
System.out.println(Arrays.toString(chars));
The problem you had, probably concerned the unwieldy fixed size array in java, and that a char[] still is not a String.
For binary usage, as you seem to be reading serialized data, you might like to dump the file:
int i = 0;
int data;
while ((data = inputStream.read()) != -1) {
char ch = 32 <= data && data < 127 ? (char) data : ' ';
System.out.println("[%06d] %02x %c%n", i, data, ch);
++i;
}
Dumping file position, hex value and char value.
it is simple example:
public class CopyBytes {
public static void main(String[] args) throws IOException {
FileInputStream in = null;
FileOutputStream out = null;
try {
in = new FileInputStream("xanadu.txt");
out = new FileOutputStream("outagain.txt");
int c;
while ((c = in.read()) != -1) {
out.write(c);
}
} finally {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
}
If you want to read text(characters) - use Readers, if you want to read bytes - use Streams
Why not using Apache Commons:
byte[] bytes = IOUtils.toByteArray(inputStream);
Then you can convert it to char:
String str = new String(bytes);
Char[] chars = str.toCharArray();
Or like you did:
char[] charArray = Character.toChars(bytes);
To deserialize objects:
List<Object> results = new ArrayList<Object>();
FileInputStream fis = new FileInputStream("your_file.dat");
ObjectInputStream ois = new ObjectInputStream(fis);
try {
while (true) {
results.add(ois.readObject());
}
} catch (OptionalDataException e) {
if (!e.eof) throw e;
} finally {
ois.close();
}
Edit:
Use file.length() for they array size, and make a byte array. Then inputstream.read(b).
Edit again: if you want characters, use inputstreamreader(fileinputstream(file),charset), it even comes with charset.
I need some code that will allow me to read one page at a time from a UTF-8 file.
I've used the code;
File fileDir = new File("DIRECTORY OF FILE");
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(fileDir), "UTF8"));
String str;
while ((str = in.readLine()) != null) {
System.out.println(str);
}
in.close();
}
After surrounding it with a try catch block it runs but outputs the entire file!
Is there a way to amend this code to just display ONE PAGE of text at a time?
The file is in UTF-8 format and after viewing it in notepad++, i can see the file contains FF characters to denote the next page.
You will need to look for the form feed character by comparing to 0x0C.
For example:
char c = in.read();
while ( c != -1 ) {
if ( c == 0x0C ) {
// form feed
} else {
// handle displayable character
}
c = in.read();
}
EDIT added an example of using a Scanner, as suggested by Boris
Scanner s = new Scanner(new File("a.txt")).useDelimiter("\u000C");
while ( s.hasNext() ) {
String str = s.next();
System.out.println( str );
}
If the file is valid UTF-8, that is, the pages are split by U+00FF, aka (char) 0xFF, aka "\u00FF", 'ÿ', then a buffered reader can do. If it is a byte 0xFF there would be a problem, as UTF-8 may use a byte 0xFF.
int soughtPageno = ...; // Counted from 0
int currentPageno = 0;
try (BufferedReader in = new BufferedReader(new InputStreamReader(
new FileInputStream(fileDir), StandardCharsets.UTF_8))) {
String str;
while ((str = in.readLine()) != null && currentPageno <= soughtPageno) {
for (int pos = str.indexOf('\u00FF'; pos >= 0; )) {
if (currentPageno == soughtPageno) {
System.out.println(str.substring(0, pos);
++currentPageno;
break;
}
++currentPageno;
str = str.substring(pos + 1);
}
if (currentPageno == soughtPageno) {
System.out.println(str);
}
}
}
For a byte 0xFF (wrong, hacked UTF-8) use a wrapping InputStream between FileInputStream and the reader:
class PageInputStream implements InputStream {
InputStream in;
int pageno = 0;
boolean eof = false;
PageInputSTream(InputStream in, int pageno) {
this.in = in;
this.pageno = pageno;
}
int read() throws IOException {
if (eof) {
return -1;
}
while (pageno > 0) {
int c = in.read();
if (c == 0xFF) {
--pageno;
} else if (c == -1) {
eof = true;
in.close();
return -1;
}
}
int c = in.read();
if (c == 0xFF) {
c = -1;
eof = true;
in.close();
}
return c;
}
Take this as an example, a bit more work is to be done.
You can use a Regex to detect form-feed (page break) characters. Try something like this:
File fileDir = new File("DIRECTORY OF FILE");
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(fileDir), "UTF8"));
String str;
Regex pageBreak = new Regex("(^.*)(\f)(.*$)")
while ((str = in.readLine()) != null) {
Match match = pageBreak.Match(str);
bool pageBreakFound = match.Success;
if(pageBreakFound){
String textBeforeLineBreak = match.Groups[1].Value;
//Group[2] will contain the form feed character
//Group[3] will contain the text after the form feed character
//Do whatever logic you want now that you know you hit a page boundary
}
System.out.println(str);
}
in.close();
The parenthesis around portions of the Regex denote capture groups, which get recorded in the Match object. The \f matches on the form feed character.
Edited Apologies, for some reason I read C# instead of Java, but the core concept is the same. Here's the Regex documentation for Java: http://docs.oracle.com/javase/tutorial/essential/regex/
I have this file that I send from a server to a client through a socket. However when I try to reaad the first 159 first bytes in the client it gives a result that is smaller than when I ask the server to read the same amount in the original file, but when I print the length of what I read in both sides it is the same but one is almost 2/3 of the other! What could be the problem? I already made replaceAll("(\\r|\\n|\\s)","") to take off any space or tabulation but still no change.
Any suggestions?
Here is the code where I write the file:
FileOutputStream writer = new FileOutputStream("Splits.txt");
String output= null;
StringBuilder sb2 = new StringBuilder();
for (int i =0; i < MainClass.NUM_OF_SPLITS ; i++){
StringBuilder sb1 = new StringBuilder();
for (String s : MainClass.allSplits.get(i).blocks)
{sb2.append(s);}
sb1.append(sb2);}
output = sb2.toString().replaceAll("(\\r|\\n|\\s)", "");
writer.write(output.getBytes(Charset.forName("ISO-8859-1")));
writer.close();
And here where I read the file:
FileInputStream fis = new FileInputStream("Splits.txt");
InputStreamReader reader = new InputStreamReader(fis,Charset.forName("ISO-8859-1"));
for(int i = 0; i < splitsNum; i++) {
char[] buf = new char[159]; //param
int count = reader.read(buf);
String h=String.valueOf(buf, 0, count).replaceAll("(\\r|\\n||\\s)","");
System.out.println( h);
}
You need to loop until you've read all the data you need:
char[] buf = new char[159];
int charsRead = 0;
while (charsRead < buf.length) {
int count = reader.read(buf, charsRead, buf.length - charsRead);
if (count < 0) {
throw new EOFException();
}
charsRead += count;
}
// Right, now you know you've actually read 159 characters...
I have a .txt file consisting of 1's and 0's like so;
11111100000001010000000101110010
11111100000001100000000101110010
00000000101001100010000000100000
I would like to be able to read 8 (1's and 0's) and put each 'byte' into a byte array. So a line would be 4 bytes;
11111100 00000101 00000001 01110010 --> 4 bytes, line 1
11111100 00000110 00000001 01110010 --> 8 bytes, line 2
00000000 10100110 00100000 00100000 --> total 12 bytes, line 3
...
and so on.
I believe I need to store the data in a binary file but I'm not sure how to do this. Any help is greatly appreciated.
Edit 1:
I would like to put 8 1's and 0's (11111100, 00000101) into a byte and store in a byte array so 11111100 would be the first byte in the array, 00000101 the second and so on. I hope this is clearer.
Edit 2:
fileopen = new JFileChooser(System.getProperty("user.dir") + "/Example programs"); // open file from current directory
filter = new FileNameExtensionFilter(".txt", "txt");
fileopen.addChoosableFileFilter(filter);
if (fileopen.showOpenDialog(null)== JFileChooser.APPROVE_OPTION)
{
try
{
file = fileopen.getSelectedFile();
//create FileInputStream object
FileInputStream fin = new FileInputStream(file);
byte[] fileContent = new byte[(int)file.length()];
fin.read(fileContent);
for(int i = 0; i < fileContent.length; i++)
{
System.out.println("bit " + i + "= " + fileContent[i]);
}
//create string from byte array
String strFileContent = new String(fileContent);
System.out.println("File content : ");
System.out.println(strFileContent);
}
catch(FileNotFoundException e){}
catch(IOException e){}
}
Here's one way, with comments in the code:
import java.lang.*;
import java.io.*;
import java.util.*;
public class Mkt {
public static void main(String[] args) throws Exception {
BufferedReader br = new BufferedReader(new FileReader("in.txt"));
List<Byte> bytesList = new ArrayList<Byte>();
// Read line by line
for(String line = br.readLine(); line != null; line = br.readLine()) {
// 4 byte representations per line
for(int i = 0; i < 4; i++) {
// Get each of the 4 bytes (i.e. 8 characters representing the byte)
String part = line.substring(i * 8, (i + 1) * 8);
// Parse that into the binary representation
// Integer.parseInt is used as byte in Java is signed (-128 to 127)
byte currByte = (byte)Integer.parseInt(part, 2);
bytesList.add(currByte);
}
}
Byte[] byteArray = bytesList.toArray(new Byte[]{});
// Just print for test
for(byte currByte: byteArray) {
System.out.println(currByte);
}
}
}
Input is read from file named in.txt. Here's a sample run:
$ javac Mkt.java && java Mkt
-4
5
1
114
-4
6
1
114
0
-90
32
32
Hope this helps to get you started, you can tweak to your needs.
Use BufferedReader to read in the txt file.
BufferedReader in = new BufferedReader(...);
ArrayList<byte> bytes = new ArrayList<byte>();
ArrayList<char> buffer = new ArrayList<char>();
int c = 0;
while((c = in.read()) >= 0) {
if(c == '1' || c == '0') buffer.add((char)c);
if(buffer.size() == 8) {
bytes.add(convertToByte(buffer));
buffer.clear();
}
}