Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I need to develop a parser for a binary message exchange format i.e., a message parser which parses a binary message into an java object representation. I would like to ask what useful patterns could be used to implement a parser in a most flexible way. Could anybody describe this in a nutshell or provide resources to read?
Since youre trying to read binary data and transform it into Java Object, there are many approaches, but first thing first, you must know the structure/protocol of your binary.
The pattern I show you bellow is the style that I (if I were you) will use for this scenario.
Make sure you have an input stream that will stream out your binary data. If what you have is a byte array, Make a ByteArrayInputStream.
In your objects graph, each node/object should implement something like parseIn(InputStream s) method.
public class Parent extends ArrayList<Child> {
int age;
// ... more code here
public void parseIn(InputStream is) throws IOException {
// .. logic to read the stream into this instance.
DataInputStream dis = new DataInputStream(is);
this.age = dis.readInt();
// .. if necessary
Child c = new Child();
c.parseIn(InputStream is);
this.add(c);
}
// ... more code here
}
public class Child {
int height;
short weight;
Date birthdate;
public void parseIn(InputStream is) throws IOException {
// .. logic to read the stream into this instance.
DataInputStream dis = new DataInputStream(is);
height = dis.readInt();
weight = dis.readShort();
birthdate = new Date(dis.readLong());
}
}
So, when you obtain your stream, you simply
InputStream stream = this.getInputStream();
Parent p = new Parent();
parent.parseIn(stream);
And so on and so forth.
Some times, you need to read the underlying stream for some hint you need to read forward. For example when reading a string data in the binary stream. Either you keep reading byte-by-byte until you find a terminator byte (as of C's style 0 termination character). Or to provide the string length on the first byte and then read a byte array of that length.
I hope you get the Idea. And I hope it helps.
Related
Consider the scenario of competitive programming, I have to read 2*10^5 (or Even more ) numbers from console . Then I use BufferedReader or for even fast performance I use custom reader class that uses DataInputStream under the hood.
Quick Internet search given me this .
We can use java.io for smaller streaming of data and for large streaming we can use java.nio.
So I want to try java.nio console input and test it against the java.io performance .
Is it possible to read console input using java.nio ?
Can I read data from System.in using java.nio ?
Will it be faster than input methods that I currently have ?
Any relevant information will be appreciated.
Thanks ✌️
You can open a channel to stdin like
FileInputStream stdin = new FileInputStream(FileDescriptor.in);
FileChannel stdinChannel = stdin.getChannel();
When stdin has been redirected to a file, operations like querying the size, performing fast transfers to other channels and even memory mapping may work. But when the input is a real console or a pipe or you are reading character data, the performance is unlikely to differ significantly.
The performance depends on the way you read it, not the class you are using.
An example of code directly operating on a channel, to process white-space separated decimal numbers, is
CharsetDecoder cs = Charset.defaultCharset().newDecoder();
ByteBuffer bb = ByteBuffer.allocate(1024);
CharBuffer cb = CharBuffer.allocate(1024);
while(stdinChannel.read(bb) >= 0) {
bb.flip();
cs.decode(bb, cb, false);
bb.compact();
cb.flip();
extractDoubles(cb);
cb.compact();
}
bb.flip();
cs.decode(bb, cb, true);
if(cb.position() > 0) {
cb.flip();
extractDoubles(cb);
}
private static void extractDoubles(CharBuffer cb) {
doubles: for(int p = cb.position(); p < cb.limit(); ) {
while(p < cb.limit() && Character.isWhitespace(cb.get(p))) p++;
cb.position(p);
if(cb.hasRemaining()) {
for(; p < cb.limit(); p++) {
if(Character.isWhitespace(cb.get(p))) {
int oldLimit = cb.limit();
double d = Double.parseDouble(cb.limit(p).toString());
cb.limit(oldLimit);
processDouble(d);
continue doubles;
}
}
}
}
}
This is more complicated than using java.util.Scanner or a BufferedReader’s readLine() followed by split("\\s"), but has the advantage of avoiding the complexity of the regex engine, as well as not creating String objects for the lines. When there are more than one number per line or empty lines, i.e. the line strings would not not match the number strings, this can save the copying overhead intrinsic to string construction.
This code is still handling arbitrary charsets. When you know the expected charset and it is ASCII based, using a lightweight transformation instead of the CharsetDecoder, like shown in this answer, can gain an additional performance increase.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
How to compare any types of two files in Java?
I'm able to compare 2 text files, but I need to compare any type of files (like xls,doc,jpp...etc.)
I just need a Boolean result (for any type of files) telling if the files are the same or not.
You can first compare the files length, then you can compare files content byte by byte and return false as soon as a difference is found.
public static boolean sameContent (File f1, File f2) throws IOException {
if(f1.length()!=f2.length())return false;
FileInputStream fis1 = new FileInputStream(f1);
FileInputStream fis2 = new FileInputStream(f2);
try {
int byte1;
while((byte1 = fis1.read())!=-1) {
int byte2 = fis2.read();
if(byte1!=byte2)return false;
}
} finally {
fis1.close();
fis2.close();
}
return true;
}
One note about md5 comparaison (suggested in comments) :
Comparing md5 of the files is not reliable because md5 of 2 different files can be the same (if you are unlucky).
Computing md5 requires reading the whole file (+ hashing algorithm) and so is less efficient
This question already has answers here:
How do I convert a large binary String to byte array java?
(3 answers)
Closed 6 years ago.
I want to store some 0s and 1s into memory
I do not know how to explain this clearly but I will try my best to do so.
Let's say, I have an IMAGE file of around 420bytes.
red icon
I want to visualize its binary code meaning I want to see the 0s and 1s. I run this piece of code to do that and this works just fine...
import java.util.Scanner;
import java.io.BufferedInputStream;
import java.io.FileInputStream;
public class fileToBin {
public static void main(String[] args) throws Exception {
StringBuilder sb = new StringBuilder();
Scanner ana = new Scanner(System.in);
System.out.println("File?");
String fileName = ana.nextLine();
try (BufferedInputStream is = new BufferedInputStream(new FileInputStream(fileName))) {
for (int b; (b = is.read()) != -1;) {
String s = "0000000" + Integer.toBinaryString(b);
s = s.substring(s.length() - 8);
sb.append(s);
}
}
System.out.println(sb);
}
}
I send FF0000.png as input and got the following as output...
100010010101000001001110010001110000110100001010000110100000101000000000000000000000000000001101010010010100100001000100010100100000000000000000000000001000000000000000000000000000000010000000000010000000011000000000000000000000000011000011001111100110000111001011000000000000000000000000000000010111001101010010010001110100001000000000101011101100111000011100111010010000000000000000000000000000010001100111010000010100110101000001000000000000000010110001100011110000101111111100011000010000010100000000000000000000000000001001011100000100100001011001011100110000000000000000000011101100001100000000000000000000111011000011000000011100011101101111101010000110010000000000000000000000000100111001010010010100010001000001010101000111100001011110111011011101001000110001000000010000000000000000000011001100001110100000111110100011011110111101000010010000100100000111000011101101100001101101010001111001011100000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010010001000000100111010000001001110000000000011100010000001011000100000010011001000010110110011110110101011011010001100001110111001011110010011001011111011001101010000000000000000000000000000000000100100101000101010011100100010010101110010000100110000010000010
I understand that this is the memory orientation(please correct me if I am wrong about any of these terms) of this particular file.
Now, let's say I do not have nay image file and I did not retrieved and binary code of any image file. The only thing I have is this 0s and 1s and I do not know whether this set of 0s and 1s actually represent a file or not. I have no idea what this represents.
I want to insert/load this 0s and 1s into computer memory. How can I do that?
This can be called the reverse process of my earlier action where I retrieved binary code from a file. Now, I want to insert some 0s and 1s into memory and save it as a file. That does not need to be an IMAGE file, any file extension can be okay. Because I assumed that I am not aware of the presence of any image file.
So, my main task is I have some 0s and 1s and I want to load it to memory and save as a file. Is it possible to do that? How can I do this with Java or any other programming language? How does this memory and binary representation work?
Sorry for my noobness and thank you for your patience :)
Given a String of binary called str and some sort of OutputStream (e.g. a FileOutputStream) called out:
For every 8 characters in str, get the byte's numerical value with Integer.parseInt, and write it to out.
String str = ...;
OutputStream out = ...;
for (int i = 0; i < str.length; i += 8) {
String byteStr = str.substring(i, i+8);
int byteVal = Integer.parseInt(byteStr, 2);
out.write(byteVal);
}
Note that this will cause an IndexOutOfBoundsException if str.length isn't a multiple of 8.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
I hope the question is suitable for here. I've designed an application layer protocol, say similar to HTTP. How can I define, and set the header fields in Java? Overall, I just want to write a simple client-server program that transfers "Hello World" string, but using my own protocol.
Assume header fields are similar to the following. So the "Hello World!" data comes after this header.
When you write to a socket, you're writing an stream of bytes. It's common, as in the table you included in your question, to start that stream with a standard series of bytes that gives the information needed to make sense of the remaining stream.
For example, of you simply want to send a string the minimum you'd need to add would be the string length like this:
|message length|data|
Which could be written like this:
String data = "Hello, world!";
ByteBuffer buffer = ByteBuffer.allocate(data.length + Integer.BYTES);
buffer.putInt(data.length);
buffer.put(data.getBytes("UTF-8"));
buffer.flip();
channel.write(buffer);
Adding in additional header information is no different, you just need to define a format for it in the stream.
You might for example use a format like this
|message length|header count|header size|header type|header data|data|
Which could be written like this:
Map<Integer, String> headers = ...
String data = "Hello, world - look at my headers!";
int headerBuffersLength = 0;
List<ByteBuffer> headerBuffers = new ArrayList<>();
for(Integer headerType : headers.keySet())
{
String headerData = headers.get(headerType);
ByteBuffer headerBuffer = ByteBuffer.allocation(headerData.size + Integer.BYTES + Integer.BYTES);
headerBuffer.putInt(headerData.length);
headerBuffer.putInt(headerType);
headerBuffer.put(headerData.getBytes("UTF-8"));
headerBuffer.flip();
headerBuffers.add(headerBuffer);
headerBuffersLength += headerBuffer.limit();
}
ByteBuffer buffer = ByteBuffer.allocate(data.length + headerBuffersLength + Integer.BYTES + Integer.BYTES);
buffer.putInt(data.length + headerBuffersLength);
buffer.putInt(headerBuffers.size());
for (ByteBuffer headerBuffer : headerBuffers)
{
buffer.put(headerBuffer);
}
buffer.put(data.getBytes("UTF-8"));
buffer.flip();
channel.write(buffer);
That's the basics, the code is very simple to write, but you might want to look at Google Protocol Buffers if you're doing anything more complicated.
There are many ways. It is quite common that this kind of specification comes with an xsd. If that is the case, you can use JAXB2 to parse it and create a set of Java Classes.
If that is not the case, may be you have the specification in such way you can do text processing and extract attributes and types to automate Java Class construction, using grep, sed, and so on.
If finally you have to build the Java Classes on your own, I will do something like following:
package my.package.ams;
public class ASMHeader {
private Integer version = null;
private Integer msgType = null;
private Integer priority = null;
...
public String getVersionString (){
return String.format("%02d", (version != null)?version:0);
}
public Integer getVersion(){
return version;
}
public void setVersion(Integer version){
if(version >= 0 and version < 100){
this.version = version;
}
}
public Integer getMsgType(){
return msgType;
}
public void setMsgType(Integer MsgType){
if(msgType >= 0 and msgType < 5){
this.msgType = msgType;
}
}
//And so on
....
}
Finally processing rules are not shown in yournpic, but you will have to understand and implements.
Take into account the possibility to thrown exception where version, msgType and other variables, doesn't meet the rules expressed in the document.
Hope it helps!
am having this very strange problem: i have a small program that reads bytes off a socket;
whenever i am debugging, the program runs fine; but every time i run it (like straight up run it), i get the ArrayIndexOutOfBounds exception. what gives? am i reading it too fast for the socket? am i missing something?
here is the main():
public static void main(String[] args){
TParser p = new TParser();
p.init();
p.readPacket();
p.sendResponse();
p.readPacket();
p.sendResponse();
p.shutdown();
}
The method init is where i create the Sockets for reading and writing;
The next method (readPacket) is where problems start to arise; i read the entire buffer to a private byte array so i can manipulate the data freely; for instance, depending on some bytes on the data i set some properties:
public void readPacket(){
System.out.println("readPacket");
readInternalPacket();
setPacketInfo();
}
private void readInternalPacket(){
System.out.println("readInternalPacket");
try {
int available=dataIN.available();
packet= new byte[available];
dataIN.read(packet,0,available);
dataPacketSize=available;
}
catch (Exception e) {
e.printStackTrace();
}
}
private void setPacketInfo() {
System.out.println("setPacketInfo");
System.out.println("packetLen: " +dataPacketSize);
byte[] pkt= new byte[2];
pkt[0]= packet[0];
pkt[1]= packet[1];
String type= toHex(pkt);
System.out.println("packet type: "+type);
if(type.equalsIgnoreCase("000F")){
recordCount=0;
packetIterator=0;
packetType=Constants.PacketType.ACKPacket;
readIMEI();
validateDevice();
}
}
The line where it breaks is the line
pkt[1]= packet[1]; (setPacketInfo)
meaning it only has 1 byte at that time... but how can that be, if whe i debug it it runs perfectly? is there some sanity check i must do on the socket? (dataIN is of type DataInputStream)
should i put methods on separate threads? ive gone over this over and over, even replaced my memory modules (when i started having weird ideas on this)
...please help me.
I dont know the surrounding code, especially the class of dataIN but I think your code does this:
int available=dataIN.available(); does not wait for data at all, just returns that there are 0 bytes available
so your array is of size 0 and you then do:
pkt[0]= packet[0]; pkt[1]= packet[1]; which is out of bounds.
I would recommend that you at least loop until the available() returns the 2 you expect, but i cannot be sure that that is the correct (* ) or right (** ) way to do it because i dont know dataIN's class-implementation.
Notes: (* ) it is not correct if it is possible for available() to e.g. return the 2 bytes separately. (** ) it is not the right way to do it if dataIN itself provides methods that wait.
Can it be that reading the data from the socket is an asynchronous process and the setPacketInfo() is called before your packet[] is completely filled? If this is the case, it's possible it runs great when debugging, but terrible when it really uses sockets on different machines.
You can add some code to the setPacketInfo() method to check the length of the packet[] variable.
byte[] pkt= new byte[packet.length];
for(int x = 0; x < packet.length; x++)
{
pkt[x]= packet[x];
}
not really sure though why you even copy the packet[] variable into pkt[]?
You are using a packet oriented protocol on a stream oriented layer without transmitting the real packet length. because of fragmentation the size of received data can be smaller than the packet you sent.
Therefore I strongly recommend to send the data packet size before sending the actual packet. On the receiver side you could use a DataInputStream and use blocking read for detecting an incoming packet:
private void readInternalPacket() {
System.out.println("readInternalPacket");
try {
int packetSize = dataIN.readInt();
packet = new byte[packetSize];
dataIN.read(packet, 0, packetSize);
dataPacketSize = packetSize;
} catch (Exception e) {
e.printStackTrace();
}
}
Of course you have to modify the sender side as well, sending the packet size before the packet.
To add to the answer from #eznme. You need to read from your underlying stream until there is no more pending data. This may required one or more reads, but an end of stream will be indicated when the available method returns 0. I would recommend using Apache IOUtils to 'copy' the input stream to a ByteArrayOutputStream, then getting the byte[] array from that.
In your setPacketInfo method you should do a check on your data buffer length before getting your protocol header bytes:
byte[] pkt= new byte[2];
if((packet != null) && (packet.length >= 2)) {
pkt[0]= packet[0];
pkt[1]= packet[1];
// ...
}
That will get rid of the out of bound exceptions you are getting when you read zero-length data buffers from your protocol.
You should never rely on dataIN.available(), and dataIN.read(packet,0,available); returns an integer that says how many bytes you received. That's not always the same value as what available says, and it can also be less than the size of the buffer.
This is how you should read:
byte[] packet = new byte[1024]; //
dataPacketSize = dataIN.read(packet,0,packet.length);
You should also wrap your DataInputStream in a BufferedInputStream, and take care of the case where you get less than 2 bytes, so that you don't try to process bytes that you haven't received.