CRC32 (Ethernet) Calculation vs .ZIP/.PNG in Java? - java

I can only calculate the CRC32 values of .ZIP/.PNG Strings, but not Ethernet related ones. The Java CRC32 class only seems to allow for one type of calculation.
String str = textField.getText();
Checksum checksum = new CRC32();
byte bytes[] = null;
try {
bytes = str.getBytes("ASCII");
} catch (UnsupportedEncodingException ex) {
Logger.getLogger(GUI.class.getName()).log(Level.SEVERE, null, ex);
}
checksum.update(bytes, 0, bytes.length);
long lngChecksum = checksum.getValue();
crc32bField.setText(Long.toHexString(lngChecksum));
This is the code I've written to calculate my CRC, could anyone help me achieve the same values as one calculated on this website?
http://hash.online-convert.com/crc32-generator
Just as an example,
"hello world" =
7813f744 (website)
D4A1185 (My Code)
Thanks :)

It seems that the algorithm used in the website you provided is CRC32 whereas the one you are using is CRC32B. Both of them are completely different algorithms that is why you are getting different values.
You try the CRC32B algorithm from the same website. It is inline with what the Checksum class is giving you.

Related

What is the fastest way to load the MD5 of an file?

I want to load the MD5 of may different files. I am following this answer to do that but the main problem is that the time taken to load the MD5 of the files ( May be in hundreds) is a lot.
Is there any way which can be used to find the MD5 of an file without consuming much time.
Note- The size of the file may be large ( May go up to 300MB).
This is the code which I am using -
import java.io.*;
import java.security.MessageDigest;
public class MD5Checksum {
public static byte[] createChecksum(String filename) throws Exception {
InputStream fis = new FileInputStream(filename);
byte[] buffer = new byte[1024];
MessageDigest complete = MessageDigest.getInstance("MD5");
int numRead;
do {
numRead = fis.read(buffer);
if (numRead > 0) {
complete.update(buffer, 0, numRead);
}
} while (numRead != -1);
fis.close();
return complete.digest();
}
// see this How-to for a faster way to convert
// a byte array to a HEX string
public static String getMD5Checksum(String filename) throws Exception {
byte[] b = createChecksum(filename);
String result = "";
for (int i=0; i < b.length; i++) {
result += Integer.toString( ( b[i] & 0xff ) + 0x100, 16).substring( 1 );
}
return result;
}
public static void main(String args[]) {
try {
System.out.println(getMD5Checksum("apache-tomcat-5.5.17.exe"));
// output :
// 0bb2827c5eacf570b6064e24e0e6653b
// ref :
// http://www.apache.org/dist/
// tomcat/tomcat-5/v5.5.17/bin
// /apache-tomcat-5.5.17.exe.MD5
// 0bb2827c5eacf570b6064e24e0e6653b *apache-tomcat-5.5.17.exe
}
catch (Exception e) {
e.printStackTrace();
}
}
}
You cannot use hashes to determine any similarity of content.
For instance, generating the MD5 of hellostackoverflow1 and hellostackoverflow2 calculates two hashes where none of the characters of the string representation match (7c35[...]85fa vs b283[...]3d19). That's because a hash is calculated based on the binary data of the file, thus two different formats of the same thing - e.g. .txt and a .docx of the same text - have different hashes.
But as already noted, some speed might be achieved by using native code, thus the NDK. Additionally, if you still want to compare files for exact matches, first compare the size in bytes, after that use a hashing algorithm with enough speed and a low risk of collisions. As stated, CRC32 is fine.
Hash/CRC calculation takes some time as the file has to be read completely.
The code of createChecksum you presented is nearly optimal. The only parts that can be tweaked is the read buffer size (I would use a buffer size 2048 bytes or larger). However this may get you a maximum of 1-2% speed improvement.
If this is still too slow the only option left is to implement the hashing in C/C++ and use it as native method. Besides that there is nothing you can do.

Consume a c# base64 encoded file in java

I want to transfer a file from C# to a java webservice which accepts base64 strings. The problem is that when I encode the file using the c# Convert class, it produces a string based on a little endian unsigned byte[].
In Java byte[] are signed / big endian. When I decode the delivered string, I get a different byte[] and therefor the file is corrupt.
How can I encode a byte[] in C# to a base64, which is equal to the byte[] that is decoded in java using the same string?
C# side:
byte[] attachment = File.ReadAllBytes(#"c:\temp\test.pdf");
String attachmentBase64 = Convert.ToBase64String(attachment, Base64FormattingOptions.None);
Java side:
#POST
#Path("/localpdf")
#Consumes("text/xml")
#Produces("text/xml")
public String getExtractedDataFromEncodedPdf(#FormParam("file") String base64String) {
if(base64String == null) return null;
byte[] data = Base64.decodeBase64(base64String.getBytes());
FileOutputStream ms;
try {
ms = new FileOutputStream(new File("C:\\Temp\\test1234.pdf"));
ms.write(data);
ms.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
File test1234.pdf is corrupt.
"Signed" and "big-endian" are very different things, and I believe you're confusing yourself.
Yes, bytes are signed in Java and unsigned in C# - but I strongly suspect you're getting the same actual bits in both cases... it's just that a bit pattern of (say) 11111111 represents 255 in C# and -1 in Java. Unless you're viewing the bytes as numbers (which is rarely useful) it won't matter - it certainly doesn't matter if you just use the bytes to write out a file on the Java side.

Decompressing PHP's gzcompress in Java

I'm trying to decompress a json object in Java that was initially compressed in PHP. Here's how it gets compressed into PHP:
function zip_json_encode(&$arr) {
$uncompressed = json_encode($arr);
return pack('L', strlen($uncompressed)).gzcompress($uncompressed);
}
and decoded (again in PHP):
function unzip_json_decode(&$data) {
$uncompressed = #gzuncompress(substr($data,4));
return json_decode($uncompressed, $array_instead_of_object);
}
That gets put into MySQL and now it must be pulled out of the db by Java. We pull it out from the ResultSet like this:
String field = rs.getString("field");
I then pass that string to a method to decompress it. This is where it falls apart.
private String decompressHistory(String historyString) throws SQLException {
StringBuffer buffer = new StringBuffer();
try {
byte[] historyBytes = historyString.substring(4).getBytes();
ByteArrayInputStream bin = new ByteArrayInputStream(historyBytes);
InflaterInputStream in = new InflaterInputStream(bin, new Inflater(true));
int len;
byte[] buf = new byte[1024];
while ((len = in.read(buf)) != -1) {
// buf should be decoded, right?
}
} catch (IOException e) {
e.getStackTrace();
}
return buffer.toString();
}
Not quite sure what's going wrong here, but any pointers would be appreciated!
You need to get rid of the true in Inflater(true). Use just Inflater(). The true makes it expect raw deflate data. Without the true, it is expecting zlib-wrapped deflate data. PHP's gzcompress() produces zlib-wrapped deflate data.
Gzipped data is binary, byte[]. Using String, Unicode text, not only needs conversion, but is faulty.
For instance this involves a conversion:
byte[] historyBytes = historyString.substring(4).getBytes();
byte[] historyBytes = historyString.substring(4).getBytes("ISO-8859-1");
The first version uses the default platform encoding, making the application non-portable.
The first to-do is to use binary data in the database as VARBINARY or BLOB.
ImputStream field = rs.getBinaryStream("field");
try (InputStream in = new GZIPInputStream(field)) {
...
}
Or so. Mind the other answer.
In the end, neither of the above solutions worked, but both have merits. When we pulled the data out of mysql and cast it to bytes we have a number of missing character bytes (67). This made it impossible to decompress on the java side. As for the answers above. Mark is correct that gzcompress() uses zlib and therefore you should use the Inflater() class in Java.
Joop is correct that the data conversion is faulty. Our table was too large to convert it to varbinary or blob. That may have solved the problem, but didn't work for us. We ended up having java make a request to our PHP app, then simply unpacked the compressed data on the PHP side. This worked well. Hopefully this is helpful to anyone else that stumbles across it.

Invalid info_hash (Java BitTorrent client)

according to the specification: http://wiki.theory.org/BitTorrentSpecification
info_hash: urlencoded 20-byte SHA1 hash of the value of the info key from the Metainfo file. Note that the value will be a bencoded dictionary, given the definition of the info key above.
torrentMap is my dictionary, I get the info key which is another dictionary, I calculate the hash and I URLencode it.
But I always get an invalid info_hash message when I try to send it to the tracker.
This is my code:
public String GetInfo_hash() {
String info_hash = "";
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutput out = null;
try {
out = new ObjectOutputStream(bos);
out.writeObject(torrentMap.get("info"));
byte[] bytes = bos.toByteArray(); //Map => byte[]
MessageDigest md = MessageDigest.getInstance("SHA1");
info_hash = urlencode(md.digest(bytes)); //Hashing and URLEncoding
out.close();
bos.close();
} catch (Exception ex) { }
return info_hash;
}
private String urlencode(byte[] bs) {
StringBuffer sb = new StringBuffer(bs.length * 3);
for (int i = 0; i < bs.length; i++) {
int c = bs[i] & 0xFF;
sb.append('%');
if (c < 16) {
sb.append('0');
}
sb.append(Integer.toHexString(c));
}
return sb.toString();
}
This is almost certainly the problem:
out = new ObjectOutputStream(bos);
out.writeObject(torrentMap.get("info"));
What you're going to be hashing is the Java binary serialization format of the value of torrentMap.get("info"). I find it very hard to believe that all BitTorrent programs are meant to know about that.
It's not immediately clear to me from the specification what the value of the "info" key is meant to be, but you need to work out some other way of turning it into a byte array. If it's a string, I'd expect some well-specified encoding (e.g. UTF-8). If it's already binary data, then use that byte array directly.
EDIT: Actually, it's sounds like the value will be a "bencoded dictionary" as per your quote, which looks like it will be a string. Quite how you're meant to encode that string (which sounds like it may include values which aren't in ASCII, for example) before hashing it is up for grabs. If your sample strings are all ASCII, then using "ASCII" and "UTF-8" as the encoding names for String.getBytes(...) will give the same result anyway, of course...

HTML5 Websocket Server handshake (v.76) (Java)

I'm trying to build a Java-based HTML5 websocket server (v76) and have problems with the handshake. There are a few opensource Java solutions that supposedly support v76 but none of them seem to work.
I am certain my handshake response is correct (at least calculating the two key's responses). My question: Is Java by default Big Endian? Since the concatenation of the two key answers + the response bytes is the handshake answer, I'm having to do multiple type conversions (string to int, concat two ints into a string, then convert to byte and concat with the response bytes, then MD5 encoding), is there something in particular I need to be looking for? My response always seems accurate using Wireshark (# of bytes), but since the clients have no debug information it's hard to tell why my handshakes are failing.
Any supporting answers or working code would be EXTREMELY valuable to me.
Hey, this is a working example of the handshake producer for websockets version 76. If you use the example from the spec (http://tools.ietf.org/pdf/draft-hixie-thewebsocketprotocol-76.pdf) and print the output as a String, it produces the correct answer.
public byte[] getHandshake (String firstKey, String secondKey, byte[] last8)
{
byte[] toReturn = null;
//Strip out numbers
int firstNum = Integer.parseInt(firstKey.replaceAll("\\D", ""));
int secondNum = Integer.parseInt(secondKey.replaceAll("\\D", ""));
//Count spaces
int firstDiv = firstKey.replaceAll("\\S", "").length();
int secondDiv = secondKey.replaceAll("\\S", "").length();
//Do the division
int firstShake = firstNum / firstDiv;
int secondShake = secondNum / secondDiv;
//Prepare 128 bit byte array
byte[] toMD5 = new byte[16];
byte[] firstByte = ByteBuffer.allocate(4).putInt(firstShake).array();
byte[] secondByte = ByteBuffer.allocate(4).putInt(secondShake).array();
//Copy the bytes of the numbers you made into your md5 byte array
System.arraycopy(firstByte, 0, toMD5, 0, 4);
System.arraycopy(secondByte, 0, toMD5, 4, 4);
System.arraycopy(last8, 0, toMD5, 8, 8);
try
{
//MD5 everything together
MessageDigest md5 = MessageDigest.getInstance("MD5");
toReturn = md5.digest(toMD5);
}
catch (NoSuchAlgorithmException e)
{
e.printStackTrace();
}
return toReturn;
}
I wrote this so feel free to use it where ever.
EDIT: Some other problems I ran into - You MUST write the 'answer' to the handshake as bytes. If you try to write it back to the stream as a String it will fail (must be something to do with char conversion?). Also, make sure you're writing the rest of the response to the handshake exactly as it shows in the spec.
Jetty 7 supports web sockets, and is open source. You might find inspiration (but I would suggest you just embed Jetty in your application and be done with it).
http://blogs.webtide.com/gregw/entry/jetty_websocket_server
You can try my implementation:
https://github.com/TooTallNate/Java-WebSocket
It supports draft 75 and 76 currently. Verified with current versions of Chrome and Safari. Good luck!

Categories

Resources