The result I'm getting is that files of the same type are returning the same md5 hash value. For example two different jpgs are giving me the same result. However, a jpg vs a apk are giving different results.
Here is my code...
public static String checkHashURL(String input) {
try {
MessageDigest md = MessageDigest.getInstance("MD5");
InputStream is = new URL(input).openStream();
try {
is = new DigestInputStream(is, md);
int b;
while ((b = is.read()) > 0) {
;
}
} finally {
is.close();
}
byte[] digest = md.digest();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < digest.length; i++) {
sb.append(
Integer.toString((digest[i] & 0xff) + 0x100, 16).substring(
1));
}
return sb.toString();
} catch (Exception ex) {
throw new RuntimeException(ex);
}
}
This is broken:
while ((b = is.read()) > 0)
Your code will stop at the first byte of the stream which is 0. If the two files have the same values before the first 0 byte, you'll fail. If you really want to call the byte-at-a-time version of read, you want:
while (is.read() != -1) {}
The parameterless InputStream.read() method returns -1 when it reaches the end of the stream.
(There's no need to assign a value to b, as you're not using it.)
Better would be to read a buffer at a time:
byte[] ignoredBuffer = new byte[8 * 1024]; // Up to 8K per read
while (is.read(ignoredBuffer) > 0) {}
This time the condition is valid, because InputStream.read(byte[]) would only ever return 0 if you pass in an empty buffer. Otherwise, it will try to read at least one byte, returning the length of data read or -1 if the end of the stream has been reached.
Related
I am trying to establish communication between Arduino and Android over Uart. So, while reading buffer on Android side I am not getting data in chunks.
if (uartDevice != null) {
// Loop until there is no more data in the RX buffer.
try {
byte[] buffer = new byte[CHUNK_SIZE];
int read;
while ((read = uartDevice.read(buffer, buffer.length)) > 0) {
data = new String(buffer, StandardCharsets.UTF_8).substring(0, read);
System.out.println(String.format("%020x", new BigInteger(1, data.getBytes(/*YOUR_CHARSET?*/))));
} catch (IOException e) {
Log.w(TAG, "Unable to transfer data over UART", e);
}
Expected output is:
2a3619010101001a0708403031301010011214084030313010100112140845
Instead I am receiving:
2a361a010101001a070840303130101001121408403031
8403031301010011214084030313010100112140845
3031301010011214084030313010100112140845
If you want to write code that only prints the bytes that you get I would try the following:
if (uartDevice != null) {
// Loop until there is no more data in the RX buffer.
try {
byte[] buffer = new byte[CHUNK_SIZE];
int read;
while ((read = uartDevice.read(buffer, buffer.length)) > 0) {
for (int i = 0; i < read; i++) {
System.out.printf("%02x", buffer[i]);
}
}
} catch (IOException e) {
Log.w(TAG, "Unable to transfer data over UART", e);
}
System.out.println(); // Adds a newline after all bytes
}
The following is a method that takes a UartDevice as a parameter, reads from it until the end and returns a single byte array with the whole content. No arbitrary buffer that is guaranteed to hold the whole content is needed. The returned array is exactly as big as it needs to be. Only a small read buffer is used to increase performance. Error handling is ignored.
This assumes that the data is not larger than it fits into memory.
byte[] readFromDevice(UartDevice uartDevice) {
byte[] buffer = new byte[CHUNK_SIZE];
int read;
ByteArrayOutputStream data = new ByteArrayOutputStream();
while ((read = uartDevice.read(buffer, buffer.length)) > 0) {
data.write(buffer, 0, read);
}
return data.toByteArray();
}
The method returns when all data has been read and you can process the returned array at your leasure.
Here is a section of code in my Java program (in the main thread) :
int bufSize = 4096; // 4k buffer.
byte[] buffer = new byte[bufSize];
int bytesAvailable, bytesRead;
try {
while ( true ) {
bytesAvailable = System.in.available(); // seems may throw IOException if InputStream is closed.
if ( bytesAvailable > 0 ) {
if ( bytesAvailable > bufSize ) {
bytesAvailable = bufSize;
}
// The following statement seems should not block.
bytesRead = System.in.read(buffer, 0, bytesAvailable);
if ( bytesRead == -1 ) {
myOutputStream.close();
System.out.println("NonBlockingReadTest : return by EOF.");
return;
}
myOutputStream.write(buffer, 0, bytesRead);
myOutputStream.flush();
} else if ( bytesAvailable == 0 ) {
// Nothing to do, just loop over.
} else {
// Error ! Should not be here !
}
}
} catch (IOException e) {
.....
The program runs OK. It just reads (in a non-blocking way) some input from stdin and then write the input to a file. I have used "System.in.available()" so that the "System.in.read(buffer, 0, bytesAvailable)" statement would not block.
However, I could not terminate the program by typing "ctrl-D" in the keyboard. I could only terminate it by "ctrl-C" and I think that's not a graceful way. It seems that the "bytesRead == -1" condition is never true.
Is that any modification I could do so that the program could be terminated by "ctrl-D". Any idea, thanks.
Based on Read Input until control+d
You can use ByteBuffer to read ints from your buffer.
Then you can compare that int to 4. If it is then it's a CTRL+D.
If you see Aleadam response you can see how to check for CTRL+D:
public class InputTest {
public static void main(String[] args) throws IOException {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
StringBuilder out = new StringBuilder();
while (true) {
try {
int c = in.read();
if (c != 4) // ASCII 4 04 EOT (end of transmission) ctrl D, I may be wrong here
out.append (c);
else
break;
} catch (IOException e) {
System.err.println ("Error reading input");
}
}
System.out.println(out.toString());
}
}
If you see this Convert a byte array to integer in java and vice versa
and Jeff Mercado response you can do the following:
byte[] arr = { 0x00, 0x01 };
ByteBuffer wrapped = ByteBuffer.wrap(arr); // big-endian by default
short num = wrapped.getShort(); // 1
ByteBuffer dbuf = ByteBuffer.allocate(2);
dbuf.putShort(num);
byte[] bytes = dbuf.array(); // { 0, 1 }
give this code a try. it worked for me:
try {
while ((bytesAvailable = System.in.read(buffer)) != -1) {
myOutputStream.write(buffer, 0, bytesAvailable);
myOutputStream.flush();
} catch (IOException e) {}
CTRL-D will causes the terminal to make the buffered input available. If there is nothing on a line of its own, CTRL+D sends an EOF signal to the System.in InputStream which is -1.
Hi I need to calculate the entropy of order m of a file where m is the number of bit (m <= 16).
So:
H_m(X)=-sum_i=0 to i=2^m-1{(p_i,m)(log_2 (p_i,m))}
So, I thought to create an input stream to read the file and then calculate the probability of each sequence composed by m bit.
For m = 8 it's easy because I consider a byte.
Since that m<=16 I tought to consider as primitive type short, save each short of the file in an array short[] and then manipulate bits using bitwise operators to obtain all the sequences of m bit in the file.
Is this a good idea?
Anyway, I'm not able to create a stream of short. This is what I've done:
public static void main(String[] args) {
readFile(FILE_NAME_INPUT);
}
public static void readFile(String filename) {
short[] buffer = null;
File a_file = new File(filename);
try {
File file = new File(filename);
FileInputStream fis = new FileInputStream(filename);
DataInputStream dis = new DataInputStream(fis);
int length = (int)file.length() / 2;
buffer = new short[length];
int count = 0;
while(dis.available() > 0 && count < length) {
buffer[count] = dis.readShort();
count++;
}
System.out.println("length=" + length);
System.out.println("count=" + count);
for(int i = 0; i < buffer.length; i++) {
System.out.println("buffer[" + i + "]: " + buffer[i]);
}
fis.close();
}
catch(EOFException eof) {
System.out.println("EOFException: " + eof);
}
catch(FileNotFoundException fe) {
System.out.println("FileNotFoundException: " + fe);
}
catch(IOException ioe) {
System.out.println("IOException: " + ioe);
}
}
But I lose a byte and I don't think this is the best way to proced.
This is what I think to do using bitwise operator:
int[] list = new int[l];
foreach n in buffer {
for(int i = 16 - m; i > 0; i-m) {
list.add( (n >> i) & 2^m-1 );
}
}
I'm assuming in this case to use shorts.
If I use bytes, how can I do a cycle like that for m > 8?
That cycle doesn't work because I have to concatenate multiple bytes and each time varying the number of bits to be joined..
Any ideas?
Thanks
I think you just need to have a byte array:
public static void readFile(String filename) {
ByteArrayOutputStream outputStream=new ByteArrayOutputStream();
try {
FileInputStream fis = new FileInputStream(filename);
byte b=0;
while((b=fis.read())!=-1) {
outputStream.write(b);
}
byte[] byteData=outputStream.toByteArray();
fis.close();
}
catch(IOException ioe) {
System.out.println("IOException: " + ioe);
}
Then you can manipulate byteData as per your bitwise operations.
--
If you want to work with shorts you can combine bytes read this way
short[] buffer=new short[(int)(byteData.length/2.)+1];
j=0;
for(i=0; i<byteData.length-1; i+=2) {
buffer[j]=(short)((byteData[i]<<8)|byteData[i+1]);
j++;
}
To check for odd bytes do this
if((byteData.length%2)==1) last=(short)((0x00<<8)|byteData[byteData.length-1]]);
last is a short so it could be placed in buffer[buffer.length-1]; I'm not sure if that last position in buffer is available or occupied; I think it is but you need to check j after exiting the loop; if j's value is buffer.length-1 then it is available; otherwise might be some problem.
Then manipulate buffer.
The second approach with working with bytes is more involved. It's a question of its own. So try this above.
i'm creating the md5 hash generator. i first test it with an original file, then i altered the file to see whether the md5 hash codes is changed or not. the hash code did not change even after i altered the same file. what is the problem?
public class MD5CheckSum {
public byte [] createChecksum (String filename) throws Exception {
InputStream fis = new FileInputStream(filename);
byte[] buffer = new byte[1024];
MessageDigest complete = MessageDigest.getInstance("MD5");
int numRead;
do {
numRead = fis.read(buffer);
if (numRead > 0){
complete.update(buffer,0,numRead);
}
}while (numRead !=1);
fis.close();
return complete.digest();
}
public String getMD5Checksum(String filename) throws Exception {
/*byte[] b = createChecksum(filename);
String result = "";
for (int i=0; i < b.length; i++){
result += Integer.toString(( b[i] & 0xff) + 0x100, 16).substring( 1 );
}
return result;*/
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] messageDigest = md.digest(filename.getBytes());
BigInteger number = new BigInteger(1, messageDigest);
String hashtext = number.toString(16);
// Now we need to zero pad it if you actually want the full 32 chars.
while (hashtext.length() < 32) {
hashtext = "0" + hashtext;
}
return hashtext;
}
public MD5CheckSum() throws Exception{
String path = "C:/Users/user/Downloads/Documents/ECOMM SUMMER BLOSSOM.docx";
System.out.println("MD5 Hash Succeed");
System.out.println(getMD5Checksum(path));
}
EDITED: I changed some code
public static String getMD5Checksum(String filename) throws Exception {
byte[] b = createChecksum(filename);
String result = "";
for (int i=0; i < b.length; i++) {
result += Integer.toString( ( b[i] & 0xff ) + 0x100, 16).substring( 1 );
}
return result;
}
public static void main(String args[]) {
try {
System.out.println("Start hashing....");
System.out.println(getMD5Checksum("C:/Users/user/Downloads/Documents/21.pdf"));
System.out.println("Done hashing....");
}
catch (Exception e) {
e.printStackTrace();
}
}
But it takes too long to generate the hash and currently the hash still not generated till now.
filename.getBytes() gets bytes of the filename, not the file contents.
I could tell you how to load the entire file into a byte array, but that would be bad, because it could take up huge amounts of memory when it just isn't necessary to keep the entire file in memory while the hash is calculated.
Instead you should open a stream and get the hash of that. See this answer for that: https://stackoverflow.com/a/304350/360211
You're seem to calculate the MD5-sum of the filename not the content of the file. What you should have done to avoid this is to use a file with a known MD5-sum (by for example run md5sum on it) and check if your code yields the same result.
Also I can't help noting that your createCheckSum seem to be a better candidate to be working as it seem to actually work on the content of the file.
Just verifying that you get different value for different input may show that you've got a candidate for check summing, but it's a poor check that it's actually the correct algorithm used.
I've been doing research on a java problem I have with no success. I've read a whole bunch of similar questions here on StackOverflow but the solutions just doesn't seem to work as expected.
I'm trying to read a binary file byte by byte.
I've used:
while ((data = inputStream.read()) != -1)
loops...
for (int i = 0; i < bFile.length; i++) {
loops...
But I only get empty or blank output. The actual content of the file I'm trying to read is as follows:
¬í sr assignment6.PetI¿Z8kyQŸ I ageD weightL namet Ljava/lang/String;xp > #4 t andysq ~ #bÀ t simbasq ~ #I t wolletjiesq ~
#$ t rakker
I'm merely trying to read it byte for byte and feed it to a character array with the following line:
char[] charArray = Character.toChars(byteValue);
Bytevalue here represents an int of the byte it's reading.
What is going wrong where?
Since java 7 it is not needed to read byte by byte, there are two utility function in Files:
Path path = Paths.get("C:/temp/test.txt");
// Load as binary:
byte[] bytes = Files.readAllBytes(path);
String asText = new String(bytes, StandardCharset.ISO_8859_1);
// Load as text, with some Charset:
List<String> lines = Files.readAllLines(path, StandardCharsets.ISO_8859_1);
As you want to read binary data, one would use readAllBytes.
String and char is for text. As opposed to many other programming languages, this means Unicode, so all scripts of the world may be combined. char is 16 bit as opposed to the 8 bit byte.
For pure ASCII, the 7 bit subset of Unicode / UTF-8, byte and char values are identical.
Then you might have done the following (low-quality code):
int fileLength = (int) path.size();
char[] chars = new char[fileLength];
int i = 0;
int data;
while ((data = inputStream.read()) != -1) {
chars[i] = (char) data; // data actually being a byte
++i;
}
inputStream.close();
String text = new String(chars);
System.out.println(Arrays.toString(chars));
The problem you had, probably concerned the unwieldy fixed size array in java, and that a char[] still is not a String.
For binary usage, as you seem to be reading serialized data, you might like to dump the file:
int i = 0;
int data;
while ((data = inputStream.read()) != -1) {
char ch = 32 <= data && data < 127 ? (char) data : ' ';
System.out.println("[%06d] %02x %c%n", i, data, ch);
++i;
}
Dumping file position, hex value and char value.
it is simple example:
public class CopyBytes {
public static void main(String[] args) throws IOException {
FileInputStream in = null;
FileOutputStream out = null;
try {
in = new FileInputStream("xanadu.txt");
out = new FileOutputStream("outagain.txt");
int c;
while ((c = in.read()) != -1) {
out.write(c);
}
} finally {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
}
If you want to read text(characters) - use Readers, if you want to read bytes - use Streams
Why not using Apache Commons:
byte[] bytes = IOUtils.toByteArray(inputStream);
Then you can convert it to char:
String str = new String(bytes);
Char[] chars = str.toCharArray();
Or like you did:
char[] charArray = Character.toChars(bytes);
To deserialize objects:
List<Object> results = new ArrayList<Object>();
FileInputStream fis = new FileInputStream("your_file.dat");
ObjectInputStream ois = new ObjectInputStream(fis);
try {
while (true) {
results.add(ois.readObject());
}
} catch (OptionalDataException e) {
if (!e.eof) throw e;
} finally {
ois.close();
}
Edit:
Use file.length() for they array size, and make a byte array. Then inputstream.read(b).
Edit again: if you want characters, use inputstreamreader(fileinputstream(file),charset), it even comes with charset.