according to the specification: http://wiki.theory.org/BitTorrentSpecification
info_hash: urlencoded 20-byte SHA1 hash of the value of the info key from the Metainfo file. Note that the value will be a bencoded dictionary, given the definition of the info key above.
torrentMap is my dictionary, I get the info key which is another dictionary, I calculate the hash and I URLencode it.
But I always get an invalid info_hash message when I try to send it to the tracker.
This is my code:
public String GetInfo_hash() {
String info_hash = "";
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutput out = null;
try {
out = new ObjectOutputStream(bos);
out.writeObject(torrentMap.get("info"));
byte[] bytes = bos.toByteArray(); //Map => byte[]
MessageDigest md = MessageDigest.getInstance("SHA1");
info_hash = urlencode(md.digest(bytes)); //Hashing and URLEncoding
out.close();
bos.close();
} catch (Exception ex) { }
return info_hash;
}
private String urlencode(byte[] bs) {
StringBuffer sb = new StringBuffer(bs.length * 3);
for (int i = 0; i < bs.length; i++) {
int c = bs[i] & 0xFF;
sb.append('%');
if (c < 16) {
sb.append('0');
}
sb.append(Integer.toHexString(c));
}
return sb.toString();
}
This is almost certainly the problem:
out = new ObjectOutputStream(bos);
out.writeObject(torrentMap.get("info"));
What you're going to be hashing is the Java binary serialization format of the value of torrentMap.get("info"). I find it very hard to believe that all BitTorrent programs are meant to know about that.
It's not immediately clear to me from the specification what the value of the "info" key is meant to be, but you need to work out some other way of turning it into a byte array. If it's a string, I'd expect some well-specified encoding (e.g. UTF-8). If it's already binary data, then use that byte array directly.
EDIT: Actually, it's sounds like the value will be a "bencoded dictionary" as per your quote, which looks like it will be a string. Quite how you're meant to encode that string (which sounds like it may include values which aren't in ASCII, for example) before hashing it is up for grabs. If your sample strings are all ASCII, then using "ASCII" and "UTF-8" as the encoding names for String.getBytes(...) will give the same result anyway, of course...
Related
I'm trying to decompress a json object in Java that was initially compressed in PHP. Here's how it gets compressed into PHP:
function zip_json_encode(&$arr) {
$uncompressed = json_encode($arr);
return pack('L', strlen($uncompressed)).gzcompress($uncompressed);
}
and decoded (again in PHP):
function unzip_json_decode(&$data) {
$uncompressed = #gzuncompress(substr($data,4));
return json_decode($uncompressed, $array_instead_of_object);
}
That gets put into MySQL and now it must be pulled out of the db by Java. We pull it out from the ResultSet like this:
String field = rs.getString("field");
I then pass that string to a method to decompress it. This is where it falls apart.
private String decompressHistory(String historyString) throws SQLException {
StringBuffer buffer = new StringBuffer();
try {
byte[] historyBytes = historyString.substring(4).getBytes();
ByteArrayInputStream bin = new ByteArrayInputStream(historyBytes);
InflaterInputStream in = new InflaterInputStream(bin, new Inflater(true));
int len;
byte[] buf = new byte[1024];
while ((len = in.read(buf)) != -1) {
// buf should be decoded, right?
}
} catch (IOException e) {
e.getStackTrace();
}
return buffer.toString();
}
Not quite sure what's going wrong here, but any pointers would be appreciated!
You need to get rid of the true in Inflater(true). Use just Inflater(). The true makes it expect raw deflate data. Without the true, it is expecting zlib-wrapped deflate data. PHP's gzcompress() produces zlib-wrapped deflate data.
Gzipped data is binary, byte[]. Using String, Unicode text, not only needs conversion, but is faulty.
For instance this involves a conversion:
byte[] historyBytes = historyString.substring(4).getBytes();
byte[] historyBytes = historyString.substring(4).getBytes("ISO-8859-1");
The first version uses the default platform encoding, making the application non-portable.
The first to-do is to use binary data in the database as VARBINARY or BLOB.
ImputStream field = rs.getBinaryStream("field");
try (InputStream in = new GZIPInputStream(field)) {
...
}
Or so. Mind the other answer.
In the end, neither of the above solutions worked, but both have merits. When we pulled the data out of mysql and cast it to bytes we have a number of missing character bytes (67). This made it impossible to decompress on the java side. As for the answers above. Mark is correct that gzcompress() uses zlib and therefore you should use the Inflater() class in Java.
Joop is correct that the data conversion is faulty. Our table was too large to convert it to varbinary or blob. That may have solved the problem, but didn't work for us. We ended up having java make a request to our PHP app, then simply unpacked the compressed data on the PHP side. This worked well. Hopefully this is helpful to anyone else that stumbles across it.
I have a legacy system that uses hibernate interceptor to encrypt (and encode) and decrypt (and decode) some fields on some database tables. It makes use of the OnSave, OnLoad and OnFlushDirty methods. This code turns out to be buggy as data read from this system, when transferred to another application still has some of the records encrypted and encoded (some encrypted multiple times). The challenge for me here is that I could perform the decryption and decoding (as many times as necessary) when the receiving application is on a Windows machine. I get a BadPaddingException when I try to repeat the same thing when the receiving application is a linux VM.
Any help/suggestions will be greatly appreciated
here is a snippet of the hibernate interceptor
public boolean onLoad(Object entity, Serializable arg1, Object[] state, String[] propertyNames, Type[] arg4) throws CallbackException {
if (key != null){
try {
if (entity instanceof BasicData) {
for (int i = 0; i < state.length; i++) {
if (state[i] instanceof String){
String cipherText = (String)state[i];
byte[] cipherTextBytes = Base64Coder.decode(cipherText);
byte[] plainTextBytes = dCipher.doFinal(cipherTextBytes);
state[i] = new String(plainTextBytes, "UTF8");
}
}
return true;
}
} catch (Exception e) {
e.printStackTrace();
}}return false;}
I'd have to guess here but if you mean this Base64Coder the problem might be the following:
It is unclear how the base64 string has been created, i.e. which encoding had been used.
If you use UTF-8 to get the bytes of a string and create a base64 from those bytes you'll get a different result than if you'd use ISO Latin-1, for example.
Afterwards you create a string from those bytes using UTF-8, but if the base64 string had not been created using UTF-8, you'll get wrong results.
Just a quote from the linked source (if this is the correct one):
public static String encodeString (String s) {
return new String(encode(s.getBytes())); }
Here, s.getBytes() will use the system's/jvm's default encoding, so you should really ensure it is UTF-8!
If you control both sides, encode and decode, better way to use DatatypeConverter:
String buffer = DatatypeConverter.printBase64Binary( symKey );
byte[] supposedSymKey = DatatypeConverter.parseBase64Binary( buffer );
I'm searching for a way to parse large files (about 5-10Go) and search for position (in byte) of some recurrent strings, the fastest as possible.
I've tried to use the RandomAccessFile reader by doing something like bellow:
RandomAccessFile lecteurFichier = new RandomAccessFile(<MyFile>, "r");
while (currentPointeurPosition < lecteurFichier.length()) {
char currentFileChar = (char) lecteurFichier.readByte();
// Test each char for matching my string (by appending chars until I found my string)
// and keep a trace of all found string's position
}
The problem is this code is too slow (maybe because I read byte by byte ?).
I also tried the solution bellow, which is perfect in term of speedness but I can't get my string's positions.
FileInputStream is = new FileInputStream(fichier.getFile());
FileChannel f = is.getChannel();
ByteBuffer buf = ByteBuffer.allocateDirect(64 * 1024);
Charset charset = Charset.forName("ISO-8859-1");
CharsetDecoder decoder = charset.newDecoder();
long len = 0;
while ((len = f.read(buf)) != -1) {
buf.flip();
String data = "";
try {
int old_position = buf.position();
data = decoder.decode(buf).toString();
// reset buffer's position to its original so it is not altered:
buf.position(old_position);
}
catch (Exception e) {
e.printStackTrace();
}
buf.clear();
}
f.close();
Does anyone has a better solution to propose ?
Thank you in advance (and sorry for my spelling, I'm french)
Since your input data is encoded in an 8-bit encoding*, you can speed up the search by encoding the search string rather than decoding the file:
byte[] encoded = searchString.getBytes("ISO-8859-1");
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file));
int b;
long pos = -1;
while ((b = bis.read()) != -1) {
pos++;
if (encoded[0] == b) {
// see if rest of string matches
}
}
A BufferedInputStream should be pretty fast. Using ByteBuffer might be faster, but this is going to make the search logic more complicated because of the possibility of a string match than spans a buffer boundary.
Then there are various clever ways to optimize string searches that could be adapted to this situation ... where you are search a stream of bytes / characters rather than an array of bytes / characters. The Wikipedia page on String Searching is a good place to start.
Note that since we are reading and matching in a byte-wise fashion, the position is just the count of bytes read (or skipped), so there is no need to use a random access file.
* In fact this trick will work with many multibyte encodings too.
Searching for a 'needle' in a 'haystack' is a well-studied problem-Here's a related link on StackOverflow itself. I am sure the java implementations of the algorithms discussed should be available too. Why not try some of them,to see if they fit the job?
Is it possible to convert a string to a byte array and then convert it back to the original string in Java or Android?
My objective is to send some strings to a microcontroller (Arduino) and store it into EEPROM (which is the only 1 KB). I tried to use an MD5 hash, but it seems it's only one-way encryption. What can I do to deal with this issue?
I would suggest using the members of string, but with an explicit encoding:
byte[] bytes = text.getBytes("UTF-8");
String text = new String(bytes, "UTF-8");
By using an explicit encoding (and one which supports all of Unicode) you avoid the problems of just calling text.getBytes() etc:
You're explicitly using a specific encoding, so you know which encoding to use later, rather than relying on the platform default.
You know it will support all of Unicode (as opposed to, say, ISO-Latin-1).
EDIT: Even though UTF-8 is the default encoding on Android, I'd definitely be explicit about this. For example, this question only says "in Java or Android" - so it's entirely possible that the code will end up being used on other platforms.
Basically given that the normal Java platform can have different default encodings, I think it's best to be absolutely explicit. I've seen way too many people using the default encoding and losing data to take that risk.
EDIT: In my haste I forgot to mention that you don't have to use the encoding's name - you can use a Charset instead. Using Guava I'd really use:
byte[] bytes = text.getBytes(Charsets.UTF_8);
String text = new String(bytes, Charsets.UTF_8);
You can do it like this.
String to byte array
String stringToConvert = "This String is 76 characters long and will be converted to an array of bytes";
byte[] theByteArray = stringToConvert.getBytes();
http://www.javadb.com/convert-string-to-byte-array
Byte array to String
byte[] byteArray = new byte[] {87, 79, 87, 46, 46, 46};
String value = new String(byteArray);
http://www.javadb.com/convert-byte-array-to-string
Use [String.getBytes()][1] to convert to bytes and use [String(byte[] data)][2] constructor to convert back to string.
byte[] pdfBytes = Base64.decode(myPdfBase64String, Base64.DEFAULT)
import java.io.FileInputStream;
import java.io.ByteArrayOutputStream;
public class FileHashStream
{
// write a new method that will provide a new Byte array, and where this generally reads from an input stream
public static byte[] read(InputStream is) throws Exception
{
String path = /* type in the absolute path for the 'commons-codec-1.10-bin.zip' */;
// must need a Byte buffer
byte[] buf = new byte[1024 * 16]
// we will use 16 kilobytes
int len = 0;
// we need a new input stream
FileInputStream is = new FileInputStream(path);
// use the buffer to update our "MessageDigest" instance
while(true)
{
len = is.read(buf);
if(len < 0) break;
md.update(buf, 0, len);
}
// close the input stream
is.close();
// call the "digest" method for obtaining the final hash-result
byte[] ret = md.digest();
System.out.println("Length of Hash: " + ret.length);
for(byte b : ret)
{
System.out.println(b + ", ");
}
String compare = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d";
String verification = Hex.encodeHexString(ret);
System.out.println();
System.out.println("===")
System.out.println(verification);
System.out.println("Equals? " + verification.equals(compare));
}
}
What I am trying to do is read from a text file where each line has the path to a file and then space for a separator and a hash that accompanies it. So I call checkVersion() and loadStrings(File f_) returns a String[], one place for each line. When I try to check the hashes however I end up with something that isn't even hex and is twice as long as it should be, it's probably something obvious that my eyes are just overlooking. The idea behind this is an auto-update for my game to save bandwidth, thanks for your time.
The code is fixed, here is the final version if anyone else has this issue, thanks a lot everyone.
void checkVersion() {
String[] v = loadStrings("version.txt");
for(int i=0; i<v.length; i++) {
String[] piece = split(v[i], " "); //BREAKS INTO FILENAME, HASH
println("Checking "+piece[0]+"..."+piece[1]);
if(checkHash(piece[0], piece[1])) {
println("ok!");
} else {
println("NOT OKAY!");
//CONTINUE TO DOWNLOAD FILE AND THEN CALL CHECKVERSION AGAIN
}
}
}
boolean checkHash(String path_, String hash_) {
return createHash(path_).equals(hash_);
}
byte[] messageDigest(String message, String algorithm) {
try {
java.security.MessageDigest md = java.security.MessageDigest.getInstance(algorithm);
md.update(message.getBytes());
return md.digest();
} catch(java.security.NoSuchAlgorithmException e) {
println(e.getMessage());
return null;
}
}
String createHash(String path_) {
byte[] md5hash = messageDigest(new String(loadBytes(path_)),"MD5");
BigInteger bigInt = new BigInteger(1, md5hash);
return bigInt.toString(16);
}
The String.getBytes() method returns the bytes that represent the character encodings for the string. It doesn't parse it into bytes that represent a number in some arbitrary radix. For example "AA".getBytes() would yield you 0x41 0x41 on windows, not 10101010b, which is what it appears you were expecting? To get that you could, for example Byte.parseByte("AA", 16)
The library you're using to create hashes probably has a method for taking back in its own string representation. How to convert back depends on the representation, which you didn't give us.
use following code to convert hash bytes to string
//byte[] md5sum = digest.digest();
BigInteger bigInt = new BigInteger(1, md5sum);
String output = bigInt.toString(16);
System.out.println("MD5: " + output);
for more information