org.apache.commons.codec.DecoderException: Odd number of characters - java

Sending hex string in url parameter and trying to convert it in to string at server side.
Converting user input string by using following javascript encoding code
function encode(string) {
var number = "";
var length = string.trim().length;
string = string.trim();
for (var i = 0; i < length; i++) {
number += string.charCodeAt(i).toString(16);
}
return number;
}
Now I'm trying to parse hex string 419 for russian character Й in java code as follows
byte[] bytes = "".getBytes();
try {
bytes = Hex.decodeHex(hex.toCharArray());
sb.append(new String(bytes,"UTF-8"));
} catch (DecoderException e) {
e.printStackTrace(); // Here it gives error 'Odd number of characters'
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
but it gives following error
"org.apache.commons.codec.DecoderException: Odd number of characters."
How it can be resolved. As there are many russian character have hex code 3 digit and due to this it is not able to convert it to .toCharArray().

Use Base64 instead
val aes = KeyGenerator.getInstance("AES")
aes.init(128)
val secretKeySpec = aes.generateKey()
val base64 = Base64.encodeToString(secretKeySpec.encoded, 0)
val bytes = Base64.decode(base64, 0)
SecretKeySpec(bytes, 0, bytes.size, "AES") == secretKeySpec

In the case you mentioned Й is U+0419 and most cyrillic characters start with a leading 0. This apparently means that adding a 0 before odd numbered character arrays before converting would help.
Testing the javascript seems that this could be safe only for 1 letter long strings: Ѓ(U+0403) returned 403, Ѕ(U+0405) returned 405, but ЃЅ returned 403405 instead of 04030405 or 4030405, which is even worse, becouse it is even and would not trigger the exception and could decode to something completely different.
This question dealing with padding with leading zeros may help with the javascript part.

Instead of
sb.append(new String(bytes,"UTF-8"));
Try this
sb.append(new String(bytes,"Windows-1251"));

Related

Java - What is the proper way to convert a UTF-8 String to binary?

I'm using this code to convert a UTF-8 String to binary:
public String toBinary(String str) {
byte[] buf = str.getBytes(StandardCharsets.UTF_8);
StringBuilder result = new StringBuilder();
for (int i = 0; i < buf.length; i++) {
int ch = (int) buf[i];
String binary = Integer.toBinaryString(ch);
result.append(("00000000" + binary).substring(binary.length()));
result.append(' ');
}
return result.toString().trim();
}
Before I was using this code:
private String toBinary2(String str) {
StringBuilder result = new StringBuilder();
for (int i = 0; i < str.length(); i++) {
int ch = (int) str.charAt(i);
String binary = Integer.toBinaryString(ch);
if (ch<256)
result.append(("00000000" + binary).substring(binary.length()));
else {
binary = ("0000000000000000" + binary).substring(binary.length());
result.append(binary.substring(0, 8));
result.append(' ');
result.append(binary.substring(8));
}
result.append(' ');
}
return result.toString().trim();
}
These two method can return different results; for example:
toBinary("è") = "11000011 10101000"
toBinary2("è") = "11101000"
I think that because the bytes of è are negative while the corresponding char is not (because char is a 2 byte unsigned integer).
What I want to know is: which of the two approaches is the correct one and why?
Thanks in advance.
Whenever you want to convert text into binary data (or into text representing binary data, as you do here) you have to use some encoding.
Your toBinary uses UTF-8 for that encoding.
Your toBinary2 uses something that's not a standard encoding: it encodes every UTF-16 codepoint * <= 256 in a single byte and all others in 2 bytes. Unfortunately that one is not a useful encoding, since for decoding you'll have to know if a single byte is stand-alone or part of a 2-byte sequence (UTF-8/UTF-16 do that by indicating with the highest-level bits which one it is).
tl;dr toBinary seems correct, toBinary2 will produce output that can't uniquely be decoded back to the original string.
* You might be wondering where the mention of UTF-16 comes from: That's because all String objects in Java are implicitly encoded in UTF-16. So if you use charAt you get UTF-16 codepoints (which just so happen to be equal to the Unicode code number for all characters that fit into the Basic Multilingual Plane).
This code snippet might help.
String s = "Some String";
byte[] bytes = s.getBytes();
StringBuilder binary = new StringBuilder();
for(byte b:bytes){
int val =b;
for(int i=;i<=s.length;i++){
binary.append((val & 128) == 0 ? 0 : 1);
val<<=1;
}
}
System.out.println(" "+s+ "to binary" +binary);

JAVA: failing to get encrypted data in string using xor

I was trying to print encrypted text using string perhaps i was wrong somewhere. I am doing simple xor on a plain text. Coming encrypted text/string i am putting in a C program and doing same xor again to get plain text again.
But in between, I am not able to get proper string of encrypted text to pass in C
String xorencrypt(byte[] passwd,int pass_len){
char[] st = new char[pass_len];
byte[] crypted = new byte[pass_len];
for(int i = 0; i<pass_len;i++){
crypted[i] = (byte) (passwd[i]^(i+1));
st[i] = (char)crypted[i];
System.out.println((char)passwd[i]+" "+passwd[i] +"= " + (char)crypted[i]+" "+crypted[i]);/* characters are printed fine but problem is when i am convering it in to string */
}
return st.toString();
}
I don't know if any kind of encoding also needed because if i did so how I will decode and decrypt from C program.
example if suppose passwd = bond007
then java program should return akkb78>
further C program will decrypt akkb78> to bond007 again.
Use
return new String(crypted);
in that case you don't need st[] array at all.
By the way, the encoded value for bond007 is cmm`560 and not what you posted.
EDIT
While solution above would most likely work in most java environments, to be safe about encoding,
as suggested by Alex, provide encoding parameter to String constructor.
For example if you want your string to carry 8-bit bytes :
return new String(crypted, "ISO-8859-1");
You would need the same parameter when getting bytes from your string :
byte[] bytes = myString.getBytes("ISO-8859-1")
Alternatively, use solution provided by Alex :
return new String(st);
But, convert bytes to chars properly :
st[i] = (char) (crypted[i] & 0xff);
Otherwise, all negative bytes, crypted[i] < 0 will not be converted to char properly and you get surprising results.
Change this line:
return st.toString();
with this
return new String(st);

Decoding a String Containing Percentage(%)

I am trying to decode a String that Contains (%) percentage, it's throwing an Exception
Exception:URLDecoder: Illegal hex characters in escape (%) pattern - For input string: "%&"
My Code:
public class DecodeCbcMsg {
public static void main(String[] args) throws UnsupportedEncodingException
{
String msg="Hello%%&&$$";
String strTMsg = URLDecoder.decode(msg,"UTF-8");
System.out.println(strTMsg);
}
It doesn't look like your string is encoded correctly...
Maybe you should ensure it is properly encoded first?
For example, the encoded character representation for % is %25...
So please try decoding Hello%25%25%26%26%24%24 instead, and see what you get :)
your msg is not a valid encoded url, so it cannot be decode.
just like you try to decode a invalid base64 encode string.
ps:
from URLDecoder code
case '%':
/*
* Starting with this instance of %, process all
* consecutive substrings of the form %xy. Each
* substring %xy will yield a byte. Convert all
* consecutive bytes obtained this way to whatever
* character(s) they represent in the provided
* encoding.
*/
try {
// (numChars-i)/3 is an upper bound for the number
// of remaining bytes
if (bytes == null)
bytes = new byte[(numChars-i)/3];
int pos = 0;
while ( ((i+2) < numChars) &&
(c=='%')) {
int v = Integer.parseInt(s.substring(i+1,i+3),16);
if (v < 0)
throw new IllegalArgumentException("URLDecoder: Illegal hex characters in escape (%) pattern - negative value");
bytes[pos++] = (byte) v;
i+= 3;
if (i < numChars)
c = s.charAt(i);
}
// A trailing, incomplete byte encoding such as
// "%x" will cause an exception to be thrown
if ((i < numChars) && (c=='%'))
throw new IllegalArgumentException(
"URLDecoder: Incomplete trailing escape (%) pattern");
sb.append(new String(bytes, 0, pos, enc));
} catch (NumberFormatException e) {
throw new IllegalArgumentException(
"URLDecoder: Illegal hex characters in escape (%) pattern - "
+ e.getMessage());
}
so it try to parse int of string %&, it will throw exception
in order to decode an URL-encoded string, the string would first really need to be url-encoded. In a properly encoded URL, a % sign will be followed by two hex digits 0-9,A-F and so the URLDecoder considers your %% as illegal. The message is quite clear. Make sure you encode your URL properly. Use URLEncoder first, to encode your msg String.

Invalid info_hash (Java BitTorrent client)

according to the specification: http://wiki.theory.org/BitTorrentSpecification
info_hash: urlencoded 20-byte SHA1 hash of the value of the info key from the Metainfo file. Note that the value will be a bencoded dictionary, given the definition of the info key above.
torrentMap is my dictionary, I get the info key which is another dictionary, I calculate the hash and I URLencode it.
But I always get an invalid info_hash message when I try to send it to the tracker.
This is my code:
public String GetInfo_hash() {
String info_hash = "";
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutput out = null;
try {
out = new ObjectOutputStream(bos);
out.writeObject(torrentMap.get("info"));
byte[] bytes = bos.toByteArray(); //Map => byte[]
MessageDigest md = MessageDigest.getInstance("SHA1");
info_hash = urlencode(md.digest(bytes)); //Hashing and URLEncoding
out.close();
bos.close();
} catch (Exception ex) { }
return info_hash;
}
private String urlencode(byte[] bs) {
StringBuffer sb = new StringBuffer(bs.length * 3);
for (int i = 0; i < bs.length; i++) {
int c = bs[i] & 0xFF;
sb.append('%');
if (c < 16) {
sb.append('0');
}
sb.append(Integer.toHexString(c));
}
return sb.toString();
}
This is almost certainly the problem:
out = new ObjectOutputStream(bos);
out.writeObject(torrentMap.get("info"));
What you're going to be hashing is the Java binary serialization format of the value of torrentMap.get("info"). I find it very hard to believe that all BitTorrent programs are meant to know about that.
It's not immediately clear to me from the specification what the value of the "info" key is meant to be, but you need to work out some other way of turning it into a byte array. If it's a string, I'd expect some well-specified encoding (e.g. UTF-8). If it's already binary data, then use that byte array directly.
EDIT: Actually, it's sounds like the value will be a "bencoded dictionary" as per your quote, which looks like it will be a string. Quite how you're meant to encode that string (which sounds like it may include values which aren't in ASCII, for example) before hashing it is up for grabs. If your sample strings are all ASCII, then using "ASCII" and "UTF-8" as the encoding names for String.getBytes(...) will give the same result anyway, of course...

Comparing hash from string against hash of local file

What I am trying to do is read from a text file where each line has the path to a file and then space for a separator and a hash that accompanies it. So I call checkVersion() and loadStrings(File f_) returns a String[], one place for each line. When I try to check the hashes however I end up with something that isn't even hex and is twice as long as it should be, it's probably something obvious that my eyes are just overlooking. The idea behind this is an auto-update for my game to save bandwidth, thanks for your time.
The code is fixed, here is the final version if anyone else has this issue, thanks a lot everyone.
void checkVersion() {
String[] v = loadStrings("version.txt");
for(int i=0; i<v.length; i++) {
String[] piece = split(v[i], " "); //BREAKS INTO FILENAME, HASH
println("Checking "+piece[0]+"..."+piece[1]);
if(checkHash(piece[0], piece[1])) {
println("ok!");
} else {
println("NOT OKAY!");
//CONTINUE TO DOWNLOAD FILE AND THEN CALL CHECKVERSION AGAIN
}
}
}
boolean checkHash(String path_, String hash_) {
return createHash(path_).equals(hash_);
}
byte[] messageDigest(String message, String algorithm) {
try {
java.security.MessageDigest md = java.security.MessageDigest.getInstance(algorithm);
md.update(message.getBytes());
return md.digest();
} catch(java.security.NoSuchAlgorithmException e) {
println(e.getMessage());
return null;
}
}
String createHash(String path_) {
byte[] md5hash = messageDigest(new String(loadBytes(path_)),"MD5");
BigInteger bigInt = new BigInteger(1, md5hash);
return bigInt.toString(16);
}
The String.getBytes() method returns the bytes that represent the character encodings for the string. It doesn't parse it into bytes that represent a number in some arbitrary radix. For example "AA".getBytes() would yield you 0x41 0x41 on windows, not 10101010b, which is what it appears you were expecting? To get that you could, for example Byte.parseByte("AA", 16)
The library you're using to create hashes probably has a method for taking back in its own string representation. How to convert back depends on the representation, which you didn't give us.
use following code to convert hash bytes to string
//byte[] md5sum = digest.digest();
BigInteger bigInt = new BigInteger(1, md5sum);
String output = bigInt.toString(16);
System.out.println("MD5: " + output);
for more information

Categories

Resources