Thales HSM - Windows cp 1252 succes / Linux UTF-8 Fail - java

I am working on TCP/IP Application with HSM Module Integration.
My JAVA code was working fine in Windows 32 bit/JRE 32 Bit/IBM Websphere 7,
When i upgrade to RedHat Linux-64 bit/JRE 64 bit/IBM webshere 8, If i sending below 127 length of the string was working fine, but more than 127 it was returning the response. Also I have done some encoding techniques, but facing the same pblm pls guide me .
If the commandLength = less than 127, working fine, but it was > than 127 [ UTF-8 encoding was failing ]
So for more than 127 i am using extended ascii, but it was not working in the [UTF-8]/working fine in windows-1252
//hsmMessage.insert(0, (char)commandLength);
char[] extended_ascii = new char[1];
byte cp437bytes[]= new byte[1];
cp437bytes[0] = (byte) commandLength;
extended_ascii = new String(cp437bytes).toCharArray(); //extended_ascii = new String(cp437bytes, "CP437").toCharArray();
hsmMessage.insert(0, extended_ascii);
Thanks

Never use String objects to hold arbitrary binary data - use byte arrays or wrappers thereof.
The reason is that when converting from a byte array to a String the given locale is used to convert the bytes into Character objects which will in many circumstances end up with the String not holding the exact bytes you think it should, especially for byte values >= 128.
I had a very similar problem many years ago in the RADIUS server I had written. It would work fine for the vast majority of passwords, but if a user password had a £ symbol in it the differences between US-ASCII and UK-ASCII caused the underlying byte value to get mangled, resulting in mis-calculated encrypted passwords, and failed logins.

Related

Base64 encode gives different result on linux CentOS terminal and in Java

I am trying to generate some random password on Linux CentOS and store it in database as base64. Password is 'KQ3h3dEN' and when I convert it with 'echo KQ3h3dEN | base64' as a result I will get 'S1EzaDNkRU4K'.
I have function in java:
public static String encode64Base(String stringToEncode)
{
byte[] encodedBytes = Base64.getEncoder().encode(stringToEncode.getBytes());
String encodedString = new String(encodedBytes, "UTF-8");
return encodedString;
}
And result of encode64Base("KQ3h3dEN") is 'S1EzaDNkRU4='.
So, it is adding "K" instead of "=" in this example. How to ensure that I will always get same result when using base64 on linux and base64 encoding in java?
UPDATE: Updated question as I didn't noticed "K" at the end of linux encoded string. Also, here are few more examples:
'echo KQ3h3dENa | base64' => result='S1EzaDNkRU5hCg==', but it should be 'S1EzaDNkRU5h'
echo KQ3h3dENaa | base64' => result='S1EzaDNkRU5hYQo=', but it should be 'S1EzaDNkRU5hYQ=='
Found solution after few hours of experimenting. It seems like new line was added to the string I wanted to encode. Solution would be :
echo -n KQ3h3dEN | base64
Result will be the same as with java base64 encode.
Padding
The '==' sequence indicates that the last group contained only one byte, and '=' indicates that it contained two bytes.
In theory, the padding character is not needed for decoding, since the number of missing bytes can be calculated from the number of Base64 digits. In some implementations, the padding character is mandatory, while for others it is not used.
So it depends on tools and libraries you use. If base64 with padding is the same as without padding for them, there is no problem. As an insurance you can use on linux tool that generates base64 with padding.
Use withoutPadding() of Base64.Encoder class to get Base64.Encoder instance which encodes without adding any padding character at the end.
check the link :
https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html#withoutPadding

NSData bytes not matching bytes from Java WS with same base64 string

I am using protocol buffers in an iOS application. The app consumes a web service written in Java, which spits back a base64 encoded string.
The base64 string is the same on both ends.
In the app however, whenever I try to convert the string to NSData, the number of bytes may or may not be the same on both ends. The result is a possible invalid protocol buffer exception, invalid end tag.
For example:
Source(bytes) | NSData | Diff
93 93 0
6739 6735 -4
5745 5739 -6
The bytes are equal in the trivial case of an empty protocol buffer.
Here is the Java source:
import org.apache.commons.codec.binary.Base64;
....
public static String bytesToBase64(byte[] bytes) {
return Base64.encodeBase64String(bytes);
}
On the iOS side, I have tried various algorithms from similar questions which all agree in byte size and content.
What could be causing this?
On closer inspection, the issue was my assumption that Base64 is Base64. I was using the url variant in the web service while the app's decode was expecting a normal version.
I noticed underscores in the Base64, which I thought odd.
The Base64 page http://en.wikipedia.org/wiki/Base64 map of value/char shows no underscores, but later in the article goes over variants, which do use underscores.

Java and c++ encrypted results not matching

I have a existing c++ code which will encrypt a string. Now I did the same encryption in .
Some of the encrypted strings are matching . Some are mismatching in one or two characters.
I am unable to figure out why it is happening. I ran both the codes in debug mode until they call their libraries both have the same key, salt, iv string to be encrypted.
I know that even if a single byte padding change will modify encrypted string drastically. But here I am just seeing a one or two characters change. Here is a sample (Bold characters in between stars is the part mis matching)
java:
U2FsdGVkX18xMjM0NTY3OGEL9nxFlHrWvodMqar82NT53krNkqat0rrgeV5FAJFs1vBsZIJPZ08DJVrQ*Pw*yV15HEoyECBeAZ6MTeN+ZYHRitKanY5jiRU2J0KP0Fzola
C++:
U2FsdGVkX18xMjM0NTY3OGEL9nxFlHrWvodMqar82NT53krNkqat0rrgeV5FAJFs1vBsZIJPZ08DJVrQ*jQ*yV15HEoyECBeAZ6MTeN+ZYHRitKanY5jiRU2J0KP0Fzola
I am using AES encryption. provider is SunJCE version 1.6. I tried changing provider to Bouncy Castle. Even then result is same.
Added One More sample:
C++:
U2FsdGVkX18xMjM0NTY3O*I*/BMu11HkHgnkx+dLPDU1lbfRwb+aCRrwkk7e9dy++MK+/94dKLPXaZDDlWlA3gdUNyh/Fxv*oF*STgl3QgpS0XU=
java:
U2FsdGVkX18xMjM0NTY3O*D*/BMu11HkHgnkx+dLPDU1lbfRwb+aCRrwkk7e9dy++MK+/94dKLPXaZDDlWlA3gdUNyh/Fxv*j9*STgl3QgpS0XU=
UPDATE:
As per the comments I feel base 64 encryption is the culprit. I am using Latin-1 char set in both places. Anything else that I can check
Sigh!!
The problem almost certainly is that after you encrypt the data and receive the encrypted data as a byte string, you are doing some sort of character conversion on the data before sending it through Base-64 conversion.
Note that if you encrypt the strings "ABC_D_EFG" and "ABC_G_EFG", the encrypted output will be completely different starting with the 4th character, and continuing to the end. In other words, the Base-64 outputs would be something like (using made-up values):
U2FsdGVkX18xMj
and
U2FsdGXt91mJpz
The fact that, in the above examples, only two isolated Base-64 characters (one byte) are messed up in each case pretty much proves that the corruption occurs AFTER encryption.
The output of an encryption process is a byte sequence, not a character sequence. The corruption observed is consistent with erroneously interpreting the bytes as characters and attempting to perform a code page conversion on them, prior to feeding them into the Base-64 converter. The output from the encryptor should be fed directly into the Base-64 converter without any conversions.
You say you are using the "Latin-1 char set in both places", a clear sign that you are doing some conversion you should not be doing -- there should be no need to muck with char sets.
First a bit of code:
import javax.xml.bind.DatatypeConverter;
...
public static void main(String[] args) {
String s1j = "U2FsdGVkX18xMjM0NTY3OGEL9nxFlHrWvodMqar82NT53krNkqat0rrgeV5FAJFs1vBsZIJPZ08DJVrQPwyV15HEoyECBeAZ6MTeN+ZYHRitKanY5jiRU2J0KP0Fzola";
String s1c = "U2FsdGVkX18xMjM0NTY3OGEL9nxFlHrWvodMqar82NT53krNkqat0rrgeV5FAJFs1vBsZIJPZ08DJVrQjQyV15HEoyECBeAZ6MTeN+ZYHRitKanY5jiRU2J0KP0Fzola";
byte[] bytesj = DatatypeConverter.parseBase64Binary(s1j);
byte[] bytesc = DatatypeConverter.parseBase64Binary(s1c);
int nmax = Math.max(bytesj.length, bytesc.length);
int nmin = Math.min(bytesj.length, bytesc.length);
for (int i = 0; i < nmax; ++i) {
if (i >= nmin) {
boolean isj = i < bytesj.length;
byte b = isj? bytesj[i] : bytesc[i];
System.out.printf("%s [%d] %x%n", (isj? "J" : "C++"), i, (int)b & 0xFF);
} else {
byte bj = bytesj[i];
byte bc = bytesc[i];
if (bj != bc) {
System.out.printf("[%d] J %x != C++ %x%n", i, (int)bj & 0xFF, (int)bc & 0xFF);
}
}
}
}
This delivers
[60] J 3f != C++ 8d
Now 0x3f is the code of the question mark.
The error is, that 0x80 - 0xBF are in Latin-1, officially ISO-8859-1, control characters.
Windows Latin-1, officially Windows-1252, uses these codes for other characters.
Hence you should use "Windows-1252" or "Cp1252" (Code-Page) in Java.
Blundly
In the encryption the original bytes in the range 0x80 .. 0xBF were replaced with a question mark because of some translation to ISO-8859-1 instead of Windows-1252 to byte[].

Java: Commons-codec and base64 decoding not working on server

I sit with a weird problem. I use Apache's commons-codec (ver 1.4). The following code snippet work correctly on my PC (java version 1.6, Glassfish ver 2.1) in a standalone app and in an EJB, but on my server in an EJB it doesn't work correctly (but in a standalone app it works correctly)
...
org.apache.commons.codec.binary.Base64 b64 = new org.apache.commons.codec.binary.Base64();
byte[] bytes = b64.decode(makeSignedBytes(strB64.getBytes("UTF-8")));
...
private byte[] makeSignedBytes(byte[] ubytes)
{
byte[] sbytes = new byte[ubytes.length];
for (int i = 0; i < ubytes.length; i++)
{
sbytes[i] = (byte)(0x000000FF & ((int) ubytes[i]));
}
return sbytes;
}
The input string is:
4-sDHXi_2Tu2a8k8NPs1FBT3t7UvN7CksUV6gfSE_Ks0aiCPbdeGM8qLdC58b2_hFH7lEp8m9cyPYQOTo4E0t66ZYP8n8tRhT87c8iD34pCd80qvP9vIXsNsodRaGzK5
The output byte array should look like this (I've hex printed it):
|E3|EB|03|1D|78|BF|D9|3B|B6|6B|C9|3C|34|FB|35|14|14|F7|B7|B5|2F|37|B0|A4|B1|45|7A|81|F4|84|FC|AB|34|6A|20|8F|6D|D7|86|33|CA|8B|74|2E|7C|6F|6F|E1|14|7E|E5|12|9F|26|F5|CC|8F|61|03|93|A3|81|34|B7|AE|99|60|FF|27|F2|D4|61|4F|CE|DC|F2|20|F7|E2|90|9D|F3|4A|AF|3F|DB|C8|5E|C3|6C|A1|D4|5A|1B|32|B9|
96 bytes long, when the server get's it wrong it's only 93 bytes and looks like this:
|E2|C0|C7|5E|2D|93|BB|66|BC|93|C3|4F|B3|51|41|4F|7B|7B|52|F3|7B|0A|4B|14|57|A8|1F|48|42|AC|D1|A8|82|3D|B7|5E|18|CF|2A|2D|D0|B9|F1|BD|A1|14|7E|E5|12|9F|26|F5|CC|8F|61|03|93|A3|81|34|B7|AE|99|60|FF|27|F2|D4|61|4F|CE|DC|F2|20|F7|E2|90|9D|F3|4A|AF|3F|DB|C8|5E|C3|6C|A1|D4|5A|1B|32|B9|
I have no idea why it works on my pc and not on the server :(
The wrong result is caused by replacing the second character in the input string (U+002D HYPHEN-MINUS) with U+2010 HYPHEN and replacing underscores with spaces. Perhaps it's a result of passing input string through some "smart" text editor. So, actually it looks like you pass a wrong input string in.
Other flaws in your code:
makeSignedBytes() method makes no sense and isn't needed
strB64.getBytes("UTF-8") is semantically wrong, it should be strB64.getBytes("ASCII")

Perl Client to Java Server

I'm trying to write a perl client program to connect to a Java server application (JDuplicate). I see that the java server uses The DataInput.readUTF and DataInput.writeUTF methods, which the JDuplicate website lists as "Java's modified UTF-8 protocol".
My test program is pretty simple, i'm trying to send client type data, which should invoke a response from the sever, however it just times out:
#!/usr/bin/perl
use strict;
use Encode;
use IO::Socket;
my $remote = IO::Socket::INET->new(
Proto => 'tcp',
PeerAddr => 'localhost',
PeerPort => '10421'
) or die "Cannot connect to server\n";
$|++;
$remote->send(encode_utf8("CLIENTTYPE|JDSC#0.5.9#0.2"));
while (<$remote>) {
print $_,"\n";
}
close($remote);
exit(0);
I've tried $remote->send(pack("U","..."));, I've tried "use utf8;", I've tried binmode($remote, ":utf8"), and I've tried sending just plain ASCII text, nothing ever gets responded to.
I can see the data being sent with tcpdump, all in one packet, but the server itself does nothing with it (other then ack the packet).
Is there something additional i need to do to satisfy the "modified" utf implementation of Java?
Thanks.
You have to implement the protocol correctly:
First, the total number of bytes needed to represent all the characters of s is calculated. If this number is larger than 65535, then a UTFDataFormatException is thrown. Otherwise, this length is written to the output stream in exactly the manner of the writeShort method; after this, the one-, two-, or three-byte representation of each character in the string s is written.
As indicated in the docs for writeShort, it sends a 16-bit quantity in network order.
In Perl, that resembles
sub sendmsg {
my($s,$msg) = #_;
die "message too long" if length($msg) > 0xffff;
my $sent = $s->send(
pack(n => (length($msg) & 0xffff)) .
$msg
);
die "send: $!" unless defined $sent;
die "short write" unless $sent == length($msg) + 2;
}
sub readmsg {
my($s) = #_;
my $buf;
my $nread;
$nread = $s->read($buf, 2);
die "read: $!" unless defined $nread;
die "short read" unless $nread == 2;
my $len = unpack n => $buf;
$nread = $s->read($buf, $len);
die "read: $!" unless defined $nread;
die "short read" unless $nread == $len;
$buf;
}
Although the code above doesn't perform modified UTF encoding, it elicits a response:
my $remote = IO::Socket::INET->new(
Proto => 'tcp',
PeerAddr => 'localhost',
PeerPort => '10421'
) or die "Cannot connect to server: $#\n";
my $msg = "CLIENTTYPE|JDSC#0.5.9#0.2";
sendmsg $remote, $msg;
my $buf = readmsg $remote;
print "[$buf]\n";
Output:
[SERVERTYPE|JDuplicate#0.5.9 beta (build 584)#0.2]
This is unrelated to the main part of your question, but I thought I would explain what the "Java's modified UTF-8" that the API expects is; it's UTF-8, except with UTF-16 surrogate pairs encoded as their own codepoints, instead of having the characters represented by the pairs encoded directly in UTF-8. For instance, take the character U+1D11E MUSICAL SYMBOL G CLEF.
In UTF-8 it's encoded as the four bytes F0 9D 84 9E.
In UTF-16, because it's beyond U+FFFF, it's encoded using the surrogate pair 0xD834 0xDD1E.
In "modified UTF-8", it's given the UTF-8 encoding of the surrogate pair codepoints: that is, you encode "\uD834\uDD1E" into UTF-8, giving ED A0 B4 ED B4 9E, which happens to be fully six bytes long.
When using this format, Java will also encode any embedded nulls using the illegal overlong form C0 80 instead of encoding them as nulls, ensuring that there are never any embedded nulls in a "modified UTF-8" string.
If you're not sending any characters outside of the BMP or any nulls, though, there's no difference from the real thing ;)
Here's some documentation courtesy of Sun.

Categories

Resources