Why Java do not respect given array length? - java

I saw problem in this piece of code:
byte[] buf = new byte[6];
buf = "abcdef".getBytes();
System.out.println(buf.length);
Array was made for 6 bytes. If I get bytes from string with length 6 I will get much more bytes. So how will all these bytes get into this array? But this is working. Moreover, buf.length shows length of that array as it is array of chars not those bytes.
Afterwards, I realized that in
byte[] buf = new byte[6];
6 does not mean much, i.e. I can put there 0 or 1 or 2 or so on and code will work (with buf.length showing length of given string not array - what I see as second problem or discrepancy).
This question is different than Why does Java's String.getBytes() uses “ISO-8859-1” because it have one aspect more, at least: variables assignment oversight (getBytes() returns new array), i.e. it don't fully address my question.

That is not how variable assignments work
Thinking that assigning a 6 byte array to a variable will limit the length of any other arrays assigned to the same variable show a fundamental lack of comprehension on what variable are and how they work.
Really think about why you think assigning a variable to a fixed length array would limit the length of being assigned to another length array?
Strings are Unicode in Java
Strings in Java are Unicode and internally represented as UTF-16 which means they are 2 or 4 bytes per character in memory.
When they are converted to a byte array the number of bytes that represents the string is determined by what encoding is used when converting to the byte[].
Always specify an appropriate character encoding when converting Strings to arrays to get what you expect.
But even then UTF-8 would not guarantee single bytes per character, and ASCII would be not be able to represent non ASCII Unicode characters.
Character encoding is tricky
The ubiquitous internet encoding standard is UTF-8 it will correct in 99.9999999% of all cases, in those cases it isn't converting UTF-8 to the correct encoding is trivial because UTF-8 is so well supported in every toolchain.
Learn to make everything final and you will a lot easier time and less confusion.
import com.google.common.base.Charsets;
import javax.annotation.Nonnull;
import java.util.Arrays;
public class Scratch
{
public static void main(final String[] args)
{
printWithEncodings("Hello World!");
printWithEncodings("こんにちは世界!");
}
private static void printWithEncodings(#Nonnull final String s)
{
System.out.println("s = " + s);
final byte[] defaultEncoding = s.getBytes(); // never do this, you do not know what you will get!
// for ASCII characters the first three will all be the same single byte representations
final byte[] iso88591Encoding = s.getBytes(Charsets.ISO_8859_1);
final byte[] asciiEncoding = s.getBytes(Charsets.US_ASCII);
final byte[] utf8Encoding = s.getBytes(Charsets.UTF_8);
final byte[] utf16Encoding = s.getBytes(Charsets.UTF_16);
System.out.println("Arrays.toString(defaultEncoding) = " + Arrays.toString(defaultEncoding));
System.out.println("Arrays.toString(iso88591) = " + Arrays.toString(iso88591Encoding));
System.out.println("Arrays.toString(asciiEncoding) = " + Arrays.toString(asciiEncoding));
System.out.println("Arrays.toString(utf8Encoding) = " + Arrays.toString(utf8Encoding));
System.out.println("Arrays.toString(utf16Encoding) = " + Arrays.toString(utf16Encoding));
}
}
results in
s = Hello World!
Arrays.toString(defaultEncoding) = [72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100, 33]
Arrays.toString(iso88591) = [72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100, 33]
Arrays.toString(asciiEncoding) = [72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100, 33]
Arrays.toString(utf8Encoding) = [72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100, 33]
Arrays.toString(utf16Encoding) = [-2, -1, 0, 72, 0, 101, 0, 108, 0, 108, 0, 111, 0, 32, 0, 87, 0, 111, 0, 114, 0, 108, 0, 100, 0, 33]
s = こんにちは世界!
Arrays.toString(defaultEncoding) = [-29, -127, -109, -29, -126, -109, -29, -127, -85, -29, -127, -95, -29, -127, -81, -28, -72, -106, -25, -107, -116, 33]
Arrays.toString(iso88591) = [63, 63, 63, 63, 63, 63, 63, 33]
Arrays.toString(asciiEncoding) = [63, 63, 63, 63, 63, 63, 63, 33]
Arrays.toString(utf8Encoding) = [-29, -127, -109, -29, -126, -109, -29, -127, -85, -29, -127, -95, -29, -127, -81, -28, -72, -106, -25, -107, -116, 33]
Arrays.toString(utf16Encoding) = [-2, -1, 48, 83, 48, -109, 48, 107, 48, 97, 48, 111, 78, 22, 117, 76, 0, 33]
Always specify the Charset encoding!
.bytes(Charset) is always the correct way to convert a String to bytes. Use whatever encoding you need.
Internally supported encodings for JDK7

new byte[6]; has no effect whatsoever as the array reference buf is getting updated with reference of the array returned by "abcdef".getBytes();.

That's because String.getBytes() returns an entirely different array object which is then assigned to buf. You could have just as easily done this:
byte[] buf = "abcdef".getBytes();
System.out.println(buf.length);

Related

Phoenix framework: How to decode Phoenix Session Cookie with Java

I am trying two different ways to decode Phoenix Session Cookie.
First one is Elixir's interaction shell, and the second one is with Java.
Please see the following examples;
IEx
iex(1)> set_cookie = "SFMyNTY.g3QAAAABbQAAAAtfY3NyZl90b2tlbm0AAAAYZFRuNUtQMkJ5YWtKT1JnWUtCeXhmNmdP.l0T3G-i8I5dMwz7lEZnQAeK_WeqEZTxcDeyNY2poz_M"
"SFMyNTY.g3QAAAABbQAAAAtfY3NyZl90b2tlbm0AAAAYZFRuNUtQMkJ5YWtKT1JnWUtCeXhmNmdP.l0T3G-i8I5dMwz7lEZnQAeK_WeqEZTxcDeyNY2poz_M"
iex(2)> [_, payload, _] = String.split(set_cookie, ".", parts: 3)
["SFMyNTY",
"g3QAAAABbQAAAAtfY3NyZl90b2tlbm0AAAAYZFRuNUtQMkJ5YWtKT1JnWUtCeXhmNmdP",
"l0T3G-i8I5dMwz7lEZnQAeK_WeqEZTxcDeyNY2poz_M"]
iex(3)> {:ok, encoded_term } = Base.url_decode64(payload, padding: false)
{:ok,
<<131, 116, 0, 0, 0, 1, 109, 0, 0, 0, 11, 95, 99, 115, 114, 102, 95, 116, 111,
107, 101, 110, 109, 0, 0, 0, 24, 100, 84, 110, 53, 75, 80, 50, 66, 121, 97,
107, 74, 79, 82, 103, 89, 75, 66, 121, 120, 102, ...>>}
iex(4)> :erlang.binary_to_term(encoded_term)
%{"_csrf_token" => "dTn5KP2ByakJORgYKByxf6gO"}
Java
public static String decodePhoenixSessionCookie(String sessionCookie) {
String payload = sessionCookie.split("\\.")[1];
byte[] encoded_term = Base64.getUrlDecoder().decode(payload.getBytes());
return new String(encoded_term);
}
Java Output
�tm_csrf_tokenmdTn5KP2ByakJORgYKByxf6gO
What I wonder is; with the Java way, I can fully achieve field name and it's value, but some gibberish values come with them.
Do you know what's the reason for this?
Do I have a chance to get clean output like Elixir way in Java way?

Casting a String-ified byte array into byte[] [duplicate]

I'm trying to understand a byte[] to string, string representation of byte[] to byte[] conversion... I convert my byte[] to a string to send, I then expect my web service (written in python) to echo the data straight back to the client.
When I send the data from my Java application...
Arrays.toString(data.toByteArray())
Bytes to send..
[B#405217f8
Send (This is the result of Arrays.toString() which should be a string representation of my byte data, this data will be sent across the wire):
[-47, 1, 16, 84, 2, 101, 110, 83, 111, 109, 101, 32, 78, 70, 67, 32, 68, 97, 116, 97]
On the python side, the python server returns a string to the caller (which I can see is the same as the string I sent to the server
[-47, 1, 16, 84, 2, 101, 110, 83, 111, 109, 101, 32, 78, 70, 67, 32, 68, 97, 116, 97]
The server should return this data to the client, where it can be verified.
The response my client receives (as a string) looks like
[-47, 1, 16, 84, 2, 101, 110, 83, 111, 109, 101, 32, 78, 70, 67, 32, 68, 97, 116, 97]
I can't seem to figure out how to get the received string back into a
byte[]
Whatever I seem to try I end up getting a byte array which looks as follows...
[91, 45, 52, 55, 44, 32, 49, 44, 32, 49, 54, 44, 32, 56, 52, 44, 32, 50, 44, 32, 49, 48, 49, 44, 32, 49, 49, 48, 44, 32, 56, 51, 44, 32, 49, 49, 49, 44, 32, 49, 48, 57, 44, 32, 49, 48, 49, 44, 32, 51, 50, 44, 32, 55, 56, 44, 32, 55, 48, 44, 32, 54, 55, 44, 32, 51, 50, 44, 32, 54, 56, 44, 32, 57, 55, 44, 32, 49, 49, 54, 44, 32, 57, 55, 93]
or I can get a byte representation which is as follows:
B#2a80d889
Both of these are different from my sent data... I'm sure Im missing something truly simple....
Any help?!
You can't just take the returned string and construct a string from it... it's not a byte[] data type anymore, it's already a string; you need to parse it. For example :
String response = "[-47, 1, 16, 84, 2, 101, 110, 83, 111, 109, 101, 32, 78, 70, 67, 32, 68, 97, 116, 97]"; // response from the Python script
String[] byteValues = response.substring(1, response.length() - 1).split(",");
byte[] bytes = new byte[byteValues.length];
for (int i=0, len=bytes.length; i<len; i++) {
bytes[i] = Byte.parseByte(byteValues[i].trim());
}
String str = new String(bytes);
** EDIT **
You get an hint of your problem in your question, where you say "Whatever I seem to try I end up getting a byte array which looks as follows... [91, 45, ...", because 91 is the byte value for [, so [91, 45, ... is the byte array of the string "[-45, 1, 16, ..." string.
The method Arrays.toString() will return a String representation of the specified array; meaning that the returned value will not be a array anymore. For example :
byte[] b1 = new byte[] {97, 98, 99};
String s1 = Arrays.toString(b1);
String s2 = new String(b1);
System.out.println(s1); // -> "[97, 98, 99]"
System.out.println(s2); // -> "abc";
As you can see, s1 holds the string representation of the array b1, while s2 holds the string representation of the bytes contained in b1.
Now, in your problem, your server returns a string similar to s1, therefore to get the array representation back, you need the opposite constructor method. If s2.getBytes() is the opposite of new String(b1), you need to find the opposite of Arrays.toString(b1), thus the code I pasted in the first snippet of this answer.
String coolString = "cool string";
byte[] byteArray = coolString.getBytes();
String reconstitutedString = new String(byteArray);
System.out.println(reconstitutedString);
That outputs "cool string" to the console.
It's pretty darn easy.
What I did:
return to clients:
byte[] result = ****encrypted data****;
String str = Base64.encodeBase64String(result);
return str;
receive from clients:
byte[] bytes = Base64.decodeBase64(str);
your data will be transferred in this format:
OpfyN9paAouZ2Pw+gDgGsDWzjIphmaZbUyFx5oRIN1kkQ1tDbgoi84dRfklf1OZVdpAV7TonlTDHBOr93EXIEBoY1vuQnKXaG+CJyIfrCWbEENJ0gOVBr9W3OlFcGsZW5Cf9uirSmx/JLLxTrejZzbgq3lpToYc3vkyPy5Y/oFWYljy/3OcC/S458uZFOc/FfDqWGtT9pTUdxLDOwQ6EMe0oJBlMXm8J2tGnRja4F/aVHfQddha2nUMi6zlvAm8i9KnsWmQG//ok25EHDbrFBP2Ia/6Bx/SGS4skk/0couKwcPVXtTq8qpNh/aYK1mclg7TBKHfF+DHppwd30VULpA==
What Arrays.toString() does is create a string representation of each individual byte in your byteArray.
Please check the API documentation
Arrays API
To convert your response string back to the original byte array, you have to use split(",") or something and convert it into a collection and then convert each individual item in there to a byte to recreate your byte array.
Its simple to convert byte array to string and string back to byte array in java. we need to know when to use 'new' in the right way.
It can be done as follows:
byte array to string conversion:
byte[] bytes = initializeByteArray();
String str = new String(bytes);
String to byte array conversion:
String str = "Hello"
byte[] bytes = str.getBytes();
For more details, look at:
http://evverythingatonce.blogspot.in/2014/01/tech-talkbyte-array-and-string.html
The kind of output you are seeing from your byte array ([B#405217f8) is also an output for a zero length byte array (ie new byte[0]). It looks like this string is a reference to the array rather than a description of the contents of the array like we might expect from a regular collection's toString() method.
As with other respondents, I would point you to the String constructors that accept a byte[] parameter to construct a string from the contents of a byte array. You should be able to read raw bytes from a socket's InputStream if you want to obtain bytes from a TCP connection.
If you have already read those bytes as a String (using an InputStreamReader), then, the string can be converted to bytes using the getBytes() function. Be sure to pass in your desired character set to both the String constructor and getBytes() functions, and this will only work if the byte data can be converted to characters by the InputStreamReader.
If you want to deal with raw bytes you should really avoid using this stream reader layer.
Can you not just send the bytes as bytes, or convert each byte to a character and send as a string? Doing it like you are will take up a minimum of 85 characters in the string, when you only have 11 bytes to send. You could create a string representation of the bytes, so it'd be "[B#405217f8", which can easily be converted to a bytes or bytearray object in Python. Failing that, you could represent them as a series of hexadecimal digits ("5b42403430353231376638") taking up 22 characters, which could be easily decoded on the Python side using binascii.unhexlify().
[JDK8]
import java.util.Base64;
To string:
String str = Base64.getEncoder().encode(new byte[]{ -47, 1, 16, ... });
To byte array:
byte[] bytes = Base64.getDecoder().decode("JVBERi0xLjQKMyAwIG9iago8P...");
If you want to convert the string back into a byte array you will need to use String.getBytes() (or equivalent Python function) and this will allow you print out the original byte array.
Use the below code API to convert bytecode as string to Byte array.
byte[] byteArray = DatatypeConverter.parseBase64Binary("JVBERi0xLjQKMyAwIG9iago8P...");
[JAVA 8]
import java.util.Base64;
String dummy= "dummy string";
byte[] byteArray = dummy.getBytes();
byte[] salt = new byte[]{ -47, 1, 16, ... }
String encoded = Base64.getEncoder().encodeToString(salt);
You can do the following to convert byte array to string and then convert that string to byte array:
// 1. convert byte array to string and then string to byte array
// convert byte array to string
byte[] by_original = {0, 1, -2, 3, -4, -5, 6};
String str1 = Arrays.toString(by_original);
System.out.println(str1); // output: [0, 1, -2, 3, -4, -5, 6]
// convert string to byte array
String newString = str1.substring(1, str1.length()-1);
String[] stringArray = newString.split(", ");
byte[] by_new = new byte[stringArray.length];
for(int i=0; i<stringArray.length; i++) {
by_new[i] = (byte) Integer.parseInt(stringArray[i]);
}
System.out.println(Arrays.toString(by_new)); // output: [0, 1, -2, 3, -4, -5, 6]
But to convert the string to byte array and then convert that byte array to string, below approach can be used:
// 2. convert string to byte array and then byte array to string
// convert string to byte array
String str2 = "[0, 1, -2, 3, -4, -5, 6]";
byte[] byteStr2 = str2.getBytes(StandardCharsets.UTF_8);
// Now byteStr2 is [91, 48, 44, 32, 49, 44, 32, 45, 50, 44, 32, 51, 44, 32, 45, 52, 44, 32, 45, 53, 44, 32, 54, 93]
// convert byte array to string
System.out.println(new String(byteStr2, StandardCharsets.UTF_8)); // output: [0, 1, -2, 3, -4, -5, 6]
I have also answered the same in the following question:
https://stackoverflow.com/a/70486387/17364272

Java Base64 encode gives different results than C base64 encode

I try to encode this byte array:
[237, 217, 204, 218, 109, 227, 157, 145, 35, 152, 85, 142, 182, 180, 120, 8]
Using Java library org.apache.commons.codec.binary.Base64.encodeBase64 and org.bouncycastle.util.encoders.Base64.encode this is the results:
[55, 100, 110, 77, 50, 109, 51, 106, 110, 90, 69, 106, 109, 70, 87, 79, 116, 114, 82, 52, 67, 65, 61, 61]
(note the double '=' padding character at the end)
Using base64.c Copyright (c) 1995-2001 Kungliga Tekniska Högskolan (Royal Institute of Technology, Stockholm, Sweden) this is the output:
[55, 100, 110, 77, 50, 109, 51, 106, 110, 90, 69, 106, 109, 70, 87, 79, 116, 114, 82, 52, 67, 66, 72, 114]
Can anyone explain why? How can I make the Java/C library works the same way?
Every Base64 ASCII character holds 6 bits information (26 = 64), so 4 Base64 characters hold 3 bytes information. You have 16 bytes, so one byte remains at the end, needing 2 Base64 characters, and to make the group up to 4 chars, two padding =s are added.
Mind: with JavaSE 8 came a class Base64 to replace several older class around.
Base64 has several fields of application, with various little changes: padding can be left out, line breaks added to limit line length, and so on. Java 8's Base64 even has an option for a non-compatible URL and file name safeversion, where + and / are replaced.
Base64 works on blocks of 3 bytes, and the = padding is there to bring the output size up to a multiple of 3. This padding is optional, and if is not there then you can simply add it manually by checking the array length before trying to decode using the Java code.

Getting different encrypted password and salt while storing and retrieving from MySQL

I am trying to encrypt the password using a salt and storing it into the MySQL database.
I referred to this stackOverflow question
My code is similar to this:
private byte[] encrypt(String passwordToSave, byte[] salt)
throws UnsupportedEncodingException
{
int seedBytes = 20;
int hashBytes = 20;
int iterations = 1000;
if(null == salt)
{
SecureRandom rng = new SecureRandom();
salt = rng.generateSeed(seedBytes);
}
PKCS5S2ParametersGenerator kdf = new PKCS5S2ParametersGenerator();
kdf.init(passwordToSave.getBytes("UTF-8"), salt, iterations);
byte[] hash =
((KeyParameter) kdf.generateDerivedMacParameters(8*hashBytes)).getKey();
return hash;
}
I just altered the function little to make use of it for both the purposes.
Encrypt the password while creating the user account and store it with the salt.
Encrypt the user password with the stored salt from database when he is trying to logging in and check it with the stored value of password.
The issue with this is,
I am not getting back what I stored.
I used a lot of different things,
I used Base64 for encoding and stored into DB and decoded using the same while getting it back.
I tried to use VARBINARY and BLOB to save the byte[] data but no luck.
Then I used VARCHAR and just stored the byte[] by creating a new String from it using "UTF-8" encoding type.
I am new to cryptography so if I am wrong, please point it out.
Thanks in advance. :)
EDIT:
The output when I ran the encrypt twice:
Salt : [34, 17, -80, -59, 93, -90, 37, -25, -11, -43, 44, 1, 10, 7, -66, -108, 97, 36, 95, -116]
First Attempt: [-76, -3, 114, -69, 78, 21, -59, 23, 127, -15, 114, -106, -52, 23, 34, 91, 123, 6, 76, -115]
Second Attempt: [-76, -3, 114, -69, 78, 21, -59, 23, 127, -15, 114, -106, -52, 23, 34, 91, 123, 6, 76, -115]
Salt : [34, 17, -80, -59, 93, -90, 37, -25, -11, -43, 44, 1, 10, 7, -66, -108, 97, 36, 95, -116]

Store a list of Strings as human-readable path

I need to store a list of Strings into a single field in Java. The order is important, and I would prefer it to be stored in a human-readable format.
Perfect solution would be storing it like an xPath, but I only know libraries for compiling complex xml files to xPath, not lists of Strings.
My own written solutions easily get too complex because I want to support Strings containing any character, including the one I use as delimiter.
I currently use serialization this way:
String[] items = new String[3];
items[0] = item1;
items[1] = item2;
items[2] = item3;
byte[] bytes = SerializationUtils.serialize(items);
System.out.println("Serialized:\n"+Arrays.toString(bytes));
String[] read = (String[]) SerializationUtils.deserialize(bytes);
System.out.println("Read:");
for(String s : read) {
System.out.println(s);
}
Output:
[-84, -19, 0, 5, 117, 114, 0, 19, 91, 76, 106, 97, 118, 97, 46, 108, 97, 110, 103, 46, 83, 116, 114, 105, 110, 103, 59, -83, -46, 86, -25, -23, 29, 123, 71, 2, 0, 0, 120, 112, 0, 0, 0, 3, 116, 0, 7, 110, 117, 109, 98, 101, 114, 49, 116, 0, 8, 110, 117, 109, 98, 101, 114, 47, 50, 116, 0, 8, 110, 117, 109, 98, 101, 114, 92, 51]
This works, but apart from generating a very long String, it also generates a non-human readable string.
How can I best store this path, in a human readable way, and as little complication in my code as possible?
Solution
This is my solution using the OstermillerUtils as suggested by ct_ (thanks!).
String item1="number1";
String item2="number/2";
String item3="number\\3";
String item4="//number/4\\";
String item5=",num\"ber5,";
String item6="number,6";
String[] items = new String[6];
items[0] = item1;
items[1] = item2;
items[2] = item3;
items[3] = item4;
items[4] = item5;
items[5] = item6;
System.out.println("Test values");
for(String s : items) {
System.out.println(s);
}
StringWriter writer = new StringWriter();
CSVPrinter printer = new CSVPrinter(writer);
printer.changeDelimiter('/');
printer.write(items);
System.out.println("Persisted:\n\t"+writer.toString());
String[][] results = CSVParser.parse(writer.toString(), '/');
for (int j=0; j<results[0].length; j++){
System.out.println(results[0][j]);
}
So you want to serialize and deserialize a String array to a string and back? Have a look at http://ostermiller.org/utils/CSV.html - it can serialize and deserialize arrays using an arbitrary delimeter.
JAXB only using annotations would work. Best when you have one container class with a list field. You then get XML.

Categories

Resources