From this on how to achieve password based encryption it is clear that i need to save salt, IV and cipher text in order to decrypt it later.
From this iv and salt can be stored along with cipher text
I am storing the hex value in this format
DatatypeConverter.printHexBinary(salt) + DatatypeConverter.printHexBinary(iv) + DatatypeConverter.printHexBinary(ciphertext);
Do i need to store the values in Binary format ?
DatatypeConverter.printBase64Binary(salt) + DatatypeConverter.printBase64Binary(iv) + DatatypeConverter.printBase64Binary(ciphertext));
output clearly indicates the where the salt , iv is ending which is awful
lIvyAA/PZg4=fE4gTZUCPTrKQpUKo+Z1SA==4/gAdiOqyPOAzXR69i0wlC7YFn9/KOGitZqpOW2y3ms=
Will storing in hex format have any effects of data loss ?
Will the length of IV is constant ? in my case it is always 32 characters (hexadecimal)
Or i need to even store length of IV as well ? as the salt length is fixed initially to 8 bits (16 hexadecimal characters)
(I am using PBKDF2WithHmacSHA1 algorithm for key generation and AES/CBC/PKCS5Padding for cipher)
I think it is worth emphasizing again what the accepted answer above mentioned in passing.
That is, it is unnecessary and unwarranted to make any attempt to hide the salt or the IV. The security of your cryptography is entirely dependent on the secrecy of the secret key, and that of the secret key alone. The IV and the salt can be handed out in clear text along with the ciphertext, and as long as the secret key remains a secret, the ciphertext remains secure.
It's important to understand and accept that, or you will wind yourself about an axle trying to obfuscate things that don't matter. There is no security in obscurity.
It is important to note, however, that the salt should be generated in a cryptographically strong pseudorandom number generator. A new salt should be generated for each new plain text that is being encrypted. Likewise, the IV should be randomly generated for each new ciphertext.
Those parameters need to be independent and unpredictable but need not be secret.
So you can store them in separate fields or delimit them in a single field, or use fixed lengths for the first two of three fields. For maximum flexibility and future proofing, though, I suggest delimited fields, and include all parameters needed to deal with the data. If you are using PBE, I would include the algorithm name and the iteration count, too, rather than rely on default values.
Base64 encodes in chunks of 3 bytes into 4 base64 chars. If the number of bytes that needs to be encoded ain't a multiplum of 3 the last block is padded with one or two =, to indicate that this block ain't full 3 bytes.
As neither the salt nor the IV needs to be kept secret, there really ain't any problem about being able to detect where they start or stop. The base64 padding char = ain't a problem - but you ought to have a way to separate the three encoded strings. You could e.g. simply seperate the parts with a :.
The size of the IV is the same as the block size of your encryption algorithm. In this case you use AES that have a block size of 128 bits, which is 16 bytes. This would give 32 bytes if hex encoded, or 24 bytes if base64 encoded. Salt don't really have a fixed length, and will depend on your implementation.
Related
what if the input key is less than 16 bytes? i found a solution, but its not a best practice.
//hashing here act as padding because any input given, it will generate fixed 20 bytes long.
MessageDigest sha = MessageDigest.getInstance("SHA-1");
key = sha.digest(key);
//trim the code to only 16 bytes.
key = Arrays.copyOf(key, 16);
I'm not planning to use salt because it is not necessary in my project. Is there any better way?
There are three approaches:
Pad the key out to 16 bytes. You can use any value(s) you want to as padding, just so long as you do it consistently.
Your scheme of using a SHA-1 hash is OK. It would be better if you could use all of the bits in the hash as the key, but 128 bits should be enough.
Tell the user that the key needs to be at least N characters. A key that is too short may be susceptible to a password guessing attack. (A 15 character key is probably too long to be guessed, but 8 characters is tractable.) In fact, you probably should do some other password quality checks.
My recommendation is to combine 1. or 2. with 3 ... and password quality checks.
I'm not convinced that seeding the hash will make much difference. (I am assuming that the bad guy would be able to inspect your file encryption app and work out how you turn passwords into keys.) Seeding means that the bad guy cannot pre-generate a set of candidate keys for common / weak passwords, but he still needs to try each of the generated keys in turn.
But the flip-side is that using a crypto hash doesn't help if the passwords you start with are weak.
Don't confuse keys and passwords. Keys are randomly generated and may consist of any possible byte value. Passwords on the other hand need to be typable by a human and usually rememberable. If the key is too short then either emit an error to the user or treat it as a password.
A key should then only be entered in encoded format such as hex or Base64. Only check the length when you successfully decode it.
A password has all kinds of issues that makes it brute forceable such as short length or low complexity. There you would need to use a password-based key derivation function such as PBKDF2 and a sufficiently large work factor (iterations) in order to make a single key derivation attempt so slow that an attacker would need much more time to check the whole input space.
You should combine that with some message to the user to give some hints that the password is too short or doesn't include some character classes and is therefore not recommended.
Given an arbitrary Java byte array for example 1024 byte array I would like to derive an AES-256 bit key. The array is generated from ECHD via javax.crypto.KeyAgreement using byte[] secret = keyAgreement.generateSecret()
My current solution is to treat the input byte array as a password. Use the PBKDF2 key derivation function the input array as both the password and the salt as shown below.
UPDATE: I have set UTF-8 as the encoding to address issues pointed out in comments and answers.
private byte[] deriveAes256bitKey(byte[] secret)
throws NoSuchAlgorithmException, InvalidKeySpecException {
var secretKeyFactory = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA256");
var password = new String(secret, UTF_8).toCharArray();
var keySpec = new PBEKeySpec(password, secret, 1024, 256);
return secretKeyFactory.generateSecret(keySpec).getEncoded();
}
Is there a better way to take a byte array in Java and turn it into an AES-256 bit key?
I would be wary of using new String(input).toCharArray() to create the password. It's not portable (it uses the platform default encoding), and its behaviour is undefined if there are invalid character sequences in the input.
Consider this:
System.out.println(new String(new byte[] {(byte) 0xf0, (byte) 0x82, (byte) 0x82, (byte) 0xac}, StandardCharsets.UTF_8));
f08282ac is an over long encoding of the Euro sign (€). It's decoded to the replacement character (�; 0xfffd) because it's an illegal sequence. All illegal UTF-8 sequences will end up as the replacement char, which is not what you want.
You could avoid decoding problems by serialising the byte array before passing it to the SecretKeyFactory (base64 encode it, or simply new BigInteger(input).toString(Character.MAX_RADIX)). However, this can be avoided if you don't use the SecretKeyFactory. It's unnecessary.
PBKDF2 (Password-Based Key Derivation Function 2) is designed to make brute force attacks against user supplied passwords harder by being computationally expensive and adding salt.
You don't need that here (your input is large and random; nobody will be mounting dictionary attacks against it). Your problem is just that the input length doesn't match the required key length.
You can just hash the input to the correct length:
MessageDigest md = MessageDigest.getInstance("SHA-256");
byte[] keyBytes = md.digest(input);
What is required here is a KBKDF or Key Based Key Derivation Function. A KBKDF converts a secret value that contains enough entropy into a different key of a specific size. A PBKDF is used when you have a passphrase with potentially too little entropy into a key using key strenghtening (using the salt and work factor or iteration count). The work factor / iteration count doesn't need to be used if the input value is already strong enough not to be guessed / brute forced.
SHA-256 in general suffices if you only want a resulting 128 bit value. However, using a key derivation function may still offer benefits. First of all, it is a function that is explicitly defined for the function, so it is easier to prove that it is secure. Furthermore, it is generally possible to add additional data to the key derivation function so that you can e.g. derive more keys or a key and an IV. Or you can expand the configurable output size to output enough data for different keys or key / IV.
That said, most cryptographers won't frown too much if you use SHA-256 (or SHA-512 in case you require more bits for key / IV). The output is still supposed to be randomized using all possible bits from the input, and it is impossible to inverse the function.
I generate randomly IV value everytime I encrypt when doing AES/CBC.
private static IvParameterSpec getRandomIvParameterSpec() {
byte[] iv = new byte[16];
new SecureRandom().nextBytes(iv);
return new IvParameterSpec(iv);
}
And I concat IV Value to cipher byte everytime I encrypt.
Is there any secure improvement if I hash (SHA-256) IV value before concat to cipher byte?
SHA-256 is injective. You give it the same input, it will give you the same output. It is not surjective, however. If m1 and m2 both hash to h, you cannot conclude that m1 = m2, even if you know that |m1| = |m2| (both messages are of the same length).
Therefore, applying SHA-256 (or any deterministic function) cannot increase the entropy of your data. At best, it won't decrease it. In other words: If your data is 16 purely random bytes, it won't be “more than purely random” after you hash it. And if your data was not purely random to begin with, then hashing it won't help making it random. You have to use a better entropy source in the first place.
Another problem that you didn't mention is that you currently have 16 random bytes but if you put them into your SHA-256 hash function, you'll get 32 bytes out. Which ones are you going to use? If you only use every second byte – due to injectivity – you won't get all possible bit patterns even if your input was perfectly random and the hash function was flawless. (If you did, then this would – by the pidgin hole principle – mean that the other half of the bytes would always be a function of the bytes you did chose. Only a really crappy hash function, which SHA-256 of course is not, would have such property.) If you try to be clever and combine the bytes in some “smart” way, chances are that you'll make things even worse.
So the short answer is: just don't do it. Generate as many random bytes as you need using the strongest non-deterministic entropy source you have available and use them directly.
Can anyone explain the PHP code and give me hints on how to port the code in Java?
Here is the PHP code:
function decode_string($encoded_string, $key) {
$decoded = rtrim(mcrypt_decrypt(MCRYPT_RIJNDAEL_256, md5($key), base64_decode($encoded_string), MCRYPT_MODE_CBC, md5(md5($key))), "\0");
return $decoded;
}
OK, I'll bite, but I'll let you do the coding:
rtrim(x, "\0"): removes the braindead zero padding (0..15 bytes of zeros) that PHP employs, this to make the plaintext X times the block size, required for CBC. You'll have to program this yourself as it is not present in Bouncy Castle - so don't use any padding mode. Just remove the zero valued bytes at the right of the decrypted plaintext.
mcrypt_decrypt(MCRYPT_RIJNDAEL_256): probably somebody thought that this means AES-256, which it isn't. It is Rijndael with a block size of 256 bits. You need the Bouncy Castle libs in Java to decrypt that non-standardized part of the cipher
MD5($key) somebody needed 256 bits of key material and thought that the hex encoding of the MD5 value over a password was good enough. It isn't, as it only provides half of the entropy (2 hex chars per byte). That and the fact that MD5 is not a password hashing function makes this disingenuous at best
base64_decode($encoded_string): well, expect base 64 encoding, which is alright if the ciphertext needed to be present as ASCII compatible text
MCRYPT_MODE_CBC: that's OK, but as PHP is mainly used as a web language, I expect the message to be vulnerable to padding oracle / plain text oracle attacks, and you should of course expect any alteration of the ciphertext to be undetectable
md5(md5($key)): applying MD5 twice does not make this any more safe than a zero IV and don't forget the hexadecimal conversion performed by each of these functions; fortunately that does mean that the IV is at least 256 bits instead of 128 bits
So you need to use:
new BufferedBlockCipher(new RijndaelEngine(256))
in the lightweight API of Bouncy Castle.
Happy coding, you're good in Java, so this should be a breeze. Upgrade away from this utter crap ASAP.
Is it possible to guarantee output to be of certain length regardless of the input?
For example, i'd like to pass in a String and guarantee that its' encrypted equivalent will contain 45 characters. Those 45 characters must be there regardless whether input is 1 character of Alice in Wonderland.
Note: 45 is obviously an example, the point is that number of output characters should be controlled in some way (exact number, or divisible by 5, or even)
No - it is not possible to specify a fixed result length. If the data is long enough, then it cannot be encrypted to a fixed short arbitrary length (that would be amazing compression). It would be possible to devise a hash of that nature possibly. But a hash is different (it is one way; you cannot extract the original data from the hash).
It would be possible to control the length by using padding, though.
If you set your limit "high enough", yes, you can easily do what you want, using padding plus a stream-cipher.
For instance, take a look at the CTR (counter) mode of operation of block-ciphers: http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation#Counter_.28CTR.29
Using AES-128 in CTR mode, if you use a random IV and insert it at the beginning of the cipher text, you know that the size of the cipher text will be exactly 16 bytes + size of the plain text. Therefore, if you fix your cipher-text length at 100 bytes for example, you could encrypt plain texts of up to 84 bytes. You'd have to pad shorter plain texts. For instance, if you are encrypting ASCII texts, you could use the byte 0x00 as a marker of the end of the string (just as the "null-terminated strings" from C), and then just pad with random garbage until you get 84 bytes.
There are many, many other common padding schemes you could use: http://en.wikipedia.org/wiki/Padding_(cryptography)
I just thought about another possibility: you could use some kind of authenticated encryption, such as Galois/Counter Mode (GCM). You concatenate the random IV with the cipher text, and this with random bytes to pad it to the desired size. Then, to decrypt, you just try every substring of the ciphertext: if you got the correct substring, the decryption algorithm will output the plain text; otherwise, it will output "error". Just be aware that, using this, you could introduce some timing attacks on your scheme, and you might also do lots and lots of computations to decrypt the cipher text if the plain texts vary a lot in size.
In any case, be sure to have your scheme reviewed by an expert on cryptography (for instance, after you devise your scheme, ask about it at https://crypto.stackexchange.com/), because it is very easy to overlook some attack possibilities.