I am working on a java client/server application. Every user must be able to create and modify files (containing some sensitive data) through the client application (marking them with a digital signature) or manually (marking them with a 99.99999% chance wrong signature). The signature does not use client identity, only the content of the file, which mean two distant clients creating the exact same file would end up with two files with the exact same signature).
After doing pros and cons, I ended up thinking about using obfuscation to protect from malicious users than would use reverse-engineering to find the algorithm delivering digital signature for a given file.
But if I've understood it correctly, obfuscation makes code harder to read for human, harder to understand, but my goal is more about hiding the algorithm behind digital signature. Any idea on how to make it:
Hard to read?
Hard to find?
At the moment my idea are:
Using very random names and some useless treatments
Putting it in a random class at a random place and using stuff from random places
Remove comments
Randomize
Also I'm not sure to understand how compiling and reverse engineering work.
When a code is compiled, I ever thought variables were nicknamed in the "method area", and that a reverse engineering would give us back a code with variables named a, b, c... etc. But it appears not to be the case and it makes sense now that I think about it, since reflection is possible in java, am I right on that last part?
To conclude, I'm not sure to understand how this would prevent user to reverse my code (except for variable names' part).
I ended up thinking about using obfuscation to protect from malicious users than would use reverse-engineering to find the algorithm delivering digital signature for a given file.
I think this is misguided for the following reasons.
There are a few well-known cryptographic hashing functions that are understood to be sufficiently secure against reverse engineering, given the current "state of the art" in cryptography. You can read about some of the common ones here:
https://en.wikipedia.org/wiki/Cryptographic_hash_function
You can combine a cryptographic hash function with public key encryption to provide digital signatures that are (should be) secure enough for your use-case. For example:
https://en.wikipedia.org/wiki/Digital_Signature_Algorithm
There are solid implementations of these technologies available for Java. There is no need to implement your own.
Designing and implementing your own digital signature algorithm is unwise. Unless you are an expert in the mathematics of cryptography, your algorithm is likely to have weaknesses that you are unaware of. And you are an expert, you will fully understand the difficulty in creating a strong system.
Obfuscation is not an adequate protection against reverse engineering to extract secrets (such as an algorithm) from code. Indeed, in the case of Java it is little more than a "speed bump" for a skilled hacker.
OK, I'm just struggling to understand how my app will be able to determine that the signature of "a" is equals to some word while a user can't find the same algorithm on the internet to do exactly the same and find the same signature.
You have a point. If the "text" that you are creating a hash for is known to be very short and/or easy to "guess", then it will be feasible to brute-force its hash, assuming that the algorithm is known. (For example, Gravatar's approach of using hashes of email addresses for privacy is flawed, because it is not hard to assemble a list of known email addresses, generate their hashes and store them in a database that can be queried.)
However, once you have gotten beyond a few tens of random bytes of data, or a few tens words of text, brute-force (and rainbow table) attacks become impractical. So, you can can start with your document, add an "envelop" with a timestamp, other identifying information, and (if necessary) some random junk to pad out the source text. Then hash the lot. The other end merely needs to repeat the process and see if they get the same hash.
(There is more stuff you need to do to create a full digital signature ... but read the link above.)
Let's clarify your misconceptions about obfuscation:
You don't do it on your source code. In the java world, if at all you obfuscate the binary delivery, in other words: your class files. Or to be precise: it is mostly about class file obfuscation, there are commercial tools for source code obfuscation.
Obfuscation is still used within the Android realm, but "pure" java shops, it is rarely used these days
And most importantly: "security by obscurity" rarely works.
If you are really serious about running your code at the client side (where you have no control over it), it might be better to do that part in native code, and to deliver machine compiled binaries for that part.
Related
I want to check user's balance of a couple of ERC20 compliant tokens using web3j.
Is there a generic way of doing that (generic for every ERC20 contract) or should I get ABI for each of the contracts and generate java classes from it?
I have never used web3j, but I have used web3js quite a bit. I will link you to relevant information.
Here is an interface that is already created in the tests of the web3j library, so the best place to start.
Extra notes (which might well be basic for you)
Checking the balance is something that you don't want to generate a transaction for (since it doesn't change the state of the blockchain) and so you should use a 'call', as explained here.
Also, it may be useful to understand how Ethereum creates the ABI in the first place. Every transaction or call can contain data with it, and the network then uses this data to determine which function is being called and it's parameters. The logic for this function is sitting at the address of the first 4 bytes of the kekak hash of the functions name/parameters (some info), which is one reason why it is so important that this hash is collision free (imagine 2 different functions hashing to the same address). But the take home of this is that all erc20 tokens (if they follow the standard) have common ABIs for those functions.
PS. For next time I think this question is better suited for Ethereum Stackexchange.
I know enough about cryptology to make life difficult for a novice programmer and get laughed at by security experts. So with that in mind, I ask: how secure is javax.crypto.Cipher? I realise that anything can be cracked by someone with a will and a way, but I still would like to know relative details.
The reason I ask is I would like to store account names and passwords that will be sent through my Cryptor class that will encrpyt them, and would like to know if this will do the job. If any one has any literature that I could read, that would be greatly apprieciated.
Thanks ~Aedon
Cipher is a generic class to apply an encryption/decryption algorithm. Its security depends on the actual encryption algorithm used (DES, triple-DES, AES, etc.), on the size of its key, and on the block chaining type that you choose.
If you intend to store passwords securely, then your requirements are quite different from simply "communicating securely/privately". A Cipher on its own is not enough to protect you. You need to use one of these
bcrypt
scrypt
PBKDF2 from PKCS#5
in that circumstance. Here are some arguments and links concerning password security.
The punchline is that "normal" encryption (or hashing, too) is just way too fast to hold off serious attackers. You want to artificially slow down the entire process to make it as hard as possible for somebody systematically attacking your application. A single user won't notice the difference between 1 or 500 milliseconds when entering a password but for an attacker this means that in order to break your scheme it will take them 500 times as long on the average - so if it would have taken roughly 1 month to find a valid password before, now it will take 500 months.
Since NullCipher is a Cipher - not secure at all.
Looking at the of effort some oragnizations do to obfuscate Java bytecode to avoid others to decompile it and extract secret information from the code, taking in account the limitations of this practice):
Wouldn't it be feasible to use asymmetric encryption to face this problem? I mean, wouldn't it be possible for Oracle to equip the JVM with a certificate and a ClassLoader capable of decrypt encrypted class files using the private key of this certificate?
Of course, the classes would have been encrypted using the public key of this "unique oracle certificate".
The private key would be inside the JVM.
I suppose that maybe it is not mathematically possible to protect this private key inside the JVM (encrypting it in turn...), and that it would be eventually hacked... is this the case???
If the private key is inside the JVM it will take literally minutes to hackers and crackers to get what that key is using reverse engineering.
Besides, classloader will probably be very slow if it has to use asymmetric encryption every time it needs to load a class.
I suppose that maybe it is not mathematically possible to protect this private key inside the JVM (encrypting it in turn...), and that it would be eventually hacked... is this the case???
Essentially, yes.
If you use symmetric algorithms and store the key in the JVM, it will be trivial to reverse engineer and find those. If you employ obfuscation to hide them, it becomes less trivial, but it can still be done.
With public key crypto, the key doing the decrypting needs to be stored somewhere again. This is essentially a key storage problem and the only difficult to reverse engineer problems are in hardware; even then, they get broken.
My answer on IT security concerning effective DRM protection methods covers this in a little more detail.
In any case, it's impossible to protect private certificate at the client machine (e.g. in JVM). How do you imagine it? If it was a plain text file, obviously it could be extracted. If it was encrypted, the "second level" key would have to be on client machine as well so that JVM could use the private cert for code protection. So you would be able to extract that key, and consequently the private cert, as well.
For the signing scheme to be hard to break, the key needs to be inaccessible.
If you have the complete program this is not hard to extract for a programmer. Any platform which can actually do this, has the key outside of reach for programmers.
You might find this story about getting the private key from an Airport Express interesting: http://mafipulation.org/blagoblig/2011/04/08#shairport
Don't forget that the oracle JVM isn't the only JVM around. every JVM must adhere to a standard (Java Virtual Machine Specification) to ensure a very basic principle of java: "write once, run anywhere". a private key like this would cause the oracle jvm to behave differently than all other implementations
If code is encrypted it must be decrypted at some point. It is a simple tautology. Obfuscation however is in many cases irreversible.
I am using iso 19794-2 fingerprint data format. All the data are in the iso 19794-2 format. I have more than hundred thousand fingerprints. I wish to make efficient search to identify the match. Is it possible to construct a binary tree like structure to perform an efficient(fastest) search for match? or suggest me a better way to find the match. and also suggest me an open source api for java to do fingerprint matching. Help me. Thanks.
Do you have a background in fingerprint matching? It is not a simple problem and you'll need a bit of theory to tackle such a problem. Have a look at this introduction to fingerprint matching by Bologna University's BioLab (a leading research lab in this field).
Let's now answer to your question, that is how to make the search more efficient.
Fingerprints can be classified into 5 main classes, according to the type of macro-singularity that they exhibit.
There are three types of macro-singularities:
whorl (a sort of circle)
loop (a U inversion)
delta (a sort of three-way crossing)
According to the position of those macro-singularities, you can classify the fingerprint in those classes:
arch
tented arch
right loop
left loop
whorl
Once you have narrowed the search to the correct class, you can perform your matches. From your question it looks like you have to do an identification task, so I'm afraid that you'll have to do all the comparisons, or else add some layers of pre-processing (like the classification I wrote about) to further narrow the search field.
You can find lots of information about fingerprint matching in the book Handbook of Fingerprint Recognition, by Maltoni, Maio, Jain and Prabhakar - leading researchers in this field.
In order to read ISO 19794-2 format, you could use some utilities developed by NIST called BiomDI, Software Tools supporting Standard Biometric Data Interchange Formats. You could try to interface it with open source matching algorithms like the one found in this biometrics SDK. It would however need a lot of work, including the conversion from one format to another and the fine-tuning of algorithms.
My opinion (as a Ph.D. student working in biometrics) is that in this field you can easily write code that does the 60% of what you need in no time, but the remaining 40% will be:
hard to write (20%); and
really hard to write without money and time (20%).
Hope that helps!
Edit: added info about NIST BiomDI
Edit 2: since people sometimes email me asking for a copy of the standard, I unfortunately don't have one to share. All I have is a link to the ISO page that sells the standard.
The iso format specifies useful mechanisms for matching and decision parameters. Decide on what mechanism you wish to employ to identify the match, and the relevant decision parameters. When you have determined these mechanisms and decision parameters, examine them to see which are capable of being put into an order - with a fairly high degree of individual values, as you want to avoid multiple collisions on the data. When you have identified a small number of data items (preferably one) that have this property, calculate the property for each fingerprint - preferably as they are added to the database, though a bulk load can be done initially. Then the search for a match is done on the calculated characteristic, and can be done by a binary tree, a black-red tree, or a variety of other search processes. I cannot recommend a particular search strategy without knowing what form and degree of differentiation of values you have in your database. Such a search strategy should, however, be capable of delivering a (small) range of possible matches - which can then be tested individually against your match mechanism and parameters, before deciding on a specific match.
I want to write a very simple implementation of an onion router in Java (but including chaum mixes) - a lot of the public / private key encryption seems pretty straightforward, but struggling to understand how the last router would know that the final onionskin has been 'peeled'.
I was thinking of having some sort of checksum also encoded, so that each router tries a decryption with their private key, and if the checksum works - forwards the newly peeled onion to the next router.
Only this way, (assuming that some bit of the checksum are stripped every time a successful decryption occurs) there will be a way (looking at the checksum) to estimate how close it is to decryption -- this this a major vulnerability ? is the checksum methodology an appropriate simplification?
Irrespective of the problem you mention, it's generally good practice to include some integrity check whenever you encrypt/decrypt data. However, checksums aren't really suitable for this. Have a look at Secure Hash algorithms such as SHA-256 (there are implementations built into the standard Java cryptography framework).
Now, coming back to your original question... To each node of the onion, you're going to pass an encrypted "packet", but that packet won't just include the actual data to pass on-- it'll include details of the next node, your hash code, and whatever else... including whatever flag/indication to say whether the next "node" is an onion router or the actual end host. Indeed the data for the last node will have to have some special information, namely the details of the actual end host to communicate with. In other words, the last node knows the onion has been peeled because you encode this fact in the data it ends up receiving.
Or at least, I think that's how I'd do it... ;-)
N.B. The encryption per se isn't that complicated I don't think, but there may be one or two subtleties to be careful of. For example, in a normal single client-server conversation, one subtlety you have to be careful of is to never encrypt the same block of data twice with the same key (or at least, that's what it boils down to-- research "block modes" and "initialisation vectors" if you're not familiar with this concept). In a single client-server conversation the client and server can dictate parts of the initialisation vector. In an onion router, some other solution will have to be found (at worst, using strongly-generated random numbers generated by the client alone, I suppose).
You could hide the number of checksums by storing them in a cyclic array, whose initial offset is chosen at random when the onion in constructed. Equivalently, you could cyclically shift that array after every decryption.