Persistent file validation in java

Persistent file validation in java - java

I have need in a java application that the file created by my application should not be modified by user, if it is modified then it should get validated.
My approach: I have taken the last modified time of each file in a hashmap and validate the modified file basis on this.
Problem: It is fine for particular session and if i want to persist that information then i have to create another file containing the last modified information that also can modified by user. For now I am not using any database.
So request you to give me a alternative of this i.e. how can i validate file? and is my approach most optimized one?

Use a decent hashing algorithm to take a hash of the file contents. To test if the user modified the file, conduct the same hash procedure again (at the time of the test) and compare it to the original hash. If the hashes are different, then the user clearly modified the file. I would suggest you use SHA-1 for your hashing algorithm but someone else might have a better hashing algorithm to use.
You can see this SO answer for information about how to compute a SHA-1 hash from a byte array. You can see this SO answer for information about reading a File into a byte array.
For storing the hash information of each file, I recommend using a database but you don't have to do so. You could use a normal file and within that file, create a format for storing hash related information. Your format could be the user's file name=the hash. For example:
myfile.txt=489892945720524750
otheruserfile.txt=390495940542905490

Related

Best way to password protect a Java console application

I am looking for a best-practice or standard method to password protecting a console application. I have researched various methods and would like some feedback on my approach.
I have decided to hash my password using Argon2 so that I only have to store the one-way hash. Everything is working as expected. The question I have is where do I store the hash? Should it be hard-coded? Should I store it in a separate file and read it in? What is the most secure way to approach this? At the end of the day I am writing this application to learn and would very much like to learn to do it the correct way. Links to any relevant reading material would also be appreciated. I continue to google...
EDIT: So what would the potential drawbacks be if I stored program password as hash in a file. The user would have to know the password to use the application. Then let the program password that is protected by the hash be the encryption key to secure the sensitive information? Even if the source code and/or hash file is manipulated, the sensitive data would not be readable since the correct password is used as the key...what am I missing?

First of all, Argon2 is a fine key derivation function for turning passwords into encryption keys.
However, if you are using the Argon2 hash as an encryption key, then don't store it on disk, obviously. If you store the encryption key next to the encrypted data you might as well not encrypt at all. One could even argue that it's worse, because it gives a false sense of security.
Properly encrypted data is useless without the key, so you don't have to protect the application itself. Just ask for the password if and when you need to encrypt or decrypt something. You can consider keeping the hash in memory for a while so you don't have to ask for it repeatedly, but don't persist it.
This is exactly how GPG works, for instance. It doesn't store any password hashes anywhere. Instead it stores private keys encrypted and just asks for the passphrase if it needs to decrypt a private key.

Scan duplicate document with md5

for some reasons I can't use MessageDigest.getInstance("MD5"), so I must write the algorithm code in manual way, my project is scan duplicate document (*.doc, *.txt, *.pdf) on Android device. My question is, what must I write before entering the algorithm, to scan the duplicate document on MY ROOT directory on Android device? Without select the directory, when I press button scan, the process begin, the listview show. Is anyone can help me? My project deadline will come. Thank you so much.
public class MD5 {
//What must I write here, so I allow to scan for duplicate document on Android root with MD5 Hash
//MD5 MANUAL ALGORITHM CODE
}

WHOLE PROCESS:
your goal is to detect (and perhaps store information about) duplicate files.
1 Then, first, you have to iterate through directories and files,
see this:
list all files from directories and subdirectories in Java
2 and for each file, to load it like a byte array
see this:
Reading a binary input stream into a single byte array in Java
3 then compute your MD5 - your project
4 and store this information
Your can use a Set to dectect duplicates (a Set has unique elements).
Set<String> files_hash; // each String is a string representation of MD5
if (files_hash.contains(my_md5)) // you know you have it already
or a
Map<String,String> file_and_hash; // each is file => hash
// you have to iterate to know if you have it already, or keep also a Set
ANSWER for MD5:
read algorithm:
https://en.wikipedia.org/wiki/MD5
RFC: https://www.ietf.org/rfc/rfc1321.txt
some googling ...
this presentation, step by step
http://infohost.nmt.edu/~sfs/Students/HarleyKozushko/Presentations/MD5.pdf
or try to duplicate C (or java) implementation ...
OVERALL STRATEGY
To keep time and have processus faster, you must also think about the use of your function:
if you use it once, for one unique file, better is to reduce work, by selecting before other files on their size.
if you use it regularly (and want to do it fast), scan regularly new files in background to keep an hash base up to date. Detection of new file is straightforward.
if you want to get all files duplicated, better scan everything, and use Set Strategy also
Hope this helps

You'll want to recursively scan for files, then, for each file found, calculate its MD5 or whatever and store that hash value, either in a Set<...> if you only want to know if a file is a dupe, or in a Map<..., File> if you want to be able to tell which file the current file is a duplicate of.
For each file's hash, you look into the collection of already known hashes to check if that particular hash value is in it; if it is, you (most likely) have a duplicate file; if it is not, you add the new hash value to the collection and proceed with the next file.

Java, compare incoming attachment to a list of images in database to see if identical

In a system we fetch emails automatically and save the attachments in these emails in database.
Now the customer want to be able to not save certain images, like banners and such that get saved over and over again.
I need a way to create a "blacklist" of images in the database and compare these images to the incoming attachments.
this is how the attachments are saved to database.
....
InputStream is = new BufferedInputStream(new FileInputStream(attachment));
preparedStatement.setBinaryStream(5,is,(int)filesize);
....
pstmt.executeUpdate();
In the database they get saved as image and looks like 0xFFD8FFE000104A46494600010100000100010000....
What would be an easy way to read a few such images from database and see if any of them are identical to the incoming attachment?
Note that this is a rather complex system that I will not be able to rebuild at this time. So any advice about storing images in folders instead of in database or something similar will not be helpful to me right now.

I would recommend you to use a image hasher like LIRE. With this library, you can obtain a hash and then compare them (euclidean distance). Taking similarity between images into account, you can discard images that are not equal but really simmilar.
Here is the link with the explanation:
https://blog.mayflower.de/1755-Image-similarity-search-with-LIRE.html
And here is the link with the code:
https://github.com/aoldemeier/ImageSimilarityWithLIRE

Do not compare the images directly, compare hash codes. If you use a hashing function like http://de.wikipedia.org/wiki/SHA-2 you can be very confident (*) that there are not collisions and you will blacklist the right images.
The basic idea is: While reading the Image, also compute it's hash code using MessageDigest
MessageDigest digest = MessageDigest.getInstance("SHA-256");
// call digest.update(byte[]) for all the chunks of the file
byte[] hash = digest.digest();
You can then compare the hash. If you convert it to a Base64 String before storing it to the database, you can use a normal String comparison in your SQL statement or in your Java code:
import org.apache.commons.codec.binary.Base64;
byte[] encodedBytes = Base64.encodeBase64(hash);
System.out.println("encodedBytes " + new String(encodedBytes));
Note: Your blacklist will probably still not work as you intend it. Users will just have to slightly change a single pixel of the picture and you will not find it in your blacklist anymore. You would probably compare images for similarity. And this is a lot harder and more time consuming.
See also:
How to hash some string with sha256 in Java?
Base64 Encoding in Java
Getting a File's MD5 Checksum in Java
(*) As in, the chances of a false positive are so low, don't even bother to think about it.

Since the Image data type is a binary and huge space for storing data, IMO, the easiest way to compare Image fields is hash comparison. So you need to store hash of the Photo column on your table.
Images are stored in the database in the binary form , if you want to develop this comparison blacklist system then the best way would be to compare hashes. Basically you need to store hashes of all the images in a column from which you can compare any incoming image's hash.
Comparing by name wouldn't be very efficient as name's might change.

Changing encryption algorithm

Does anyone know good tutorials to change PBEWithMD5AndDES encryption algorithm to AES for a Java application? Specially , I want to know what precautions I should take while changing this algorithm to more secure one. Any important test cases to check before and after algorithm changes. Another question is since I have used PBEWithMD5AndDES , most of the user passwords are encrypted using that algorithm. So if I change my algorithm to AES , how do I make sure that decryption of passwords happen with old algorithm while I can still use new algorithm for any new encryption.

Normally you wouldn't encrypt a users password, you'd just hash it with a salt instead.
Migrating from one encryption system to another is going to be a bit of a pain, as I see it you have two options:
During the upgrade process decrypt then re-encrypt all the passwords
Add a flag indicating the encryption method used. All existing passwords will obviously be set to the current standard. New users will be set to whatever method you choose and you can migrate other users when they change their password.

If you've already got data encrypted in format a, and you want to start using another encryption scheme, b, I can think of two ways to accomplish this:
Decrypt all of your data and re-encrypt it using `b`. This approach would be good when you can take your data store offline and "fix everything at once."
For each item you attempt to decrypt, try to decrypt it using `b` first. If that fails, decrypt it using `a`. The next time you try to encrypt something, make sure you use `b`. This approach could be used when you can't take your data store offline, but you want to encrypt all of your data using another algorithm. All of your data will eventually be encrypted using the other algorithm.

There's really no problem changing algorithms. What you need to do is decrypt the cipher text and then encrypt the resulting plain text with the new algorithm. That's straightforward. If you are going to perform this transition over time, I would suggest creating a new database table that keeps track of whether a particular entity (based on unique id) has been transfered to the new algorithm. If it has, then you simply use the new algorithm to decrypt it and you can forget about it, if not, then you use the old algorithm to decrypt it. Regardless though, all new encryption should be performed with the new algorithm.
Now, there's a second issue here. Why are you bothering to decrypt passwords? Just save the hash of the password and forget about it. If you are able to decrypt passwords, you introduce a potential vulnerability. If a malicious user can get a hold of your key you use to encrypt those passwords, then they could access the plain text of the password. Not only could the user then use that information to compromise your system, if your users use the same username/password combination for other sites, those accounts would be compromised as well. You should only store a hash of the password (SHA is a good one, don't use MD5) and then when the user attempts to log in, you hash the input and compare the two results. You have no need to know what the plain text password is.

you may look into ESAPI - java http://code.google.com/p/owasp-esapi-java/
ESAPI 1.4 was using PBEWithMD5AndDES, but in 2.0 they introduced AES
check their mail chain here
you may check the difference between the two implementations

PBEWithMD5AndDES is a method of taking a user's password and from it deriving an encryption scheme that can be used to protect further data. It is not a method of verifying a password, nor of encrypting one.
If you are only interested in password validation, then decrypt the passwords and replace them with a secure hash and in future match the hashes. You will also need your password reminder service to a password reset service.
The question is where is the password you are passing into the PBE algorithm coming from? If it is a fixed password for your application, then you just need to replace it and perform some kind of rolling upgrade. As an observation, if you are storing encrypted data as text, either hex or base-64 encoded, there are characters that cannot appear in the text output and which you can hence prepend to indicate a newer encryption scheme. For example the : character does not appear in base-64. That will allow you to identify what has been upgraded and what has not.
If the passwords are coming from the user, then each user has their own password derived cipher. In this case you can only re-encrypt whatever data has been encrypted with the user's cipher when the user provides their password.
The most direct replacement is going to be along the lines of PBEWithSHA256And256BitAES. Unfortunately, this is not supported by Java 6, so you will need a 3rd party JCE library such as Bouncy Castle. Bouncy Castle offers PBEWithSHA256And256BitAES-CBC-BC, which would be a suitable replacement.
The process of upgrading the cipher is a challenge. Whatever data has been encrypted with DES can only be decrypted with the user's password. I assume you do not have access to the passwords. This means you can only re-encrypt the data when the person who knows the password provides it. You are going to have a long period of time when your system contains a mixture of ciphers, so you need a way of identifying what is converted.
If we are talking about files, you could change the file suffix, or the folder they are stored in. If we are talking about BLOBs in a database, you could add an extra column to the database table to say what the encryption method is. If neither of those are possible you could add some form of header to the data to indicate that it has been encrypted in a new way. That's slightly risky as your existing data has no header and there is an outside chance it will match the new header by accident.
It may also be advisable to keep a list of which users have not yet had their data converted so you can prompt them to convert.

Simple encryption in JSP

I am an unwilling JSP/Java noob. I've been asked to hurriedly write up a system for generating secure urls from one site to another. The actual request string (must be passed as GET request) needs to be encrypted or otherwise obfuscated so that the user cannot easily change it to request someone else's document. Because of limitations in the environment, I cannot simply manage the request in a session and really must do it this way.
A sample of what I need:
page1.jsp:
a 7 digit number is generated by our system and needs to be passed to http://otherserver.com/page2.jsp. If the user sees this number, it will be obvious what it represents, and no other number can be used for this purpose.
The number should be encrypted or otherwise obfuscated in page1.jsp code and built into a URL to page2.jsp that can be decrypted / unobfuscated easily.
Thank you for your help!

I wouldn't bother to try to obfuscate it.
Instead, if the two servers can share a common secret, you can use keyed-hashing (see javax.crypto.Mac) to generate keyed hashes for the document number, which is passed to the other server along with the document number.
The target server can then easily verify that the keyed hash corresponds to the document number, and easily detect attempts to modify it.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.