How to generate random string with no duplicates in java - java

I read some answers , usually they use a set or some other data structure to ensure there is no duplicates. but for my situation , I already stored a lot random string in database , I have to make sure that the generated random string should not existed in database .
and I don't think retrieve all random string from database into a set and then generated the random string is a good idea...
I found that System.currentTimeMillis() will generate a "random" number , but how to translate that number to a random string is a question...I need a string with length 8.
any suggestion will be appreciated

You can use Apache library for this: RandomStringUtils
RandomStringUtils.randomAlphanumeric(8).toUpperCase() // for alphanumeric
RandomStringUtils.randomAlphabetic(8).toUpperCase() // for pure alphabets
randomAlphabetic(int count)
Creates a random string whose length is the number of characters specified.
randomAlphanumeric(int count)
Creates a random string whose length is the number of characters specified.

So there are two issues here - creating the random string, and making sure there's no duplicate already in the db.
If you are not bound to 8 characters, you can use a UUID as the commenter above suggested. The UUID class returns a strong that is highly statistically unlikely to be a duplicate of a previously generated UUID so you can use it for this precise purpose without checking if its already in your database.
UUID.randomUUID().toString();
Or if you don't care whether what the unique id is as long as its unique you could use an identity or autoincrement field which pretty much all DB's support. If you do that, though you have the read the record after you commit it to get the identity assigned by the db.
which produces a string which looks something that looks like this:
5e0013fd-3ed4-41b4-b05d-0cdf4324bb19
If you are have to have an 8 character string as your unique id and you don't want to import the apache library, \you can generate random 8 character string like this:
final String alpha="ABCDEFGHIJKLMNOPQRSTUVWXYZ";
final Random rand= new Random();
public String myUID() {
int i = 8;
String uid="";
while (i-- > 0) {
uid+=alpha.charAt(rand.nextInt(26));
}
return uid;
}
To make sure its not a duplicate, you should add a unique index to the column in the db which contains it.
You can either query the db first to make sure that no row has that id before you insert the row, or catch the exception and retry if you've generated a duplicate.

Method currentTimeMillis() returns the current time in milliseconds in long so convert long to string, and s.substring(5, s.length()) give you last 8 digit's of milliseconds those are always identical for each millisecond.
public static void main(String[] args) {
String s = String.valueOf(System.currentTimeMillis());
System.out.println(s.substring(5, s.length()));
}
You have to make sure that this string is available or not in your database each time.

Related

Java, how to hash a string with low collision probability, specify characters allowed in output to decrease this

Is there any way to hash a string and specify the characters allowed in the output, or a better approach to avoid collisions when producing a hash of 8 characters in length.
I am running into a situation where I am seeing a collision with my current hashing method (see example implementation below).
currently using crc32 from https://guava.dev/releases/20.0/api/docs/com/google/common/hash/Hashing.html
the hashes produced are alphaNumeric, 8 characters in length.
I need to keep the 8 digit length (not storing passwords), Is there a way to specify an "Alphabet" of allowed output characters of a hashing function?
e.g. to allow (a-z, 0-9,) and a set of characters e.g. (_,$,-),
the characters added will need to be URI friendly
This would allow me to decrease the possibility of collisions occurring.
The hash output will be stored in a cache for a maximum of 60 days, so collisions occurring after that period will have no affect
current approach example code:
import com.google.common.hash.HashFunction;
import com.google.common.hash.Hasher;
import com.google.common.hash.Hashing;
public class Test {
private static final String SALT = "4767c3a6-73bc-11ec-90d6-0242ac120003";
public static void main( String[] args )
{
// actual strings causing collisions removed as have to redact some data
String string1 = "myStringOne";
String string2 = "myStringTwo";
System.out.println( "string1:" + string1);
System.out.println( "string1 hashed:" + doHash(string1, SALT));
System.out.println( "string2:" + string2);
System.out.println( "string2 hash:" + doHash(string2, SALT));
}
private static String doHash(String keyValue, String salt){
HashFunction func = Hashing.crc32();
Hasher hasher = func.newHasher();
hasher.putUnencodedChars(keyValue);
hasher.putUnencodedChars(salt);
return hasher.hash().toString();
}
}
functionality of the code/problem statement
using key store db.
A user requests a resource,
hash is made of (user details & requested resource).
if resulting id already present -> return that item from DB
else, perform processing on resource and store in db, with result from hash as ID
cache is purged periodically.
Questions.
Is there a way to specify the alphabet the hash is allowed to use in its output?
I checked the docs but do not see an approach https://guava.dev/releases/20.0/api/docs/com/google/common/hash/Hashing.html
Or is there an alternative approach that would be recommended?
e.g. generating a longer hash and taking a subset.

How to ensure values produces randomly are unique in Java [duplicate]

This question already has answers here:
How to create user friendly unique IDs, UUIDs or other unique identifiers in Java
(7 answers)
Closed 1 year ago.
When generating a random value, I need to ensure that the generateUserID() method generates a "random and unique" value every time.
private String generateUserID(String prefix, int digitNumber)
{
return prefix + String.valueOf(digitNumber < 1 ? 0 : new Random()
.nextInt((9 * (int) Math.pow(10, digitNumber - 1)) - 1)
+ (int) Math.pow(10, digitNumber - 1));
}
No output from this method should be the same as another value. Thus, I need to refactor the code, but I cannot use any loop.
I use random ID's like this when moving data records from one system to another. I generally use a UUID (Universally Unique IDentifier) so I can clearly distinguish any one record from any of its (potentially millions of) counterparts.
Java has pretty good support for them. Try something like this:
import java.util.UUID;
private String generateUserID(String prefix, int digitNumber) {
String uuid = UUID.randomUUID().toString();
return prefix + digitNumber + uuid; // Use the prefix and digitNumber however you want.
}
JavaDocs: https://docs.oracle.com/javase/7/docs/api/java/util/UUID.html#randomUUID()

Id Generation for multiple forms

Can anyone suggest if i use below code to generate id for my files, will it be unique always.
As 100s forms create the form at same automatically which auto populate ids in ID textbox. So it should be thread safe and If i restart the application it should not ever repeat the id which already generated before the application stop anytime.
private static final AtomicLong count = new AtomicLong(0L);
public static String generateIdforFile()
{
String timeString = Long.toString(System.currentTimeMillis(), 36);
String counterString = Long.toString(counter.incrementAndGet() % 1000, 36);
return timeString + counterString;
}
And forms are getting the Id using ClassName.generateIdforFile();
Why not just use a UUID for your file id? You could use something like the following:
public static String generateIdforFile() {
return UUID.randomUUID().toString();
}
Or do you need a (ongoing) numeric value?
If the number just has to be numeric (and not ongoing) you could use UUID#getLeastSignificantBits() or UUID#getMostSignificantBits() for the numeric value.
Quoting this answer on SO:
So the most significant half of your UUID contains 58 bits of
randomness, which means you on average need to generate 2^29 UUIDs to
get a collision (compared to 2^61 for the full UUID).
You will of course not be as collision secure as using the full UUID.
If you are making method as synchronized there is no need to use AtomicLong variables.
Because concurrency is ensured by using synchronized keyword.
Using excessive concurrent variables hampers efficiency and performance of application.
Better use a global AtomicLong starting at 0L for you entire application. Then you concatenate with CurrentTimeMillis.
static AtomicLong counter = new AtomicLong(0L);
public static String generateIdforFile()
{
String timeString = Long.toString(System.currentTimeMillis(), 36);
String counterString = Long.toString(counter.incrementAndGet() % 1000, 36);
return timeString + counterString;
}
This has greater chances to yield unique IDs, even between application restarts, provided that your app takes a bit more than some milliseconds to shutdown and restart. Note that the method is not synchronized anymore. (no need) And provided also, that you create less than a thousand files in the same millisecond. But you can't guarantee universal uniqueness.

How to synchronize System Time access in a class in Java

I am writing a class that when called will call a method to use system time to generate a unique 8 character alphanumeric as a reference ID. But I have the fear that at some point, multiple calls might be made in the same millisecond, resulting in the same reference ID. How can I go about protecting this call to system time from multiple threads that might call this method simultaneously?
System time is unreliable source for Unique Ids. That's it. Don't use it.
You need some form of a permanent source (UUID uses secure random which seed is provided by the OS)
The system time may go/jump backwards even a few milliseconds and screw your logic entirely. If you can tolerate 64 bits only you can either use High/Low generator which is a very good compromise or cook your own recipe: like 18bits of days since beginning of 2012 (you have over 700years to go) and then 46bits of randomness coming from SecureRandom - not the best case and technically it may fail but it doesn't require external persistence.
I'd suggest to add the threadID to the reference ID. This will make the reference more unique. However, even within a thread consecutive calls to a time source may deliver identical values. Even calls to the highest resolution source (QueryPerformanceCounter) may result in identical values on certain hardware. A possible solution to this problem is testing the collected time value against its predecessor and add an increment item to the "time-stamp". You may need more than 8 characters when this should be human readable.
The most efficient source for a timestamp is the GetSystemTimeAsFileTime API. I wrote some details in this answer.
You can use the UUID class to generate the bits for your ID, then use some bitwise operators and Long.toString to convert it to base-36 (alpha-numeric).
public static String getId() {
UUID uuid = UUID.randomUUID();
// This is the time-based long, and is predictable
long msb = uuid.getMostSignificantBits();
// This contains the variant bits, and is random
long lsb = uuid.getLeastSignificantBits();
long result = msb ^ lsb; // XOR
String encoded = Long.toString(result, 36);
// Remove sign if negative
if (result < 0)
encoded = encoded.substring(1, encoded.length());
// Trim extra digits or pad with zeroes
if (encoded.length() > 8) {
encoded = encoded.substring(encoded.length() - 8, encoded.length());
}
while (encoded.length() < 8) {
encoded = "0" + encoded;
}
}
Since your character space is still smaller compared to UUID, this isn't foolproof. Test it with this code:
public static void main(String[] args) {
Set<String> ids = new HashSet<String>();
int count = 0;
for (int i = 0; i < 100000; i++) {
if (!ids.add(getId())) {
count++;
}
}
System.out.println(count + " duplicate(s)");
}
For 100,000 IDs, the code performs well pretty consistently and is very fast. I start getting duplicate IDs when I increase another order of magnitude to 1,000,000. I modified the trimming to take the end of the encoded string instead of the beginning, and this greatly improved duplicate ID rates. Now having 1,000,000 IDs isn't producing any duplicates for me.
Your best bet may still be to use a synchronized counter like AtomicInteger or AtomicLong and encode the number from that in base-36 using the code above, especially if you plan on having lots of IDs.
Edit: Counter approach, in case you want it:
private final AtomicLong counter;
public IdGenerator(int start) {
// start could also be initialized from a file or other
// external source that stores the most recently used ID
counter = new AtomicLong(start);
}
public String getId() {
long result = counter.getAndIncrement();
String encoded = Long.toString(result, 36);
// Remove sign if negative
if (result < 0)
encoded = encoded.substring(1, encoded.length());
// Trim extra digits or pad with zeroes
if (encoded.length() > 8) {
encoded = encoded.substring(0, 8);
}
while (encoded.length() < 8) {
encoded = "0" + encoded;
}
}
This code is thread-safe and can be accessed concurrently.

why Digester is generating different hash code for same message, code and iteration

i have write a hash code covert method in spring security. In that method, i will save salt value and iteration size in database. when user login with plain password next time i will use salt value and iteration from database and digest the password. But this method is generation different hash code even if the salt and iteration value are same.
public Administrator encryptDigestCode(Administrator administrator) {
StandardStringDigester digester = new StandardStringDigester();
Administrator admin = new Administrator();
digester.setAlgorithm("SHA-256");
digester.setStringOutputType("base64");
Random ran = new Random();
int iterate = ran.nextInt(1000);
digester.setIterations(iterate);
RandomSaltGenerator ram = new RandomSaltGenerator();
byte[] salt = ram.generateSalt(10);
String pass = new String(salt) + administrator.getHashedPassword();
String encryptedPassword = digester.digest(pass);
if (digester.matches(administrator.getHashedPassword(),
encryptedPassword)) {
admin.setLoginDetail(new LoginDetail());
admin.getLoginDetail().setSalt(new String(ram.generateSalt(10)));
admin.getLoginDetail().setHashingCycle(iterate);
admin.setUserName(administrator.getUsername());
admin.setSesamiagreementno(administrator.getSesamiagreementno());
admin.setHashedPassword(encryptedPassword);
} else {
admin.setLoginDetail(null);
admin.setHashedPassword(null);
admin.setUserName(null);
}
return admin;
}
how should i do? any code or site for reference. thanks
Looks like you use a Random Salt. -- Then of course the hash is always different.
And there is a second random value
Random ran = new Random();
int iterate = ran.nextInt(1000);
digester.setIterations(iterate);
I would expect that this also influent the result. So if you do not have the luck go get the same random value twice, hash is different.
The line
admin.getLoginDetail().setSalt(new String(ram.generateSalt(10)));
is wrong. First, it generates a new random salt and puts it into the database rather
than using the variable salt. I.e. you call generateSalt twice: the first result is used to hash the password and the second result is stored in the database. Secondly, it converts a random array of bytes into a string.
Unless you are sure that these bytes correspond to printable characters you should e.g. use a base64 encoding here (and of course the appropriate decoding, whenever necessary).
Also using a random integer for the number of iterations is not such a great idea. The number of iterations should make a brute force attack against a single target expensive to compute. If you use a random number here, then an attacker might just try to find the user with the lowest iteration count and start attacking this password.

Categories

Resources