How to ensure values produces randomly are unique in Java [duplicate] - java

This question already has answers here:
How to create user friendly unique IDs, UUIDs or other unique identifiers in Java
(7 answers)
Closed 1 year ago.
When generating a random value, I need to ensure that the generateUserID() method generates a "random and unique" value every time.
private String generateUserID(String prefix, int digitNumber)
{
return prefix + String.valueOf(digitNumber < 1 ? 0 : new Random()
.nextInt((9 * (int) Math.pow(10, digitNumber - 1)) - 1)
+ (int) Math.pow(10, digitNumber - 1));
}
No output from this method should be the same as another value. Thus, I need to refactor the code, but I cannot use any loop.

I use random ID's like this when moving data records from one system to another. I generally use a UUID (Universally Unique IDentifier) so I can clearly distinguish any one record from any of its (potentially millions of) counterparts.
Java has pretty good support for them. Try something like this:
import java.util.UUID;
private String generateUserID(String prefix, int digitNumber) {
String uuid = UUID.randomUUID().toString();
return prefix + digitNumber + uuid; // Use the prefix and digitNumber however you want.
}
JavaDocs: https://docs.oracle.com/javase/7/docs/api/java/util/UUID.html#randomUUID()

Related

How to generate 1000 unique email-ids using java

My requirement is to generate 1000 unique email-ids in Java. I have already generated random Text and using for loop I'm limiting the number of email-ids to be generated. Problem is when I execute 10 email-ids are generated but all are same.
Below is the code and output:
public static void main() {
first fr = new first();
String n = fr.genText()+"#mail.com";
for (int i = 0; i<=9; i++) {
System.out.println(n);
}
}
public String genText() {
String randomText = "abcdefghijklmnopqrstuvwxyz";
int length = 4;
String temp = RandomStringUtils.random(length, randomText);
return temp;
}
and output is:
myqo#mail.com
myqo#mail.com
...
myqo#mail.com
When I execute the same above program I get another set of mail-ids. Example: instead of 'myqo' it will be 'bfta'. But my requirement is to generate different unique ids.
For Example:
myqo#mail.com
bfta#mail.com
kjuy#mail.com
Put your String initialization in the for statement:
for (int i = 0; i<=9; i++) {
String n = fr.genText()+"#mail.com";
System.out.println(n);
}
I would like to rewrite your method a little bit:
public String generateEmail(String domain, int length) {
return RandomStringUtils.random(length, "abcdefghijklmnopqrstuvwxyz") + "#" + domain;
}
And it would be possible to call like:
generateEmail("gmail.com", 4);
As I understood, you want to generate unique 1000 emails, then you would be able to do this in a convenient way by Stream API:
Stream.generate(() -> generateEmail("gmail.com", 4))
.limit(1000)
.collect(Collectors.toSet())
But the problem still exists. I purposely collected a Stream<String> to a Set<String> (which removes duplicates) to find out its size(). As you may see, the size is not always equals 1000
999
1000
997
that means your algorithm returns duplicated values even for such small range.
Therefore, you'd better research already written email generators for Java or improve your own (for example, by adding numbers, some special characters that, in turn, will generate a plenty of exceptions).
If you are planning to use MockNeat, the feature for implementing email strings is already implemented.
Example 1:
String corpEmail = mock.emails().domain("startup.io").val();
// Possible Output: tiptoplunge#startup.io
Example 2:
String domsEmail = mock.emails().domains("abc.com", "corp.org").val();
// Possible Output: funjulius#corp.org
Note: mock is the default "mocking" object.
To guarantee uniqueness you could use a counter as part of the email address:
myqo0000#mail.com
bfta0001#mail.com
kjuy0002#mail.com
If you want to stick to letters only then convert the counter to base 26 representation using 'a' to 'z' as the digits.

How to generate random string with no duplicates in java

I read some answers , usually they use a set or some other data structure to ensure there is no duplicates. but for my situation , I already stored a lot random string in database , I have to make sure that the generated random string should not existed in database .
and I don't think retrieve all random string from database into a set and then generated the random string is a good idea...
I found that System.currentTimeMillis() will generate a "random" number , but how to translate that number to a random string is a question...I need a string with length 8.
any suggestion will be appreciated
You can use Apache library for this: RandomStringUtils
RandomStringUtils.randomAlphanumeric(8).toUpperCase() // for alphanumeric
RandomStringUtils.randomAlphabetic(8).toUpperCase() // for pure alphabets
randomAlphabetic(int count)
Creates a random string whose length is the number of characters specified.
randomAlphanumeric(int count)
Creates a random string whose length is the number of characters specified.
So there are two issues here - creating the random string, and making sure there's no duplicate already in the db.
If you are not bound to 8 characters, you can use a UUID as the commenter above suggested. The UUID class returns a strong that is highly statistically unlikely to be a duplicate of a previously generated UUID so you can use it for this precise purpose without checking if its already in your database.
UUID.randomUUID().toString();
Or if you don't care whether what the unique id is as long as its unique you could use an identity or autoincrement field which pretty much all DB's support. If you do that, though you have the read the record after you commit it to get the identity assigned by the db.
which produces a string which looks something that looks like this:
5e0013fd-3ed4-41b4-b05d-0cdf4324bb19
If you are have to have an 8 character string as your unique id and you don't want to import the apache library, \you can generate random 8 character string like this:
final String alpha="ABCDEFGHIJKLMNOPQRSTUVWXYZ";
final Random rand= new Random();
public String myUID() {
int i = 8;
String uid="";
while (i-- > 0) {
uid+=alpha.charAt(rand.nextInt(26));
}
return uid;
}
To make sure its not a duplicate, you should add a unique index to the column in the db which contains it.
You can either query the db first to make sure that no row has that id before you insert the row, or catch the exception and retry if you've generated a duplicate.
Method currentTimeMillis() returns the current time in milliseconds in long so convert long to string, and s.substring(5, s.length()) give you last 8 digit's of milliseconds those are always identical for each millisecond.
public static void main(String[] args) {
String s = String.valueOf(System.currentTimeMillis());
System.out.println(s.substring(5, s.length()));
}
You have to make sure that this string is available or not in your database each time.

Id Generation for multiple forms

Can anyone suggest if i use below code to generate id for my files, will it be unique always.
As 100s forms create the form at same automatically which auto populate ids in ID textbox. So it should be thread safe and If i restart the application it should not ever repeat the id which already generated before the application stop anytime.
private static final AtomicLong count = new AtomicLong(0L);
public static String generateIdforFile()
{
String timeString = Long.toString(System.currentTimeMillis(), 36);
String counterString = Long.toString(counter.incrementAndGet() % 1000, 36);
return timeString + counterString;
}
And forms are getting the Id using ClassName.generateIdforFile();
Why not just use a UUID for your file id? You could use something like the following:
public static String generateIdforFile() {
return UUID.randomUUID().toString();
}
Or do you need a (ongoing) numeric value?
If the number just has to be numeric (and not ongoing) you could use UUID#getLeastSignificantBits() or UUID#getMostSignificantBits() for the numeric value.
Quoting this answer on SO:
So the most significant half of your UUID contains 58 bits of
randomness, which means you on average need to generate 2^29 UUIDs to
get a collision (compared to 2^61 for the full UUID).
You will of course not be as collision secure as using the full UUID.
If you are making method as synchronized there is no need to use AtomicLong variables.
Because concurrency is ensured by using synchronized keyword.
Using excessive concurrent variables hampers efficiency and performance of application.
Better use a global AtomicLong starting at 0L for you entire application. Then you concatenate with CurrentTimeMillis.
static AtomicLong counter = new AtomicLong(0L);
public static String generateIdforFile()
{
String timeString = Long.toString(System.currentTimeMillis(), 36);
String counterString = Long.toString(counter.incrementAndGet() % 1000, 36);
return timeString + counterString;
}
This has greater chances to yield unique IDs, even between application restarts, provided that your app takes a bit more than some milliseconds to shutdown and restart. Note that the method is not synchronized anymore. (no need) And provided also, that you create less than a thousand files in the same millisecond. But you can't guarantee universal uniqueness.

How to synchronize System Time access in a class in Java

I am writing a class that when called will call a method to use system time to generate a unique 8 character alphanumeric as a reference ID. But I have the fear that at some point, multiple calls might be made in the same millisecond, resulting in the same reference ID. How can I go about protecting this call to system time from multiple threads that might call this method simultaneously?
System time is unreliable source for Unique Ids. That's it. Don't use it.
You need some form of a permanent source (UUID uses secure random which seed is provided by the OS)
The system time may go/jump backwards even a few milliseconds and screw your logic entirely. If you can tolerate 64 bits only you can either use High/Low generator which is a very good compromise or cook your own recipe: like 18bits of days since beginning of 2012 (you have over 700years to go) and then 46bits of randomness coming from SecureRandom - not the best case and technically it may fail but it doesn't require external persistence.
I'd suggest to add the threadID to the reference ID. This will make the reference more unique. However, even within a thread consecutive calls to a time source may deliver identical values. Even calls to the highest resolution source (QueryPerformanceCounter) may result in identical values on certain hardware. A possible solution to this problem is testing the collected time value against its predecessor and add an increment item to the "time-stamp". You may need more than 8 characters when this should be human readable.
The most efficient source for a timestamp is the GetSystemTimeAsFileTime API. I wrote some details in this answer.
You can use the UUID class to generate the bits for your ID, then use some bitwise operators and Long.toString to convert it to base-36 (alpha-numeric).
public static String getId() {
UUID uuid = UUID.randomUUID();
// This is the time-based long, and is predictable
long msb = uuid.getMostSignificantBits();
// This contains the variant bits, and is random
long lsb = uuid.getLeastSignificantBits();
long result = msb ^ lsb; // XOR
String encoded = Long.toString(result, 36);
// Remove sign if negative
if (result < 0)
encoded = encoded.substring(1, encoded.length());
// Trim extra digits or pad with zeroes
if (encoded.length() > 8) {
encoded = encoded.substring(encoded.length() - 8, encoded.length());
}
while (encoded.length() < 8) {
encoded = "0" + encoded;
}
}
Since your character space is still smaller compared to UUID, this isn't foolproof. Test it with this code:
public static void main(String[] args) {
Set<String> ids = new HashSet<String>();
int count = 0;
for (int i = 0; i < 100000; i++) {
if (!ids.add(getId())) {
count++;
}
}
System.out.println(count + " duplicate(s)");
}
For 100,000 IDs, the code performs well pretty consistently and is very fast. I start getting duplicate IDs when I increase another order of magnitude to 1,000,000. I modified the trimming to take the end of the encoded string instead of the beginning, and this greatly improved duplicate ID rates. Now having 1,000,000 IDs isn't producing any duplicates for me.
Your best bet may still be to use a synchronized counter like AtomicInteger or AtomicLong and encode the number from that in base-36 using the code above, especially if you plan on having lots of IDs.
Edit: Counter approach, in case you want it:
private final AtomicLong counter;
public IdGenerator(int start) {
// start could also be initialized from a file or other
// external source that stores the most recently used ID
counter = new AtomicLong(start);
}
public String getId() {
long result = counter.getAndIncrement();
String encoded = Long.toString(result, 36);
// Remove sign if negative
if (result < 0)
encoded = encoded.substring(1, encoded.length());
// Trim extra digits or pad with zeroes
if (encoded.length() > 8) {
encoded = encoded.substring(0, 8);
}
while (encoded.length() < 8) {
encoded = "0" + encoded;
}
}
This code is thread-safe and can be accessed concurrently.

I can't understand this programming code for psedorandom number generator for hashing

First of all I just begun learning Java and i can say it more challenging then C or python. I'm not very keen on programming to so I have hard time understanding how some codes works. This one in particular
public class Pseudo
{
final int a = 2;
final int c = 3;
int address;
String list[][] = new String [100][6];
public void AddRecord(String ID, String Name, String Course, String Address, String Email, String Contact)
{
address = (a * Integer.parseInt(ID) + c) % list.length;
if((Integer.parseInt(ID)<100000||Integer.parseInt(ID)>999999)||ID.length()==0 || Name.length()==0 || Course.length()==0 || Address.length()==0)
{
showMessageDialog(null,"The ID number should be in six digit and the particular field should not be empty","",ERROR_MESSAGE);
}
else{
if(list[address][0]!=null){
showMessageDialog(null,"Collison is occur, the same address is get. Recalculating...............","",WARNING_MESSAGE);
while(list[address][0]!=null)
{
address = (a * address + c) % list.length;
}
}
list[address][0] = ID;
list[address][1] = Name;
list[address][2] = Course;
list[address][3] = Address;
list[address][4] = Email;
list[address][5] = Contact;
showMessageDialog(null,"Student Information " + ID + " will be saved in address: " + address,"",INFORMATION_MESSAGE);
}
}
The confusion come when
address = (a * Integer.parseInt(ID) + c) % list.length;
if((Integer.parseInt(ID)<100000||Integer.parseInt(ID)>999999)||ID.length()==0 || Name.length()==0 || Course.length()==0 || Address.length()==0)
What does it mean. From what I understand from this code is that inside an IF statement you can have more then 1 condition. I'm no very sure since this is my first time seeing such a code.
The second is this
if(list[address][0]!=null){
showMessageDialog(null,"Collison is occur, the same address is get. Recalculating...............","",WARNING_MESSAGE);
while(list[address][0]!=null)
{
address = (a * address + c) % list.length;
}
}
list[address][0] = ID;
list[address][1] = Name;
list[address][2] = Course;
list[address][3] = Address;
list[address][4] = Email;
list[address][5] = Contact;
showMessageDialog(null,"Student Information " + ID + " will be saved in address: " + address,"",INFORMATION_MESSAGE);
If collision occurs the address of which it is stored should be altered using a psedorandom number generator again but what I can't grasped is
list[address][0]!=null.I am just baffle with this line. I know its job is change the address if collision happens but i don't know the exact mechanics of how this part is executed.
From what I understand from this code is that inside an IF statement you can have more then 1 condition.
Well, yes and no. You can construct complex conditions based on many smaller conditions, but ultimately the whole thing has to resolve to a single boolean true/false result.
Consider the condition in this case:
(Integer.parseInt(ID)<100000||Integer.parseInt(ID)>999999)||ID.length()==0 || Name.length()==0 || Course.length()==0 || Address.length()==0
Let's break that down into its components:
(
Integer.parseInt(ID)<100000 ||
Integer.parseInt(ID)>999999
) ||
ID.length()==0 ||
Name.length()==0 ||
Course.length()==0 ||
Address.length()==0
It's really just chaining together a bunch of comparisons into one big true/false statement. You can essentially read something like this as:
If (something) or (something else) or (another thing) then...
And each something can itself contain small somethings, etc. You can build as complex a logical condition as you want, grouping sub-conditions with parentheses, as long as the whole thing resolves to a single true/false result.
what I can't grasped is list[address][0]!=null
That is just checking if a particular value is null. That value is part of a nested (jagged) array. So you have a variable called list. That variable is an array. Each element in that array is, itself, also an array. So you end up with a kind of 2-dimensional array (but a jagged one, where any given sub-array doesn't have to be the same length as any other).
That specific piece of code looks into the list array, at the address index, and looks at the 0 index of that sub-array, and checks if that value is null.
First of all, understanding any code is much easier if it's properly formatted. All good IDEs have such a function, e.g. for Eclipse the shortcut is Ctrl+Shift+F, for IntelliJ IDEA Ctrl+Alt+L.
The most important part, which might resolve your first confusion: || is the logical OR in Java, meaning the ID must be a number between 100000 and 999999 and the attributes must not be empty. Or literally, if the ID is smaller than 100000 or larger than 999999 or any of the values are empty, there will be an error message and nothing will be done.
For the second part: null means that a variable is not set, so to prevent overwriting an entry you can check if it's already set, i.e. not equal to null. So the code changes the address variable until an address is found for which no data is set yet and then uses it to store the given data.
There are several potential problems in this code, among which:
several calls to the relatively slow Integer.parseInt(String) where it could be called once and stored into a variable
potential NumberFormatException if ID isn't a number (or is empty, or has some excess white spaces)
potential infinite loop if the array is full
But as it looks like some CS homework it shouldn't matter.
Thank You so much Mr David. I understand the first part where if u have a condition u can stack it on each other and from what i can understand it only works with the ||(OR) statement since using this will guarantee either a true or false ending.
while(list[address][0]!=null)
But I'm still a little confuse for part 2 of my problem. Since that line is to check the array is null meaning no value right.This is my understanding of the situation.That particular part of the code is suppose to resolve any collision if the user enters the same ID number right so shouldn't it be checking the value that's causing the collision. But the line seems to be doing is as long as a null value is detected the corresponding procedure would be implemented.

Categories

Resources