My requirement is to generate 1000 unique email-ids in Java. I have already generated random Text and using for loop I'm limiting the number of email-ids to be generated. Problem is when I execute 10 email-ids are generated but all are same.
Below is the code and output:
public static void main() {
first fr = new first();
String n = fr.genText()+"#mail.com";
for (int i = 0; i<=9; i++) {
System.out.println(n);
}
}
public String genText() {
String randomText = "abcdefghijklmnopqrstuvwxyz";
int length = 4;
String temp = RandomStringUtils.random(length, randomText);
return temp;
}
and output is:
myqo#mail.com
myqo#mail.com
...
myqo#mail.com
When I execute the same above program I get another set of mail-ids. Example: instead of 'myqo' it will be 'bfta'. But my requirement is to generate different unique ids.
For Example:
myqo#mail.com
bfta#mail.com
kjuy#mail.com
Put your String initialization in the for statement:
for (int i = 0; i<=9; i++) {
String n = fr.genText()+"#mail.com";
System.out.println(n);
}
I would like to rewrite your method a little bit:
public String generateEmail(String domain, int length) {
return RandomStringUtils.random(length, "abcdefghijklmnopqrstuvwxyz") + "#" + domain;
}
And it would be possible to call like:
generateEmail("gmail.com", 4);
As I understood, you want to generate unique 1000 emails, then you would be able to do this in a convenient way by Stream API:
Stream.generate(() -> generateEmail("gmail.com", 4))
.limit(1000)
.collect(Collectors.toSet())
But the problem still exists. I purposely collected a Stream<String> to a Set<String> (which removes duplicates) to find out its size(). As you may see, the size is not always equals 1000
999
1000
997
that means your algorithm returns duplicated values even for such small range.
Therefore, you'd better research already written email generators for Java or improve your own (for example, by adding numbers, some special characters that, in turn, will generate a plenty of exceptions).
If you are planning to use MockNeat, the feature for implementing email strings is already implemented.
Example 1:
String corpEmail = mock.emails().domain("startup.io").val();
// Possible Output: tiptoplunge#startup.io
Example 2:
String domsEmail = mock.emails().domains("abc.com", "corp.org").val();
// Possible Output: funjulius#corp.org
Note: mock is the default "mocking" object.
To guarantee uniqueness you could use a counter as part of the email address:
myqo0000#mail.com
bfta0001#mail.com
kjuy0002#mail.com
If you want to stick to letters only then convert the counter to base 26 representation using 'a' to 'z' as the digits.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
I am trying to decrypt an encrypted file with unknown key - the only thing I know about it is that the key is an integer x, 0 <= x < 1010 (i.e. a maximum of 10 decimal digits).
public static String enc(String msg, long key) {
String ans = "";
Random rand = new Random(key);
for (int i = 0; i < msg.length(); i = i + 1) {
char c = msg.charAt(i);
int s = c;
int rd = rand.nextInt() % (256 * 256);
int s2 = s ^ rd;
char c2 = (char) (s2);
ans += c2;
}
return ans;
}
private static String tryToDecode(String string) {
String returnedString = "";
long key;
String msg = reader(string);
for (long i = 0; i <= 999999999; i++) {
System.out.println("decoding message with key + " + i);
key = i;
System.out.println("decoding with key: " + i + "\n" + enc(msg, key));
}
return returnedString;
}
I expect to find the plain text
The program works very slowly, is there any way to make it more efficient?
You can use Parallel Array Operations added in JAVA 8 if you are using Java 8 to achive this.
The best fit for you would be to use Spliterator
public void spliterate() {
System.out.println("\nSpliterate:");
int[] src = getData();
Spliterator<Integer> spliterator = Arrays.spliterator(src);
spliterator.forEachRemaining( n -> action(n) );
}
public void action(int value) {
System.out.println("value:"+value);
// Perform some real work on this data here...
}
I am still not clear about your situation. Here some great tutorials and articles to figure out which parallel array operations of java 8 is going to help you ?
http://www.drdobbs.com/jvm/parallel-array-operations-in-java-8/240166287
https://blog.rapid7.com/2015/10/28/java-8-introduction-to-parallelism-and-spliterator/
First things first: You can't println billions of lines. This will take forever, and it's pointless - you won't be able to see the text as it scrolls by, and your buffer won't save billion of lines so you couldn't scroll back up later even if you wanted to. If you prefer (and don't mind it being 2-3% slower than it otherwise would be), you can output once every hundred million keys, just so you can verify your program is making progress.
You can optimize things by not concatenating Strings inside the loop. Strings are immutable, so the old code was creating a rather large number of Strings, especially in the enc method. Normally I'd use a StringBuilder, but in this case a simple character array will meet our needs.
And there's one more thing we need to do that your current code doesn't do: Detect when we have the answer. If we assume that the message will only contain characters from 0-127 with no Unicode or extended ASCII, then we know we have a possible answer when the entire message contains only characters in this range. And we can also use this to further optimize, as we can then immediately discard any message that has a character outside of this range. We don't even have to finish decoding it and can move on to the next key. (If the message is of any length, the odds are that only one key will produce a decoded message with characters in that range - but it's not guaranteed, which is why I do not stop when I get to a valid message. You could probably do that, though.)
Due to the way random numbers are generated in Java, anything in the seed above 32 bits is not used by the encoding/decoding algorithm. So you only need to go up to 4294967295 instead of 9999999999. (This also means the key that was originally used to encode the message might not be the key this program uses to decode it, since 2-3 keys in the 10 digit range will produce the same encoding/decoding.)
private static String tryToDecode4(String msg) {
String returnedString = "";
for (long i=0; i<=4294967295l; i++)
{
if (i % 100000000 == 0) // This part is just to see that it's making progress. Remove if desired for a small speed gain.
System.out.println("Trying " + i);
char[] decoded = enc4(msg, i);
if (decoded == null)
continue;
returnedString = String.valueOf(decoded);
System.out.println("decoding with key: " + i + " " + returnedString);
}
return returnedString;
}
private static char[] enc4(String msg, long key) {
char[] ansC = new char[msg.length()];
Random rand = new Random(key);
for(int i=0;i<msg.length();i=i+1)
{
char c = msg.charAt(i);
int s = c;
int rd = rand.nextInt()%(256*256);
int s2 = s^rd;
char c2 = (char)(s2);
if (c2 > 127)
return null;
ansC[i] = c2;
}
return ansC;
}
This code finished running in a little over 3 minutes on my machine, with a message of "Hello World".
This code will not work well for very short messages (3-4 characters or less.) It will not work if the message contains Unicode or extended ASCII, although it could easily be modified to do so if you know the range of characters that might be in the message.
I want to query multiple candidates for a search string which could look like "My sear foo".
Now I want to look for documents which have a field that contains one (or more) of the entered strings (seen as splitted by whitespaces).
I found some code which allows me to do a search by pattern:
#View(name = "find_by_serial_pattern", map = "function(doc) { var i; if(doc.serialNumber) { for(i=0; i < doc.serialNumber.length; i+=1) { emit(doc.serialNumber.slice(i), doc);}}}")
public List<DeviceEntityCouch> findBySerialPattern(String serialNumber) {
String trim = serialNumber.trim();
if (StringUtils.isEmpty(trim)) {
return new ArrayList<>();
}
ViewQuery viewQuery = createQuery("find_by_serial_pattern").startKey(trim).endKey(trim + "\u9999");
return db.queryView(viewQuery, DeviceEntityCouch.class);
}
which works quite nice for looking just for one pattern. But how do I have to modify my code to get a multiple contains on doc.serialNumber?
EDIT:
This is the current workaround, but there must be a better way i guess.
Also there is only an OR logic. So an entry fits term1 or term2 to be in the list.
#View(name = "find_by_serial_pattern", map = "function(doc) { var i; if(doc.serialNumber) { for(i=0; i < doc.serialNumber.length; i+=1) { emit(doc.serialNumber.slice(i), doc);}}}")
public List<DeviceEntityCouch> findBySerialPattern(String serialNumber) {
String trim = serialNumber.trim();
if (StringUtils.isEmpty(trim)) {
return new ArrayList<>();
}
String[] split = trim.split(" ");
List<DeviceEntityCouch> list = new ArrayList<>();
for (String s : split) {
ViewQuery viewQuery = createQuery("find_by_serial_pattern").startKey(s).endKey(s + "\u9999");
list.addAll(db.queryView(viewQuery, DeviceEntityCouch.class));
}
return list;
}
Looks like you are implementing a full text search here. That's not going to be very efficient in CouchDB (I guess same applies to other databases).
Correct me if I am wrong but from looking at your code looks like you are trying to search a list of serial numbers for a pattern. CouchDB (or any other database) is quite efficient if you can somehow index the data you will be searching for.
Otherwise you must fetch every single record and perform a string comparison on it.
The only way I can think of to optimize this in CouchDB would be the something like the following (with assumptions):
Your serial numbers are not very long (say 20 chars?)
You force the search to be always 5 characters
Generate view that emits every single 5 char long substring from your serial number - more or less this (could be optimized and not sure if I got the in):
...
for (var i = 0; doc.serialNo.length > 5 && i < doc.serialNo.length - 5; i++) {
emit([doc.serialNo.substring(i, i + 5), doc._id]);
}
...
Use _count reduce function
Now the following url:
http://localhost:5984/test/_design/serial/_view/complex-key?startkey=["01234"]&endkey=["01234",{}]&group=true
Will return a list of documents with a hit count for a key of 01234.
If you don't group and set the reduce option to be false, you will get a list of all matches, including duplicates if a single doc has multiple hits.
Refer to http://ryankirkman.com/2011/03/30/advanced-filtering-with-couchdb-views.html for the information about complex keys lookups.
I am not sure how efficient couchdb is in terms of updating that view. It depends on how many records you will have and how many new entries appear between view is being queried (I understand couchdb rebuilds the view's b-tree on demand).
I have generated a view like that that splits doc ids into 5 char long keys. Out of over 1K docs it generated over 30K results - id being 32 char long, simple maths really: (serialNo.length - searchablekey.length + 1) * docscount).
Generating the view took a while but the lookups where fast.
You could generate keys of multiple lengths, etc. All comes down to your records count vs speed of lookups.
Hi may i know how can i write a code to generate the alphanumeric code which is look like this HW6KNMQA, CMKQ83JX ? I dont wish to use UUID method. Is there any simple method to generate for this ? ANy help would be appreciated.
What i have done so far;
import org.apache.commons.lang.RandomStringUtils;
public String testing() throws Exception
{
int ID_LENGTH = 10;
String a = RandomStringUtils.randomAlphanumeric(ID_LENGTH);
return a;
}
but i received error
java.lang.NoClassDefFoundError: org/apache/commons/lang/RandomStringUtils
You could use the RandomStringUtils from the Apache project. That being said, you do not seem to require a fixed length value, this, I think, could cause trouble down the line since it might make it harder to identify the value you are after.
If this is not a problem, you could use the Random function to randomly decide the length of the string to generate.
You can use BigInteger.toString(int radix).
Random random = new Random(System.currentTimeMillis());
public void test() {
for (int i = 0; i < 10; i++) {
String n = BigInteger.valueOf(Math.abs(random.nextLong())).toString(32).toUpperCase();
if (n.length() > 8) {
if (n.length() > 10) {
n = n.substring(n.length() - 10);
}
System.out.println(n);
}
}
}
Note that because this is in base 32 you will not see WXYZ but you should see all other characters and digits with equal probability.
I am trying to write a query such as this:
select {r: referrers(f), count:count(referrers(f))}
from com.a.b.myClass f
However, the output doesn't show the actual objects:
{
count = 3.0,
r = [object Object]
}
Removing the Javascript Object notation once again shows referrers normally, but they are no longer compartmentalized. Is there a way to format it inside the Object notation?
So I see that you asked this question a year ago, so I don't know if you still need the answer, but since I was searching around for something similar, I can answer this. The problem is that referrers(f) returns an enumeration and so it doesn't really translate well when you try to put it into your hashmap. I was doing a similar type of analysis where I was trying to find unique char arrays (count the unique combinations of char arrays up to the first 50 characters). What I came up with was this:
var counts = {};
filter(
map(
unique(
map(
filter(heap.objects('char[]'), "it.length > 50"), // filter out strings less than 50 chars in length
function(charArray) { // chop the string at 50 chars and then count the unique combos
var subs = charArray.toString().substr(0,50);
if (! counts[subs]) {
counts[subs] = 1;
} else {
counts[subs] = counts[subs] + 1;
}
return subs;
}
) // map
) // unique
, function(subs) { // map the strings into an array that has the string and the counts of that string
return { string: subs, count: counts[subs] };
}) // map
, "it.count > 5000"); // filter out strings that have counts < 5000
This essentially shows how to take an enumeration (heap.objects('char[]') in this case) and filter it and map it so that you can compute statistics on it. Hope this helps someone.
I am writing a class that when called will call a method to use system time to generate a unique 8 character alphanumeric as a reference ID. But I have the fear that at some point, multiple calls might be made in the same millisecond, resulting in the same reference ID. How can I go about protecting this call to system time from multiple threads that might call this method simultaneously?
System time is unreliable source for Unique Ids. That's it. Don't use it.
You need some form of a permanent source (UUID uses secure random which seed is provided by the OS)
The system time may go/jump backwards even a few milliseconds and screw your logic entirely. If you can tolerate 64 bits only you can either use High/Low generator which is a very good compromise or cook your own recipe: like 18bits of days since beginning of 2012 (you have over 700years to go) and then 46bits of randomness coming from SecureRandom - not the best case and technically it may fail but it doesn't require external persistence.
I'd suggest to add the threadID to the reference ID. This will make the reference more unique. However, even within a thread consecutive calls to a time source may deliver identical values. Even calls to the highest resolution source (QueryPerformanceCounter) may result in identical values on certain hardware. A possible solution to this problem is testing the collected time value against its predecessor and add an increment item to the "time-stamp". You may need more than 8 characters when this should be human readable.
The most efficient source for a timestamp is the GetSystemTimeAsFileTime API. I wrote some details in this answer.
You can use the UUID class to generate the bits for your ID, then use some bitwise operators and Long.toString to convert it to base-36 (alpha-numeric).
public static String getId() {
UUID uuid = UUID.randomUUID();
// This is the time-based long, and is predictable
long msb = uuid.getMostSignificantBits();
// This contains the variant bits, and is random
long lsb = uuid.getLeastSignificantBits();
long result = msb ^ lsb; // XOR
String encoded = Long.toString(result, 36);
// Remove sign if negative
if (result < 0)
encoded = encoded.substring(1, encoded.length());
// Trim extra digits or pad with zeroes
if (encoded.length() > 8) {
encoded = encoded.substring(encoded.length() - 8, encoded.length());
}
while (encoded.length() < 8) {
encoded = "0" + encoded;
}
}
Since your character space is still smaller compared to UUID, this isn't foolproof. Test it with this code:
public static void main(String[] args) {
Set<String> ids = new HashSet<String>();
int count = 0;
for (int i = 0; i < 100000; i++) {
if (!ids.add(getId())) {
count++;
}
}
System.out.println(count + " duplicate(s)");
}
For 100,000 IDs, the code performs well pretty consistently and is very fast. I start getting duplicate IDs when I increase another order of magnitude to 1,000,000. I modified the trimming to take the end of the encoded string instead of the beginning, and this greatly improved duplicate ID rates. Now having 1,000,000 IDs isn't producing any duplicates for me.
Your best bet may still be to use a synchronized counter like AtomicInteger or AtomicLong and encode the number from that in base-36 using the code above, especially if you plan on having lots of IDs.
Edit: Counter approach, in case you want it:
private final AtomicLong counter;
public IdGenerator(int start) {
// start could also be initialized from a file or other
// external source that stores the most recently used ID
counter = new AtomicLong(start);
}
public String getId() {
long result = counter.getAndIncrement();
String encoded = Long.toString(result, 36);
// Remove sign if negative
if (result < 0)
encoded = encoded.substring(1, encoded.length());
// Trim extra digits or pad with zeroes
if (encoded.length() > 8) {
encoded = encoded.substring(0, 8);
}
while (encoded.length() < 8) {
encoded = "0" + encoded;
}
}
This code is thread-safe and can be accessed concurrently.