Java Multithreading Implementation for generating unique codes - java

My question is how I would implement multithreading to this task correctly.
I have a program that takes quite a long time to finish executing. About an hour and a half. I need to generate about 10,000 random and unique number codes. The code below is how I first implemented it and have it right now.
import java.util.Random;
import java.util.ArrayList;
public class Main
{
public static void main(String[] args) {
Random random = new Random();
// This holds all the codes
ArrayList<String> database = new ArrayList<>();
int counter = 0;
while(counter < 10000){
// Generate a 10 digit long code and append to sb
StringBuilder sb = new StringBuilder();
for(int i = 0; i < 10; i++){
sb.append(random.nextInt(10));
}
String code = String.valueOf(sb);
sb.setLength(0);
// Check if this code already exists in the database
// If not, then add the code and update counter
if(!database.contains(code)){
database.add(code);
counter++;
}
}
System.out.println("Done");
}
}
This of course is incredibly inefficient. So my question is: Is there is a way to implement multithreading that can work on this single piece of code? Best way I can word it is to give two cores/ threads the same code but have them both check the a single ArrayList? Both cores/ threads will generate codes but check to make sure the code it just made doesn't already exist either from the other core/ thread or from itself. I drew a rough diagram below. Any insight, advice, or pointers is greatly appreciated.

Using a more appropriate data structure and a more appropriate representation of the data, this should be a lot faster and easier to read, too:
Set<Long> database = new HashSet<>(10000);
while(database.size() < 10000){
database.add(ThreadLocalRandom.current().nextLong(10_000_000_000L);
}

Start with more obvious optimizations:
Do not use ArrayList, use HashSet. ArrayList contains() time complexity is O(n), while HashSet is O(1). Read this question about Big O summary for java collections framework. Read about Big O notation.
Initialize your collection with appropriate initial capacity. For your case that would be:
new HashSet<>(10000);
Like this underlying arrays won't be copied to increase their capacity. I would suggest to look/debug implementations of java collections to better understand how they work under the hood. Even try to implement them on your own.
Before you delve into complex multithreading optimizations, fix the simple problems - like bad collection choices.
Edit: As per suggestion from #Thomas in comments, you can directly generate a number(long) in the range you need - 0 to 9_999_999_999. You can see in this question how to do it. Stringify the resulting number and if length is less than 10, pad with leading zeroes.

Example:
(use ConcurrentHashMap, use threads, use random.nextLong())
public class Main {
static Map<String,Object> hashMapCache = new ConcurrentHashMap<String,Object>();
public static void main(String[] args) {
Random random = new Random();
// This holds all the codes
ArrayList<String> database = new ArrayList<>();
int counter = 0;
int NumOfThreads = 20;
int total = 10000;
int numberOfCreationsForThread = total/NumOfThreads;
int leftOver = total%NumOfThreads;
List<Thread> threadList = new ArrayList<>();
for(int i=0;i<NumOfThreads;i++){
if(i==0){
threadList.add(new Thread(new OneThread(numberOfCreationsForThread+leftOver,hashMapCache)));
}else {
threadList.add(new Thread(new OneThread(numberOfCreationsForThread,hashMapCache)));
}
}
for(int i=0;i<NumOfThreads;i++){
threadList.get(i).start();;
}
for(int i=0;i<NumOfThreads;i++){
try {
threadList.get(i).join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
for(String key : hashMapCache.keySet()){
database.add(key);
}
System.out.println("Done");
}}
OneThread:
public class OneThread implements Runnable{
int numberOfCreations;
Map<String,Object> hashMapCache;
public OneThread(int numberOfCreations,Map<String,Object> hashMapCache){
this.numberOfCreations = numberOfCreations;
this.hashMapCache = hashMapCache;
}
#Override
public void run() {
int counter = 0;
Random random = new Random();
System.out.println("thread "+ Thread.currentThread().getId() + " Start with " +numberOfCreations);
while(counter < numberOfCreations){
String code = generateRandom(random);
while (code.length()!=10){
code = generateRandom(random);
}
// Check if this code already exists in the database
// If not, then add the code and update counter
if(hashMapCache.get(code)==null){
hashMapCache.put(code,new Object());
counter++;
}
}
System.out.println("thread "+ Thread.currentThread().getId() + " end with " +numberOfCreations);
}
private static String generateRandom(Random random){
return String.valueOf(digits(random.nextLong(),10));
}
/** Returns val represented by the specified number of hex digits. */
private static String digits(long val, int digits) {
val = val > 0 ? val : val*-1;
return Long.toString(val).substring(0,digits);
}
}

Related

Compartmentalizing loops over a large iteration

The Goal of my question is to enhance the performance of my algorithm by splitting the range of my loop iterations over a large array list.
For example: I have an Array list with a size of about 10 billion entries of long values, the goal I am trying to achieve is to start the loop from 0 to 100 million entries, output the result for the 100 million entries of whatever calculations inside the loop; then begin and 100 million to 200 million doing the previous and outputting the result, then 300-400million,400-500million and so on and so forth.
after I get all the 100 billion/100 million results, then I can sum them up outside of the loop collecting the results from the loop outputs parallel.
I have tried to use a range that might be able to achieve something similar by trying to use a dynamic range shift method but I cant seem to have the logic fully implemented like I would like to.
public static void tt4() {
long essir2 = 0;
long essir3 = 0;
List cc = new ArrayList<>();
List<Long> range = new ArrayList<>();
// break point is a method that returns list values, it was converted to
// string because of some concatenations and would be converted back to long here
for (String ari1 : Breakpoint()) {
cc.add(Long.valueOf(ari1));
}
// the size of the List is huge about 1 trillion entries at the minimum
long hy = cc.size() - 1;
for (long k = 0; k < hy; k++) {
long t1 = (long) cc.get((int) k);
long t2 = (long) cc.get((int) (k + 1));
// My main question: I am trying to iterate the entire list in a dynamic way
// which would exclude repeated endpoints on each iteration.
range = LongStream.rangeClosed(t1 + 1, t2)
.boxed()
.collect(Collectors.toList());
for (long i : range) {
// Hard is another method call on the iteration
// complexcalc is a method as well
essir2 = complexcalc((int) i, (int) Hard(i));
essir3 += essir2;
}
}
System.out.println("\n" + essir3);
}
I don't have any errors, I am just looking for a way to enhance performance and time. I can do a million entries in under a second directly, but when I put the size I require it runs forever. The size I'm giving are abstracts to illustrate size magnitudes, I don't want opinions like a 100 billion is not much, if I can do a million under a second, I'm talking massively huge numbers I need to iterate over doing complex tasks and calls, I just need help with the logic I'm trying to achieve if I can.
One thing I would suggest right off the bat would be to store your Breakpoint return value inside a simple array rather then using a List. This should improve your execution time significantly:
List<Long> cc = new ArrayList<>();
for (String ari1 : Breakpoint()) {
cc.add(Long.valueOf(ari1));
}
Long[] ccArray = cc.toArray(new Long[0]);
I believe what you're looking for is to split your tasks across multiple threads. You can do this with ExecutorService "which simplifies the execution of tasks in asynchronous mode".
Note that I am not overly familiar with this whole concept but have experimented with it a bit recently and give you a quick draft of how you could implement this.
I welcome those more experienced with multi-threading to either correct this post or provide additional information in the comments to help improve this answer.
Runnable Task class
public class CompartmentalizationTask implements Runnable {
private final ArrayList<Long> cc;
private final long index;
public CompartmentalizationTask(ArrayList<Long> list, long index) {
this.cc = list;
this.index = index;
}
#Override
public void run() {
Main.compartmentalize(cc, index);
}
}
Main class
private static ExecutorService exeService = Executors.newCachedThreadPool();
private static List<Future> futureTasks = new ArrayList<>();
public static void tt4() throws ExecutionException, InterruptedException
{
long essir2 = 0;
long essir3 = 0;
ArrayList<Long> cc = new ArrayList<>();
List<Long> range = new ArrayList<>();
// break point is a method that returns list values, it was converted to
// string because of some concatenations and would be converted back to long here
for (String ari1 : Breakpoint()) {
cc.add(Long.valueOf(ari1));
}
// the size of the List is huge about 1 trillion entries at the minimum
long hy = cc.size() - 1;
for (long k = 0; k < hy; k++) {
futureTasks.add(Main.exeService.submit(new CompartmentalizationTask(cc, k)));
}
for (int i = 0; i < futureTasks.size(); i++) {
futureTasks.get(i).get();
}
exeService.shutdown();
}
public static void compartmentalize(ArrayList<Long> cc, long index)
{
long t1 = (long) cc.get((int) index);
long t2 = (long) cc.get((int) (index + 1));
// My main question: I am trying to iterate the entire list in a dynamic way
// which would exclude repeated endpoints on each iteration.
range = LongStream.rangeClosed(t1 + 1, t2)
.boxed()
.collect(Collectors.toList());
for (long i : range) {
// Hard is another method call on the iteration
// complexcalc is a method as well
essir2 = complexcalc((int) i, (int) Hard(i));
essir3 += essir2;
}
}

Creating variables with random string - Nothing will be printed out on the screen

I'm creating Java Program which one prints out as many For loops as user wants to. I'm creating variables in for wih the random -generated string. The user can choose how long the variable will be. My problem is, when I try list my variables for testing this code work, I can't see anything printed out on the screen when I'm using listVariables() -method. If I try to put System.out.println inside the generateVariables(), The new generated strings are in the ArrayList -vector. The code is clear, and it will run on console, but nothing seems to be printed out. Where's the catch?
Here is my code
import java.util.Random;
import java.util.ArrayList;
import java.lang.StringBuffer;
public class Silmukkageneraattori {
//Attributes
//Change the value for how many loops you want to create?
private static final int howManyLoops = 5;
//Next string -array includes all the generated variables;
private static ArrayList<String> variables = new ArrayList<String>();
//Next value is how long variable name do you want to create?
private static final int howLongVariable = 2;
//Next method generates variables
public static void generateVariables(int how) {
String temp = null;
for (int x=0;x<how;x++) {
variables.add((String)createVariable());
}
}
//Next method creates variable
public static String createVariable() {
Random rand = new Random();
StringBuffer sb = new StringBuffer("");
String chars = "qwertyuiopasdfghjklzxcvbnm";
for (int x=0;x<howLongVariable;x++) {
sb.append(""+chars.charAt(rand.nextInt(chars.length()-1)));
}
return sb.toString();
}
//Method for listing created variables;
public static void listVariables(ArrayList<String> varB) {
for (int x=0;x>varB.size()-1;x++) {
String var = (String)varB.get(x).toString();
System.out.println(var);
}
}
//Main -method
public static void main(String[] args) {
generateVariables(howManyLoops);
listVariables(variables);
}
}
In the below for loop in this method:
public static void listVariables(ArrayList<String> varB) {
for (int x=0;x>varB.size()-1;x++) {
String var = (String)varB.get(x).toString();
System.out.println(var);
}
}
You are running the loop until x becomes smaller than varB.size()-1 (or until varB.size()-1 is greater than x).
However, varB.size()-1 does not change size within your loop. Thereofre, it will always be be greater than x and x will always stay smaller than varB.size()-1. Consequently, the loop never runs.
It appears you only want to get one element from varB, the first one, so why not remove the for loop altogether and just do:
System.out.println(varB.get(0));
Change
for (int x=0;x>varB.size()-1;x++) {
to
for (int x=0;x<varB.size()-1;x++) {
Note the change of > to < in condition in above for.
you have used > condition so 0 never will be greater then 4. so the condition required like it start from 0 to 4. till x will be lesser then total varB.size().
for (int x=0;x<varB.size()-1;x++)
instead of
for (int x=0;x>varB.size()-1;x++)
Your problem is in the listVariables method.
You are using a > (greater than) rather than < (smaller than).
Once listVariables is called, x is 0, and varB.size is some value (5). At the test condition, the loop will not be entered since 0 is not greater than 5.

Searching and sorting through an Arraylist

I am a new Java programmer and I am working on a project that requires me to read a text file that has Movie Reviews.
Once I've read the file, I am asked to search and sort through the Array of movies and return the total number of reviews for each movie as well as the average rate for each movie.
The portion I am currently stuck on is iterating through the Array list.
I am using an inner and outer for loop and I seem to be getting an infinite loop.
I would appreciate a second set of eyes. I have been staring at this project for a few days now and starting to not see mistakes.
Here is the code:
import java.io.*;
import java.util.*;
import java.lang.*;
public class MovieReviewApp {
public static void main(String[] args)
{
String strline = "";
String[] result = null;
final String delimit = "\\s+\\|\\s+";
String title ="";
//int rating = (Integer.valueOf(- 1));
ArrayList<MovieReview> movies = new ArrayList<MovieReview>();
//ArrayList<String> titles = new ArrayList<String>();
//ArrayList<Integer> ratings = new ArrayList<Integer>();
//HashMap<String, Integer> hm = new HashMap<String, Integer>();
//ListMultimap<String, Integer> hm = ArrayListMultimap.create();
try
{
BufferedReader f = new BufferedReader(new FileReader("/Users/deborahjaffe/Desktop/Java/midterm/movieReviewData.txt"));
while(true)
{
strline = f.readLine(); // reads line by line of text file
if(strline == null)
{
break;
}
result = strline.split(delimit, 2); //creates two strings
//hm.put(result[0], new Integer [] {Integer.valueOf(result[1])});
//hm.put(result[0], Integer.valueOf(result[1]));
// titles.add(result[0]);
//ratings.add(Integer.valueOf(result[1]));
MovieReview m = new MovieReview(result[0]);
movies.add(m);
MovieReview m2 = new MovieReview();
int rating = Integer.valueOf(result[1]);
int sz = movies.size();
for (int i = 0; i < sz; i++)
{
for (int j = 0; j < sz; j++)
{
m2 = movies.get(i);
if (movies.contains(m2))
{
m2.addRating(rating);
}
else
{
movies.add(m2);
m2.addRating(rating);
}
}
}
movies.toString();
//Collections.sort(movies);
} //end while
f.close();
//Set<String> keys = hm.keySet();
//Collection<Integer> values = hm.values();
} //end of try
catch(FileNotFoundException e)
{
System.out.println("Error: File not found");
}
catch(IOException e)
{
System.out.println("Error opening a file.");
}
} // end main
} // end class
Read the file first and then iterate through the list or map for searching, sorting etc. In the above code, close the while loop before iterating through the list.
If you want to iterate through the ArrayList you can use an enhanced for-loop to iterate through it. Note: while in an enhanced for-loop you cannot make changes to the ArrayList as the enhanced for-loop makes the ArrayList essentially (and temporarily) read-only. This would work for iterating through to pull values, but not for adding values. Because you are changing the ArrayList this won't work, but I just thought that you should know about it, if you don't already. The enhanced for-loop works like this, I will put separate parts in squiggle brackets,
for({Object Type of ArrayList} {Dummy Value} : {name of ArrayList}), so it would look like this: for(MovieReview x: movies).
Regarding the inner part of this nested for-loop:
for (int i = 0; i < sz; i++)
{
for (int j = 0; j < sz; j++)
{
m2 = movies.get(i);
if (movies.contains(m2))
{
m2.addRating(rating);
}
else
{
movies.add(m2);
m2.addRating(rating);
}
}
}
Why do you have the inner part? The j variable is never used for anything so the for-loop seems kind of useless. Unless of course you made a mistake at the top of the inner loop and meant to have m2 = movies.get(j); but that seems unlikely.
In regard to the infinite loop, the way you wrote the for-loops you should not be getting an infinite loop because they all increment to a certain value that is reachable. Your while-loop seems to run infinitely, but I do notice that you have a break if strline points to null. I assume this is guaranteed to occur at the end of the file, but I would suggest that you make the condition for your while-loop be while(scannerName.hasNext()). This will allow your while-loop to eventually terminate without having the extra code plus having a Scanner instead of a BufferedReader will be slightly more efficient and still do everything the BufferedReader can do and much more, like that method hasNext().
I hope this has helped. If you have other questions, let me know. Good luck.

Unique random number for a particular timestamp

I am kind of learning concepts of Random number generation & Multithreading in java.
The idea is to not generating a repeated random number of range 1000 in a particular millisecond (Considering, not more than 50 data, in a multithreaded way will be processed in a millisecond). So that list of generated random number at the specific time is unique. Can you give me any idea as i am ending up generating couple of repeated random numbers (also, there is a considerable probability) in a particular milli second.
I have tried the following things where i failed.
Random random = new Random(System.nanoTime());
double randomNum = random.nextInt(999);
//
int min=1; int max=999;
double randomId = (int)Math.abs(math.Random()* (max - min + 1) + min);
//
Random random = new Random(System.nanoTime()); // also tried new Random();
double randomId = (int)Math.abs(random.nextDouble()* (max - min + 1) + min);
As I am appending the timestamp that is being generated, in a multithreaded environment i see the same ids (around 8-10) that is being generated (2-4 times) for 5000+ unique data.
First, you should use new Random(), since it looks like this (details depend on Java version):
public Random() { this(++seedUniquifier + System.nanoTime()); }
private static volatile long seedUniquifier = 8682522807148012L;
I.e. it already makes use of nanoTime() and makes sure different threads with the same nanoTime() result get different seeds, which new Random(System.nanoTime()) doesn't.
(EDIT: Pyranja pointed out this is a bug in Java 6, but it's fixed in Java 7:
public Random() {
this(seedUniquifier() ^ System.nanoTime());
}
private static long seedUniquifier() {
// L'Ecuyer, "Tables of Linear Congruential Generators of
// Different Sizes and Good Lattice Structure", 1999
for (;;) {
long current = seedUniquifier.get();
long next = current * 181783497276652981L;
if (seedUniquifier.compareAndSet(current, next))
return next;
}
}
private static final AtomicLong seedUniquifier
= new AtomicLong(8682522807148012L);
)
Second, if you generate 50 random numbers from 1 to 1000, the probability some numbers will be the same is quite high thanks to the birthday paradox.
Third, if you just want unique ids, you could just use AtomicInteger counter instead of random numbers. Or if you want a random part to start with, append a counter as well to guarantee uniqueness.
This class will allow you to get nonrepeating values from a certain range until the whole range has been used. Once the range is used, it will be reinitialized.
Class comes along with a simple test.
If you want to make the class thread safe, just add synchronized to nextInt() declaration.
Then you can use the singleton pattern or just a static variable to access the generator from multiple threads. That way all your threads will use the same object and the same unique id pool.
public class NotRepeatingRandom {
int size;
int index;
List<Integer> vals;
Random gen = new Random();
public NotRepeatingRandom(int rangeMax) {
size = rangeMax;
index = rangeMax; // to force initial shuffle
vals = new ArrayList<Integer>(size);
fillBaseList();
}
private void fillBaseList() {
for (int a=0; a<size; a++) {
vals.add(a);
}
}
public int nextInt() {
if (index == vals.size()) {
Collections.shuffle(vals);
index = 0;
}
int val = vals.get(index);
index++;
return val;
}
public static void main(String[] args) {
NotRepeatingRandom gen = new NotRepeatingRandom(10);
for (int a=0; a<30; a++) {
System.out.println(gen.nextInt());
}
}
}
If I understand your question correctly, multiple threads are creating their own instances of Random class at the same time and all threads generate the same random number?
Same number is generated, because all random instances where created at the same time, i.e. with the same seed.
To fix this, create only one instance of Random class, which is shared by all threads so that all your threads call nextDouble() on the same instance. Random.nextDouble() class is thread safe and will implicitly update its seed with every call.
//create only one Random instance, seed is based on current time
public static final Random generator= new Random();
Now all threads should use the same instance:
double random=generator.nextDouble()

Return True or False Randomly

I need to create a Java method to return true or false randomly. How can I do this?
The class java.util.Random already has this functionality:
public boolean getRandomBoolean() {
Random random = new Random();
return random.nextBoolean();
}
However, it's not efficient to always create a new Random instance each time you need a random boolean. Instead, create a attribute of type Random in your class that needs the random boolean, then use that instance for each new random booleans:
public class YourClass {
/* Oher stuff here */
private Random random;
public YourClass() {
// ...
random = new Random();
}
public boolean getRandomBoolean() {
return random.nextBoolean();
}
/* More stuff here */
}
(Math.random() < 0.5) returns true or false randomly
This should do:
public boolean randomBoolean(){
return Math.random() < 0.5;
}
You can use the following code
public class RandomBoolean {
Random random = new Random();
public boolean getBoolean() {
return random.nextBoolean();
}
public static void main(String[] args) {
RandomBoolean randomBoolean = new RandomBoolean();
for (int i = 0; i < 10; i++) {
System.out.println(randomBoolean.getBoolean());
}
}
}
You will get it by this:
return Math.random() < 0.5;
You can use the following for an unbiased result:
Random random = new Random();
//For 50% chance of true
boolean chance50oftrue = (random.nextInt(2) == 0) ? true : false;
Note: random.nextInt(2) means that the number 2 is the bound. the counting starts at 0. So we have 2 possible numbers (0 and 1) and hence the probability is 50%!
If you want to give more probability to your result to be true (or false) you can adjust the above as following!
Random random = new Random();
//For 50% chance of true
boolean chance50oftrue = (random.nextInt(2) == 0) ? true : false;
//For 25% chance of true
boolean chance25oftrue = (random.nextInt(4) == 0) ? true : false;
//For 40% chance of true
boolean chance40oftrue = (random.nextInt(5) < 2) ? true : false;
Java's Random class makes use of the CPU's internal clock (as far as I know). Similarly, one can use RAM information as a source of randomness. Just open Windows Task Manager, the Performance tab, and take a look at Physical Memory - Available: it changes continuously; most of the time, the value updates about every second, only in rare cases the value remains constant for a few seconds. Other values that change even more often are System Handles and Threads, but I did not find the cmd command to get their value. So in this example I will use the Available Physical Memory as a source of randomness.
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class Main {
public String getAvailablePhysicalMemoryAsString() throws IOException
{
Process p = Runtime.getRuntime().exec("cmd /C systeminfo | find \"Available Physical Memory\"");
BufferedReader in =
new BufferedReader(new InputStreamReader(p.getInputStream()));
return in.readLine();
}
public int getAvailablePhysicalMemoryValue() throws IOException
{
String text = getAvailablePhysicalMemoryAsString();
int begin = text.indexOf(":")+1;
int end = text.lastIndexOf("MB");
String value = text.substring(begin, end).trim();
int intValue = Integer.parseInt(value);
System.out.println("available physical memory in MB = "+intValue);
return intValue;
}
public boolean getRandomBoolean() throws IOException
{
int randomInt = getAvailablePhysicalMemoryValue();
return (randomInt%2==1);
}
public static void main(String args[]) throws IOException
{
Main m = new Main();
while(true)
{
System.out.println(m.getRandomBoolean());
}
}
}
As you can see, the core part is running the cmd systeminfo command, with Runtime.getRuntime().exec().
For the sake of brevity, I have omitted try-catch statements. I ran this program several times and no error occured - there is always an 'Available Physical Memory' line in the output of the cmd command.
Possible drawbacks:
There is some delay in executing this program. Please notice that in the main() function , inside the while(true) loop, there is no Thread.sleep() and still, output is printed to console only about once a second or so.
The available memory might be constant for a newly opened OS session - please verify. I have only a few programs running, and the value is changing about every second. I guess if you run this program in a Server environment, getting a different value for every call should not be a problem.
ThreadLocalRandom.current().nextBoolean()
To avoid recreating Random objects, use ThreadLocalRandom. Every thread has just one such object.
boolean rando = ThreadLocalRandom.current().nextBoolean() ;
That code is so short and easy to remember that you need not bother to create a dedicated method as asked in the Question.

Categories

Resources