Multiple Threads in Java for ConcurrentHashMap

Multiple Threads in Java for ConcurrentHashMap - java

I made a two thread one is get Data and another is save Data.
My problem is it is not handled in the process of storing the data read from Thread1.
I want to extract 1,000,000 elements and create them as file. The element size is so big, So i divide a elements size by 100,000. And then, the loop will run a 10 time. One thread reads a data from other server by 100,000. Another thread takes the data from the first thread and writes it to a file.
My Original scenario is below:
First thread read a total key, value size.
It will be 100,000 ~ 1,000,000. I would assume that i would process 1,000,000 data.Then Count sets 1,000,000. First Thread divide by 100,000 and read a data from server by 100,000. And then, First Thread calls a setData(Key,Value map).
It will loop 10 times.
Second Thread will loop 10 time. first, get a data by calling getMap() method. And It calls writeSeq(hashmap) method. It writes a data to writer stream. It is not yet flush. There is a problem here. It successfully gets a data size by calling getMap(). But, writeSeq method can not process all of size of value. When i gets a size of 100,000, it process as random. It will be 100, 1500, 0, 8203 ...
First Thread is below:
public void run() {
getValueCount(); //initialize value.
while (this.jobFlag) {
getSortedMap(this.count); //count starts the number of all elements size.
//For example, Total size is 1,000,000. Then count will sets a 1,000,000 and it is decreased as 100,000.
// Also setMap() is called in this method.
if (!jobFlag) //If all processing is done, jobFlag is set as false.
break;
}
resetValue();
}
Second Thread is below:
public void run() {
setWriter(); //Writer Stream creates;
double count = 10; //the number of loop.
ConcurrentHashMap<String, String> hash = new ConcurrentHashMap<String,String>();
for (int i = 0; i <= count - 1; i++) {
hash = share.getMap();
writeSeq(hash);
}
closeWriter(); //close Writer stream
}
This is shared source:
import java.util.HashMap;
import java.util.concurrent.ConcurrentHashMap;
public class ShareData {
ConcurrentHashMap<String, String> map;
public synchronized ConcurrentHashMap<String, String> getMap(){
if (this.map == null) {
try {
wait();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
ConcurrentHashMap<String, String> hashmap = map;
this.map = null;
return hashmap;
}
public synchronized void setMap(ConcurrentHashMap<String, String> KV) {
if (this.map != null) {
try {
wait();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
this.map = KV;
notify();
}
}
After that, second thread which save a data is stared. The size of KV is fine, but all values are not processed when foreach is proecessed. Also, each time i create a file, the size is different. Is it problem of synchronized?
public synchronized void writeSeq(ConcurrentHashMap<String, String> KV) {
AtomicInteger a = new AtomicInteger(0);
System.out.println(KV.size()); //ex) 65300
redisKV.entrySet().parallelStream().forEach(
entry -> {
try {
a.incrementAndGet();
writer.append(new Text(entry.getKey()), new Text(entry.getValue()));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
});
System.out.println(a.get()); //ex) 1300
i = 0;
notify();
}

The size of KV is fine, but all values are not processed when foreach is processed. Also, each time i create a file, the size is different. Is it problem of synchronized?
Unclear. I can see a small issue but it is not likely to cause the problem you describe.
The if (map == null) wait(); code should be a while loop.
The if (map != null) wait(); code should be a while loop.
The issue is that if one thread gets a spurious notify, it may proceed with map in the wrong state. You need to retry the test. (If you read the javadoc for Object, you will see an example that correctly implements a condition variable.)
Apart from that, the root cause of your problem does not appear to be in the code that you have shown us.
However, if I was to take a guess, my guess would be that one thread is adding or removing entries in the ConcurrentHashMap while the second thread is processing it1. The getMap / setMap methods you have shown us have to be used appropriately (i.e. called at appropriate points with appropriate arguments) to avoid the two threads interfering with each other. You haven't shown us that code.
So, if my guess is correct, your problem is a logic error rather than a low level synchronization problem. But if you need a better answer you will need to write and post a proper MCVE.
1 - A ConcurrentHashMap's iterators are weakly consistent. This means that if you update the map while iterating, you may miss entries in the iteration, or possibly see them more than once.

better way is using BlockingQueue, one thread put the queue, another thread take from the queue.

i++; is not thread-safe. You will get a lower count than there are updates. Use AtomicInteger And its incrementAndGet() method instead.

Related

java program a infinite loop in thread cause hundred percent cpu

I am quite new to Threads in Java, I am using an API which is using thread internally and listening data from the counter party, I am putting this data in an queue for further processing. I have created another Thread which is continuously reading this queue for retrieving data and processing and to write the results into text file. I am using while(true) statement to run infinite loop in thread this cause a hundred per cent of CPU usage and if I use sleep(10) in it add up latency which keep on increasing with time as I am receiving about 20 data item in one second.
public void run() {
while(true) {
try { Thread.sleep(10); }
catch (InterruptedException e2) { // TODO Auto-generated catch block
e2.printStackTrace();
}
if (!(queue.isEmpt())) {
Tick quote=queue.take();
processTuple(quote);
}
} // end while(true)
} // end run()
Could anyone suggest me solution where I can reduce CPU usage without adding latency.

Check out ArrayBlockingQueue.
EDIT:
Example of how to use a queue based on your code:
LinkedBlockingQueue<Tick> queue;
public void run() {
while (true) {
// No need to check the queue. No need to sleep().
// take() will wait until there's anything available
Tick quote = queue.take();
processTuple(quote);
}
}

Ya. Use a BlockingQueue implementation instead of busy- wait. while(true) will keep scheduling the thread.

Use queue implementations instead of Threads. See this link to know more about queue implementations. You can use ArrayBlockingQueue.

You may change your code something like this:
BlockingQueue<Tick> queue = ..
public void run()
{
for (Tick quote; quote = queue.take(); )
{
if (quote == someSpecialObjectToIndicateStop)
break; // To stop this thread Or you may catch InterruptedException
processTuple(quote);
}
}
See BlockingQueue documentation here

Synchronizing a group of threads

I am writing a program in Java in where I have a HashMap<String, Deque<Integer>> info;
My data is a list of Wikipedia pages that were visited with an hour time period, along with a count of how many times each was visited.
de Florian_David_Fitz 18
de G%C3%BCnther_Jauch 1
de Gangs_of_New_York 2
de Georg_VI._(Vereinigtes_K%C3%B6nigreich) 7
de Gerry_Rafferty 2
This data gets stored in the HashMap from above with the page name as key and the Deque updated hourly with the number of visits that hour.
I want to have one thread ThreadRead that reads input files and stores the info in the HashMap. And then one ThreadCompute thread for each key in the HashMap that consumes the associated Deque.
ThreadRead needs to lock all ThreadComputes while active, then wake them up when finished so the ThreadComputes can work concurrently.
If I need a different mutex for each ThreadCompute then how can I keep all of them locked while ThreadRead works? And how can I wake up all the ThreadComputes from ThreadRead when is done?
I have used info as a lock for ThreadRead, and info.get(key) for each ThreadCompute But it is not working as I expected.
Edit:
I add some code to try to make more clear the problem. This is what I have at the moment:
HashMap<String, Deque<Integer>> info;
boolean controlCompute, control Read;
private static class ThreadRead extends Thread {
public void run() {
while(controlRead) {
try {
read();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
public void read() throws InterruptedException{
synchronized(info){
while(count==numThreads){
for (File file: files){
reader.parse(file, info); // Reads the file and store the data in the Hashmap
keys=true;
while(info.getSizeDeque()>10){
count=0;
info.wait();
info.notifyAll();
}
}
}
controlRead=false;
}
}
}
private static class ThreadCompute extends Thread {
public String key;
public void run() {
while(controlCompute) {
try {
compute();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
public void compute() throws InterruptedException{
synchronized(info.get(key)){
if(count!=numThreads){
algorithms(); //Here I apply the algorithms to the integers in the deque
if(controlRead){
info.get(key).removeFirst();
count++;
if(count==numThreads){
info.notify();
info.get(key).wait();
}
info.get(key).wait();
}
if(info.isEmptyDeque(key)){
controlCompute=false;
}
}
}
}
}

Class java.util.concurrent.locks.ReentrantReadWriteLock is good for this kind of problem. There should be exactly one instance to guard the whole HashMap. The file reader needs to acquire the write lock of the ReadWriteLock because it wants to modify the map. The other threads need to each acquire their own read lock from the one ReadWriteLock.
All your threads must be careful to limit as much as possible the scope in which they hold their locks, so in particular, the file-read thread should acquire the write lock immediately before modifying the map, hold it until all modifications for one entry are complete, then release it. The other threads don't block each other, so they could in principle hold their locks longer, but doing so will block the file reader.

List concurrency failing

I have an Arraylist that I am constantly adding to and removing from in separate threads.
One thread adds, and the other removes.
This is the class that contains the changing list:
public class DataReceiver {
private static final String DEBUG_TAG = "DataReceiver";
// Class variables
private volatile ArrayList<Byte> buffer;
//private volatile Semaphore dataAmount;
public DataReceiver() {
this.buffer = new ArrayList<Byte>();
//this.dataAmount = new Semaphore(0, true);
}
// Adds a data sample to the data buffer.
public final void addData(byte[] newData, int bytes) {
int newDataPos = 0;
// While there is still data
while(newDataPos < bytes) {
// Fill data buffer array with new data
buffer.add(newData[newDataPos]);
newDataPos++;
//dataAmount.release();
}
return;
}
public synchronized byte getDataByte() {
/*
try {
dataAmount.acquire();
}
catch(InterruptedException e) {
return 0;
}
*/
while(buffer.size() == 0) {
try {
Thread.sleep(250);
}
catch(Exception e) {
Log.d(DEBUG_TAG, "getDataByte: failed to sleep");
}
}
return buffer.remove(0);
}
}
The problem is I get a null pointer every so often exception when trying to buffer.remove(0). As you can tell form the comments in the code, I tried using a semaphore at one point but it still intermittently threw nullpointer exceptions, so I created my own type of sleep-poll as a semi-proof-of-concept.
I do not understand why a null pointer exception would occur and/or how to fix it.

If you are handling the object initialization in a different thread it is possible that the constructor is not finished before the
public synchronized byte getDataByte()
is called therefore causing the NullPointerException because
this.buffer = new ArrayList<Byte>();
was never called.

I have a guess as to an explanation. I would do it in comments, but I don't have enough reputation, so hopefully this answer is helpful.
First of all, if you were to declare the addData() function as synchronized, would your problem go away? My guess is that it would.
My theory is that although you declared buffer as volatile, that is not sufficient protection for your use case. Imagine this case:
addData() gets called and is calling buffer.add()
at the same time, getDataByte() is checking buffer.size() == 0
My theory is that buffer.add() is not an atomic operation. Somewhere during the buffer.add() operation, it's internal size counter increments, enabling your getDataByte() call to buffer.size() == 0 to return false. On occasion, getDataByte() continues with its buffer.remove() call before your buffer.add() call completes.
This is based on an excerpt I read here:
https://www.ibm.com/developerworks/java/library/j-jtp06197/
"While the increment operation (x++) may look like a single operation, it is really a compound read-modify-write sequence of operations that must execute atomically -- and volatile does not provide the necessary atomicity."

AtomicReference to a mutable object and visibility

Say I have an AtomicReferenceto a list of objects:
AtomicReference<List<?>> batch = new AtomicReference<List<Object>>(new ArrayList<Object>());
Thread A adds elements to this list: batch.get().add(o);
Later, thread B takes the list and, for example, stores it in a DB: insertBatch(batch.get());
Do I have to do additional synchronization when writing (Thread A) and reading (Thread B) to ensure thread B sees the list the way A left it, or is this taken care of by the AtomicReference?
In other words: if I have an AtomicReference to a mutable object, and one thread changes that object, do other threads see this change immediately?
Edit:
Maybe some example code is in order:
public void process(Reader in) throws IOException {
List<Future<AtomicReference<List<Object>>>> tasks = new ArrayList<Future<AtomicReference<List<Object>>>>();
ExecutorService exec = Executors.newFixedThreadPool(4);
for (int i = 0; i < 4; ++i) {
tasks.add(exec.submit(new Callable<AtomicReference<List<Object>>>() {
#Override public AtomicReference<List<Object>> call() throws IOException {
final AtomicReference<List<Object>> batch = new AtomicReference<List<Object>>(new ArrayList<Object>(batchSize));
Processor.this.parser.parse(in, new Parser.Handler() {
#Override public void onNewObject(Object event) {
batch.get().add(event);
if (batch.get().size() >= batchSize) {
dao.insertBatch(batch.getAndSet(new ArrayList<Object>(batchSize)));
}
}
});
return batch;
}
}));
}
List<Object> remainingBatches = new ArrayList<Object>();
for (Future<AtomicReference<List<Object>>> task : tasks) {
try {
AtomicReference<List<Object>> remainingBatch = task.get();
remainingBatches.addAll(remainingBatch.get());
} catch (ExecutionException e) {
Throwable cause = e.getCause();
if (cause instanceof IOException) {
throw (IOException)cause;
}
throw (RuntimeException)cause;
}
}
// these haven't been flushed yet by the worker threads
if (!remainingBatches.isEmpty()) {
dao.insertBatch(remainingBatches);
}
}
What happens here is that I create four worker threads to parse some text (this is the Reader in parameter to the process() method). Each worker saves the lines it has parsed in a batch, and flushes the batch when it is full (dao.insertBatch(batch.getAndSet(new ArrayList<Object>(batchSize)));).
Since the number of lines in the text isn't a multiple of the batch size, the last objects end up in a batch that isn't flushed, since it's not full. These remaining batches are therefore inserted by the main thread.
I use AtomicReference.getAndSet() to replace the full batch with an empty one. It this program correct with regards to threading?

Um... it doesn't really work like this. AtomicReference guarantees that the reference itself is visible across threads i.e. if you assign it a different reference than the original one the update will be visible. It makes no guarantees about the actual contents of the object that reference is pointing to.
Therefore, read/write operations on the list contents require separate synchronization.
Edit: So, judging from your updated code and the comment you posted, setting the local reference to volatile is sufficient to ensure visibility.

I think that, forgetting all the code here, you exact question is this:
Do I have to do additional synchronization when writing (Thread A) and
reading (Thread B) to ensure thread B sees the list the way A left it,
or is this taken care of by the AtomicReference?
So, the exact response to that is: YES, atomic take care of visibility. And it is not my opinion but the JDK documentation one:
The memory effects for accesses and updates of atomics generally follow the rules for volatiles, as stated in The Java Language Specification, Third Edition (17.4 Memory Model).
I hope this helps.

Adding to Tudor's answer: You will have to make the ArrayList itself threadsafe or - depending on your requirements - even larger code blocks.
If you can get away with a threadsafe ArrayList you can "decorate" it like this:
batch = java.util.Collections.synchronizedList(new ArrayList<Object>());
But keep in mind: Even "simple" constructs like this are not threadsafe with this:
Object o = batch.get(batch.size()-1);

The AtomicReference will only help you with the reference to the list, it will not do anything to the list itself. More particularly, in your scenario, you will almost certainly run into problems when the system is under load where the consumer has taken the list while the producer is adding an item to it.
This sound to me like you should be using a BlockingQueue. You can then Limit the memory footprint if you producer is faster than your consumer and let the queue handle all contention.
Something like:
ArrayBlockingQueue<Object> queue = new ArrayBlockingQueue<Object> (50);
// ... Producer
queue.put(o);
// ... Consumer
List<Object> queueContents = new ArrayList<Object> ();
// Grab everything waiting in the queue in one chunk. Should never be more than 50 items.
queue.drainTo(queueContents);
Added
Thanks to #Tudor for pointing out the architecture you are using. ... I have to admit it is rather strange. You don't really need AtomicReference at all as far as I can see. Each thread owns its own ArrayList until it is passed on to dao at which point it is replaced so there is no contention at all anywhere.
I am a little concerned about you creating four parser on a single Reader. I hope you have some way of ensuring each parser does not affect the others.
I personally would use some form of producer-consumer pattern as I have described in the code above. Something like this perhaps.
static final int PROCESSES = 4;
static final int batchSize = 10;
public void process(Reader in) throws IOException, InterruptedException {
final List<Future<Void>> tasks = new ArrayList<Future<Void>>();
ExecutorService exec = Executors.newFixedThreadPool(PROCESSES);
// Queue of objects.
final ArrayBlockingQueue<Object> queue = new ArrayBlockingQueue<Object> (batchSize * 2);
// The final object to post.
final Object FINISHED = new Object();
// Start the producers.
for (int i = 0; i < PROCESSES; i++) {
tasks.add(exec.submit(new Callable<Void>() {
#Override
public Void call() throws IOException {
Processor.this.parser.parse(in, new Parser.Handler() {
#Override
public void onNewObject(Object event) {
queue.add(event);
}
});
// Post a finished down the queue.
queue.add(FINISHED);
return null;
}
}));
}
// Start the consumer.
tasks.add(exec.submit(new Callable<Void>() {
#Override
public Void call() throws IOException {
List<Object> batch = new ArrayList<Object>(batchSize);
int finishedCount = 0;
// Until all threads finished.
while ( finishedCount < PROCESSES ) {
Object o = queue.take();
if ( o != FINISHED ) {
// Batch them up.
batch.add(o);
if ( batch.size() >= batchSize ) {
dao.insertBatch(batch);
// If insertBatch takes a copy we could merely clear it.
batch = new ArrayList<Object>(batchSize);
}
} else {
// Count the finishes.
finishedCount += 1;
}
}
// Finished! Post any incopmplete batch.
if ( batch.size() > 0 ) {
dao.insertBatch(batch);
}
return null;
}
}));
// Wait for everything to finish.
exec.shutdown();
// Wait until all is done.
boolean finished = false;
do {
try {
// Wait up to 1 second for termination.
finished = exec.awaitTermination(1, TimeUnit.SECONDS);
} catch (InterruptedException ex) {
}
} while (!finished);
}

How to correctly use synchronized?

This piece of code:
synchronized (mList) {
if (mList.size() != 0) {
int s = mList.size() - 1;
for (int i = s; i > 0; i -= OFFSET) {
mList.get(i).doDraw(canv);
}
getHead().drawHead(canv);
}
}
Randomly throws AIOOBEs. From what I've read, the synchronized should prevent that, so what am I doing wrong?
Edits:
AIOOBE = Array Index Out Of Bounds Exception
The code's incomplete, cut down to what is needed. But to make you happy, OFFSET is 4, and just imagine that there is a for-loop adding a bit of data at the beginning. And a second thread reading and / or modifying the list.
Edit 2:
I've noticed it happens when the list is being drawn and the current game ends. The draw-thread hasn't drawn all elements when the list is emptied. Is there a way of telling the game to wait with emtying the list untill it's empty?
Edit 3:
I've just noticed that I'm not sure if this is a multi-threading problem. Seems I only have 2 threads, one for calculating and drawing and one for user input.. Gonna have to look into this a bit more than I thought.

What you're doing looks right... but that's all:
It doesn't matter on what object you synchronize, it needn't be the list itself.
What does matter is if all threads always synchronize on the same object, when accessing a shared resource.
Any access to SWING (or another graphic library) must happen in the AWT-Thread.
To your edit:
I've noticed it happens when the list is being drawn and the current game ends. The draw-thread hasn't drawn all elements when the list is emptied. Is there a way of telling the game to wait with emtying the list untill it's empty?
I think you mean "...wait with emptying the list until the drawing has completed." Just synchronize the code doing it on the same lock (i.e., the list itself in your case).
Again: Any access to a shared resource must be protected somehow. It seems like you're using synchronized just here and not where you're emptying the list.

The safe solution is to only allow one thread to create objects, add and remove them from a List after the game has started.
I had problems myself with random AIOOBEs erros and no synchornize could solve it properly plus it was slowing down the response of the user.
My solution, which is now stable and fast (never had an AIOOBEs since) is to make UI thread inform the game thread to create or manipulate an object by setting a flag and coordinates of the touch into the persistent variables.
Since the game thread loops about 60 times per second this proved to be sufficent to pick up the message from the UI thread and do something.
This is a very simple solution and it works great!

My suggestion is to use a BlockingQueue and I think you are looking for this solution also. How you can do it? It is already shown with an example in the javadoc :)
class Producer implements Runnable {
private final BlockingQueue queue;
Producer(BlockingQueue q) { queue = q; }
public void run() {
try {
while (true) { queue.put(produce()); }
} catch (InterruptedException ex) { ... handle ...}
}
Object produce() { ... }
}
class Consumer implements Runnable {
private final BlockingQueue queue;
Consumer(BlockingQueue q) { queue = q; }
public void run() {
try {
while (true) { consume(queue.take()); }
} catch (InterruptedException ex) { ... handle ...}
}
void consume(Object x) { ... }
}
class Setup {
void main() {
BlockingQueue q = new SomeQueueImplementation();
Producer p = new Producer(q);
Consumer c1 = new Consumer(q);
Consumer c2 = new Consumer(q);
new Thread(p).start();
new Thread(c1).start();
new Thread(c2).start();
}
}
The beneficial things for you are, you need not to worry about synchronizing your mList. BlockingQueue offers 10 special method. You can check it in the doc. Few from javadoc:
BlockingQueue methods come in four forms, with different ways of handling operations that cannot be satisfied immediately, but may be satisfied at some point in the future: one throws an exception, the second returns a special value (either null or false, depending on the operation), the third blocks the current thread indefinitely until the operation can succeed, and the fourth blocks for only a given maximum time limit before giving up.
To be in safe side: I am not experienced with android. So not certain whether all java packages are allowed in android. But at least it should be :-S, I wish.

You are getting Index out of Bounds Exception because there are 2 threads that operate on the list and are doing it wrongly.
You should have been synchronizing at another level, in such a way that no other thread can iterate through the list while other thread is modifying it! Only on thread at a time should 'work on' the list.
I guess you have the following situation:
//piece of code that adds some item in the list
synchronized(mList){
mList.add(1, drawableElem);
...
}
and
//code that iterates you list(your code simplified)
synchronized (mList) {
if (mList.size() != 0) {
int s = mList.size() - 1;
for (int i = s; i > 0; i -= OFFSET) {
mList.get(i).doDraw(canv);
}
getHead().drawHead(canv);
}
}
Individually the pieces of code look fine. They seam thread-safe. But 2 individual thread-safe pieces of code might not be thread safe at a higher level!
It's just you would have done the following:
Vector v = new Vector();
if(v.length() == 0){ v.length() itself is thread safe!
v.add("elem"); v.add() itself is also thread safe individually!
}
BUT the compound operation is NOT!
Regards,
Tiberiu

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Multiple Threads in Java for ConcurrentHashMap - java

better way is using BlockingQueue, one thread put the queue, another thread take from the queue.

i++; is not thread-safe. You will get a lower count than there are updates. Use AtomicInteger And its incrementAndGet() method instead.

Related

java program a infinite loop in thread cause hundred percent cpu

Synchronizing a group of threads

List concurrency failing

AtomicReference to a mutable object and visibility

How to correctly use synchronized?

Categories

Resources