Sharing a resource among Threads, different behavior in different java versions - java

This is the first time I've encountered something like below.
Multiple Threads (Inner classes implementing Runnable) sharing a Data Structure (instance variable of the upper class).
Working: took classes from Eclipse project's bin folder, ran on a Unix machine.
NOT WORKING: directly compiled the src on Unix machine and used those class files. Code compiles and then runs with no errors/warnings, but one thread is not able to access shared resource properly.
PROBLEM: One thread adds elements to the above common DS. Second thread does the following...
while(true){
if(myArrayList.size() > 0){
//do stuff
}
}
The Log shows that the size is updated in Thread 1.
For some mystic reason, the workflow is not enetering if() ...
Same exact code runs perfectly if I directly paste the class files from Eclipse's bin folder.
I apologize if I missed anything obvious.
Code:
ArrayList<CSRequest> newCSRequests = new ArrayList<CSRequest>();
//Thread 1
private class ListeningSocketThread implements Runnable {
ServerSocket listeningSocket;
public void run() {
try {
LogUtil.log("Initiating...");
init(); // creates socket
processIncomongMessages();
listeningSocket.close();
} catch (IOException e) {
e.printStackTrace();
}
}
private void processIncomongMessages() throws IOException {
while (true) {
try {
processMessage(listeningSocket.accept());
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
}
private void processMessage(Socket s) throws IOException, ClassNotFoundException {
// read message
ObjectInputStream ois = new ObjectInputStream(s.getInputStream());
Object message = ois.readObject();
LogUtil.log("adding...: before size: " + newCSRequests.size());
synchronized (newCSRequests) {
newCSRequests.add((CSRequest) message);
}
LogUtil.log("adding...: after size: " + newCSRequests.size()); // YES, THE SIZE IS UPDATED TO > 0
//closing....
}
........
}
//Thread 2
private class CSRequestResponder implements Runnable {
public void run() {
LogUtil.log("Initiating..."); // REACHES..
while (true) {
// LogUtil.log("inside while..."); // IF NOT COMMENTED, FLOODS THE CONSOLE WITH THIS MSG...
if (newCSRequests.size() > 0) { // DOES NOT PASS
LogUtil.log("inside if size > 0..."); // NEVER REACHES....
try {
handleNewCSRequests();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
....
}
UPDATE
Solution was to add synchronized(myArrayList) before I check the size in the Thread 2.

To access a shared structure in a multi-threaded environment, you should use implicit or explicit locking to ensure safe publication and access among threads.
Using the code above, it should look like this:
while(true){
synchronized (myArrayList) {
if(myArrayList.size() > 0){
//do stuff
}
}
//sleep(...) // outside the lock!
}
Note: This pattern looks much like a producer-consumer and is better implemented using a queue. LinkedBlockingQueue is a good option for that and provides built-in concurrency control capabilities. It's a good structure for safe publishing of data among threads.
Using a concurrent data structure lets you get rid of the synchronized block:
Queue queue = new LinkedBlockingQueue(...)
...
while(true){
Data data = queue.take(); // this will wait until there's data in the queue
doStuff(data);
}

Every time you modify a given shared variable inside a parallel region (a region with multiple threads running in parallel) you must ensure mutual exclusion. You can guarantee mutual exclusion in Java by using synchronized or locks, normally you use locks when you want a finer grain synchronization.
If the program only performance reads on a given shared variable there is no need for synchronized/lock the accesses to this variable.
Since you are new in this subject I recommend you this tutorial

If I got this right.. There are at least 2 threads that work with the same, shared, datastructure. The array you mentioned.. One thread adds values to the array and the second thread "does stuff" if the size of the array > 0.
There is a chance that the thread scheduler ran the second thread (that checks if the collection is > 0), before the first thread got a chance to run and add a value.
Running the classes from bin or recompiling them has nothing to do. If you were to run the application over again from the bin directory, you might seen the issue again. How many times did you ran the app?
It might not reproduce consistently but at one point you might see the issue again.
You could access the datastruce in a serial fashion, allowing only one thread at a time to access the array. Still that does not guarantee that the first thread will run and only then the second one will check if the size > 0.
Depending on what you need to accomplish, there might be better / other ways to achieve that. Not necessarily using a array to coordinate the threads..

Check the return of
newCSRequests.add((CSRequest) message);
I am guessing its possible that it didn't get added for some reason. If it was a HashSet or similar, it could have been because the hashcode for multiple objects return the same value. What is the equals implementation of the message object?
You could also use
List list = Collections.synchronizedList(new ArrayList(...));
to ensure the arraylist is always synchronised correctly.
HTH

Related

System.out.println on my boolean made my program able to validate the boolean

My program is based on two threads that share a protocol object. Depending on a boolean in the shared protocol object I try to make the other thread wait before using the protocol.
Main:
GameProtocol protocol = new GameProtocol();
MyThreadedClass thread1 = new MyThreadedClass(protocol);
MyThreadedClass thread2 = new MyThreadedClass(protocol);
thread1.start()
thread2.start()
Thread class:
GameProtocol protocol;
private MyThreadedClass(GameProtocol protocol){
this.protocol = protocol
}
private GamePackage waitCheck(GamePackage gp){
if(!gp.isWaiting()) {
return protocol.update(gp);
}
while(protocol.waitForCategory) {
//System.out.println(protocol.waitForCategory);
}
return protocol.update(gp);
}
Protocol class:
boolean waitForCategory = false;
public synchronized GamePackage update(GamePackage gp){
if(gp.myTurnToPickCategory){
gp.setWaiting(false);
waitForCategory = true;
} else {
gp.setWaiting(true);
waitForCategory = false;
}
return gp;
}
Now my intention is to make one thread wait untill the other thread have used the update method a second time. But the second thread get stuck in the while loop even tho the boolean waitForCategory have been set to false. Once I added the line System.out.println(protocol.waitForCategory); however it just started to work, and if I remove it it stops working again. I can't seem to understand how a ´sout´ on the boolean make it work. If anyone understands this would it be possible to solve it in another way? as having a sout inside a loop like that will make it messy.
As others have already explained, the introduction of println() inserts synchronization into the picture, so your code gives the illusion that it works.
In order to solve this problem you have to make sure everything is properly synchronized. In other words, gp.isWaiting() must also be synchronized, and protocol.waitForCategory must be moved into a method and synchronized.
Alternatively, quit trying to work with synchronization and use asynchronous message passing via java.util.concurrent.BlockingQueue instead. Your code will perform better, you will not be running the danger of race conditions, and your code will also be testable. (Whereas with synchronization your code will never be testable, because there is no test that will catch a race condition.)

How to deal with code that runs before foreach block in Apache Spark?

I'm trying to deal with some code that runs differently on Spark stand-alone mode and Spark running on a cluster. Basically, for each item in an RDD, I'm trying to add it to a list, and once this is done, I want to send this list to Solr.
This works perfectly fine when I run the following code in stand-alone mode of Spark, but does not work when the same code is run on a cluster. When I run the same code on a cluster, it is like "send to Solr" part of the code is executed before the list to be sent to Solr is filled with items. I try to force the execution by solrInputDocumentJavaRDD.collect(); after foreach, but it seems like it does not have any effect.
// For each RDD
solrInputDocumentJavaDStream.foreachRDD(
new Function<JavaRDD<SolrInputDocument>, Void>() {
#Override
public Void call(JavaRDD<SolrInputDocument> solrInputDocumentJavaRDD) throws Exception {
// For each item in a single RDD
solrInputDocumentJavaRDD.foreach(
new VoidFunction<SolrInputDocument>() {
#Override
public void call(SolrInputDocument solrInputDocument) {
// Add the solrInputDocument to the list of SolrInputDocuments
SolrIndexerDriver.solrInputDocumentList.add(solrInputDocument);
}
});
// Try to force execution
solrInputDocumentJavaRDD.collect();
// After having finished adding every SolrInputDocument to the list
// add it to the solrServer, and commit, waiting for the commit to be flushed
try {
if (SolrIndexerDriver.solrInputDocumentList != null
&& SolrIndexerDriver.solrInputDocumentList.size() > 0) {
SolrIndexerDriver.solrServer.add(SolrIndexerDriver.solrInputDocumentList);
SolrIndexerDriver.solrServer.commit(true, true);
SolrIndexerDriver.solrInputDocumentList.clear();
}
} catch (SolrServerException | IOException e) {
e.printStackTrace();
}
return null;
}
}
);
What should I do, so that sending-to-Solr part executes after the list of SolrDocuments are added to solrInputDocumentList (and works also in cluster mode)?
As I mentioned on the Spark Mailing list:
I'm not familiar with the Solr API but provided that 'SolrIndexerDriver' is a singleton, I guess that what's going on when running on a cluster is that the call to:
SolrIndexerDriver.solrInputDocumentList.add(elem)
is happening on different singleton instances of the SolrIndexerDriver on different JVMs while
SolrIndexerDriver.solrServer.commit
is happening on the driver.
In practical terms, the lists on the executors are being filled-in but they are never committed and on the driver the opposite is happening.
The recommended way to handle this is to use foreachPartition like this:
rdd.foreachPartition{iter =>
// prepare connection
Stuff.connect(...)
// add elements
iter.foreach(elem => Stuff.add(elem))
// submit
Stuff.commit()
}
This way you can add the data of each partition and commit the results in the local context of each executor. Be aware that this add/commit must be thread safe in order to avoid data loss or corruption.
have you checked under the spark UI to see the execution plan of this job.
Check how it is getting split into stages and their dependencies. That should give you an idea hopefully.

How to make an async listener do blocking?

I am writing a blackberry app that communicates with a simple Bluetooth peripheral using text based AT commands - similar to a modem... I can only get it working on the blackberry using an event listener. So the communication is now asynchronous.
However, since it is a simple device and I need to control concurrent access, I would prefer to just have a blocking call.
I have the following code which tries to convert the communications to blocking by using a wait/notify. But when I run it, notifyResults never runs until getStringValue completes. i.e. it will always timeout no matter what the delay.
The btCon object runs on a separate thread already.
I'm sure I am missing something obvious with threading. Could someone kindly point it out?
Thanks
I should also add the the notifyAll blows up with an IllegalMonitorStateException.
I previously tried it with a simple boolean flag and a wait loop. But the same problem existed. notifyResult never runs until after getStringValue completes.
public class BTCommand implements ResultListener{
String cmd;
private BluetoothClient btCon;
private String result;
public BTCommand (String cmd){
this.cmd=cmd;
btCon = BluetoothClient.getInstance();
btCon.addListener(this);
System.out.println("[BTCL] BTCommand init");
}
public String getStringValue(){
result = "TIMEOUT";
btCon.sendCommand(cmd);
System.out.println("[BTCL] BTCommand getStringValue sent and waiting");
synchronized (result){
try {
result.wait(5000);
} catch (InterruptedException e) {
System.out.println("[BTCL] BTCommand getStringValue interrupted");
}
}//sync
System.out.println("[BTCL] BTCommand getStringValue result="+result);
return result;
}
public void notifyResults(String cmd) {
if(cmd.equalsIgnoreCase(this.cmd)){
synchronized(result){
result = btCon.getHash(cmd);
System.out.println("[BTCL] BTCommand resultReady: "+cmd+"="+result);
result.notifyAll();
}//sync
}
}
}
Since both notifyResults and getStringValue have synchronized clauses on the same object, assuming getStringValues gets to the synchronized section first notifyResults will block at the start of the synchronized clause until getStringValues exits the synchronized area. If I understand, this is the behaviour you're seeing.
Nicholas' advice is probably good, but you may not find any of those implementations in BlackBerry APIs you're using. You may want to have a look at the produce-consumer pattern.
It may be more appropriate to use a Latch, Semaphore, or a Barrier, as recommended by Brian Goetz book Java Concurrency in Practice.
These classes will make it easier to write blocking methods, and will likely help to prevent bugs, especially if you are unfamiliar with wait() and notifyAll(). (I am not suggesting that YOU are unfamiliar, it is just a note for others...)
The code will work ok. If you will use final object instead of string variable. I'm surprised that you don't get NPE or IMSE.
Create field:
private final Object resultLock = new Object();
Change all synchronized sections to use it instead of string field result.
I don't like magic number 5 sec. I hope you treat null result as timeout in your application.

Java Multithreaded I/0 and communication problem

I am using java to create an application for network management. In this application I establish communication with network devices using SNMP4j library (for the snmp protocol). So, Im supposed to scan certain values of the network devices using this protocol and put the result into a file for caching. Up in some point I decided to make my application multi-threaded and assign a device to a thread. I created a class that implements the runnable interface and then scans for the values that I want for each device.
When i run this class alone it, works fine. but when I put multiple threads at the same time the output mess up, it prints additional or out of order output into the files. Now, i wonder if this problem is due to the I/O or due to the communication.
Here I'll put some of the code so that you can see what im doing and help me figure what's wrong.
public class DeviceScanner implements Runnable{
private final SNMPCommunicator comm;
private OutputStreamWriter out;
public DeviceScanner(String ip, OutputStream output) throws IOException {
this.device=ip;
this.comm = new SNMPV1Communicator(device);
oids=MIB2.ifTableHeaders;
out = new OutputStreamWriter(output);
}
#Override
public void run(){
//Here I use the communicator to request for desired data goes something like ...
String read=""
for (int j=0; j<num; j++){
read= comm.snmpGetNext(oids);
out.write(read);
this.updateHeaders(read);
}
out.flush();
//...
}
}
some of the expected ooutput would be something like:
1.3.6.1.2.1.1.1.0 = SmartSTACK ELS100-S24TX2M
1.3.6.1.2.1.1.2.0 = 1.3.6.1.4.1.52.3.9.1.10.7
1.3.6.1.2.1.1.3.0 = 26 days, 22:35:02.31
1.3.6.1.2.1.1.4.0 = admin
1.3.6.1.2.1.1.5.0 = els
1.3.6.1.2.1.1.6.0 = Computer Room
but instead i get something like (varies):
1.3.6.1.2.1.1.1.0 = SmartSTACK ELS100-S24TX2M
1.3.6.1.2.1.1.2.0 = 1.3.6.1.4.1.52.3.9.1.10.7
1.3.6.1.2.1.1.4.0 = admin
1.3.6.1.2.1.1.5.0 = els
1.3.6.1.2.1.1.3.0 = 26 days, 22:35:02.31
1.3.6.1.2.1.1.6.0 = Computer Room
1.3.6.1.2.1.1.1.0 = SmartSTACK ELS100-S24TX2M
1.3.6.1.2.1.1.2.0 = 1.3.6.1.4.1.52.3.9.1.10.7
*Currently I have one file per device scanner desired.
i get them from a list of ip , it looks like this. Im also using a little threadpool to keep a limited number of threads at the same time .
for (String s: ips){
output= new FileOutputStream(new File(path+s));
threadpool.add(new DeviceScanner(s, output));
}
I suspect SNMPV1Communicator(device) is not thread-safe. As I can see it's not a part of SNMP4j library.
Taking a wild guess at what's going on here, try putting everything inside a synchronized() block, like this:
synchronized (DeviceScanner.class)
{
for (int j=0; j<num; j++){
read= comm.snmpGetNext(oids);
out.write(read);
this.updateHeaders(read);
}
out.flush();
}
If this works, my guess is right and the reason for the problems you're seeing is that you have many OutputStreamWriters (one on each thread), all writing to a single OutputStream. Each OutputStreamWriter has its own buffer. When this buffer is full, it passes the data to the OutputStream. It's essentially random when each each OutputStreamWriter's buffer is full - it might well be in the middle of a line.
The synchronized block above means that only one thread at a time can be writing to that thread's OutputStreamWriter. The flush() at the end means that before leaving the synchronized block, the OutputStreamWriter's buffer should have been flushed to the underlying OutputStream.
Note that synchronizing in this way on the class object isn't what I'd consider best practice. You should probably be looking at using a single instance of some other kind of stream class - or something like a LinkedBlockingQueue, with all of the SNMP threads passing their data over to a single file-writing thread. I've added the synchronized as above because it was the only thing available to synchronize on within your pasted example code.
You've got multiple threads all using buffered output, and to the same file.
There's no guarantees as to when those threads will be scheduled to run ... the output will be fairly random ordered, dictated by the thread scheduling.

How to know whether a file copying is 'in progress'/complete in java (1.6) [duplicate]

This question already has answers here:
JAVA NIO Watcher: How to detect end of a long lasting (copy) operation?
(2 answers)
Closed 8 years ago.
I am writing a directory monitoring utility in java(1.6) using polling at certain intervals using lastModified long value as the indication of change. I found that when my polling interval is small (seconds) and the copied file is big then the change event is fired before the actual completion of file copying.
I would like to know whether there is a way I can find the status of file like in transit, complete etc.
Environments: Java 1.6; expected to work on windows and linux.
There are two approaches I've used in the past which are platform agnostic.
1/ This was for FTP transfers where I controlled what was put, so it may not be directly relevant.
Basically, whatever is putting a file file.txt will, when it's finished, also put a small (probably zero-byte) dummy file called file.txt.marker (for example).
That way, the monitoring tool just looks for the marker file to appear and, when it does, it knows the real file is complete. It can then process the real file and delete the marker.
2/ An unchanged duration.
Have your monitor program wait until the file is unchanged for N seconds (where N is reasonably guaranteed to be large enough that the file will be finished).
For example, if the file size hasn't changed in 60 seconds, there's a good chance it's finished.
There's a balancing act between not thinking the file is finished just because there's no activity on it, and the wait once it is finished before you can start processing it. This is less of a problem for local copying than FTP.
This solution worked for me:
File ff = new File(fileStr);
if(ff.exists()) {
for(int timeout = 100; timeout>0; timeout--) {
RandomAccessFile ran = null;
try {
ran = new RandomAccessFile(ff, "rw");
break; // no errors, done waiting
} catch (Exception ex) {
System.out.println("timeout: " + timeout + ": " + ex.getMessage());
} finally {
if(ran != null) try {
ran.close();
} catch (IOException ex) {
//do nothing
}
ran = null;
}
try {
Thread.sleep(100); // wait a bit then try again
} catch (InterruptedException ex) {
//do nothing
}
}
System.out.println("File lockable: " + fileStr +
(ff.exists()?" exists":" deleted during process"));
} else {
System.out.println("File does not exist: " + fileStr);
}
This solution relies on the fact that you can't open the file for writing if another process has it open. It will stay in the loop until the timeout value is reached or the file can be opened. The timeout values will need to be adjusted depending on the application's actual needs. I also tried this method with channels and tryLock(), but it didn't seem to be necessary.
Do you mean that you're waiting for the lastModified time to settle? At best that will be a bit hit-and-miss.
How about trying to open the file with write access (appending rather than truncating the file, of course)? That won't succeed if another process is still trying to write to it. It's a bit ugly, particularly as it's likely to be a case of using exceptions for flow control (ick) but I think it'll work.
If I understood the question correctly, you're looking for a way to distinguish whether the copying of a file is complete or still in progress?
How about comparing the size of the source and destination file (i.e. file.length())? If they're equal, then copying is complete. Otherwise, it's still in progress.
I'm not sure it's efficient since it would still require polling. But it "might" work.
You could look into online file upload with progressbar techniques - they use OutputStreamListener and custom writer to notify the listener about bytes written.
http://www.missiondata.com/blog/java/28/file-upload-progress-with-ajax-and-java-and-prototype/
File Upload with Java (with progress bar)
We used to monitor the File Size change for determine whether the File is inComplete or not.
we used Spring integration File endpoint to do the polling for a directory for every 200 ms.
Once the file is detected(regardless of whether it is complete or not), We have a customer File filter, which will have a interface method "accept(File file)" to return a flag indicating whether we can process the file.
If the False is returned by the filter, this FILE instance will be ignored and it will be pick up during the next polling for the same filtering process..
The filter does the following:
First, we get its current file size. and we will wait for 200ms(can be less) and check for the size again. If the size differs, we will retry for 5 times. Only when the file size stops growing, the File will be marked as COMPLETED.(i.e. return true).
Sample code used is as the following:
public class InCompleteFileFilter<F> extends AbstractFileListFilter<F> {
protected Object monitor = new Object();
#Override
protected boolean accept(F file) {
synchronized (monitor){
File currentFile = (File)file;
if(!currentFile.getName().contains("Conv1")){return false;}
long currentSize = currentFile.length();
try { Thread.sleep(200); } catch (InterruptedException e) { e.printStackTrace(); }
int retryCount = 0;
while(retryCount++ < 4 && currentFile.length() > currentSize){
try { Thread.sleep(200); } catch (InterruptedException e) { e.printStackTrace(); }
}
if(retryCount == 5){
return false;
}else{
return true;
}
}
}
}

Categories

Resources