Java concurrent writes from multiple threads to a single text file?

Java concurrent writes from multiple threads to a single text file? - java

I have a multi-threaded Java 7 program (a jar file) which uses JDBC to perform work (it uses a fixed thread pool).
The program works fine and it logs things as it progresses to the command shell console window (System.out.printf()) from multiple concurrent threads.
In addition to the console output I also need to add the ability for this program to write to a single plain ASCII text log file - from multiple threads.
The volume of output is low, the file will be relatively small as its a log file, not a data file.
Can you please suggest a good and relatively simple design/approach to get this done using Java 7 features (I dont have Java 8 yet)?
Any code samples would also be appreciated.
thank you very much
EDIT:
I forgot to add: in Java 7 using Files.newOutputStream() static factory method is stated to be thread safe - according to official Java documentation. Is this the simplest option to write a single shared text log file from multiple threads?

If you want to log output, why not use a logging library, like e.g. log4j2? This will allow you to tailor your log to your specific needs, and can log without synchronizing your threads on stdout (you know that running System.out.print involves locking on System.out?)
Edit: For the latter, if the things you log are thread-safe, and you are OK with adding LMAX' disruptor.jar to your build, you can configure async loggers (just add "async") that will have a logging thread take care of the whole message formatting and writing (and keeping your log messages in order) while allowing your threads to run on without a hitch.

Given that you've said the volume of output is low, the simplest option would probably be to just write a thread-safe writer which uses synchronization to make sure that only one thread can actually write to the file at a time.
If you don't want threads to block each other, you could have a single thread dedicated to the writing, using a BlockingQueue - threads add write jobs (in whatever form they need to - probably just as strings) to the queue, and the single thread takes the values off the queue and writes them to the file.
Either way, it would be worth abstracting out the details behind a class dedicated for this purpose (ideally implementing an interface for testability and flexibility reasons). That way you can change the actual underlying implementation later on - for example, starting off with the synchronized approach and moving to the producer/consumer queue later if you need to.

Keep a common PrintStream reference where you'll write to (instead of System.out) and set it to System.out or channel it through to a FileOutputStream depending on what you want.
Your code won't change much (barely at all) and PrintStream is already synchronized too.

Related

Find out all java code places where a blocking operation happens

In our Netty application. We are moving all blocking calls in our code to run in a special backgroundThreadGroup.
I'd like to be able to log in production the threadName and the lineNumber of the java code that is about to execute a blocking operation. (i.e. sync File and Network IO)
That way I can grep for the logs looking at places were we might have missed to move our blocking code to the backgroundThreadGroup.
Is there a way to instrument the JVM so that it can tell me that?

Depends on what you mean by a "blocking operation".
In a broad sense, any operation that causes a voluntary context switch is blocking. Trying to do something special about them is absolutely impractical.
For example, in Java, any method containing synchronized is potentially blocking. This includes
ConcurrentHashMap.put
SecureRandom.nextInt
System.getProperty
and many more. I don't think you really want to avoid calling all these methods that look normal at a first glance.
Even simple methods without any synchronization primitives can be blocking. E.g., ByteBuffer.get may result in a page fault and a blocking read on the OS level. Furthermore, as mentioned in comments, there are JVM level blocking operations that are not under your control.
In short, it's impractical if not impossible to find all places in the code where a blocking operation happens.
If, however, you are interested in finding particular method calls that you believe are bad (like Thread.sleep and Socket.read), you can definitely do so. There is a BlockHound project specifically for this purpose. It already has a predefined list of "bad" methods, but can be customized with your own list.

There is a library called BlockHound, that will throw an exception unless you have configured BlockHound to ignore that specific blocking call
This is how you configure BlockHound for Netty: https://github.com/violetagg/netty/blob/625f9d5781ed85bfaca6fa4e826d0d46d70fdbd8/common/src/main/java/io/netty/util/internal/Hidden.java
(You can improve the above code by replacing the last line with builder.nonBlockingThreadPredicate(
p -> p.or(thread -> thread instanceof FastThreadLocalThread)); )
see https://github.com/reactor/BlockHound
see https://blog.frankel.ch/blockhound-how-it-works/
I personally used it to find all blocking call within our Netty based service.
Good Luck

Send Data from multiple threads to a single thread

I'm coding a Java socket server that connects to Arduino which in turn send and receive data. As shown by the Java socket documentation I've set up the server to open a new thread for every connection.
My question is, how will I be able to send the data from the socket threads to my main thread? The socket will be constantly open, so the data has to be sent while the thread is running.
Any suggestion?
Update: the goal of the server is to send commands to an Arduino (ie. Turn ligh on or off) and receive data from sensors, therefore I need a way to obtain that data from the sensors which are connected to individual threads and to send them into a single one.

Sharing data among threads is always tricky. There is no "correct" answer, it all depends on your use case. I suppose you are not searching for the highest performance, but for easiness of use, right?
For that case, I would recommend looking at synchronized collections, maps, lists or queues perhaps. One class, which seems like a good fit for you, is ConcurrentLinkedQueue.
You can also create synchronized proxies for all usual collections using the factory methods in Collections class:
Collections.synchronizedList(new ArrayList<String>());
You do not have to synchronize access to them.
Another option, which might be an overkill, is using database. There are some in-memory databases, like H2.
In any case, I suggest you to lower the amount of shared information to the lowest possible level. For example, you can keep the "raw" data separate per thread (e.g. in ThreadLocal variables) and then just synchronize during aggregation.

You seem to have the right idea - you need a thread to run the connection to the external device and you need a main thread to run your application.
How do you share data between these threads: This isn't in general a problem - different threads can write to the same memory; within the same application threads share memory space.
What you probably want to avoid is the two thread concurrently changing or reading the data - java provides a very useful keyword - synchronized - to handle this sort of situation which is straight forward to use and provides the kind of guarantees you need. This is a bit technical but discusses the concurrency features.

Here is a tutorial you might be able to get some more information on. Please note, a quick google search will bring up lots of answers to your question.
http://tutorials.jenkov.com/java-multithreaded-servers/multithreaded-server.html
In answer to your question, you can send the information from one thread to another by using a number of options - I would recommend if it is a simple setup, just use static variables/methods to pass the information.
Also as reference, for large scale programs, it is not recommended to start a thread for every connection. It works fine on smaller scale (e.g. a few number of clients), but scales poorly.

If this is a web application and you are just going to show the current readout of any of the sensors, then blocking queue is a huge overkill and will cause more problems than it solves. Just use a volatile static field of the required type. The field itself can be static, or it could reside in a singleton object, or it could be part of a context passed to the worker.
in the SharedState class:
static volatile float temperature;
in the thread:
SharedState.temperature = 13.2f;
In the web interface (assuming jsp):
<%= SharedState.temperature %>
btw: if you want to access last 10 readouts, then it's equally easy: just store an array with last 10 readouts instead of a single value (just don't modifiy what's inside the array, replace the whole array instead - otherwise synchronization issues might occur).

Python: Multithreading between Java subproccess and Python listener?

I am monitoring and Minecraft server and I am making a setup file in Python. I need to be able to run two threads, one running the minecraft_server.jar in the console window, while a second thread is constantly checking the output of the minecraft_server. Also, how would I input into the console from Python after starting the Java process?
Example:
thread1 = threading.Thread(target=listener)
thread2 = minecraft_server.jar
def listener():
if minecraft_server.jarOutput == "Server can't keep up!":
sendToTheJavaProccessAsUserInputSomeCommandsToRestartTheServer

It's pretty hard to tell here, but I think what you're asking is how to:
Launch a program in the background.
Send it input, as if it came from a user on the console.
Read its output that it tries to display to a user on the console.
At the same time, run another thread that does other stuff.
The last one is pretty easy; in fact, you've mostly written it, you just need to add a thread1.start() somewhere.
The subprocess module lets you launch a program and control its input and output. It's easiest if you want to just feed in all the input at once, wait until it's done, then process all the output, but obviously that's not your case here, so it's a bit more involved:
minecraft = subprocess.Popen(['java', 'path/to/minecraft_server.jar', '-other', 'args],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
I'm merging stdout and stderr together into one pipe; if you want to read them separately, or send stderr to /dev/null, or whatever, see the docs; it's all pretty simple. While we're making assumptions here, I'm going to assume that minecraft_server uses a simple line-based protocol, where every command, every response, and every info message is exactly one line (that is, under 1K of text ending in a \n).
Now, to send it input, you just do this:
minecraft.stdin.write('Make me a sandwich\n')
Or, in Python 3.x:
minecraft.stdin.write(b'Make me a sandwich\n')
To read its output, you do this:
response = minecraft.stdout.readline()
That works just like a regular file. But note that it works like a binary file. In Python 2.x, the only difference is that newlines don't get automatically converted, but in Python 3.x, it means you can only write bytes (and compatible objects), not strs, and you will receive bytes back. There are good reasons for that, but if you want to get pipes that act like text files instead, see the universal_newlines (and possibly bufsize) arguments under Frequently Used Arguments and Popen Constructor.
Also, it works like a blocking file. With a regular file, this rarely matters, but with a pipe, it's quite possible that there will be data later, but there isn't data yet (because the server hasn't written it yet). So, if there is no output yet (or not a complete line's worth, since I used readline()), your thread just blocks, waiting until there is.
If you don't want that, you probably want to create another thread to service stdout. And its function can actually look pretty similar to what you've got:
def listener():
for line in minecraft.stdout:
if line.strip() == "Server can't keep up!":
minecraft.stdin.write("Restart Universe\n")
Now that thread can block all day and there's no problem, because your other threads are still going.
Well, not quite no problem.
First it's going to be hard to cleanly shut down your program.
More seriously, the pipes between processes have a fixed size; if you don't service stdout fast enough, or the child doesn't service stdin fast enough, the pipe can block. And, the way I've written things, if the stdin pipe blocks, we'll be blocked forever in that stdin.write and won't get to the next read off stdout, so that can block too, and suddenly we're both waiting on each other forever.
You can solve this by having another thread to service stdout. The subprocess module itself includes an example, in the Popen._communicate function used by all the higher-level functions. (Make sure to look at Python 3.3 or later, because earlier versions had bugs.)
If you're in Python 3.4+ (or 3.3 with a backport off PyPI), you can instead use asyncio to rewrite your program around an event loop and handle the input and output the same way you'd write a reactor-based network server. That's what all the cool kids are doing in 2017, but back in late 2014 many people still thought it looked new and scary.
If all of this is sounding like a lot more work than you signed on for, you may want to consider using pexpect, which wraps up a lot of the tedious details, and makes some simplifying assumptions that are probably true in your case.

Java simple Analytics/Event Stream Processing with front end

My application takes a lot of measurements of it's internal processes. For example I time certain methods, I time external webservice calls and I also have variables which have a changing value, and processes which have a 'state' (e.g. PAUSED, WAITING etc).
The application uses 100 to 200 threads, and each bit of data would be associated with a particular thread.
I am looking for some software that I can channel all this information into that would produce useful metrics and graphs of the data (ideally in real time or close to real time), let me set thresholds to trigger warnings, would allow me to filter the data by thread or thread group, etc etc.
The application is performing time critical tasks so the software/api would need to be very fast and never block.
The application is written in java, and ideally the software/api would be in java as well. I think what I'm looking for is called Event Stream Processing, but I'm really not sure what language to use to describe it.
All I've found so far are Esper and ERMA. Can anyone give me a recommendation? I'm the only one working on this project so I'm hoping for something that is pretty easy to set up and use, and has a workable front end.

In the end I found Graphite which was pretty close to being exactly what I wanted. Not the simplest to set up and configure however, but I got it working in the end.
http://graphite.wikidot.com/
In my case I send data directly from my application to Statsd (via UDP), which collects the data and does some pre processing before it ends up in the whisper back end, there is a simple example of a java interface here https://github.com/etsy/statsd/commit/2253223f3c19d2149d65ec5bc802198ff93da4cb
Alternatively you could send your data directly to graphite, example here http://neopatel.blogspot.co.uk/2011/04/logging-to-graphite-monitoring-tool.html

Use cases of PipedInputStream and PipedOutputStream

What are use cases of Piped streams? Why just not read data into buffer and then write them out?

BlockingQueue or similiar collections may serve you better, which is thread safe, robust, and scales better.

Pipes in Java IO provides the ability for two threads running in the same JVM to communicate. As such pipes are a common source or destination of data.
This useful if you have two long running Threads and one is setup to produce data and the other consume it.

As the other answers have said, they are designed for use between threads. In practice they are best avoided. I've used them once in 13 years and I wish I hadn't.

They are usually used for simultaneously reading and writing, usually by two different threads.
(They design is quite bad. You can't switch threads at one end and then have that thread exit without disrupting the pipe.)

One advantage of using Piped streams is that they provide stream functionality in our code without compelling us to build new specialized streams.
For e.g. we can use pipes to create simple logging facility for our application.We can send messages to logging facility through ordinaty Printwritter and then it can do whatever processing or buffering is required before sending message off to final destination.
more details refer : http://docstore.mik.ua/orelly/java/exp/ch08_01.htm

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.