Non-Blocking File Reads - java

Is there a non-blocking file read API in java? If not would it be wise to build one in C++ and call it from a java app via JNI?

My original answer is now wrong, since the addition of AsynchronousFileChannel in Java 7.
You still cannot select on a file, but there are now two asynchronous file read methods: one that takes a callback and another that returns a Future.
It may be cleaner to use the callback method (and dispatch an event from the callback) than to have a dedicated thread polling a pipe.

No, FileChannel does not extend SelectableChannel.
Probably because not all OSes support it.
Windows does, and in theory you could write a windows-specific C++ library and call it via JNI, but it is a lot of work to integrate this with java.nio.
I would rather have a worker thread copy the file contents to a pipe and do non-blocking reads on the other end of the pipe.

AsynchronousFileChannel is the right answer. Yet, it does not provide an easy API. It is quite verbose to use it comparing with the similar usage of java.nio.file.Files that provides simple static methods, such as: readAllLines or lines. Unfortunately Files methods are synchronous.
The AsyncFiles alternative from RxIo provides the corresponding non-blocking methods, with 3 different APIs: callback based, CompletableFuture and also with reactive streams. Here it is an example with reactive streams:
AsyncFiles
.lines(path)
.subscribe(doOnNext(line -> /*... use line from background thread ...*/));

Related

Find out all java code places where a blocking operation happens

In our Netty application. We are moving all blocking calls in our code to run in a special backgroundThreadGroup.
I'd like to be able to log in production the threadName and the lineNumber of the java code that is about to execute a blocking operation. (i.e. sync File and Network IO)
That way I can grep for the logs looking at places were we might have missed to move our blocking code to the backgroundThreadGroup.
Is there a way to instrument the JVM so that it can tell me that?
Depends on what you mean by a "blocking operation".
In a broad sense, any operation that causes a voluntary context switch is blocking. Trying to do something special about them is absolutely impractical.
For example, in Java, any method containing synchronized is potentially blocking. This includes
ConcurrentHashMap.put
SecureRandom.nextInt
System.getProperty
and many more. I don't think you really want to avoid calling all these methods that look normal at a first glance.
Even simple methods without any synchronization primitives can be blocking. E.g., ByteBuffer.get may result in a page fault and a blocking read on the OS level. Furthermore, as mentioned in comments, there are JVM level blocking operations that are not under your control.
In short, it's impractical if not impossible to find all places in the code where a blocking operation happens.
If, however, you are interested in finding particular method calls that you believe are bad (like Thread.sleep and Socket.read), you can definitely do so. There is a BlockHound project specifically for this purpose. It already has a predefined list of "bad" methods, but can be customized with your own list.
There is a library called BlockHound, that will throw an exception unless you have configured BlockHound to ignore that specific blocking call
This is how you configure BlockHound for Netty: https://github.com/violetagg/netty/blob/625f9d5781ed85bfaca6fa4e826d0d46d70fdbd8/common/src/main/java/io/netty/util/internal/Hidden.java
(You can improve the above code by replacing the last line with builder.nonBlockingThreadPredicate(
p -> p.or(thread -> thread instanceof FastThreadLocalThread)); )
see https://github.com/reactor/BlockHound
see https://blog.frankel.ch/blockhound-how-it-works/
I personally used it to find all blocking call within our Netty based service.
Good Luck

Unbounded PipedInputStream in Java

I am using an http library to fetch data that is 200 mb in size. Each line in the data is then processed. To save memory I would like to process the data line by line as the data is streamed in rather than waiting for all 200 mb to be downloaded first.
The http library I am using exposes a method that looks something like OnCharReceived(CharBuffer buffer) that can be overridden so that I can in effect process each chunk of data as it comes in.
I would like to expose this data as an InputStream. My first thought was to use a PipedInputStream and PipedOutputStream pair where in OnCharReceived() I would write to the PipedOutputStream and in my thread read from the PipedInputStream. However, this seems to have the problem that the underlying buffer of the pipe could get full requiring the writing thread to block in OnCharReceived until my thread gets around to processing data. But blocking in OnCharReceived would probably be blocking in the http library's IO thread and would be very bad.
Are there Java classes out there that handle the abstract problem I need to solve here without me having to roll my own custom implementation. I know of things like BlockingQueue that could be used as part of a larger solution. But are there any simple solutions.
For reasons of legacy code I really need the data exposed as an InputStream.
Edit: To be more precise I am basing my code on the following example from the apache http async library
https://hc.apache.org/httpcomponents-asyncclient-dev/httpasyncclient/examples/org/apache/http/examples/nio/client/AsyncClientHttpExchangeStreaming.java
If there's a simpler solution I would not get near Piped[In/Out]putStream. It introduces unnecessary complicated threading concerns as you pointed out. Keep in mind you can always write to a temp file and then read from the file as an InputStream. This also has the advantage of closing the HTTP connection as fast as possible and avoid timeouts.
There might be other solutions depending on the API you are using but I think the proposed solution still makes sense for the reasons above.

Java concurrent writes from multiple threads to a single text file?

I have a multi-threaded Java 7 program (a jar file) which uses JDBC to perform work (it uses a fixed thread pool).
The program works fine and it logs things as it progresses to the command shell console window (System.out.printf()) from multiple concurrent threads.
In addition to the console output I also need to add the ability for this program to write to a single plain ASCII text log file - from multiple threads.
The volume of output is low, the file will be relatively small as its a log file, not a data file.
Can you please suggest a good and relatively simple design/approach to get this done using Java 7 features (I dont have Java 8 yet)?
Any code samples would also be appreciated.
thank you very much
EDIT:
I forgot to add: in Java 7 using Files.newOutputStream() static factory method is stated to be thread safe - according to official Java documentation. Is this the simplest option to write a single shared text log file from multiple threads?
If you want to log output, why not use a logging library, like e.g. log4j2? This will allow you to tailor your log to your specific needs, and can log without synchronizing your threads on stdout (you know that running System.out.print involves locking on System.out?)
Edit: For the latter, if the things you log are thread-safe, and you are OK with adding LMAX' disruptor.jar to your build, you can configure async loggers (just add "async") that will have a logging thread take care of the whole message formatting and writing (and keeping your log messages in order) while allowing your threads to run on without a hitch.
Given that you've said the volume of output is low, the simplest option would probably be to just write a thread-safe writer which uses synchronization to make sure that only one thread can actually write to the file at a time.
If you don't want threads to block each other, you could have a single thread dedicated to the writing, using a BlockingQueue - threads add write jobs (in whatever form they need to - probably just as strings) to the queue, and the single thread takes the values off the queue and writes them to the file.
Either way, it would be worth abstracting out the details behind a class dedicated for this purpose (ideally implementing an interface for testability and flexibility reasons). That way you can change the actual underlying implementation later on - for example, starting off with the synchronized approach and moving to the producer/consumer queue later if you need to.
Keep a common PrintStream reference where you'll write to (instead of System.out) and set it to System.out or channel it through to a FileOutputStream depending on what you want.
Your code won't change much (barely at all) and PrintStream is already synchronized too.

Signals in Java

I was use C++ signals
sigaction
struct sigaction sigact;
and set all attributes to use signals
now I want to use it in Java what's the equivalent in java
to the include "signal.h"
I have two threads:
one run from the beginning of the program
and the other run at the signal Alarm
I was implement the functionality in C++ using Signals as shown and now I want to implement it using java
Edited to put my Goal:
actually my Goal to run the second Thread When the signal arrives from the first thread
Thus sounds like a typical "XY-Problem".
In plain Java you have no access to OS-signal. They are platform specific and Java strifes to be platform agnostic. Also: calling Java from a signal handler with JNI might be "fun" (as explained in Dwarf Fortress).
So you have to go back to the drawing board and think about what is the problem you want to solve and stop thinking about how to solve it with signals.
That said: if you insist on signals and are not afraid to use internal stuff which might change on a whim: Take a look at sun.misc.Signal.
EDIT Now the question made it clear, that the signalling takes place within one JVM. For this signals are definitely the wrong thing in Java.
So the simplest solution is to create and start the second thread directly from within the first thread. No signalling required.
The next best solution is to code a "rendezvous point" using Object.wait() in the second thread (using any object instance but the Thread itself) and Object.notify() or notifyAll() from the first thread. Searching for these terms in a Java tutorial will bring up enough examples.

Use cases of PipedInputStream and PipedOutputStream

What are use cases of Piped streams? Why just not read data into buffer and then write them out?
BlockingQueue or similiar collections may serve you better, which is thread safe, robust, and scales better.
Pipes in Java IO provides the ability for two threads running in the same JVM to communicate. As such pipes are a common source or destination of data.
This useful if you have two long running Threads and one is setup to produce data and the other consume it.
As the other answers have said, they are designed for use between threads. In practice they are best avoided. I've used them once in 13 years and I wish I hadn't.
They are usually used for simultaneously reading and writing, usually by two different threads.
(They design is quite bad. You can't switch threads at one end and then have that thread exit without disrupting the pipe.)
One advantage of using Piped streams is that they provide stream functionality in our code without compelling us to build new specialized streams.
For e.g. we can use pipes to create simple logging facility for our application.We can send messages to logging facility through ordinaty Printwritter and then it can do whatever processing or buffering is required before sending message off to final destination.
more details refer : http://docstore.mik.ua/orelly/java/exp/ch08_01.htm

Categories

Resources