How to run java for loop in parallel?

How to run java for loop in parallel? - java

I am newbie learning selenium and wrote below java code. I am trying to run a for loop that is supposed to load the site 20 times. Right now it does loop in sequential order and I want that to be run in parallel.
public class lenders {
//ExtentReports logger = ExtentReports.get(lenders.class);
public static void main(String[] args) throws InterruptedException {
for (int i=0; i<20; i++) {
FirefoxDriver driver= new FirefoxDriver();
driver.manage().timeouts().pageLoadTimeout(1, TimeUnit.SECONDS);
try {
driver.get("https://www.google.com");
} catch (TimeoutException e) {
driver.quit();
}
}
Towards the end I want 20 browsers to be open and loading the site and all of them getting killed.

If you are on Java-8 you can run a parallel for loop using aparallelStream
IntStream.range(0,20).parallel().forEach(i->{
... do something here
});

In general and at a high level, trying to run code in parallel within java means that you are trying to run multi-threaded code (more than one thread executing at one time).
As individuals have been saying in comments, one must therefore give a warning with my answer. Multi-threading in itself is a complicated topic and one must enter this territory with caution as there can be many issues/topics regarding "thread safety" and even if this is the way you "should" approach the "business request".
In any case, if you really want to create something multi-threaded then I would direct you to a few technical items to get you STARTED on the topic training (and your own further research): You could create another class that implements the Callable interface. It will then have to have the "call" method by nature of implementing that interface. In this class and in the "call" method you would put the actual logic that you want to happen in parallel. In your case, all of the driver, etc.
Then in your parent class (the one you put code from above), you can use a FixedThreadPool and an associated ExecutorService that accepts this callable class. It will essentially run the "call" method in a separate thread so that your for loop can continue onwards at the same time that the logic in the "call" method is executed. It will go the second time around and create another thread, etc. You can manage how many threads are created using your thread pool. You can use different kind of thread pools and services, etc. Again, this is a really BIG topic and field. I am just trying to get your nose in a direction for you to start researching it further.
People might not like my answer because they think you should use completely different technologies other than using Selenium in this manner, etc. I totally understand that point of view and don't disagree with those alternate answers. However, your question was "how to get this running at the same time" and I have tried to give you the building block answer. I hope that helps! Let me know if you need some links to tutorials or anything, but google "ExecutorService" "Thread Pool" and "Callable" (or combinations of them) with the word java and tutorial should get you a variety of answers on the topic.
I hope that helps!

Related

Read JSON files into collections, best practice

I'm working on a JavaFX application. I have several JSON files which I would like to read and insert into Collections in domain objects. I am using Gson to read these files at present. My application currently is working, however, there is a long delay before the application launches. I assume that this is because I'm reading these files sequentially and in the same Thread. Therefore, I am looking to enhance the launch time by introducing some concurrency. I'm thinking If I can figure out how to read the files in parallel it should speed up the launch time. I'm new to the idea of concurrency so I'm trying to learn as I go. Needless to say, I've hit a few roadblocks and can't seem to find much information or examples online.
Here are my issues:
Not sure if the JSON file reads can be done in a background thread.
Domain classes use these Collections to compute and eventually display values in the GUI. From my understanding, if you modify the GUI it has to be done in the JavaFX Application thread and not in the background. I'm not sure if loading data to be used in the GUI counts as modifying the GUI. I'm not directly updating any GUI Nodes like textField.setText("something") by reading Json, so I would assume no, I'm not. Am I wrong?
What is the difference between a Task> and Thread or an ExecutorService and Callable>? Is one method preferred over the other? I've tried both and failed. When I tried using a task and background thread, I would get a NullPointerException because the app tried to access the collection before the files were read and initialized with data. It went from being too slow to being too fast. SMH.
To solve this problem, I heard about Preloaders. The idea here was to launch some sort of splash screen to delay until the loading of resources (reading of JSON files) was complete, then proceed to the main application. However, the examples or information here is VERY scarce. I'm using Java 10 and IntelliJ, so I may have cornered myself into a one in a million niche.
I'm not asking for anyone to solve my problem for me. I'm just a little lost and don't know where or how to proceed. I'll be happy to share specifics if needed but I think my issues are still conceptual at this point.
Help me StackOverflow you're my only hope.
edit: code example:
public class Employee {
private List<Employee> employeeList;
public Employee() {
employeeList = new ArrayList<>();
populateEmployees();
}
private final void populateEmployees() {
Task<Void> readEmployees = new Task<>() {
#Override
protected Void call() throws Exception {
System.out.println("Starting to read employee.json"); // #1
InputStream in = getClass().getResourceAsStream("/json/employee.json");
Reader reader = new InputStreamReader(in);
Type type = new TypeToken<List<Employee>>(){}.getType();
Gson gson = new Gson();
employeeList.addAll(gson.fromJson(reader, type));
System.out.println("employeeList has " + employeeList.size() + " elements"); // #2
return null;
}
};
readEmployees.run();
System.out.println(readEmployees.getMessage()); // #3
}
}
I see #1 printed to the console, never #2 or 3. How do I know that it processed all through the Task?

How much your app will speed up depends on how big are those files and how much files there are. You should know that creating threads is also resource consuming task. I can imagine situation where you have plenty of files and for each one you're creating a new thread which could even make your app initialize slower.
In case of big amount of files or number of files which can change in time, you can arrange some thread pool of constant number eg. 5 which can work simultaneously on reading files task.
Back to the problem and the question is it worth to use separate threads for reading files, I'll say yes but only if your app have some work on initialization which can be done without knowing content of those files. You should be aware that in some point in time you'll probably need to wait for file parsing results.
As a part of problem solving you can do some benchmark to check how long parsing each file process takes and then you'll know what configuration/amount of working threads will be the best. Eg. you won't create thread for each file when parsing takes 1 second, but if you have 100 files of 1 second processing time you can create some thread pool and divide the job for each thread equally.
yes
I don't know JavaFX but in general concept of Thread and Task is the same. Thread gives you certanity that you're starting new thread, it's lower level of abstraction. Task is some sort of higher abstraction where you want to run part of your code separately, and asynchronously but you don't want to be aware on which thread it will run. Some programming languages behind Task hides actually some thread pool.
Preloaders are fine, because they show user some job is being done in background so he won't worry if application has frozen. On the other hand if you can speed up initialization process it will be great. You can join those two ideas, but remember, no one wants to wait a lot :)

Multiprocessing in Java with Killable thread

I have a scenario in which I am running unreliable code in java (the scenario is not unlike this). I am providing the framework classes, and the intent is for the third party to overwrite a base class method called doWork(). However, if the client doWork() enters a funked state (such as an infinite loop), I need to be able to terminate the worker.
Most of the solutions (I've found this example and this example) revolve around a loop check for a volatile boolean:
while (keepRunning) {
//some code
}
or checking the interrupted status:
while (isInterrupted()) {
//some code
}
However, neither of these solutions deal with the the following in the '//some code' section:
for (int i = 0; i < 10; i++) {
i = i - 1;
}
I understand the reasons thread.stop() was depreciated, and obviously, running faulty code isn't desirable, though my situation forces me to run code I can't verify myself. But I find it hard to believe Java doesn't have some mechanism for handling threads which get into an unacceptable state. So, I have two questions:
Is it possible to launch a Thread or Runnable in Java which can be reliably killed? Or does Java require cooperative multithreading to the point where a thread can effectively hose the system?
If not, what steps can be taken to pass live objects such that the code can be run in a Process instead (such as passing active network connections) where I can actually kill it.?

If you really don't want to (or probably cannot due to requirement of passing network connections) spawn new processes, you can try to instrument code of this 'plugin' when you load it's class. I mean change it's bytecode so it will include static calls to some utility method (eg ClientMentalHealthChecker.isInterrupted()). It's actually not that hard to do. Here you can find some tools that might help: https://java-source.net/open-source/bytecode-libraries. It won't be bullet proof because there are other ways of blocking execution. Also keep in mind that clients code can catch InterruptedExceptions.

Trouble understanding Java threads

I learned about multiprocessing from Python and I'm having a bit of trouble understanding Java's approach. In Python, I can say I want a pool of 4 processes and then send a bunch of work to my program and it'll work on 4 items at a time. I realized, with Java, I need to use threads to achieve this same task and it seems to be working really really well so far.
But.. unlike in Python, my cpu(s) aren't getting 100% utilization (they are about 70-80%) and I suspect it's the way I'm creating threads (code is the same between Python/Java and processes are independent). In Java, I'm not sure how to create one thread so I create a thread for every item in a list I want to process, like this:
for (int i = 0; i < 500; i++) {
Runnable task = new MyRunnable(10000000L + i);
Thread worker = new Thread(task);
// We can set the name of the thread
worker.setName(String.valueOf(i));
// Start the thread, never call method run() direct
worker.start();
// Remember the thread for later usage
threads.add(worker);
}
I took it from here. My question is this the correct way to launch threads or is there a way to have Java itself manage the number of threads so it's optimal? I want my code to run as fast as possible and I'm trying to understand how to tell and resolve any issues that maybe arising from too many threads being created.
This is not a major issue, just curious to how it works under the Java hood.

You use an Executor, the implementation of which handles a pool of threads, decides how many, and so forth. See the Java tutorial for lots of examples.
In general, bare threads aren’t used in Java except for very simple things. Instead, there will be some higher-level API that receives your Runnable or Task and knows what to do.

Take a look at the Java Executor API. See this article, for example.
Although creating Threads is much 'cheaper' than it used to be, creating large numbers of threads (one per runnable as in your example) isn't the way to go - there's still an overhead in creating them, and you'll end up with too much context switching.
The Executor API allows you to create various types of thread pool for executing Runnable tasks, so you can reuse threads, flexibly manage the number that are created, and avoid the overhead of thread-per-runnable.
The Java threading model and the Python threading model (not multiprocessing) are really quite similar, incidentally. There isn't a Global Interpreter Lock as in Python, so there's usually less need to fork off multiple processes.

Thread is a "low level" API.
Depending on what you want to do, and the version of java you use, their is better solution.
If you use Java 7, and if your task allow it, you can use the fork/join framework : http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html
However, take a look at the java concurrency tutorial : http://docs.oracle.com/javase/tutorial/essential/concurrency/executors.html

Java - how to stop a thread running arbitrary code?

In my application which runs user submitted code[1] in separate threads, there might be some cases where the code might take very long to run or it might even have an infinite loop! In that case how do I stop that particular thread?
I'm not in control of the user code, so I cannot check for Thread.interrupted() from the inside. Nor can I use Thread.stop() carelessly. I also cannot put those code in separate processes.
So, is there anyway to handle this situation?
[1] I'm using JRuby, and the user code is in ruby.

With the constraints you've provided:
User submitted code you have no control over.
Cannot force checks for Thread.interrupted().
Cannot use Thread.stop().
Cannot put the user code in a process jail.
The answer to your question is "no, there is no way of handling this situation". You've pretty much systematically designed things so that you have zero control over untrusted third-party code. This is ... a suboptimal design.
If you want to be able to handle anything, you're going to have to relax one (or preferably more!) of the above constraints.
Edited to add:
There might be a way around this for you without forcing your clients to change code if that is a(nother) constraint. Launch the Ruby code in another process and use some form of IPC mechanism to do interaction with your main code base. To avoid forcing the Ruby code to suddenly have to be coded to use explicit IPC, drop in a set of proxy objects for your API that do the IPC behind the scenes which themselves call proxy objects in your own server. That way your client code is given the illusion of working inside your server while you jail that code in its own process (which you can ultimately kill -9 as the ultimate sanction should it come to that).
Later you're going to want to wean your clients from the illusion since IPC and native calls are very different and hiding that behind a proxy can be evil, but it's a stopgap you can use while you deprecate APIs and move your clients over to the new APIs.

I'm not sure about the Ruby angle (or of the threading angle) of things here, but if you're running user-submitted code, you had best run it in a separate process rather than in a separate thread of the same process.
Rule number one: Never trust user input. Much less if the input is code!
Cheers

Usually you have a variable to indicate to stop a thread. Some other thread then would set this variable to true. Finally you periodically check, whether the variable is set or not.
But given that you can't change user code , I am afraid there isn't a safe way of doing it.

For Running Thread Thread.Interrupt wont actually stop as sfussenegger mentioned aforth (thanks sfussenegger recollected after reading spec).
using a shared variable to signal that it should stop what it is doing. The thread should check the variable periodically,(ex : use a while loop ) and exit in an orderly manner.
private boolean isExit= false;
public void beforeExit() {
isExit= true;
}
public void run() {
while (!isExit) {
}
}

How to identify sections of code executed by multiple threads

I'm using net beans for Java development. I'm working on a multi threading application and I want to easily identify code sections which are executed by more than one thread? Is there a easy way to do that?
For example, if some field of method of class ABC is executed by more than one thread?

This is something that can only be determined at runtime.
You can throw this method to the beginning you your method calls to determine the calling Thread.
public static void reportThread(String methodName) {
//Somehow LOG (println, logging framework)
LOG(methodName + " was ran on thread: " + Thread.currentThread().getName());
}

In general, it is not possible to do statically, i.e. by inspection of the code. (The problem is undecidable due to the halting problem.)
Your only option is to do the analysis runtime, that is, to log which actual thread executes with method. You have a few options. Here are two that I come to think of immediately.
Add System.out.println(Thread.currentThread()) on interesting methods
Use for example AspectJ to do something similar.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.