Multiple things at once (Threads?)

Multiple things at once (Threads?) - java

All,
What is a really simple way of having a program do more than one thing at once, even if the computer does not necessarily have multiple 'cores'. Can I do this by creating more than one Thread?
My goal is to be able to have two computers networked (through Sockets) to respond to each-other's requests, while my program will at the same time be able to be managing a UI. I want the server to potentially handle more than one client at the same time as well.
My understanding is that the communication is done with BufferedReader.readLine() and PrintWriter.println(). My problem is that I want the server to be waiting on multiple readLine() requests, and also be doing other things. How do I handle this?
Many thanks,
Jonathan

Yes, you can do this by having multiple threads inside your Java program.
As the mechanisms in Java gets rather complicated when you do this, have a look at the appropriate section in the Java Tutorial:
http://java.sun.com/docs/books/tutorial/essential/concurrency/

Yes, just create multiple threads. They will run concurrently, whether or not the processor has multiple cores. (With a single core, the OS simply suspends the execution of the running thread at certain points and runs another thread for a while, so in effect, multiple ones seem to be running at the same time).
Here's a good concurrency tutorial: http://java.sun.com/docs/books/tutorial/essential/concurrency/

The standard Java tutorial for Sockets is a good start. I wrote the exact program you are describing using this as a base. The last point on the page "Supporting Multiple Clients" describes how threads are implemented.
http://java.sun.com/docs/books/tutorial/networking/sockets/clientServer.html

Have a look at this page: http://www.ashishmyles.com/tutorials/tcpchat/index.html -- it gives a good description of threads, UI details, etc, and gives a chat example where they merge the two together.
Also, consider using Apache MINA. It's quite lightweight, doesn't rely on any external libraries (apart from slf4j) and makes it very easy to get stuff from sockets without needing to go around in as loop, and it's also quite non-blocking (or blocking when you need it to be). So, you have a class which implements IoHandler and then you register that with an acceptor or some other Mina connection class. Then, it notifies you about when packets are received. It handles all the usually-crippling backend stuff for you in a pleasant way (i.e., manually creating multiple threads for clients and then managing these).
It also has codec support, where you can transform sent and received messages. So, say you want to receive Java objects on either end of your connection -- this will do the conversion for you. Perhaps you also want to zip them up to make it more efficient? You can write that too, adding that to the chain below the object codec.

Can I do this by creating more than one Thread?
What is a really simple way of having
a program do more than one thing at
once, even if the computer does not
necessarily have multiple 'cores'. Can
I do this by creating more than one
Thread?
If you have 1 single core, then "official" only 1 task can be executed at the same time. Because your computer
s processor is so fast and executes so many instructions per second it creates the illusion that your computer is doing multiple tasks simultaneously while every small unit it only executes 1 task. You can create this illusion in Java by creating threads which get scheduled by your operating system to run for a short period of time.
My advice is to have a look at the java.util.concurrent package because it contains a lot of helpful tools to make playing around with threads a lot easier(Back in the days when this package did not exists it was a lot harder). I for example like to use
ExecutorService es = Executors.newCachedThreadPool();
to create a thread pool which I can submit tasks to run simultaneously. Then when I have task which I like to have run, I call
es.execute(runnable);
where runnable looks like:
Runnable runnable = new Runnable() {
public void run() {
// code to run.
}
};
For example say you run the following code:
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package mytests;
import java.util.Date;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
/**
*
* #author alfred
*/
public class Main {
/**
* #param args the command line arguments
* #throws Exception
*/
public static void main(String[] args) throws Exception {
// TODO code application logic here
final CountDownLatch latch = new CountDownLatch(2);
final long start = System.nanoTime();
ExecutorService es = Executors.newCachedThreadPool();
Runnable runnable = new Runnable() {
public void run() {
sleep(1);
System.out.println(new Date());
latch.countDown();
}
};
es.submit(runnable);
es.submit(runnable);
latch.await(); // waits only latch.countDown() has been called 2 times.
// 1 nanoseconds is equal to 1/1000000000 of a second.
long total = (System.nanoTime() - start) / 1000000;
System.out.println("total time: " + total);
es.shutdown();
}
public static void sleep(int i) {
try {
Thread.sleep(i * 1000);
} catch (InterruptedException ie) {}
}
}
The output would look like
run:
Fri Apr 02 03:34:14 CEST 2010
Fri Apr 02 03:34:14 CEST 2010
total time: 1055
BUILD SUCCESSFUL (total time: 1 second)
But I ran 2 tasks which each both ran for at least 1 second(because of sleep of 1 second). If I would run those 2 tasks sequentially then it would take at least 2 seconds, but because I used threads it only took 1 second. This is what you wanted and it is easily accomplished using the java.util.concurrent package.
I want the server to potentially handle more than one client at the same time as well.
My goal is to be able to have two
computers networked (through Sockets)
to respond to each-other's requests,
while my program will at the same time
be able to be managing a UI. I want
the server to potentially handle more
than one client at the same time as
well.
I would advice you to have a look at Netty framework(MINA which also developed by the creator of MINA, but Netty is better(more development) in my opinion).:
The Netty project is an effort to
provide an asynchronous event-driven
network application framework and
tools for rapid development of
maintainable high performance & high
scalability protocol servers &
clients.
It will do all the heavy lifting for you. When I read the user guide I was totally amazed with netty. Netty uses nio which is for highly concurrent servers the new way to do IO which scales much better. Like I said before this framework does all the heavy lifting for you
My problem is that I want the server to be waiting on multiple readLine() requests, and also be doing other things
My understanding is that the
communication is done with
BufferedReader.readLine() and
PrintWriter.println(). My problem is
that I want the server to be waiting
on multiple readLine() requests, and
also be doing other things. How do I
handle this?
Again when you look into the netty's user guide + examples you will find out that it does all the heavy lifting for you in an efficient way. You will only have to specify some simple callbacks to get the data from the clients.
Hopefully this has answered all your question. Else I would advice you to leave a comment so that I will try to explain it better.

Related

Read JSON files into collections, best practice

I'm working on a JavaFX application. I have several JSON files which I would like to read and insert into Collections in domain objects. I am using Gson to read these files at present. My application currently is working, however, there is a long delay before the application launches. I assume that this is because I'm reading these files sequentially and in the same Thread. Therefore, I am looking to enhance the launch time by introducing some concurrency. I'm thinking If I can figure out how to read the files in parallel it should speed up the launch time. I'm new to the idea of concurrency so I'm trying to learn as I go. Needless to say, I've hit a few roadblocks and can't seem to find much information or examples online.
Here are my issues:
Not sure if the JSON file reads can be done in a background thread.
Domain classes use these Collections to compute and eventually display values in the GUI. From my understanding, if you modify the GUI it has to be done in the JavaFX Application thread and not in the background. I'm not sure if loading data to be used in the GUI counts as modifying the GUI. I'm not directly updating any GUI Nodes like textField.setText("something") by reading Json, so I would assume no, I'm not. Am I wrong?
What is the difference between a Task> and Thread or an ExecutorService and Callable>? Is one method preferred over the other? I've tried both and failed. When I tried using a task and background thread, I would get a NullPointerException because the app tried to access the collection before the files were read and initialized with data. It went from being too slow to being too fast. SMH.
To solve this problem, I heard about Preloaders. The idea here was to launch some sort of splash screen to delay until the loading of resources (reading of JSON files) was complete, then proceed to the main application. However, the examples or information here is VERY scarce. I'm using Java 10 and IntelliJ, so I may have cornered myself into a one in a million niche.
I'm not asking for anyone to solve my problem for me. I'm just a little lost and don't know where or how to proceed. I'll be happy to share specifics if needed but I think my issues are still conceptual at this point.
Help me StackOverflow you're my only hope.
edit: code example:
public class Employee {
private List<Employee> employeeList;
public Employee() {
employeeList = new ArrayList<>();
populateEmployees();
}
private final void populateEmployees() {
Task<Void> readEmployees = new Task<>() {
#Override
protected Void call() throws Exception {
System.out.println("Starting to read employee.json"); // #1
InputStream in = getClass().getResourceAsStream("/json/employee.json");
Reader reader = new InputStreamReader(in);
Type type = new TypeToken<List<Employee>>(){}.getType();
Gson gson = new Gson();
employeeList.addAll(gson.fromJson(reader, type));
System.out.println("employeeList has " + employeeList.size() + " elements"); // #2
return null;
}
};
readEmployees.run();
System.out.println(readEmployees.getMessage()); // #3
}
}
I see #1 printed to the console, never #2 or 3. How do I know that it processed all through the Task?

How much your app will speed up depends on how big are those files and how much files there are. You should know that creating threads is also resource consuming task. I can imagine situation where you have plenty of files and for each one you're creating a new thread which could even make your app initialize slower.
In case of big amount of files or number of files which can change in time, you can arrange some thread pool of constant number eg. 5 which can work simultaneously on reading files task.
Back to the problem and the question is it worth to use separate threads for reading files, I'll say yes but only if your app have some work on initialization which can be done without knowing content of those files. You should be aware that in some point in time you'll probably need to wait for file parsing results.
As a part of problem solving you can do some benchmark to check how long parsing each file process takes and then you'll know what configuration/amount of working threads will be the best. Eg. you won't create thread for each file when parsing takes 1 second, but if you have 100 files of 1 second processing time you can create some thread pool and divide the job for each thread equally.
yes
I don't know JavaFX but in general concept of Thread and Task is the same. Thread gives you certanity that you're starting new thread, it's lower level of abstraction. Task is some sort of higher abstraction where you want to run part of your code separately, and asynchronously but you don't want to be aware on which thread it will run. Some programming languages behind Task hides actually some thread pool.
Preloaders are fine, because they show user some job is being done in background so he won't worry if application has frozen. On the other hand if you can speed up initialization process it will be great. You can join those two ideas, but remember, no one wants to wait a lot :)

Test inter-device communication timings in Java

Scenario:
I want to test a communication between 2 devices. They communicate by frames.
I start up the application (on device 1) and I send a number of frames (each frames contains a unique (int) ID). Device 2 receives each frame and sends an acknowledgement (and just echo's the ID) or it doesn't. (when frame got lost)
When device 1 receives the ACK I want to compare the time it took to send and receive the ACK back.
From looking around SO
How do I measure time elapsed in Java?
System.nanoTime() is probably the best way to monitor the elapsed time. However this is all happening in different threads according to the classic producer-consumer pattern where a thread (on device 1) is always reading and another is managing the process (and also writing the frames). Now thank you for bearing with me my question is:
Question: Now for the problem: I need to convey the unique ID from the ACK frame from the reading thread to the managing thread. I've done some research and this seems to be an good candidate for wait/notify system or not? Or perhaps I just need a shared array that contains data of each frame send? But than how does the managing thread know it happened?
Context I want to compare these times because I want to research what factors can hamper communication.

Why don't you just populate a shared map with <unique id, timestamp> pairs? You can expire old entries by periodically removing entries older than a certain amount.

I suggest you reformulate your problem with tasks (Callable). Create a task for the writer and one for the reader role. Submit these in pairs in an ExecutorService and let the Java concurrency framework handle the concurrency for you. You only have to think about what will be the result of a task and how would you want to use it.
// Pseudo code
ExecutorService EXC = Executors.newCachedThreadPool();
Future<List<Timestamp>> readerFuture = EXC.submit(new ReaderRole(sentFramwNum));
Future<List<Timestamp>> writerFuture = EXC.submit(new WriterRole(sentFrameNum));
List<Timestamp> writeResult = writerFuture.get(); // wait for the completion of writing
List<Timestamp> readResult = readerFuture.get(); // wait for the completion of reading
This is pretty complex stuff but much cleaner and more stable that a custom developed synchronization solution.
Here is a pretty good tutorial for the Java concurrency framework: http://www.vogella.com/articles/JavaConcurrency/article.html#threadpools

Does Vert.x has real concurrency for single verticles?

the question might look like a troll but it is actually about how vert.x manages concurrency since a verticle itself runs in a dedicated thread.
Let's look at this simple vert.x http server written in Java:
import org.vertx.java.core.Handler;
import org.vertx.java.core.http.HttpServerRequest;
import org.vertx.java.platform.Verticle;
public class Server extends Verticle {
public void start() {
vertx.createHttpServer().requestHandler(new Handler<HttpServerRequest>() {
public void handle(HttpServerRequest req) {
req.response().end("Hello");
}
}).listen(8080);
}
}
As far as I understand the docs, this whole file represents a verticle. So the start method is called within the dedicated verticle thread, so far so good. But where is the requestHandler invoked ? If it is invoked on exactly this thread I can't see where it is better than node.js.
I'm pretty familiar with Netty, which is the network/concurrency library vert.x is based on. Every incoming connection is mapped to a dedicated thread which scales quite nicely. So.. does this mean that incoming connections represent verticles as well ? But how can then the verticle instance "Server" communicate with those clients ? In fact I would say that this concept is as limited as Node.js.
Please help me to understand the concepts right!
Regards,
Chris

I've talked to someone who is quite involved in vert.x and he told me that I'm basically right about the "concurrency" issue.
BUT: He showed me a section in the docs which I totally missed where "Scaling servers" is explained in detail.
The basic concept is, that when you write a verticle you just have single core performance. But it is possible to start the vert.x platform using the -instance parameter which defines how many instances of a given verticle are run. Vert.x does a bit of magic under the hood so that 10 instances of my server do not try to open 10 server sockets but actually a single on instead. This way vert.x is horizontally scalable even for single verticles.
This is really a great concept and especially a great framework!!

Every verticle is single threaded, upon startup the vertx subsystem assigns an event loop to that verticle. Every code in that verticle will be executed in that event loop. Next time you should ask questions in http://groups.google.com/forum/#!forum/vertx, the group is very lively your question will most likely be answered immediately.

As you correctly answered to yourself, vertex indeed uses async non-blocking programming (like node.js) so you can't do blocking operations because you would otherwise stop the whole (application) world from turning.
You can scale servers as you correctly stated, by spawning more (n=CPU cores) verticle instances, each trying to listen on same TCP/HTTP port.
Where it shines compared to node.js is that the JVM itself is multi-threaded, which gives you more advantages (from the runtime point of view, not including type safety of Java etc etc):
Multithreaded (cross-verticle) communication, while still being constrained to thread-safe Actor-like model, does not require IPC (Inter Process Communication) to pass messages between verticles - everything happens inside the same process, same memory region. Which is faster than node.js spawning every forked task in a new system process and using IPC to communicate
Ability to do compute-heavy and/or blocking tasks within the same JVM process: http://vertx.io/docs/vertx-core/java/#blocking_code or http://vertx.io/docs/vertx-core/java/#worker_verticles
Speed of HotSpot JVM compared to V8 :)

Parallel programming in Java

How can we do Parallel Programming in Java? Is there any special framework for that? How can we make the stuff work?
I will tell you guys what I need, think that I developed a web crawler and it crawls a lot of data from the internet. One crawling system will not make things work properly, so I need more systems working in parallel. If this is the case can I apply parallel computing? Can you guys give me an example?

If you are asking about pure parallel programming i.e. not concurrent programming then you should definitely try MPJExpress http://mpj-express.org/. It is a thread-safe implementation of mpiJava and it supports both distributed and shared memory models. I have tried it and found very reliable.
1 import mpi.*;
2
3
/**
4 * Compile:impl specific.
5 * Execute:impl specific.
6 */
7
8 public class Send {
9
10 public static void main(String[] args) throws Exception {
11
12 MPI.Init(args);
13
14 int rank = MPI.COMM_WORLD.Rank() ; //The current process.
15 int size = MPI.COMM_WORLD.Size() ; //Total number of processes
16 int peer ;
17
18 int buffer [] = new int[10];
19 int len = 1 ;
20 int dataToBeSent = 99 ;
21 int tag = 100 ;
22
23 if(rank == 0) {
24
25 buffer[0] = dataToBeSent ;
26 peer = 1 ;
27 MPI.COMM_WORLD.Send(buffer, 0, len, MPI.INT, peer, tag) ;
28 System.out.println("process <"+rank+"> sent a msg to "+ 29 "process <"+peer+">") ;
30
31 } else if(rank == 1) {
32
33 peer = 0 ;
34 Status status = MPI.COMM_WORLD.Recv(buffer, 0, buffer.length, 35 MPI.INT, peer, tag);
36 System.out.println("process <"+rank+"> recv'ed a msg\n"+ 37 "\tdata <"+buffer[0] +"> \n"+ 38 "\tsource <"+status.source+"> \n"+ 39 "\ttag <"+status.tag +"> \n"+ 40 "\tcount <"+status.count +">") ;
41
42 }
43
44 MPI.Finalize();
45
46 }
47
48 }
One of the most common functionalities provided by messaging libraries like MPJ Express is the support of point-to-point communication between executing processes. In this context, two processes belonging to the same communicator (for instance the MPI.COMM_WORLD communicator) may communicate with each other by sending and receiving messages. A variant of the Send() method is used to send the message from the sender process. On the other hand, the sent message is received by the receiver process by using a variant of the Recv() method. Both sender and receiver specify a tag that is used to ﬁnd a matching incoming messages at the receiver side.
After initializing the MPJ Express library using the MPI.Init(args) method on line 12, the program obtains its rank and the size of the MPI.COMM_WORLD communicator. Both processes initialize an integer array of length 10 called buffer on line 18. The sender process—rank 0—stores a value of 10 in the ﬁrst element of the msg array. A variant of the Send() method is used to send an element of the msg array to the receiver process.
The sender process calls the Send() method on line 27. The ﬁrst three arguments are related to the data being sent. The sending bu!er—the bu!er array—is the ﬁrst argument followed by 0 (o!set) and 1 (count). The data being sent is of MPI.INT type and the destination is 1 (peer variable); the datatype and destination are speciﬁed as fourth and ﬁfth argument to the Send() method. The last and the sixth argument is the tag variable. A tag is used to identify messages at the receiver side. A message tag is typically an identiﬁer of a particular message in a speciﬁc communicator.
On the other hand the receiver process (rank 1) receives the message using the blocking receive method.

Java supports threads, thus you can have multi threaded Java application. I strongly recommend the Concurrent Programming in Java: Design Principles and Patterns book for that:
http://java.sun.com/docs/books/cp/

You want to look at the Java Parallel Processing Framework (JPPF)

You can have a look at Hadoop and Hadoop Wiki.This is an apache framework inspired by google's map-reduce.It enables you to do distributed computing using multiple systems.Many companies like Yahoo,Twitter use it(Sites Powered By Hadoop).Check this book for more information on how to use it Hadoop Book.

In java parallel processing is done using threads which are part of the runtime library
The Concurrency Tutorial should answer a lot of questions on this topic if you're new to java and parallel programming.

As far as I know, on most operating systems the Threading mechanism of Java should be based on real kernel threads. This is good from the parallel programming prospective. Other languages like Python simply do some time multiplexing of the processor (namely, if you run a heavvy multithreaded application on a multiprocessor machine you'll see only one processor running).
You can easily find something just googling it: by example this is the first result for "java threading":
http://download-llnw.oracle.com/javase/tutorial/essential/concurrency/
Basically it boils down to extend the Thread class, overload the "run" method with the code belonging to the other thread and call the "start" method on an instance of the class you extended.
Also if you need to make something thread safe, have a look to the synchronized methods.

This is the parallel programming resource I've been pointed to in the past:
http://www.jppf.org/
I have no idea whether its any good or not, just that someone recommended it a while ago.

I have heard about one at conference a few years ago - ParJava. But I'm not sure about the current status of the project.

Read the section ón threads in the java tutorial. http://download-llnw.oracle.com/javase/tutorial/essential/concurrency/procthread.html

java.util.concurrency package and the Brian Goetz book "Java concurrency in practice"
There is also a lot of resources here about parallel patterns by Ralph Johnson (one of the GoF design pattern author) :
http://parlab.eecs.berkeley.edu/wiki/patterns/patterns

Is the Ateji PX parallel-for loop what you're looking for ?
This will crawl all sites in parallel (notice the double bar next to the for keyword) :
for||(Site site : sites) {
crawl(site);
}
If you need to compose the results of crawling, then you'll probably want to use a parallel comprehension, such as :
Set result = set for||{ crawl(site) | Site site : sites }
Further reading here : http://www.ateji.com/px/whitepapers/Ateji%20PX%20for%20Java%20v1.0.pdf

You might want to check out Hadoop. It's designed to have jobs running over an arbitrary amount of boxes and takes care of all the bookkeeping for you. It's inspired by Google's MapReduce and their related tools and so it even comes from web indexing.

Have you looked at this:
http://www.javacodegeeks.com/2013/02/java-7-forkjoin-framework-example.html?ModPagespeed=noscript
The Fork / Join Framework?
I am also trying to learn a bit about this.

Parallelism
Parallelism means that an application splits its tasks up into smaller subtasks which can be processed in parallel, for instance on multiple CPUs at the exact same time.

you can use JCSP (http://www.cs.kent.ac.uk/projects/ofa/jcsp/) the library implements CSP (Communicating Sequential Processes) principles in Java, parallelisation is abstracted from thread level and you deal instead with processes.

Java SE 5 and 6 introduced a set of packages in java.util.concurrent.* which provide powerful concurrency building blocks.
check this for more information.
http://www.oracle.com/technetwork/articles/java/fork-join-422606.html

You might try Parallel Java 2 Library.
On the website Prof. Alan Kaminsky wrote:
Fast forward to 2013, when I started developing PJ2. Parallel computing had expanded far beyond what it was a decade earlier. Multicore parallel computers were equipped with many more CPU cores and much larger main memory, such that computations that used to require a whole cluster now could be done on a single multicore node. New kinds of parallel computing hardware had become commonplace, notably graphics processing unit (GPU) accelerators. Cloud computing services, such as Amazon's EC2, allowed anyone to run parallel programs on a virtual supercomputer with thousands of cores. New application areas for parallel computing had opened up, notably big data analytics. New parallel programming APIs had arisen, such as OpenCL and NVIDIA Corporation's CUDA for GPU parallel programming, and map-reduce frameworks like Apache's Hadoop for big data computing. To explore and take advantage of all these trends, I decided that a completely new Parallel Java 2 Library was needed.
In early 2013 when PJ2 wasn't yet available (although an earlier version was), I tried Java Parallel Processing Framework (JPPF). JPPF was okay but at first glance PJ2 looks interesting.

There is a library called Habanero-Java (HJ), developed at Rice University that was built using lambda expressions and can run on any Java 8 JVM.
HJ-lib integrates a wide range of parallel programming constructs (e.g., async tasks, futures, data-driven tasks, forall, barriers, phasers, transactions, actors) in a single programming model that enables unique combinations of these constructs (e.g., nested combinations of task and actor parallelism).
The HJ runtime is responsible for orchestrating the creation, execution, and termination of HJ tasks, and features both work-sharing and work-stealing schedulers. You can follow the tutorial to set it up on your computer.
Here is a simple HelloWorld example:
import static edu.rice.hj.Module1.*;
public class HelloWorld {
public static void main(final String[] args) {
launchHabaneroApp(() -> {
finish(() -> {
async(() -> System.out.println("Hello World - 1!"));
async(() -> System.out.println("Hello World - 2!"));
async(() -> System.out.println("Hello World - 3!"));
async(() -> System.out.println("Hello World - 4!"));
});
});
}}
Each async method runs in parallel with the other async methods, while the content within these methods run sequentially. The program doesn't continue until all code within the finish method complete.

Short answer with example library
If you are interested in parallel processing using Java, I would recommend you to give a try to Hazelcast Jet.
No more words needed from my side. Just check the website and learn by their examples. It give you pretty solid background and imagination about what does it meen to process data paralelly.
https://jet.hazelcast.org/

Forcing multiple threads to use multiple CPUs when they are available

I'm writing a Java program which uses a lot of CPU because of the nature of what it does. However, lots of it can run in parallel, and I have made my program multi-threaded. When I run it, it only seems to use one CPU until it needs more then it uses another CPU - is there anything I can do in Java to force different threads to run on different cores/CPUs?

There are two basic ways to multi-thread in Java. Each logical task you create with these methods should run on a fresh core when needed and available.
Method one: define a Runnable or Thread object (which can take a Runnable in the constructor) and start it running with the Thread.start() method. It will execute on whatever core the OS gives it -- generally the less loaded one.
Tutorial: Defining and Starting Threads
Method two: define objects implementing the Runnable (if they don't return values) or Callable (if they do) interface, which contain your processing code. Pass these as tasks to an ExecutorService from the java.util.concurrent package. The java.util.concurrent.Executors class has a bunch of methods to create standard, useful kinds of ExecutorServices. Link to Executors tutorial.
From personal experience, the Executors fixed & cached thread pools are very good, although you'll want to tweak thread counts. Runtime.getRuntime().availableProcessors() can be used at run-time to count available cores. You'll need to shut down thread pools when your application is done, otherwise the application won't exit because the ThreadPool threads stay running.
Getting good multicore performance is sometimes tricky, and full of gotchas:
Disk I/O slows down a LOT when run in
parallel. Only one thread should do disk read/write at a time.
Synchronization of objects provides safety to multi-threaded operations, but slows down work.
If tasks are too
trivial (small work bits, execute
fast) the overhead of managing them
in an ExecutorService costs more than
you gain from multiple cores.
Creating new Thread objects is slow. The ExecutorServices will try to re-use existing threads if possible.
All sorts of crazy stuff can happen when multiple threads work on something. Keep your system simple and try to make tasks logically distinct and non-interacting.
One other problem: controlling work is hard! A good practice is to have one manager thread that creates and submits tasks, and then a couple working threads with work queues (using an ExecutorService).
I'm just touching on key points here -- multithreaded programming is considered one of the hardest programming subjects by many experts. It's non-intuitive, complex, and the abstractions are often weak.
Edit -- Example using ExecutorService:
public class TaskThreader {
class DoStuff implements Callable {
Object in;
public Object call(){
in = doStep1(in);
in = doStep2(in);
in = doStep3(in);
return in;
}
public DoStuff(Object input){
in = input;
}
}
public abstract Object doStep1(Object input);
public abstract Object doStep2(Object input);
public abstract Object doStep3(Object input);
public static void main(String[] args) throws Exception {
ExecutorService exec = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
ArrayList<Callable> tasks = new ArrayList<Callable>();
for(Object input : inputs){
tasks.add(new DoStuff(input));
}
List<Future> results = exec.invokeAll(tasks);
exec.shutdown();
for(Future f : results) {
write(f.get());
}
}
}

When I run it, it only seems to use
one CPU until it needs more then it
uses another CPU - is there anything I
can do in Java to force different
threads to run on different
cores/CPUs?
I interpret this part of your question as meaning that you have already addressed the problem of making your application multi-thread capable. And despite that, it doesn't immediately start using multiple cores.
The answer to "is there any way to force ..." is (AFAIK) not directly. Your JVM and/or the host OS decide how many 'native' threads to use, and how those threads are mapped to physical processors. You do have some options for tuning. For example, I found this page which talks about how to tune Java threading on Solaris. And this page talks about other things that can slow down a multi-threaded application.

First, you should prove to yourself that your program would run faster on multiple cores. Many operating systems put effort into running program threads on the same core whenever possible.
Running on the same core has many advantages. The CPU cache is hot, meaning that data for that program is loaded into the CPU. The lock/monitor/synchronization objects are in CPU cache which means that other CPUs do not need to do cache synchronization operations across the bus (expensive!).
One thing that can very easily make your program run on the same CPU all the time is over-use of locks and shared memory. Your threads should not talk to each other. The less often your threads use the same objects in the same memory, the more often they will run on different CPUs. The more often they use the same memory, the more often they must block waiting for the other thread.
Whenever the OS sees one thread block for another thread, it will run that thread on the same CPU whenever it can. It reduces the amount of memory that moves over the inter-CPU bus. That is what I guess is causing what you see in your program.

First, I'd suggest reading "Concurrency in Practice" by Brian Goetz.
This is by far the best book describing concurrent java programming.
Concurrency is 'easy to learn, difficult to master'. I'd suggest reading plenty about the subject before attempting it. It's very easy to get a multi-threaded program to work correctly 99.9% of the time, and fail 0.1%. However, here are some tips to get you started:
There are two common ways to make a program use more than one core:
Make the program run using multiple processes. An example is Apache compiled with the Pre-Fork MPM, which assigns requests to child processes. In a multi-process program, memory is not shared by default. However, you can map sections of shared memory across processes. Apache does this with it's 'scoreboard'.
Make the program multi-threaded. In a multi-threaded program, all heap memory is shared by default. Each thread still has it's own stack, but can access any part of the heap. Typically, most Java programs are multi-threaded, and not multi-process.
At the lowest level, one can create and destroy threads. Java makes it easy to create threads in a portable cross platform manner.
As it tends to get expensive to create and destroy threads all the time, Java now includes Executors to create re-usable thread pools. Tasks can be assigned to the executors, and the result can be retrieved via a Future object.
Typically, one has a task which can be divided into smaller tasks, but the end results need to be brought back together. For example, with a merge sort, one can divide the list into smaller and smaller parts, until one has every core doing the sorting. However, as each sublist is sorted, it needs to be merged in order to get the final sorted list. Since this is "divide-and-conquer" issue is fairly common, there is a JSR framework which can handle the underlying distribution and joining. This framework will likely be included in Java 7.

There is no way to set CPU affinity in Java. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4234402
If you have to do it, use JNI to create native threads and set their affinity.

You should write your program to do its work in the form of a lot of Callable's handed to an ExecutorService and executed with invokeAll(...).
You can then choose a suitable implementation at runtime from the Executors class. A suggestion would be to call Executors.newFixedThreadPool() with a number roughly corresponding to the number of cpu cores to keep busy.

The easiest thing to do is break your program into multiple processes. The OS will allocate them across the cores.
Somewhat harder is to break your program into multiple threads and trust the JVM to allocate them properly. This is -- generally -- what people do to make use of available hardware.
Edit
How can a multi-processing program be "easier"? Here's a step in a pipeline.
public class SomeStep {
public static void main( String args[] ) {
BufferedReader stdin= new BufferedReader( System.in );
BufferedWriter stdout= new BufferedWriter( System.out );
String line= stdin.readLine();
while( line != null ) {
// process line, writing to stdout
line = stdin.readLine();
}
}
}
Each step in the pipeline is similarly structured. 9 lines of overhead for whatever processing is included.
This may not be the absolute most efficient. But it's very easy.
The overall structure of your concurrent processes is not a JVM problem. It's an OS problem, so use the shell.
java -cp pipline.jar FirstStep | java -cp pipline.jar SomeStep | java -cp pipline.jar LastStep
The only thing left is to work out some serialization for your data objects in the pipeline.
Standard Serialization works well. Read http://java.sun.com/developer/technicalArticles/Programming/serialization/ for hints on how to serialize. You can replace the BufferedReader and BufferedWriter with ObjectInputStream and ObjectOutputStream to accomplish this.

I think this issue is related to Java Parallel Proccesing Framework (JPPF). Using this you can run diferent jobs on diferent processors.

JVM performance tuning has been mentioned before in Why does this Java code not utilize all CPU cores?. Note that this only applies to the JVM, so your application must already be using threads (and more or less "correctly" at that):
http://ch.sun.com/sunnews/events/2009/apr/adworkshop/pdf/5-1-Java-Performance.pdf

You can use below API from Executors with Java 8 version
public static ExecutorService newWorkStealingPool()
Creates a work-stealing thread pool using all available processors as its target parallelism level.
Due to work stealing mechanism, idle threads steal tasks from task queue of busy threads and overall throughput will increase.
From grepcode, implementation of newWorkStealingPool is as follows
/**
* Creates a work-stealing thread pool using all
* {#link Runtime#availableProcessors available processors}
* as its target parallelism level.
* #return the newly created thread pool
* #see #newWorkStealingPool(int)
* #since 1.8
*/
public static ExecutorService newWorkStealingPool() {
return new ForkJoinPool
(Runtime.getRuntime().availableProcessors(),
ForkJoinPool.defaultForkJoinWorkerThreadFactory,
null, true);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.