native errors on DatagramChannel send - java

Basic
I have an app that is sending packets using DatagramChannel.send in multiple threads each to its own IP address/port and each of them keeping constant bit-rate/bandwidth. Every now and then I get this error:
java.net.SocketException: Invalid argument: no further information
at sun.nio.ch.DatagramChannelImpl.send0(Native Method)
at sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(Unknown Source)
at sun.nio.ch.DatagramChannelImpl.send(Unknown Source)
at sun.nio.ch.DatagramChannelImpl.send(Unknown Source)
...
It happens on random - sometimes 5 minutes after start sometimes after a day - so I really have problems reproducing it for testing. And on my home machine I can't reproduce it at all.
Environments
Windows 7, 8 and Server 2012 (all 64bit)
64bit Java 7 update 45
More information
The app is sending SI/EIT data to DVB-C network. I'm creating a list of 188-byte arrays for each of 80-120 threads and giving it to use. The thread takes the list and is looping over the list until new list is provided.
The error usually happens on multiple channels at once. But it can happen on just one also.
The error never happened until we had 40+ threads.
The error happens while looping over the list, not when I'm binding new list to thread.
The app it not running out of memory. Its usually running up to 70% of memory given to JVM.
Strange part: If I run multiple instance of app each handling ~10 threads problems are the same.
Simplified code sample
for(int i = 0; i < 100; ++i) {
final int id = i;
new Thread(new Runnable() {
#Override
public void run() {
final Random r = new Random();
final List<byte[]> buffer = Lists.newArrayList();
for(int i = 0; i < 200; ++i) {
final byte[] temp = new byte[188];
r.nextBytes(temp);
buffer.add(temp);
}
final SocketAddress target = new InetSocketAddress("230.0.0.18", 1000 + id);
try (final DatagramChannel channel = DatagramChannel.open(StandardProtocolFamily.INET)) {
channel.configureBlocking(false);
channel.setOption(StandardSocketOptions.IP_MULTICAST_IF, NetworkInterface.getByName("eth0"));
channel.setOption(StandardSocketOptions.IP_MULTICAST_TTL, 8);
channel.setOption(StandardSocketOptions.SO_REUSEADDR, true);
channel.setOption(StandardSocketOptions.SO_SNDBUF, 1024 * 64);
int counter = 0;
int index = 0;
while(true) {
final byte[] item = buffer.get(index);
channel.send(ByteBuffer.wrap(item), target);
index = (index + 1) % buffer.size();
counter++;
Thread.sleep(1);
}
}
catch(Exception e) {
LOG.error("Fail at " + id, e);
}
}
}).start();
}
Edits:
1) #EJP: I'm setting setting multicast properties as the actual app that I use was doing joins (and reading some data). But the problems persisted even after I removed them.
2) Should I be using some other API if I just need to send UDP packets? All the samples I could find use DatagramChannel (or its older alternative).
3) I'm still stuck with this. If anyone has an idea what can I even try, please let me know.

I had exactly the same problem, and it was caused by a zero port in the target InetSocketAddress, when calling the send method.
In your code, the target port is defined as 1000 + i, so it doesn't seem to be the problem. Anyway, I'd log the target parameters that are used when the exception is thrown, just in case.

Related

ZeroMQ: Disappearing messages

We have a Java application which is acting as a server. Client applications (written in C#) are communicating with it using ZeroMQ. We are (mostly) following the Lazy Pirate pattern.
The server has a Router socket, implemented as follows (using JeroMQ):
ZContext context = new ZContext();
Socket socket = context.createSocket(ZMQ.ROUTER);
socket.bind("tcp://*:5555");
The clients connect and send messages like this:
ZContext context = ZContext.Create();
ZSocket socket = ZSocket.Create(context, ZSocketType.REQ);
socket.Identity = Encoding.UTF8.GetBytes("Some identity");
socket.Connect("tcp://my_host:5555");
socket.Send(new ZFrame("request data"));
We have experienced lost messages when multiple clients are sending messages at the same time. With a single client, there doesn't appear to be any problem.
Are we implementing this the right way for a multiple-client-single-server setup?
Update: Example client and server exhibiting this behaviour:
Server:
import org.zeromq.ZContext;
import org.zeromq.ZMQ;
import org.zeromq.ZMQ.PollItem;
import org.zeromq.ZMQ.Poller;
import org.zeromq.ZMQ.Socket;
import org.zeromq.ZMsg;
public class SimpleServer
{
public static void main(String[] args) throws InterruptedException
{
ZContext context = new ZContext();
Socket socket = context.createSocket(ZMQ.ROUTER);
socket.setRouterMandatory(true);
socket.bind("tcp://*:5559");
PollItem pollItem = new PollItem(socket, Poller.POLLIN);
int messagesReceived = 0;
int pollCount = 0;
while ((pollCount = ZMQ.poll(new PollItem[]{pollItem}, 3000)) > -1)
{
messagesReceived += pollCount;
for (int i = 0 ; i < pollCount ; i++)
{
ZMsg msg = ZMsg.recvMsg(socket);
System.out.println(String.format("Received message: %s. Total messages received: %d", msg, messagesReceived));
}
if (pollCount == 0)
{
System.out.println(String.format("No messages on socket. Total messages received: %d", messagesReceived));
}
}
}
}
Client:
using NetMQ;
using System;
using System.Text;
namespace SimpleClient
{
class Program
{
static byte[] identity = Encoding.UTF8.GetBytes("id" + DateTime.UtcNow.Ticks);
static void Main(string[] args)
{
for (int i = 0; i < 100; i++)
{
SendMessage();
}
}
private static void SendMessage()
{
using (NetMQContext context = NetMQContext.Create())
{
using (NetMQSocket socket = context.CreateRequestSocket())
{
socket.Options.Identity = identity;
socket.Connect("tcp://localhost:5559");
socket.Send(Encoding.UTF8.GetBytes("hello!"));
}
}
}
}
}
If I run the server and a single client, I can see all my 100 messages arrive. If I run, say, 5 clients simultaneously, I only get around 200 -> 300 messages arrive, instead of the full 500. As an aside, it appears that closing the socket in the client is somehow stopping the router socket on the server from receiving messages briefly, although this is just a theory.
Part 1 - poll may return more than one event
ZMQ.poll() returns the number of events that were found:
int rc = ZMQ.poll(new PollItem[]{pollItem}, 3000);
You currently assume that one return from poll is one event. Instead, you should loop over ZMsg msg = ZMsg.recvMsg(socket); for the number of events that are indicated by the return of ZMQ.Poll().
From the source of JeroMQ:
/**
* Polling on items. This has very poor performance.
* Try to use zmq_poll with selector
* CAUTION: This could be affected by jdk epoll bug
*
* #param items
* #param timeout
* #return number of events
*/
public static int zmq_poll(PollItem[] items, long timeout)
{
return zmq_poll(items, items.length, timeout);
}
Part 2 - ZMsg.receive() may return multiple frames
When you receive a ZMsg from ZMsg msg = ZMsg.recvMsg(socket);, the ZMsg may contain multiple ZFrames, each containing client data.
From the comments of the ZMsg class in JeroMQ's source:
* // Receive message from ZMQSocket "input" socket object and iterate over frames
* ZMsg receivedMessage = ZMsg.recvMsg(input);
* for (ZFrame f : receivedMessage) {
* // Do something with frame f (of type ZFrame)
* }
Part 3 - messages can be split across multiple ZFrames
From ZFrame's source in JeroMQ:
* The ZFrame class provides methods to send and receive single message
* frames across 0MQ sockets. A 'frame' corresponds to one underlying zmq_msg_t in the libzmq code.
* When you read a frame from a socket, the more() method indicates if the frame is part of an
* unfinished multipart message.
If I'm understanding this correctly, then for each event you may get multiple frames, and one client message may map to 1..N frames (if the message is big?).
So to summarize:
One return from poll may indicate multiple events.
One event and thus one ZMsg.receive() may contain multiple frames
One frame could contain one complete client message or only part of a client message; one client message maps to 1..N frames.
Unfortunately we couldn't solve this particular issue, and have moved away from using ZeroMQ for this interface. In case it helps anyone else, the only things we worked out for definite is that rapidly opening/closing the request sockets caused undesirable behaviour (lost messages) on the router socket end. The problem was exacerbated by a poorly performing server CPU, and didn't appear at all when the server was on a fast multi-core machine.
Unfortunatley I was not even close working with ZMQ at the time this question was active. But I had the same problem today and found this page. And your answer (not using ZMQ) was not satisfying for me. So I searched a bit more and finally found out what to do.
Just as a reminder: this works with the "POLLER" in ZMQ [1]
If you use "PAIR" connection you will for sure do NOT lose nay files, BUT send/recive takes approx. the same time. So you can not speed up and was not a solution for me.
Solution:
in zmq_setsockopt (python: zmq.setsockopt) you can set ZMQ_HWM (zmq.SNDHWM, zmq.RCVHWM) to '0' [2]
in python: sock.setsockopt(zmq.SNDHWM , 0) resp. sock.setsockopt(zmq.RCVHWM, 0) for the Sender resp. Reciver
note: i think notation changed from HWM to SNDWHM/RCVHWM
HWM = 0 means that there is "NO limit" for the number of messages (so be careful, maybe set a (hvery high) limit)
there is also ZMQ_SNDBUF/ ZMQ_RCVBUF (python: zmq.SNDBUF/zmq.RCVBUF) which you can give as well, ie. sock.setsockopt(zmq.RCVBUF, 0) resp. ..... [2]
so this will set the operating system "SO_RCVBUF" to default (here my knowledge ends)
setting this parameter or not did NOT influence my case but I think it might
Performance:
So with this I could "send" 100'000 files with 98kB in ~8s (~10GB): this will fill your RAM (if this is full I think your program will slow down), see also picture
in the mean time I "recived" and saved the files in about ~enter image description here118s and freeing the RAM again
Also, with that I NERVER lost a file up to now. (you might if you hit the limits of your PC)
data loss is "GOOD":
if you realy NEED all the data you should use this method
if you can regard some losses are fine (e.g. live plotting: as long as your FPS > ~50 you will smoothly see the plots and you do not care if you lose something)
--> you can save RAM and avoid blocking your whole PC!
Hope this post helps for the next person coming by...
[1]: https://learning-0mq-with-pyzmq.readthedocs.io/en/latest/pyzmq/multisocket/zmqpoller.htm
[2]: http://api.zeromq.org/2-1:zmq-setsockopt
You find a Picture of the RAM her:
RAM is loading in about 8s. Afterwords the disk is saving the files from the buffer

Efficient way to allot the next available VM

The method getNextAvailableVm() allots virtual machines for a particular data center in a round-robin fashion. (The integer returned by this method is the machine allotted)
In a data center there could be virtual machines with different set of configurations. For example :
5 VMs with 1024 memory
4 VMs with 512 memory
Total : 9 VMs
For this data center a machine with 1024 memory will get task 2 times as compared to machine with 512 memory.
So machines for this data center are returned by the getNextAvailableVm() in the following way :
0 0 1 1 2 2 3 3 4 4 5 6 7 8
This is the current way, the machines are being returned.But there is a problem.
There could be cases, when a particular machine is busy and cannot be allotted the task.Instead the next machine available with the highest memory must be allotted the task.I have not been able to implement this.
For example :
0 (allotted first time)
0 (to be allotted the second time)
but if 0 is busy..
allot 1 if 1 is not busy
next circle check if 0 is busy
if not busy allot 0 (only when machine numbered 0 has not handled the requests it is entitled to handle)
if busy, allot the next
cloudSimEventFired method in the following class is called when ever the machine gets freed or is allotted.
public class TempAlgo extends VmLoadBalancer implements CloudSimEventListener {
/**
* Key : Name of the data center
* Value : List of objects of class 'VmAllocationUIElement'.
*/
private Map<String,LinkedList<DepConfAttr>> confMap = new HashMap<String,LinkedList<DepConfAttr>>();
private Iterator<Integer> availableVms = null;
private DatacenterController dcc;
private boolean sorted = false;
private int currentVM;
private boolean calledOnce = false;
private boolean indexChanged = false;
private LinkedList<Integer> busyList = new LinkedList<Integer>();
private Map<String,LinkedList<AlgoAttr>> algoMap = new HashMap<String, LinkedList<AlgoAttr>>();
private Map<String,AlgoHelper> map = new HashMap<String,AlgoHelper>();
private Map<String,Integer> vmCountMap = new HashMap<String,Integer>();
public TempAlgo(DatacenterController dcb) {
confMap = DepConfList.dcConfMap;
this.dcc = dcb;
dcc.addCloudSimEventListener(this);
if(!this.calledOnce) {
this.calledOnce = true;
// Make a new map using dcConfMap that lists 'DataCenter' as a 'key' and 'LinkedList<AlgoAttr>' as 'value'.
Set<String> keyst =DepConfList.dcConfMap.keySet();
for(String dataCenter : keyst) {
LinkedList<AlgoAttr> tmpList = new LinkedList<AlgoAttr>();
LinkedList<DepConfAttr> list = dcConfMap.get(dataCenter);
int totalVms = 0;
for(DepConfAttr o : list) {
tmpList.add(new AlgoAttr(o.getVmCount(), o.getMemory()/512, 0));
totalVms = totalVms + o.getVmCount();
}
Temp_Algo_Static_Var.algoMap.put(dataCenter, tmpList);
Temp_Algo_Static_Var.vmCountMap.put(dataCenter, totalVms);
}
this.algoMap = new HashMap<String, LinkedList<AlgoAttr>>(Temp_Algo_Static_Var.algoMap);
this.vmCountMap = new HashMap<String,Integer>(Temp_Algo_Static_Var.vmCountMap);
this.map = new HashMap<String,AlgoHelper>(Temp_Algo_Static_Var.map);
}
}
#Override
public int getNextAvailableVm() {
synchronized(this) {
String dataCenter = this.dcc.getDataCenterName();
int totalVMs = this.vmCountMap.get(dataCenter);
AlgoHelper ah = (AlgoHelper)this.map.get(dataCenter);
int lastIndex = ah.getIndex();
int lastCount = ah.getLastCount();
LinkedList<AlgoAttr> list = this.algoMap.get(dataCenter);
AlgoAttr aAtr = (AlgoAttr)list.get(lastIndex);
indexChanged = false;
if(lastCount < totalVMs) {
if(aAtr.getRequestAllocated() % aAtr.getWeightCount() == 0) {
lastCount = lastCount + 1;
this.currentVM = lastCount;
if(aAtr.getRequestAllocated() == aAtr.getVmCount() * aAtr.getWeightCount()) {
lastIndex++;
if(lastIndex != list.size()) {
AlgoAttr aAtr_N = (AlgoAttr)list.get(lastIndex);
aAtr_N.setRequestAllocated(1);
this.indexChanged = true;
}
if(lastIndex == list.size()) {
lastIndex = 0;
lastCount = 0;
this.currentVM = lastCount;
AlgoAttr aAtr_N = (AlgoAttr)list.get(lastIndex);
aAtr_N.setRequestAllocated(1);
this.indexChanged = true;
}
}
}
if(!this.indexChanged) {
aAtr.setRequestAllocated(aAtr.getRequestAllocated() + 1);
}
this.map.put(dataCenter, new AlgoHelper(lastIndex, lastCount));
//System.out.println("Current VM : " + this.currentVM + " for data center : " + dataCenter);
return this.currentVM;
}}
System.out.println("--------Before final return statement---------");
return 0;
}
#Override
public void cloudSimEventFired(CloudSimEvent e) {
if(e.getId() == CloudSimEvents.EVENT_CLOUDLET_ALLOCATED_TO_VM) {
int vmId = (Integer) e.getParameter(Constants.PARAM_VM_ID);
busyList.add(vmId);
System.out.println("+++++++++++++++++++Machine with vmID : " + vmId + " attached");
}else if(e.getId() == CloudSimEvents.EVENT_VM_FINISHED_CLOUDLET) {
int vmId = (Integer) e.getParameter(Constants.PARAM_VM_ID);
busyList.remove(vmId);
//System.out.println("+++++++++++++++++++Machine with vmID : " + vmId + " freed");
}
}
}
In the above code, all the lists are already sorted with the highest memory first.The whole idea is to balance the memory by allocating more tasks to a machine with higher memory.
Each time a machine is allotted request allocated is incremented by one.Each set of machines have a weight count attached to it, which is calculated by dividing memory_allotted by 512.
The method getNextAvailableVm() is called by multiple threads at a time. For 3 Data Centers 3 threads will simultaneously call getNextAva...() but on different class objects.The data center returned by the statement this.dcc.getDataCenterName() in the same method is returned according to the data center broker policy selected earlier.
How do I make sure that the machine I am currently returning is free and if the machine is not free I allot the next machine with highest memory available.I also have to make sure that the machine that is entitled to process X tasks, does process X tasks even that machine is currently busy.
This is a general description of the data structure used here :
The code of this class is hosted here on github.
This is the link for the complete project on github.
Most of the data structures/classes used here are inside this package
Perhaps you are over thinking the problem. A simple strategy is to have a broker which is aware of all the pending tasks. Each task worker or thread asks the broker for a new message/task to work on. The broker gives out work in the order it was asked for. This is how JMS queues works. For the JVMs which can handle two tasks you can start two threads.
There is many standard JMS which do this but I suggest looking at ActiveMQ as it is simple to get started with.
note in you case, a simpler solution is to have one machine with 8 GB of memory. You can buy 8 GB for a server for very little ($40 - $150 depending on vendor) and it will be used more efficiently in one instance by sharing resource. I assume you are looking at much larger instances. Instances smaller than 8 GB are better off just upgrading it.
How do I make sure that the machine I am currently returning is free
This is your scenario, if you don't know how to tell if a machine is free, I don't see how anyone would have more knowledge of you application.
and if the machine is not free I allot the next machine with highest memory available.
You need to look at the free machines and pick the one with the most available memory. I don't see what the catch is here other than doing what you have stated.
I also have to make sure that the machine that is entitled to process X tasks, does process X tasks even that machine is currently busy.
You need a data source or store for this information. What is allowed to run where. In JMS you would have multiple queues and only pass certain queues to the machines which can process those queue.

How do I count the number of times I ping a host in JSP?

My code is here from which I get result either true or false that weather it ping to the host I mention in it or not,
try
{
InetAddress address = InetAddress.getByName("192.168.1.125");
boolean reachable=address.isReachable(10000));
out.print(PingHost.DrawTable());
out.print("Is host reachable? " + reachable);
}
catch(Exception e)
{
out.print(e.printStackTrace());
}
I want to count the no of times it try to ping the host if it is not ping success fully fro the first time and the max no of count for ping would be 10
Hopes for your suggestions
Thanks in Advance
final static int MAX_PINGS = 10;
final static int TIMEOUT= 10000;
int countFailed = 0;
for (int i=0; i<MAX_PINGS; i++){
if (address.isReachable(TIMEOUT)){
System.out.println("Pinged successfully");
break;
}else{
countFailed++;
}
}
Note: giving 10000ms (10 seconds) as timeout is too much. I suggest it should be around 1000 ms.
Assuming that address.isReachable(10000)) is doing the ping, and returns true or false, then you want something like this:
int counter = 0;
do
{
counter ++;
if(address.isReachable(10000))
{
break;
}
}
while (counter < 10)
// now counter contains the number of attempts
I think you'd do well to find a good book on programming, to come up with a solution similar to this should not be something you need to ask about.
I would first question why this code needs to reside in a JSP. A request to this JSP will take forever to get back to you if the host is unreachable. Any solution that uses a member variable to track the count will also be problematic since it will run into concurrency issues.
You are better off writing LaceySnr's code on a servlet and spawning that code on a separate thread.

Check servers for active Webserver fast (multithreaded)

I want to check an huge amount (thousands) of Websites, if they are still running. Because I want to get rid of unececarry entries in my HostFile Wikipage about Hostfiles.
I want to do it in a 2 Stage process.
Check if something is running on Port 80
Check the HTTP response code (if it's not 200 I have to check the site)
I want to multithread, because if I want to check thousands of addresses, I cant wait for timeouts.
This question is just about Step one.
I have the problem, that ~1/4 of my connect attempts don't work. If I retry the not working ones about ~3/4 work? Do I not close the Sockets correctly? Do I run into a limit of open Sockets?
Default I run 16 threads, but I have the same problems with 8 or 4.
Is there something I'm missing
I have simplified the code a little.
Here is the code of the Thread
public class SocketThread extends Thread{
int tn;
int n;
String[] s;
private ArrayList<String> good;
private ArrayList<String> bad;
public SocketThread(int tn, int n, String[] s) {
this.tn = tn;
this.n = n;
this.s = s;
good = new ArrayList<String>();
bad = new ArrayList<String>();
}
#Override
public void run() {
int answer;
for (int i = tn * (s.length / n); i < ((tn + 1) * (s.length / n)) - 1; i++) {
answer = checkPort80(s[i]);
if (answer == 1) {
good.add(s[i]);
} else {
bad.add(s[i]);
}
System.out.println(s[i] + " | " + answer);
}
}
}
And here is the checkPort80 Method
public static int checkPort80(String host)
Socket socket = null;
int reachable = -1;
try {
//One way of doing it
//socket = new Socket(host, 80);
//socket.close();
//Another way I've tried
socket = new Socket();
InetSocketAddress ina = new InetSocketAddress(host, 80);
socket.connect(ina, 30000);
socket.close();
return reachable = 1;
} catch (Exception e) {
} finally {
if (socket != null) {
if (socket.isBound()) {
try {
socket.close();
return reachable;
} catch (Exception e) {
e.getMessage();
return reachable;
}
}
}
}
}
About Threads, I make a ArrayList of Threads, create them and .start() them and right afterwards I .join() them, get the "God" and the "Bad" save them to files.
Help is appreciated.
PS: I rename the Hosts-file first so that it doesn't affect the process, so this is not an issue.
Edit:
Thanks to Marcelo Hernández Rishr I discovered, that HttpURLConnection seems to be the better solution. It works faster and I can also get the HttpResponseCode, which I was also interested anyways (just thought it would be much slower, then just checking Port 80). I still after a while suddenly get Errors, I guess this has to do with the DNS server thinking this is a DOS-Attack ^^ (but I should examine futher if the error lies somewhere else) also fyi I use OpenDNS, so maybe they just don't like me ^^.
x4u suggested adding a sleep() to the Threads, which seems to make things a little better, but will it help me raise entries/second i don't know.
Still, I can't (by far) get to the speed I wanted (10+ entries/second), even 6 entries per second doesn't seem to work.
Here are a few scenarios I tested (until now all without any sleep()).
number of time i get first round how many entries where entries/second
threads of errors processed until then
10 1 minute 17 seconds ~770 entries 10
8 3 minute 55 seconds ~2000 entries 8,51
6 6 minute 30 seconds ~2270 entries 5,82
I will try to find a sweet spot with Threads and sleep (or maybe simply pause all for one minute if I get many errors).
Problem is, there are Hostfiles with one million entries, which at one entry per second would take 11 Days, which I guess all understand, is not expectable.
Are there ways to switch DNS-Servers on the fly?
Any other suggestions?
Should I post the new questions as separate questions?
Thanks for the help until now.
I'll post new results in about a week.
I have 3 suggestions that may help you in your task.
Maybe you can use the class HttpURLConnection
Use a maximum of 10 threads because you are still limited by cpu, bandwidth, etc.
The lists good and bad shouldn't be part of your thread class, maybe they can be static members of the class were you have your main method and do static synchronized methods to add members to both lists from any thread.
Sockets usually try to shut down gracefully and wait for a response from the destination port. While they are waiting they are still blocking resources which can make successive connection attempts fail if they were executed while there have still been too many open sockets.
To avoid this you can turn off the lingering before you connect the socket:
socket.setSoLinger(false, 0);

Multi-threading with Java, How to stop?

I am writing a code for my homework, I am not so familiar with writing multi-threaded applications. I learned how to open a thread and start it. I better show the code.
for (int i = 0; i < a.length; i++) {
download(host, port, a[i]);
scan.next();
}
My code above connects to a server opens a.length multiple parallel requests. In other words, download opens a[i] connections to get the same content on each iteration. However, I want my server to complete the download method when i = 0 and start the next iteration i = 1, when the the threads that download has opened completes. I did it with scan.next() to stop it by hand but obviously it is not a nice solution. How can I do that?
Edit:
public static long download(String host, int port) {
new java.io.File("Folder_" + N).mkdir();
N--;
int totalLength = length(host, port);
long result = 0;
ArrayList<HTTPThread> list = new ArrayList<HTTPThread>();
for (int i = 0; i < totalLength; i = i + N + 1) {
HTTPThread t;
if (i + N > totalLength) {
t = (new HTTPThread(host, port, i, totalLength - 1));
} else {
t = new HTTPThread(host, port, i, i + N);
}
list.add(t);
}
for (HTTPThread t : list) {
t.start();
}
return result;
}
And In my HTTPThread;
public void run() {
init(host, port);
downloadData(low, high);
close();
}
Note: Our test web server is a modified web server, it gets Range: i-j and in the response, there is contents of the i-j files.
You will need to call the join() method of the thread that is doing the downloading. This will cause the current thread to wait until the download thread is finished. This is a good post on how to use join.
If you'd like to post your download method you will probably get a more complete solution
EDIT:
Ok, so after you start your threads you will need to join them like so:
for (HTTPThread t : list) {
t.start();
}
for (HTTPThread t : list) {
t.join();
}
This will stop the method returning until all HTTPThreads have completed
It's probably not a great idea to create an unbounded number of threads to do an unbounded number of parallel http requests. (Both network sockets and threads are operating system resources, and require some bookkeeping overhead, and are therefore subject to quotas in many operating systems. In addition, the webserver you are reading from might not like 1000s of concurrent connections, because his network sockets are finite, too!).
You can easily control the number of concurrent connections using an ExecutorService:
List<DownloadTask> tasks = new ArrayList<DownloadTask>();
for (int i = 0; i < length; i++) {
tasks.add(new DownloadTask(i));
}
ExecutorService executor = Executors.newFixedThreadPool(N);
executor.invokeAll(tasks);
executor.shutdown();
This is both shorter and better than your homegrown concurrency limit, because your limit will delay starting with the next batch until all threads from the current batch have completed. With an ExceutorService, a new task is begun whenever an old task has completed (and there are still tasks left). That is, your solution will have 1 to N concurrent requests until all tasks have been started, whereas the ExecutorService will always have N concurrent requests.

Categories

Resources