node.js performance with zeromq vs. Python vs. Java

node.js performance with zeromq vs. Python vs. Java - java

I've written a simple echo request/reply test for zeromq using node.js, Python, and Java. The code runs a loop of 100K requests. The platform is a 5yo MacBook Pro with 2 cores and 3G of RAM running Snow Leopard.
node.js is consistently an order of magnitude slower than the other two platforms.
Java:
real 0m18.823s
user 0m2.735s
sys 0m6.042s
Python:
real 0m18.600s
user 0m2.656s
sys 0m5.857s
node.js:
real 3m19.034s
user 2m43.460s
sys 0m24.668s
Interestingly, with Python and Java the client and server processes both use about half of a CPU. The client for node.js uses just about a full CPU and the server uses about 30% of a CPU. The client process also has an enormous number of page faults leading me to believe this is a memory issue. Also, at 10K requests node is only 3 times slower; it definitely slows down more the longer it runs.
Here's the client code (note that the process.exit() line doesn't work, either, which is why I included an internal timer in addition to using the time command):
var zeromq = require("zeromq");
var counter = 0;
var startTime = new Date();
var maxnum = 10000;
var socket = zeromq.createSocket('req');
socket.connect("tcp://127.0.0.1:5502");
console.log("Connected to port 5502.");
function moo()
{
process.nextTick(function(){
socket.send('Hello');
if (counter < maxnum)
{
moo();
}
});
}
moo();
socket.on('message',
function(data)
{
if (counter % 1000 == 0)
{
console.log(data.toString('utf8'), counter);
}
if (counter >= maxnum)
{
var endTime = new Date();
console.log("Time: ", startTime, endTime);
console.log("ms : ", endTime - startTime);
process.exit(0);
}
//console.log("Received: " + data);
counter += 1;
}
);
socket.on('error', function(error) {
console.log("Error: "+error);
});
Server code:
var zeromq = require("zeromq");
var socket = zeromq.createSocket('rep');
socket.bind("tcp://127.0.0.1:5502",
function(err)
{
if (err) throw err;
console.log("Bound to port 5502.");
socket.on('message', function(envelope, blank, data)
{
socket.send(envelope.toString('utf8') + " Blancmange!");
});
socket.on('error', function(err) {
console.log("Error: "+err);
});
}
);
For comparison, the Python client and server code:
import zmq
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://127.0.0.1:5502")
for counter in range(0, 100001):
socket.send("Hello")
message = socket.recv()
if counter % 1000 == 0:
print message, counter
import zmq
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://127.0.0.1:5502")
print "Bound to port 5502."
while True:
message = socket.recv()
socket.send(message + " Blancmange!")
And the Java client and server code:
package com.moo.test;
import org.zeromq.ZMQ;
import org.zeromq.ZMQ.Context;
import org.zeromq.ZMQ.Socket;
public class TestClient
{
public static void main (String[] args)
{
Context context = ZMQ.context(1);
Socket requester = context.socket(ZMQ.REQ);
requester.connect("tcp://127.0.0.1:5502");
System.out.println("Connected to port 5502.");
for (int counter = 0; counter < 100001; counter++)
{
if (!requester.send("Hello".getBytes(), 0))
{
throw new RuntimeException("Error on send.");
}
byte[] reply = requester.recv(0);
if (reply == null)
{
throw new RuntimeException("Error on receive.");
}
if (counter % 1000 == 0)
{
String replyValue = new String(reply);
System.out.println((new String(reply)) + " " + counter);
}
}
requester.close();
context.term();
}
}
package com.moo.test;
import org.zeromq.ZMQ;
import org.zeromq.ZMQ.Context;
import org.zeromq.ZMQ.Socket;
public class TestServer
{
public static void main (String[] args) {
Context context = ZMQ.context(1);
Socket socket = context.socket(ZMQ.REP);
socket.bind("tcp://127.0.0.1:5502");
System.out.println("Bound to port 5502.");
while (!Thread.currentThread().isInterrupted())
{
byte[] request = socket.recv(0);
if (request == null)
{
throw new RuntimeException("Error on receive.");
}
if (!socket.send(" Blancmange!".getBytes(), 0))
{
throw new RuntimeException("Error on send.");
}
}
socket.close();
context.term();
}
}
I would like to like node, but with the vast difference in code size, simplicity, and performance, I'd have a hard time convincing myself at this point.
So, has anyone seen behavior like this before, or did I do something asinine in the code?

You're using a third party C++ binding. As far as I understand it, the crossover between v8's "js-land" and bindings to v8 written in "c++ land", is very expensive. If you notice, some popular database bindings for node are implemented entirely in JS (although, partly I'm sure, because people don't want to compile things, but also because it has the potential to be very fast).
If I remember correctly, when Ryan Dahl was writing the Buffer objects for node, he noticed that they were actually a lot faster if he implemented them mostly in JS as opposed to C++. He ended up writing what he had to in C++, and did everything else in pure javascript.
So, I'm guessing part of the performance issue here has to do with that particular module being a c++ binding.
Judging node's performance based on a third party module is not a good medium for determining its speed or quality. You would do a lot better to benchmark node's native TCP interface.

"can you try to simulate logic from your Python example (e.i send next message only after receiving previous)?" – Andrey Sidorov Jul 11 at 6:24
I think that's part of it:
var zeromq = require("zeromq");
var counter = 0;
var startTime = new Date();
var maxnum = 100000;
var socket = zeromq.createSocket('req');
socket.connect("tcp://127.0.0.1:5502");
console.log("Connected to port 5502.");
socket.send('Hello');
socket.on('message',
function(data)
{
if (counter % 1000 == 0)
{
console.log(data.toString('utf8'), counter);
}
if (counter >= maxnum)
{
var endTime = new Date();
console.log("Time: ", startTime, endTime);
console.log("ms : ", endTime - startTime);
socket.close(); // or the process.exit(0) won't work.
process.exit(0);
}
//console.log("Received: " + data);
counter += 1;
socket.send('Hello');
}
);
socket.on('error', function(error) {
console.log("Error: "+error);
});
This version doesn't exhibit the same increasing slowness as the previous, probably because it's not throwing as many requests as possible at the server and only counting responses like the previous version. It's about 1.5 times as slow as Python/Java as opposed to 5-10 times slower in the previous version.
Still not a stunning commendation of node for this purpose, but certainly a lot better than "abysmal".

This was a problem with the zeroMQ bindings of node.
I don't know since when, but it is fixed and you get the same results as with the other languages.

I'm not all that familiar with node.js, but the way you're executing it is recursively creating new functions over and over again, no wonder it's blowing up. to be on par with python or java, the code needs to be more along the lines of:
if (counter < maxnum)
{
socket.send('Hello');
processmessages(); // or something similar in node.js if available
}

Any performance testing using REQ/REP sockets is going to be skewed due to round-tripping and thread latencies. You're basically waking up the whole stack, all the way down and up, for each message. It's not very useful as a metric because REQ/REP cases are never high performance (they can't be). There are two better performance tests:
Sending many messages of various sizes from 1 byte to 1K, see how many you can send in e.g. 10 seconds. This gives you basic throughput. This tells you how efficient the stack is.
Measure end-to-end latency but of a stream of messsages; i.e. insert time stamp in each message and see what the deviation is on the receiver. This tells you whether the stack has jitter, e.g. due to garbage collection.

Your client python code is blocking in the loop. In the node example, you receive the events in the 'message' event handler asynchronously. If all you want from your client is to receive data from zmq then your python code will be more efficient because it is coded as a specialized one-trick pony. If you want to add features like listen to other events that aren't using zmq, then you'll find it complicated to re-write the python code to do so. With node, all you need is to add another event handler. node will never be a performance beast for simple examples. However, as your project gets more complicated with more moving pieces, it's a lot easier to add features correctly to node than to do so with the vanilla python you've written. I'd much rather toss a little bit more money on hardware, increase readability and decrease my development time/cost.

Related

Update a string for N seconds in a while loop

I just started learn Java and I'm stuck with this problem: I have an infinite while-loop which creates a message to send over a socket; currently the message is not send until a number of elements is poll from a queue and read them.
String msg = null;
String toSend = "";
String currentNumOfMsg = 0;
String MAX_MSG_TO_SEND = 200;
while(true) {
if ((msg = messageQueue.poll()) != null) { // if there is an element in the list
toSend += (msg + "#");
currentNumOfMsg++;
if (currentNumOfMsg == MAX_MSG_TO_SEND) {
try {
sendMessage(toSend); // send to socket
} finally {
msg = null;
toSend = "";
currentNumOfMsg = 0;
}
}
}
}
My goal is to send the message after N seconds, without waiting to reach the MAX_MSG_TO_SEND... Is it possible to do it or I shall continue with this approach?

While the other answer is perfectly valid, I thought it may be valuable to tell you that ScheduledExecutorService (documentation found here), lets you call a function foo() every n seconds using the method scheduleAtFixedRate().
Basically, the actually setting up the executor is as easy as:
ScheduledExecutorService ses = Executors.newScheduledThreadPool(1);
ses.scheduleAtFixedRate(foo, 0, n, TimeUnit.SECONDS);
I think putting any more code in here is bit unnecessary, but to see how to do this in more detail, look here, here, or here. These links give some basic examples. I would really recommend doing it this way as this class is part of the java util library (so no extra dependencies) and you don't actually have to worry very much about the multithreading/scheduling part of it, it takes care of all that for you. But thats just my $.02.
Leave a question/comment if you have one, I'll try to answer it.

Yeah, definitely you can do such a thing. But at first you should store your receive messages in a data structure and when you want to send the data via the socket, send the data in the data structure.
also, you can use guava stopWatch to send the message exactly on time. for further information, you can see https://dzone.com/articles/guava-stopwatch
Otherwise, you can use a long variable which stores System.currentTimeMillis() and each time checks if the expected elapsed time is received or not like below sample code:
long l = System.currentTimeMillis();
if(System.currentTimeMillis() - l >= 10000) {
//send data
}

Increasing the performance of a multi-thread Selenium WebDriver based brute force bot

everyone!
I have just created a brute force bot which uses WebDriver and multithreading to brute force a 4-digit code. 4-digit means a range of 0000 - 9999 possible String values. In my case, after clicking the "submit" button, not less than 7 seconds passes before the client gets a response from the server. So, I have decided to use Thread.sleep(7200) to let the page with a response be fully loaded. Then, I found out that I couldn't afford to wait for 9999*7,5 seconds for the task to be accomplished, so I had to use multithreading. I have a Quad-Core AMD machine with 1 virtual core per 1 hardware one, which gives me the opportunity to run 8 threads simultaneously. Ok, I have separated the whole job of 9999 combinations between 8 threads equally, each had got a scope of work of 1249 combinations + remainder thread starting at the very end. Ok, now I'm getting my job done in 1,5 hours (because the right code appears to be in the middle of the scope of work). That is much better, BUT it could be even more better! You know, the Thread.sleep(7500) is a pure waste of time. My machine could be switching to other threads which are wait() because of limited amount of hardware cores. How to do this? Any ideas?
Below are two classes to represent my architecture approach:
public class BruteforceBot extends Thread {
// All the necessary implementation, blah-blah
public void run() {
brutforce();
}
private void brutforce() {
initDriver();
int counter = start;
while (counter <= finish) {
try {
webDriver.get(gatewayURL);
webDriver.findElement(By.name("code")).sendKeys(codes.get(counter));
webDriver.findElement(By.name("code")).submit();
Thread.sleep(7200);
String textFound = "";
try {
do {
textFound = Jsoup.parse(webDriver.getPageSource()).text();
//we need to be sure that the page is fully loaded
} while (textFound.contains("XXXXXXXXXXXXX"));
} catch (org.openqa.selenium.JavascriptException je) {
System.err.println("JavascriptException: TypeError: "
+ "document.documentElement is null");
continue;
}
// Test if the page returns XXXXXXXXXXXXX below
if (textFound.contains("XXXXXXXXXXXXXXXx") && !textFound.contains("YYYYYYY")) {
System.out.println("Not " + codes.get(counter));
counter++;
// Test if the page contains "YYYYYYY" string below
} else if (textFound.contains("YYYYYYY")) {
System.out.println("Correct Code is " + codes.get(counter));
botLogger.writeTheLogToFile("We have found it: " + textFound
+ " ... at the code of " + codes.get(counter));
break;
// Test if any other case of response below
} else {
System.out.println("WTF?");
botLogger.writeTheLogToFile("Strange response for code "
+ codes.get(counter));
continue;
}
} catch (InterruptedException intrrEx) {
System.err.println("Interrupted exception: ");
intrrEx.printStackTrace();
}
}
destroyDriver();
} // end of bruteforce() method
And
public class ThreadMaster {
// All the necessary implementation, blah-blah
public ThreadMaster(int amountOfThreadsArgument,
ArrayList<String> customCodes) {
this();
this.codes = customCodes;
this.amountOfThreads = amountOfThreadsArgument;
this.lastCodeIndex = codes.size() - 1;
this.remainderThread = codes.size() % amountOfThreads;
this.scopeOfWorkForASingleThread
= codes.size()/amountOfThreads;
}
public static void runThreads() {
do {
bots = new BruteforceBot[amountOfThreads];
System.out.println("Bots array is populated");
} while (bots.length != amountOfThreads);
for (int j = 0; j <= amountOfThreads - 1;) {
int finish = start + scopeOfWorkForASingleThread;
try {
bots[j] = new BruteforceBot(start, finish, codes);
} catch (Exception e) {
System.err.println("Putting a bot into a theads array failed");
continue;
}
bots[j].start();
start = finish;
j++;
}
try {
for (int j = 0; j <= amountOfThreads - 1; j++) {
bots[j].join();
}
} catch (InterruptedException ie) {
System.err.println("InterruptedException has occured "
+ "while a Bot was joining a thread ...");
ie.printStackTrace();
}
// if there are any codes that are still remain to be tested -
// this last bot/thread will take care of them
if (remainderThread != 0) {
try {
int remainderStart = lastCodeIndex - remainderThread;
int remainderFinish = lastCodeIndex;
BruteforceBot remainderBot
= new BruteforceBot(remainderStart, remainderFinish, codes);
remainderBot.start();
remainderBot.join();
} catch (InterruptedException ie) {
System.err.println("The remainder Bot has failed to "
+ "create or start or join a thread ...");
}
}
}
I need your advise on how to improve the architecture of this app to make it successfully run with say, 20 threads instead of 8. My problem is - when I simply remove Thread.sleep(7200) and at the same time order to run 20 Thread instances instead of 8, the thread constantly fails to get a response from the server because it doesn't wait for 7 seconds for it to come. Therefore, the performance becomes not just less, it == 0; Which approach would you choose in this case?
P.S.: I order the amount of threads from the main() method:
public static void main(String[] args)
throws InterruptedException, org.openqa.selenium.SessionNotCreatedException {
System.setProperty("webdriver.gecko.driver", "lib/geckodriver.exe");
ThreadMaster tm = new ThreadMaster(8, new CodesGenerator().getListOfCodesFourDigits());
tm.runThreads();

Okay, so everyone can't wait until my question will get a response so I decided to answer it as soon as I can (now!).
If you would like to increase a performance of a Selenium WebDriver-based brute force bot like this one, you need to reject using the Selenium WebDriver. Because the WebDriver is a separate process in the OS, it does not even need a JVM to run. So, every single instance of the Bot was not only a thread managed by my JVM, but a Windows process also! This was the reason why I could hardly use my PC when this app was running with more than 8 threads (each thread was invoking a Windows process geckodriver.exe or chromedriver.exe). Okay, so what you really need to do to increase performance of such a brute force bot is to use HtmlUnit instead of Selenium! HtmlUnit is a pure Java framework, its jar could be found at Maven Central, its dependency could be added to your pom.xml. This way, brute forcing a 4-digit code takes 15 - 20 minutes, taking into account that after each attempt the website responds not faster than 7 seconds after each attempt. To compare, with Selenium WebDriver it took 90 minutes to accomplish the task.
And thanks again to #MartinJames, who has pointed that Thread.sleep() does let the hardware core to switch to other threads!

ZeroMQ: Disappearing messages

We have a Java application which is acting as a server. Client applications (written in C#) are communicating with it using ZeroMQ. We are (mostly) following the Lazy Pirate pattern.
The server has a Router socket, implemented as follows (using JeroMQ):
ZContext context = new ZContext();
Socket socket = context.createSocket(ZMQ.ROUTER);
socket.bind("tcp://*:5555");
The clients connect and send messages like this:
ZContext context = ZContext.Create();
ZSocket socket = ZSocket.Create(context, ZSocketType.REQ);
socket.Identity = Encoding.UTF8.GetBytes("Some identity");
socket.Connect("tcp://my_host:5555");
socket.Send(new ZFrame("request data"));
We have experienced lost messages when multiple clients are sending messages at the same time. With a single client, there doesn't appear to be any problem.
Are we implementing this the right way for a multiple-client-single-server setup?
Update: Example client and server exhibiting this behaviour:
Server:
import org.zeromq.ZContext;
import org.zeromq.ZMQ;
import org.zeromq.ZMQ.PollItem;
import org.zeromq.ZMQ.Poller;
import org.zeromq.ZMQ.Socket;
import org.zeromq.ZMsg;
public class SimpleServer
{
public static void main(String[] args) throws InterruptedException
{
ZContext context = new ZContext();
Socket socket = context.createSocket(ZMQ.ROUTER);
socket.setRouterMandatory(true);
socket.bind("tcp://*:5559");
PollItem pollItem = new PollItem(socket, Poller.POLLIN);
int messagesReceived = 0;
int pollCount = 0;
while ((pollCount = ZMQ.poll(new PollItem[]{pollItem}, 3000)) > -1)
{
messagesReceived += pollCount;
for (int i = 0 ; i < pollCount ; i++)
{
ZMsg msg = ZMsg.recvMsg(socket);
System.out.println(String.format("Received message: %s. Total messages received: %d", msg, messagesReceived));
}
if (pollCount == 0)
{
System.out.println(String.format("No messages on socket. Total messages received: %d", messagesReceived));
}
}
}
}
Client:
using NetMQ;
using System;
using System.Text;
namespace SimpleClient
{
class Program
{
static byte[] identity = Encoding.UTF8.GetBytes("id" + DateTime.UtcNow.Ticks);
static void Main(string[] args)
{
for (int i = 0; i < 100; i++)
{
SendMessage();
}
}
private static void SendMessage()
{
using (NetMQContext context = NetMQContext.Create())
{
using (NetMQSocket socket = context.CreateRequestSocket())
{
socket.Options.Identity = identity;
socket.Connect("tcp://localhost:5559");
socket.Send(Encoding.UTF8.GetBytes("hello!"));
}
}
}
}
}
If I run the server and a single client, I can see all my 100 messages arrive. If I run, say, 5 clients simultaneously, I only get around 200 -> 300 messages arrive, instead of the full 500. As an aside, it appears that closing the socket in the client is somehow stopping the router socket on the server from receiving messages briefly, although this is just a theory.

Part 1 - poll may return more than one event
ZMQ.poll() returns the number of events that were found:
int rc = ZMQ.poll(new PollItem[]{pollItem}, 3000);
You currently assume that one return from poll is one event. Instead, you should loop over ZMsg msg = ZMsg.recvMsg(socket); for the number of events that are indicated by the return of ZMQ.Poll().
From the source of JeroMQ:
/**
* Polling on items. This has very poor performance.
* Try to use zmq_poll with selector
* CAUTION: This could be affected by jdk epoll bug
*
* #param items
* #param timeout
* #return number of events
*/
public static int zmq_poll(PollItem[] items, long timeout)
{
return zmq_poll(items, items.length, timeout);
}
Part 2 - ZMsg.receive() may return multiple frames
When you receive a ZMsg from ZMsg msg = ZMsg.recvMsg(socket);, the ZMsg may contain multiple ZFrames, each containing client data.
From the comments of the ZMsg class in JeroMQ's source:
* // Receive message from ZMQSocket "input" socket object and iterate over frames
* ZMsg receivedMessage = ZMsg.recvMsg(input);
* for (ZFrame f : receivedMessage) {
* // Do something with frame f (of type ZFrame)
* }
Part 3 - messages can be split across multiple ZFrames
From ZFrame's source in JeroMQ:
* The ZFrame class provides methods to send and receive single message
* frames across 0MQ sockets. A 'frame' corresponds to one underlying zmq_msg_t in the libzmq code.
* When you read a frame from a socket, the more() method indicates if the frame is part of an
* unfinished multipart message.
If I'm understanding this correctly, then for each event you may get multiple frames, and one client message may map to 1..N frames (if the message is big?).
So to summarize:
One return from poll may indicate multiple events.
One event and thus one ZMsg.receive() may contain multiple frames
One frame could contain one complete client message or only part of a client message; one client message maps to 1..N frames.

Unfortunately we couldn't solve this particular issue, and have moved away from using ZeroMQ for this interface. In case it helps anyone else, the only things we worked out for definite is that rapidly opening/closing the request sockets caused undesirable behaviour (lost messages) on the router socket end. The problem was exacerbated by a poorly performing server CPU, and didn't appear at all when the server was on a fast multi-core machine.

Unfortunatley I was not even close working with ZMQ at the time this question was active. But I had the same problem today and found this page. And your answer (not using ZMQ) was not satisfying for me. So I searched a bit more and finally found out what to do.
Just as a reminder: this works with the "POLLER" in ZMQ [1]
If you use "PAIR" connection you will for sure do NOT lose nay files, BUT send/recive takes approx. the same time. So you can not speed up and was not a solution for me.
Solution:
in zmq_setsockopt (python: zmq.setsockopt) you can set ZMQ_HWM (zmq.SNDHWM, zmq.RCVHWM) to '0' [2]
in python: sock.setsockopt(zmq.SNDHWM , 0) resp. sock.setsockopt(zmq.RCVHWM, 0) for the Sender resp. Reciver
note: i think notation changed from HWM to SNDWHM/RCVHWM
HWM = 0 means that there is "NO limit" for the number of messages (so be careful, maybe set a (hvery high) limit)
there is also ZMQ_SNDBUF/ ZMQ_RCVBUF (python: zmq.SNDBUF/zmq.RCVBUF) which you can give as well, ie. sock.setsockopt(zmq.RCVBUF, 0) resp. ..... [2]
so this will set the operating system "SO_RCVBUF" to default (here my knowledge ends)
setting this parameter or not did NOT influence my case but I think it might
Performance:
So with this I could "send" 100'000 files with 98kB in ~8s (~10GB): this will fill your RAM (if this is full I think your program will slow down), see also picture
in the mean time I "recived" and saved the files in about ~enter image description here118s and freeing the RAM again
Also, with that I NERVER lost a file up to now. (you might if you hit the limits of your PC)
data loss is "GOOD":
if you realy NEED all the data you should use this method
if you can regard some losses are fine (e.g. live plotting: as long as your FPS > ~50 you will smoothly see the plots and you do not care if you lose something)
--> you can save RAM and avoid blocking your whole PC!
Hope this post helps for the next person coming by...
[1]: https://learning-0mq-with-pyzmq.readthedocs.io/en/latest/pyzmq/multisocket/zmqpoller.htm
[2]: http://api.zeromq.org/2-1:zmq-setsockopt
You find a Picture of the RAM her:
RAM is loading in about 8s. Afterwords the disk is saving the files from the buffer

native errors on DatagramChannel send

Basic
I have an app that is sending packets using DatagramChannel.send in multiple threads each to its own IP address/port and each of them keeping constant bit-rate/bandwidth. Every now and then I get this error:
java.net.SocketException: Invalid argument: no further information
at sun.nio.ch.DatagramChannelImpl.send0(Native Method)
at sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(Unknown Source)
at sun.nio.ch.DatagramChannelImpl.send(Unknown Source)
at sun.nio.ch.DatagramChannelImpl.send(Unknown Source)
...
It happens on random - sometimes 5 minutes after start sometimes after a day - so I really have problems reproducing it for testing. And on my home machine I can't reproduce it at all.
Environments
Windows 7, 8 and Server 2012 (all 64bit)
64bit Java 7 update 45
More information
The app is sending SI/EIT data to DVB-C network. I'm creating a list of 188-byte arrays for each of 80-120 threads and giving it to use. The thread takes the list and is looping over the list until new list is provided.
The error usually happens on multiple channels at once. But it can happen on just one also.
The error never happened until we had 40+ threads.
The error happens while looping over the list, not when I'm binding new list to thread.
The app it not running out of memory. Its usually running up to 70% of memory given to JVM.
Strange part: If I run multiple instance of app each handling ~10 threads problems are the same.
Simplified code sample
for(int i = 0; i < 100; ++i) {
final int id = i;
new Thread(new Runnable() {
#Override
public void run() {
final Random r = new Random();
final List<byte[]> buffer = Lists.newArrayList();
for(int i = 0; i < 200; ++i) {
final byte[] temp = new byte[188];
r.nextBytes(temp);
buffer.add(temp);
}
final SocketAddress target = new InetSocketAddress("230.0.0.18", 1000 + id);
try (final DatagramChannel channel = DatagramChannel.open(StandardProtocolFamily.INET)) {
channel.configureBlocking(false);
channel.setOption(StandardSocketOptions.IP_MULTICAST_IF, NetworkInterface.getByName("eth0"));
channel.setOption(StandardSocketOptions.IP_MULTICAST_TTL, 8);
channel.setOption(StandardSocketOptions.SO_REUSEADDR, true);
channel.setOption(StandardSocketOptions.SO_SNDBUF, 1024 * 64);
int counter = 0;
int index = 0;
while(true) {
final byte[] item = buffer.get(index);
channel.send(ByteBuffer.wrap(item), target);
index = (index + 1) % buffer.size();
counter++;
Thread.sleep(1);
}
}
catch(Exception e) {
LOG.error("Fail at " + id, e);
}
}
}).start();
}
Edits:
1) #EJP: I'm setting setting multicast properties as the actual app that I use was doing joins (and reading some data). But the problems persisted even after I removed them.
2) Should I be using some other API if I just need to send UDP packets? All the samples I could find use DatagramChannel (or its older alternative).
3) I'm still stuck with this. If anyone has an idea what can I even try, please let me know.

I had exactly the same problem, and it was caused by a zero port in the target InetSocketAddress, when calling the send method.
In your code, the target port is defined as 1000 + i, so it doesn't seem to be the problem. Anyway, I'd log the target parameters that are used when the exception is thrown, just in case.

Check servers for active Webserver fast (multithreaded)

I want to check an huge amount (thousands) of Websites, if they are still running. Because I want to get rid of unececarry entries in my HostFile Wikipage about Hostfiles.
I want to do it in a 2 Stage process.
Check if something is running on Port 80
Check the HTTP response code (if it's not 200 I have to check the site)
I want to multithread, because if I want to check thousands of addresses, I cant wait for timeouts.
This question is just about Step one.
I have the problem, that ~1/4 of my connect attempts don't work. If I retry the not working ones about ~3/4 work? Do I not close the Sockets correctly? Do I run into a limit of open Sockets?
Default I run 16 threads, but I have the same problems with 8 or 4.
Is there something I'm missing
I have simplified the code a little.
Here is the code of the Thread
public class SocketThread extends Thread{
int tn;
int n;
String[] s;
private ArrayList<String> good;
private ArrayList<String> bad;
public SocketThread(int tn, int n, String[] s) {
this.tn = tn;
this.n = n;
this.s = s;
good = new ArrayList<String>();
bad = new ArrayList<String>();
}
#Override
public void run() {
int answer;
for (int i = tn * (s.length / n); i < ((tn + 1) * (s.length / n)) - 1; i++) {
answer = checkPort80(s[i]);
if (answer == 1) {
good.add(s[i]);
} else {
bad.add(s[i]);
}
System.out.println(s[i] + " | " + answer);
}
}
}
And here is the checkPort80 Method
public static int checkPort80(String host)
Socket socket = null;
int reachable = -1;
try {
//One way of doing it
//socket = new Socket(host, 80);
//socket.close();
//Another way I've tried
socket = new Socket();
InetSocketAddress ina = new InetSocketAddress(host, 80);
socket.connect(ina, 30000);
socket.close();
return reachable = 1;
} catch (Exception e) {
} finally {
if (socket != null) {
if (socket.isBound()) {
try {
socket.close();
return reachable;
} catch (Exception e) {
e.getMessage();
return reachable;
}
}
}
}
}
About Threads, I make a ArrayList of Threads, create them and .start() them and right afterwards I .join() them, get the "God" and the "Bad" save them to files.
Help is appreciated.
PS: I rename the Hosts-file first so that it doesn't affect the process, so this is not an issue.
Edit:
Thanks to Marcelo Hernández Rishr I discovered, that HttpURLConnection seems to be the better solution. It works faster and I can also get the HttpResponseCode, which I was also interested anyways (just thought it would be much slower, then just checking Port 80). I still after a while suddenly get Errors, I guess this has to do with the DNS server thinking this is a DOS-Attack ^^ (but I should examine futher if the error lies somewhere else) also fyi I use OpenDNS, so maybe they just don't like me ^^.
x4u suggested adding a sleep() to the Threads, which seems to make things a little better, but will it help me raise entries/second i don't know.
Still, I can't (by far) get to the speed I wanted (10+ entries/second), even 6 entries per second doesn't seem to work.
Here are a few scenarios I tested (until now all without any sleep()).
number of time i get first round how many entries where entries/second
threads of errors processed until then
10 1 minute 17 seconds ~770 entries 10
8 3 minute 55 seconds ~2000 entries 8,51
6 6 minute 30 seconds ~2270 entries 5,82
I will try to find a sweet spot with Threads and sleep (or maybe simply pause all for one minute if I get many errors).
Problem is, there are Hostfiles with one million entries, which at one entry per second would take 11 Days, which I guess all understand, is not expectable.
Are there ways to switch DNS-Servers on the fly?
Any other suggestions?
Should I post the new questions as separate questions?
Thanks for the help until now.
I'll post new results in about a week.

I have 3 suggestions that may help you in your task.
Maybe you can use the class HttpURLConnection
Use a maximum of 10 threads because you are still limited by cpu, bandwidth, etc.
The lists good and bad shouldn't be part of your thread class, maybe they can be static members of the class were you have your main method and do static synchronized methods to add members to both lists from any thread.

Sockets usually try to shut down gracefully and wait for a response from the destination port. While they are waiting they are still blocking resources which can make successive connection attempts fail if they were executed while there have still been too many open sockets.
To avoid this you can turn off the lingering before you connect the socket:
socket.setSoLinger(false, 0);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

node.js performance with zeromq vs. Python vs. Java - java

This was a problem with the zeroMQ bindings of node. I don't know since when, but it is fixed and you get the same results as with the other languages.

Related

Update a string for N seconds in a while loop

Increasing the performance of a multi-thread Selenium WebDriver based brute force bot

ZeroMQ: Disappearing messages

native errors on DatagramChannel send

Check servers for active Webserver fast (multithreaded)

Categories

Resources