ZeroMQ: Disappearing messages

ZeroMQ: Disappearing messages - java

We have a Java application which is acting as a server. Client applications (written in C#) are communicating with it using ZeroMQ. We are (mostly) following the Lazy Pirate pattern.
The server has a Router socket, implemented as follows (using JeroMQ):
ZContext context = new ZContext();
Socket socket = context.createSocket(ZMQ.ROUTER);
socket.bind("tcp://*:5555");
The clients connect and send messages like this:
ZContext context = ZContext.Create();
ZSocket socket = ZSocket.Create(context, ZSocketType.REQ);
socket.Identity = Encoding.UTF8.GetBytes("Some identity");
socket.Connect("tcp://my_host:5555");
socket.Send(new ZFrame("request data"));
We have experienced lost messages when multiple clients are sending messages at the same time. With a single client, there doesn't appear to be any problem.
Are we implementing this the right way for a multiple-client-single-server setup?
Update: Example client and server exhibiting this behaviour:
Server:
import org.zeromq.ZContext;
import org.zeromq.ZMQ;
import org.zeromq.ZMQ.PollItem;
import org.zeromq.ZMQ.Poller;
import org.zeromq.ZMQ.Socket;
import org.zeromq.ZMsg;
public class SimpleServer
{
public static void main(String[] args) throws InterruptedException
{
ZContext context = new ZContext();
Socket socket = context.createSocket(ZMQ.ROUTER);
socket.setRouterMandatory(true);
socket.bind("tcp://*:5559");
PollItem pollItem = new PollItem(socket, Poller.POLLIN);
int messagesReceived = 0;
int pollCount = 0;
while ((pollCount = ZMQ.poll(new PollItem[]{pollItem}, 3000)) > -1)
{
messagesReceived += pollCount;
for (int i = 0 ; i < pollCount ; i++)
{
ZMsg msg = ZMsg.recvMsg(socket);
System.out.println(String.format("Received message: %s. Total messages received: %d", msg, messagesReceived));
}
if (pollCount == 0)
{
System.out.println(String.format("No messages on socket. Total messages received: %d", messagesReceived));
}
}
}
}
Client:
using NetMQ;
using System;
using System.Text;
namespace SimpleClient
{
class Program
{
static byte[] identity = Encoding.UTF8.GetBytes("id" + DateTime.UtcNow.Ticks);
static void Main(string[] args)
{
for (int i = 0; i < 100; i++)
{
SendMessage();
}
}
private static void SendMessage()
{
using (NetMQContext context = NetMQContext.Create())
{
using (NetMQSocket socket = context.CreateRequestSocket())
{
socket.Options.Identity = identity;
socket.Connect("tcp://localhost:5559");
socket.Send(Encoding.UTF8.GetBytes("hello!"));
}
}
}
}
}
If I run the server and a single client, I can see all my 100 messages arrive. If I run, say, 5 clients simultaneously, I only get around 200 -> 300 messages arrive, instead of the full 500. As an aside, it appears that closing the socket in the client is somehow stopping the router socket on the server from receiving messages briefly, although this is just a theory.

Part 1 - poll may return more than one event
ZMQ.poll() returns the number of events that were found:
int rc = ZMQ.poll(new PollItem[]{pollItem}, 3000);
You currently assume that one return from poll is one event. Instead, you should loop over ZMsg msg = ZMsg.recvMsg(socket); for the number of events that are indicated by the return of ZMQ.Poll().
From the source of JeroMQ:
/**
* Polling on items. This has very poor performance.
* Try to use zmq_poll with selector
* CAUTION: This could be affected by jdk epoll bug
*
* #param items
* #param timeout
* #return number of events
*/
public static int zmq_poll(PollItem[] items, long timeout)
{
return zmq_poll(items, items.length, timeout);
}
Part 2 - ZMsg.receive() may return multiple frames
When you receive a ZMsg from ZMsg msg = ZMsg.recvMsg(socket);, the ZMsg may contain multiple ZFrames, each containing client data.
From the comments of the ZMsg class in JeroMQ's source:
* // Receive message from ZMQSocket "input" socket object and iterate over frames
* ZMsg receivedMessage = ZMsg.recvMsg(input);
* for (ZFrame f : receivedMessage) {
* // Do something with frame f (of type ZFrame)
* }
Part 3 - messages can be split across multiple ZFrames
From ZFrame's source in JeroMQ:
* The ZFrame class provides methods to send and receive single message
* frames across 0MQ sockets. A 'frame' corresponds to one underlying zmq_msg_t in the libzmq code.
* When you read a frame from a socket, the more() method indicates if the frame is part of an
* unfinished multipart message.
If I'm understanding this correctly, then for each event you may get multiple frames, and one client message may map to 1..N frames (if the message is big?).
So to summarize:
One return from poll may indicate multiple events.
One event and thus one ZMsg.receive() may contain multiple frames
One frame could contain one complete client message or only part of a client message; one client message maps to 1..N frames.

Unfortunately we couldn't solve this particular issue, and have moved away from using ZeroMQ for this interface. In case it helps anyone else, the only things we worked out for definite is that rapidly opening/closing the request sockets caused undesirable behaviour (lost messages) on the router socket end. The problem was exacerbated by a poorly performing server CPU, and didn't appear at all when the server was on a fast multi-core machine.

Unfortunatley I was not even close working with ZMQ at the time this question was active. But I had the same problem today and found this page. And your answer (not using ZMQ) was not satisfying for me. So I searched a bit more and finally found out what to do.
Just as a reminder: this works with the "POLLER" in ZMQ [1]
If you use "PAIR" connection you will for sure do NOT lose nay files, BUT send/recive takes approx. the same time. So you can not speed up and was not a solution for me.
Solution:
in zmq_setsockopt (python: zmq.setsockopt) you can set ZMQ_HWM (zmq.SNDHWM, zmq.RCVHWM) to '0' [2]
in python: sock.setsockopt(zmq.SNDHWM , 0) resp. sock.setsockopt(zmq.RCVHWM, 0) for the Sender resp. Reciver
note: i think notation changed from HWM to SNDWHM/RCVHWM
HWM = 0 means that there is "NO limit" for the number of messages (so be careful, maybe set a (hvery high) limit)
there is also ZMQ_SNDBUF/ ZMQ_RCVBUF (python: zmq.SNDBUF/zmq.RCVBUF) which you can give as well, ie. sock.setsockopt(zmq.RCVBUF, 0) resp. ..... [2]
so this will set the operating system "SO_RCVBUF" to default (here my knowledge ends)
setting this parameter or not did NOT influence my case but I think it might
Performance:
So with this I could "send" 100'000 files with 98kB in ~8s (~10GB): this will fill your RAM (if this is full I think your program will slow down), see also picture
in the mean time I "recived" and saved the files in about ~enter image description here118s and freeing the RAM again
Also, with that I NERVER lost a file up to now. (you might if you hit the limits of your PC)
data loss is "GOOD":
if you realy NEED all the data you should use this method
if you can regard some losses are fine (e.g. live plotting: as long as your FPS > ~50 you will smoothly see the plots and you do not care if you lose something)
--> you can save RAM and avoid blocking your whole PC!
Hope this post helps for the next person coming by...
[1]: https://learning-0mq-with-pyzmq.readthedocs.io/en/latest/pyzmq/multisocket/zmqpoller.htm
[2]: http://api.zeromq.org/2-1:zmq-setsockopt
You find a Picture of the RAM her:
RAM is loading in about 8s. Afterwords the disk is saving the files from the buffer

Related

How to correctly communicate with 3D Printer

I have to write a java program that receives G-Code commands via network and sends them to a 3D printer via serial communication. In principle everything seems to be okay, as long as the printer needs more than 300ms to execute a command. If execution time is shorter than that, it takes too much time for the printer to receive the next command and that results in a delay between command execution (printer nozzle standing still for about 100-200ms). This can become a problem in 3d printing so i have to eliminate that delay.
For comparison: Software like Repetier Host or Cura can send the same commands via seial without any delay between command execution, so it has to be possible somehow.
I use jSerialComm library for serial communication.
This is the Thread that sends commands to the printer:
#Override
public void run() {
if(printer == null) return;
log("Printer Thread started!");
//wait just in case
Main.sleep(3000);
long last = 0;
while(true) {
String cmd = printer.cmdQueue.poll();
if (cmd != null && !cmd.equals("") && !cmd.equals("\n")) {
log(cmd+" last: "+(System.currentTimeMillis()-last)+"ms");
last = System.currentTimeMillis();
send(cmd + "\n", 0);
}
}
}
private void send(String cmd, int timeout) {
printer.serialWrite(cmd);
waitForBuffer(timeout);
}
private void waitForBuffer(int timeout) {
if(!blockForOK(timeout))
log("OK Timeout ("+timeout+"ms)");
}
public boolean blockForOK(int timeoutMillis) {
long millis = System.currentTimeMillis();
while(!printer.bufferAvailable) {
if(timeoutMillis != 0)
if(millis + timeoutMillis < System.currentTimeMillis()) return false;
try {
sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
printer.bufferAvailable = false;
return true;
}
this is printer.serialWrite: ("Inspired" by Arduino Java Lib)
public void serialWrite(String s){
comPort.setComPortTimeouts(SerialPort.TIMEOUT_SCANNER, 0, 500);
try{Thread.sleep(5);} catch(Exception e){}
PrintWriter pout = new PrintWriter(comPort.getOutputStream());
pout.print(s);
pout.flush();
}
printer is an Object of class Printer which implements com.fazecast.jSerialComm.SerialPortDataListener
relevant functions of Printer
#Override
public int getListeningEvents() {
return SerialPort.LISTENING_EVENT_DATA_AVAILABLE;
}
#Override
public void serialEvent(SerialPortEvent serialPortEvent) {
byte[] newData = new byte[comPort.bytesAvailable()];
int numRead = comPort.readBytes(newData, newData.length);
handleData(new String(newData));
}
private void handleData(String line) {
//log("RX: "+line);
if(line.contains("ok")) {
bufferAvailable = true;
}
if(line.contains("T:")) {
printerThread.printer.temperature[0] = Utils.readFloat(line.substring(line.indexOf("T:")+2));
}
if(line.contains("T0:")) {
printerThread.printer.temperature[0] = Utils.readFloat(line.substring(line.indexOf("T0:")+3));
}
if(line.contains("T1:")) {
printerThread.printer.temperature[1] = Utils.readFloat(line.substring(line.indexOf("T1:")+3));
}
if(line.contains("T2:")) {
printerThread.printer.temperature[2] = Utils.readFloat(line.substring(line.indexOf("T2:")+3));
}
}
Printer.bufferAvailable is declared volatile
I also tried blocking functions of jserialcomm in another thread, same result.
Where is my bottleneck? Is there a bottleneck in my code at all or does jserialcomm produce too much overhead?
For those who do not have experience in 3d-printing:
When the printer receives a valid command, it will put that command into an internal buffer to minimize delay. As long as there is free space in the internal buffer it replies with ok. When the buffer is full, the ok is delayed until there is free space again.
So basicly you just have to send a command, wait for the ok, send another one immediately.

#Override
public void serialEvent(SerialPortEvent serialPortEvent) {
byte[] newData = new byte[comPort.bytesAvailable()];
int numRead = comPort.readBytes(newData, newData.length);
handleData(new String(newData));
}
This part is problematic, the event may have been triggered before a full line was read, so potentially only half an ok has been received yet. You need to buffer (over multiple events) and reassamble into messages first before attempting to parse this as full messages.
Worst case, this may have resulted in entirely loosing temperature readings or ok messages as they have been ripped in half.
See the InputStream example and wrap it in a BufferedReader to get access to BufferedReader::readLine(). With the BufferedReader in place, you can that just use that to poll directly in the main thread and process the response synchronously.
try{Thread.sleep(5);} catch(Exception e){}
sleep(1);
You don't want to sleep. Depending on your system environment (and I strongly assume that this isn't running on Windows on x86, but rather Linux on an embedded platform), a sleep can be much longer than anticipated. Up to 30ms or 100ms, depending on the Kernel configuration.
The sleep before write doesn't make much sense in the first place, you know that the serial port is ready to write as you already had received an ok confirming reception of the previously sent command.
The sleep during receive becomes pointless when using the BufferedReader.
comPort.setComPortTimeouts(SerialPort.TIMEOUT_SCANNER, 0, 500);
And this is actually causing your problems. SerialPort.TIMEOUT_SCANNER activates a wait period on read. After receiving the first byte it will wait at least for another 100ms to see if it will become part of a message. So after it has seen the ok it then waits 100ms internally on the OS side before it assumes that this was all there is.
You need SerialPort.TIMEOUT_READ_SEMI_BLOCKING for low latency, but then the problem predicted in the first paragraph will occur unless buffered.
Setting repeatedly also causes yet another problem, because there is a 200ms sleep in Serialport::setComPortTimeouts internally. Set it per serial connection once, no more than that.

Check the manual of the printer (or tell us the model) not sure you actually need to wait for the ok, and therefore you can read/write concurrently. Some of the time there's a hardware flow control handling this stuff for you, with large enough buffers. Try just send the commands without waiting for ok, see what happens.
If you just want to pipe commands from the network to serial port, you can use ready-made solution like socat. For example running the following:
socat TCP-LISTEN:8888,fork,reuseaddr FILE:/dev/ttyUSB0,b115200,raw
would pipe all bytes coming from clients connected to the 8888 port directly to the /dev/ttyUSB0 at baud rate of 115200 (and vice-versa).

native errors on DatagramChannel send

Basic
I have an app that is sending packets using DatagramChannel.send in multiple threads each to its own IP address/port and each of them keeping constant bit-rate/bandwidth. Every now and then I get this error:
java.net.SocketException: Invalid argument: no further information
at sun.nio.ch.DatagramChannelImpl.send0(Native Method)
at sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(Unknown Source)
at sun.nio.ch.DatagramChannelImpl.send(Unknown Source)
at sun.nio.ch.DatagramChannelImpl.send(Unknown Source)
...
It happens on random - sometimes 5 minutes after start sometimes after a day - so I really have problems reproducing it for testing. And on my home machine I can't reproduce it at all.
Environments
Windows 7, 8 and Server 2012 (all 64bit)
64bit Java 7 update 45
More information
The app is sending SI/EIT data to DVB-C network. I'm creating a list of 188-byte arrays for each of 80-120 threads and giving it to use. The thread takes the list and is looping over the list until new list is provided.
The error usually happens on multiple channels at once. But it can happen on just one also.
The error never happened until we had 40+ threads.
The error happens while looping over the list, not when I'm binding new list to thread.
The app it not running out of memory. Its usually running up to 70% of memory given to JVM.
Strange part: If I run multiple instance of app each handling ~10 threads problems are the same.
Simplified code sample
for(int i = 0; i < 100; ++i) {
final int id = i;
new Thread(new Runnable() {
#Override
public void run() {
final Random r = new Random();
final List<byte[]> buffer = Lists.newArrayList();
for(int i = 0; i < 200; ++i) {
final byte[] temp = new byte[188];
r.nextBytes(temp);
buffer.add(temp);
}
final SocketAddress target = new InetSocketAddress("230.0.0.18", 1000 + id);
try (final DatagramChannel channel = DatagramChannel.open(StandardProtocolFamily.INET)) {
channel.configureBlocking(false);
channel.setOption(StandardSocketOptions.IP_MULTICAST_IF, NetworkInterface.getByName("eth0"));
channel.setOption(StandardSocketOptions.IP_MULTICAST_TTL, 8);
channel.setOption(StandardSocketOptions.SO_REUSEADDR, true);
channel.setOption(StandardSocketOptions.SO_SNDBUF, 1024 * 64);
int counter = 0;
int index = 0;
while(true) {
final byte[] item = buffer.get(index);
channel.send(ByteBuffer.wrap(item), target);
index = (index + 1) % buffer.size();
counter++;
Thread.sleep(1);
}
}
catch(Exception e) {
LOG.error("Fail at " + id, e);
}
}
}).start();
}
Edits:
1) #EJP: I'm setting setting multicast properties as the actual app that I use was doing joins (and reading some data). But the problems persisted even after I removed them.
2) Should I be using some other API if I just need to send UDP packets? All the samples I could find use DatagramChannel (or its older alternative).
3) I'm still stuck with this. If anyone has an idea what can I even try, please let me know.

I had exactly the same problem, and it was caused by a zero port in the target InetSocketAddress, when calling the send method.
In your code, the target port is defined as 1000 + i, so it doesn't seem to be the problem. Anyway, I'd log the target parameters that are used when the exception is thrown, just in case.

Round Robin for choosing a server

I do have access to few Servers say A, B, C, D and E. I'd like to choose data from these servers one by one in a round robin way. I am new to Java and threads, It would be a great help if you could help me with this.
What I am trying to do is to load a map in my application, I send HTTP requests to the servers. These Servers revert response in Bitmap format, I arrange these Images (Tiles) and show it in my application, but I am doing it it sequentially. E.g. I request Server A first to get the tiles then Server B and so on..I would like to get the tiles in such a way that Server A downloads one Image, Server B does the other. If I if I'd be doing it all alone using one server without using Multithreading it would take a long time to display whole Map.

Create a url builder which has the base urls of each server in an array and also keeps track of which server was hit last time. Next time you need data, just return the base url of the next server.

use modulo see example:
(used String as the url)
public static final int MAX_SERVER = 4;
public static void main(String[] args)
{
String urlarr[] = new String[MAX_SERVER];
init(urlarr);
int idx = 0;
while(idx < 1000){
String next = urlarr[idx++%urlarr.length];
System.out.println(next);
}
}
private static void init(String[] urlarr)
{
for(int i=0 ; i<urlarr.length ; i++){
urlarr[i] = "url("+i+")";
}
}
using module size of array on idx make it iterates over all available indexes 0,1,2,3 in this case.
part of output:
url(0)
url(1)
url(2)
url(3)
url(0)
url(1)
url(2)
url(3)

node.js performance with zeromq vs. Python vs. Java

I've written a simple echo request/reply test for zeromq using node.js, Python, and Java. The code runs a loop of 100K requests. The platform is a 5yo MacBook Pro with 2 cores and 3G of RAM running Snow Leopard.
node.js is consistently an order of magnitude slower than the other two platforms.
Java:
real 0m18.823s
user 0m2.735s
sys 0m6.042s
Python:
real 0m18.600s
user 0m2.656s
sys 0m5.857s
node.js:
real 3m19.034s
user 2m43.460s
sys 0m24.668s
Interestingly, with Python and Java the client and server processes both use about half of a CPU. The client for node.js uses just about a full CPU and the server uses about 30% of a CPU. The client process also has an enormous number of page faults leading me to believe this is a memory issue. Also, at 10K requests node is only 3 times slower; it definitely slows down more the longer it runs.
Here's the client code (note that the process.exit() line doesn't work, either, which is why I included an internal timer in addition to using the time command):
var zeromq = require("zeromq");
var counter = 0;
var startTime = new Date();
var maxnum = 10000;
var socket = zeromq.createSocket('req');
socket.connect("tcp://127.0.0.1:5502");
console.log("Connected to port 5502.");
function moo()
{
process.nextTick(function(){
socket.send('Hello');
if (counter < maxnum)
{
moo();
}
});
}
moo();
socket.on('message',
function(data)
{
if (counter % 1000 == 0)
{
console.log(data.toString('utf8'), counter);
}
if (counter >= maxnum)
{
var endTime = new Date();
console.log("Time: ", startTime, endTime);
console.log("ms : ", endTime - startTime);
process.exit(0);
}
//console.log("Received: " + data);
counter += 1;
}
);
socket.on('error', function(error) {
console.log("Error: "+error);
});
Server code:
var zeromq = require("zeromq");
var socket = zeromq.createSocket('rep');
socket.bind("tcp://127.0.0.1:5502",
function(err)
{
if (err) throw err;
console.log("Bound to port 5502.");
socket.on('message', function(envelope, blank, data)
{
socket.send(envelope.toString('utf8') + " Blancmange!");
});
socket.on('error', function(err) {
console.log("Error: "+err);
});
}
);
For comparison, the Python client and server code:
import zmq
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://127.0.0.1:5502")
for counter in range(0, 100001):
socket.send("Hello")
message = socket.recv()
if counter % 1000 == 0:
print message, counter
import zmq
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://127.0.0.1:5502")
print "Bound to port 5502."
while True:
message = socket.recv()
socket.send(message + " Blancmange!")
And the Java client and server code:
package com.moo.test;
import org.zeromq.ZMQ;
import org.zeromq.ZMQ.Context;
import org.zeromq.ZMQ.Socket;
public class TestClient
{
public static void main (String[] args)
{
Context context = ZMQ.context(1);
Socket requester = context.socket(ZMQ.REQ);
requester.connect("tcp://127.0.0.1:5502");
System.out.println("Connected to port 5502.");
for (int counter = 0; counter < 100001; counter++)
{
if (!requester.send("Hello".getBytes(), 0))
{
throw new RuntimeException("Error on send.");
}
byte[] reply = requester.recv(0);
if (reply == null)
{
throw new RuntimeException("Error on receive.");
}
if (counter % 1000 == 0)
{
String replyValue = new String(reply);
System.out.println((new String(reply)) + " " + counter);
}
}
requester.close();
context.term();
}
}
package com.moo.test;
import org.zeromq.ZMQ;
import org.zeromq.ZMQ.Context;
import org.zeromq.ZMQ.Socket;
public class TestServer
{
public static void main (String[] args) {
Context context = ZMQ.context(1);
Socket socket = context.socket(ZMQ.REP);
socket.bind("tcp://127.0.0.1:5502");
System.out.println("Bound to port 5502.");
while (!Thread.currentThread().isInterrupted())
{
byte[] request = socket.recv(0);
if (request == null)
{
throw new RuntimeException("Error on receive.");
}
if (!socket.send(" Blancmange!".getBytes(), 0))
{
throw new RuntimeException("Error on send.");
}
}
socket.close();
context.term();
}
}
I would like to like node, but with the vast difference in code size, simplicity, and performance, I'd have a hard time convincing myself at this point.
So, has anyone seen behavior like this before, or did I do something asinine in the code?

You're using a third party C++ binding. As far as I understand it, the crossover between v8's "js-land" and bindings to v8 written in "c++ land", is very expensive. If you notice, some popular database bindings for node are implemented entirely in JS (although, partly I'm sure, because people don't want to compile things, but also because it has the potential to be very fast).
If I remember correctly, when Ryan Dahl was writing the Buffer objects for node, he noticed that they were actually a lot faster if he implemented them mostly in JS as opposed to C++. He ended up writing what he had to in C++, and did everything else in pure javascript.
So, I'm guessing part of the performance issue here has to do with that particular module being a c++ binding.
Judging node's performance based on a third party module is not a good medium for determining its speed or quality. You would do a lot better to benchmark node's native TCP interface.

"can you try to simulate logic from your Python example (e.i send next message only after receiving previous)?" – Andrey Sidorov Jul 11 at 6:24
I think that's part of it:
var zeromq = require("zeromq");
var counter = 0;
var startTime = new Date();
var maxnum = 100000;
var socket = zeromq.createSocket('req');
socket.connect("tcp://127.0.0.1:5502");
console.log("Connected to port 5502.");
socket.send('Hello');
socket.on('message',
function(data)
{
if (counter % 1000 == 0)
{
console.log(data.toString('utf8'), counter);
}
if (counter >= maxnum)
{
var endTime = new Date();
console.log("Time: ", startTime, endTime);
console.log("ms : ", endTime - startTime);
socket.close(); // or the process.exit(0) won't work.
process.exit(0);
}
//console.log("Received: " + data);
counter += 1;
socket.send('Hello');
}
);
socket.on('error', function(error) {
console.log("Error: "+error);
});
This version doesn't exhibit the same increasing slowness as the previous, probably because it's not throwing as many requests as possible at the server and only counting responses like the previous version. It's about 1.5 times as slow as Python/Java as opposed to 5-10 times slower in the previous version.
Still not a stunning commendation of node for this purpose, but certainly a lot better than "abysmal".

This was a problem with the zeroMQ bindings of node.
I don't know since when, but it is fixed and you get the same results as with the other languages.

I'm not all that familiar with node.js, but the way you're executing it is recursively creating new functions over and over again, no wonder it's blowing up. to be on par with python or java, the code needs to be more along the lines of:
if (counter < maxnum)
{
socket.send('Hello');
processmessages(); // or something similar in node.js if available
}

Any performance testing using REQ/REP sockets is going to be skewed due to round-tripping and thread latencies. You're basically waking up the whole stack, all the way down and up, for each message. It's not very useful as a metric because REQ/REP cases are never high performance (they can't be). There are two better performance tests:
Sending many messages of various sizes from 1 byte to 1K, see how many you can send in e.g. 10 seconds. This gives you basic throughput. This tells you how efficient the stack is.
Measure end-to-end latency but of a stream of messsages; i.e. insert time stamp in each message and see what the deviation is on the receiver. This tells you whether the stack has jitter, e.g. due to garbage collection.

Your client python code is blocking in the loop. In the node example, you receive the events in the 'message' event handler asynchronously. If all you want from your client is to receive data from zmq then your python code will be more efficient because it is coded as a specialized one-trick pony. If you want to add features like listen to other events that aren't using zmq, then you'll find it complicated to re-write the python code to do so. With node, all you need is to add another event handler. node will never be a performance beast for simple examples. However, as your project gets more complicated with more moving pieces, it's a lot easier to add features correctly to node than to do so with the vanilla python you've written. I'd much rather toss a little bit more money on hardware, increase readability and decrease my development time/cost.

Check servers for active Webserver fast (multithreaded)

I want to check an huge amount (thousands) of Websites, if they are still running. Because I want to get rid of unececarry entries in my HostFile Wikipage about Hostfiles.
I want to do it in a 2 Stage process.
Check if something is running on Port 80
Check the HTTP response code (if it's not 200 I have to check the site)
I want to multithread, because if I want to check thousands of addresses, I cant wait for timeouts.
This question is just about Step one.
I have the problem, that ~1/4 of my connect attempts don't work. If I retry the not working ones about ~3/4 work? Do I not close the Sockets correctly? Do I run into a limit of open Sockets?
Default I run 16 threads, but I have the same problems with 8 or 4.
Is there something I'm missing
I have simplified the code a little.
Here is the code of the Thread
public class SocketThread extends Thread{
int tn;
int n;
String[] s;
private ArrayList<String> good;
private ArrayList<String> bad;
public SocketThread(int tn, int n, String[] s) {
this.tn = tn;
this.n = n;
this.s = s;
good = new ArrayList<String>();
bad = new ArrayList<String>();
}
#Override
public void run() {
int answer;
for (int i = tn * (s.length / n); i < ((tn + 1) * (s.length / n)) - 1; i++) {
answer = checkPort80(s[i]);
if (answer == 1) {
good.add(s[i]);
} else {
bad.add(s[i]);
}
System.out.println(s[i] + " | " + answer);
}
}
}
And here is the checkPort80 Method
public static int checkPort80(String host)
Socket socket = null;
int reachable = -1;
try {
//One way of doing it
//socket = new Socket(host, 80);
//socket.close();
//Another way I've tried
socket = new Socket();
InetSocketAddress ina = new InetSocketAddress(host, 80);
socket.connect(ina, 30000);
socket.close();
return reachable = 1;
} catch (Exception e) {
} finally {
if (socket != null) {
if (socket.isBound()) {
try {
socket.close();
return reachable;
} catch (Exception e) {
e.getMessage();
return reachable;
}
}
}
}
}
About Threads, I make a ArrayList of Threads, create them and .start() them and right afterwards I .join() them, get the "God" and the "Bad" save them to files.
Help is appreciated.
PS: I rename the Hosts-file first so that it doesn't affect the process, so this is not an issue.
Edit:
Thanks to Marcelo Hernández Rishr I discovered, that HttpURLConnection seems to be the better solution. It works faster and I can also get the HttpResponseCode, which I was also interested anyways (just thought it would be much slower, then just checking Port 80). I still after a while suddenly get Errors, I guess this has to do with the DNS server thinking this is a DOS-Attack ^^ (but I should examine futher if the error lies somewhere else) also fyi I use OpenDNS, so maybe they just don't like me ^^.
x4u suggested adding a sleep() to the Threads, which seems to make things a little better, but will it help me raise entries/second i don't know.
Still, I can't (by far) get to the speed I wanted (10+ entries/second), even 6 entries per second doesn't seem to work.
Here are a few scenarios I tested (until now all without any sleep()).
number of time i get first round how many entries where entries/second
threads of errors processed until then
10 1 minute 17 seconds ~770 entries 10
8 3 minute 55 seconds ~2000 entries 8,51
6 6 minute 30 seconds ~2270 entries 5,82
I will try to find a sweet spot with Threads and sleep (or maybe simply pause all for one minute if I get many errors).
Problem is, there are Hostfiles with one million entries, which at one entry per second would take 11 Days, which I guess all understand, is not expectable.
Are there ways to switch DNS-Servers on the fly?
Any other suggestions?
Should I post the new questions as separate questions?
Thanks for the help until now.
I'll post new results in about a week.

I have 3 suggestions that may help you in your task.
Maybe you can use the class HttpURLConnection
Use a maximum of 10 threads because you are still limited by cpu, bandwidth, etc.
The lists good and bad shouldn't be part of your thread class, maybe they can be static members of the class were you have your main method and do static synchronized methods to add members to both lists from any thread.

Sockets usually try to shut down gracefully and wait for a response from the destination port. While they are waiting they are still blocking resources which can make successive connection attempts fail if they were executed while there have still been too many open sockets.
To avoid this you can turn off the lingering before you connect the socket:
socket.setSoLinger(false, 0);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

ZeroMQ: Disappearing messages - java

Related

How to correctly communicate with 3D Printer

native errors on DatagramChannel send

Round Robin for choosing a server

node.js performance with zeromq vs. Python vs. Java

Check servers for active Webserver fast (multithreaded)

Categories

Resources