Trying to speed up reads from a Socket in java - java

I'm trying to make a client that can send HTTP requests and receive responses from web servers. I tried using Java's HttpURLConnection class but it doesn't give me enough control over what actually gets sent to the server, so I'd like to compose my own HTTP request messages and send them over a Socket. However, reading from the Socket's InputStream is prohibitively slow for some servers, and I'd like to speed that up if possible. Here's some code that I used to test how slow the reads were for the socket as compared to the HttpURLConnection:
public static void useURLConnection() throws Exception
{
URL url = new URL("http://" + hostName + "/");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
InputStream in = conn.getInputStream();
byte[] buffer = new byte[buffersize];
long start = System.currentTimeMillis();
while(in.read(buffer) != -1) { }
System.out.println(System.currentTimeMillis() - start);
}
public static void useSocket() throws Exception
{
byte[] request = ("GET / HTTP/1.1\r\nHost: " + hostName + "\r\n\r\n").getBytes();
Socket socket = new Socket(hostName, 80);
OutputStream out = socket.getOutputStream();
InputStream in = socket.getInputStream();
out.write(request);
byte[] buffer = new byte[buffersize];
long start = System.currentTimeMillis();
while(in.read(buffer) != -1) { }
System.out.println(System.currentTimeMillis() - start);
}
Both methods run in about the same amount of time for some servers, such as www.wikipedia.org, but reading from the socket is much slower -- minutes as opposed to milliseconds -- for others, such as www.google.com. Can someone explain why this is, and perhaps give me some pointers as to what, if anything, I can do to speed up the reads from the socket? Thanks.

So, HTTP/1.1 turns on keepalive by default for client requests. In your socket example, you're sending HTTP/1.1 as your version string, so you're implicitly accepting that you can support keepalive, yet you're completely disregarding it.
Basically, you're blocking trying to read more from the server, despite the fact that the server is waiting for you to do something (either send another request or close the connection.)
You need to either send a header "Connection: close" or send HTTP/1.0 as your version string.

Related

How to submit HTTP request with an INTENTIONAL syntax error?

I'm trying to write a simple test where I submit a request to http://localhost:12345/%, knowing that this is an illegal URI, because I want to assert that my HTTP Server's error-handling code behaves correctly. However, I am having a hard time forcing Java to do this.
If I try to create a Java 11 HttpRequest with URI.create("localhost:12345/%"), I get a URISyntaxException, which is correct and not helpful.
Similarly, using a ws-rs WebTarget:
ClientBuilder.newBuilder().build().target("http://localhost:12345").path("/%")
builds me a WebTarget pointing to /%25, which would normally be very helpful, but is not what I want in this particular situation.
Is there a way to test my error-handling behavior without resorting to low-level bytestream manipulation?
Another possibility is just to use plain Socket - it's easy enough if you know the protocol (especially if using the new text-block feature). This will allow you to misformat the request in any way you like. Reading the response and analysing the result is - of course - a bit more involved:
String request = """
GET %s HTTP/1.1\r
Host: localhost:%s\r
Connection: close\r
\r
""".formatted("/%", port);
try (Socket client = new Socket("localhost", port);
OutputStream os = client.getOutputStream();
InputStream in = client.getInputStream()) {
os.write(request.getBytes(StandardCharsets.US_ASCII));
os.flush();
// This is optimistic: the server should close the
// connection since we asked for it, and we're hoping
// that the response will be in ASCII for the headers
// and UTF-8 for the body - and that it won't use
// chunk encoding.
byte[] bytes = in.readAllBytes();
String response = new String(bytes, StandardCharsets.UTF_8);
System.out.println("response: " + response);
}
Noah's comment lead me down the right path; I was able to do this with the URL class:
#Test
public void testUriMalformed() throws Exception {
final URL url = new URL(TEST_URI + "/%");
final HttpURLConnection connection = (HttpURLConnection)url.openConnection();
final int code = connection.getResponseCode();
final String contentType = connection.getHeaderField("Content-Type");
final String entity = IOUtils.toString(connection.getErrorStream(), Charsets.UTF_8);
assertEquals(500, code);
assertEquals(MediaType.APPLICATION_JSON, contentType);
assertTrue(entity.contains("error_id"));
}

Java HTTP/1.1 GET request BufferedReader readLine never stops

Hello I'm making an HTTP client. I'm trying to fetch google.com's html code. I have a problem the the BufferedReader.readLine() function is blocking endlessly because the remote server apparently doesn't send a blank line? Or could it be that my request is wrong?
Appreciate any help!
public static void main(String[] args) {
String uri = "www.google.com";
int port = 80;
Socket socket = new Socket(uri, port);
PrintWriter toServer = new PrintWriter(socket.getOutputStream(), true);
InputStream inputStream = socket.getInputStream();
get(uri, port, language, socket, toServer, inputStream);
}
public static void get(String uri, int port, String language, Socket socket, PrintWriter toServer, InputStream inputStream) {
try {
toServer.println("GET / HTTP/1.1");
toServer.println("Host: " + uri + ":" + port);
toServer.println();
// Parse header
StringBuilder stringBuilder = new StringBuilder();
BufferedReader fromServer = new BufferedReader(new InputStreamReader(inputStream));
String line;
while ((line = fromServer.readLine()) != null) {
stringBuilder.append(line);
}
System.out.println("done");
} catch (IOException e) {
e.printStackTrace();
}
}
You are sending a HTTP/1.1 request which by default enables HTTP keep-alive. This means that the server might keep the TCP connection open after the response was sent in order to accept more requests from the client. Your code instead assumes that the server will close the connection after the response was finished by explicitly expecting readline to return null. But since the server will not close the connection (or only after some long timeout) the readline will just block.
To fix this either use HTTP/1.0 (which has keep-alive off by default) instead of HTTP/1.1 or explicitly tell the server that no more requests will be send by adding a Connection: close header.
Please note that in general HTTP is way more complex than you might think if you've just seen a few examples. The problem you face in your question is only a glimpse into more problems which you will face when continuing this path. If you really want to implement your own HTTP handling instead of using established libraries please study the actual standard instead of just assuming a specific behavior.

TCP detect disconnected server from client

I'm writing a simple TCP client/server program pair in Java, and the server must disconnect if the client hasn't sent anything in 10 seconds. socket.setSoTimeout() gets me that, and the server disconnects just fine. The problem is - how can I get the client to determine if the server is closed? Currently I'm using DataOutputStream for writing to the server, and some answers here on SO suggest that writing to a closed socket will throw an IOException, but that doesn't happen.
What kind of writer object should I use to send arbitrary byte blocks to the server, that would throw an exception or otherwise indicate that the connection has been closed remotely?
Edit: here's the client code. This is a test function that reads one file from the file system and sends it to the server. It sends it in chunks, and pauses for some time between each chunk.
public static void sendFileWithTimeout(String file, String address, int dataPacketSize, int timeout) {
Socket connectionToServer = null;
DataOutputStream outStream = null;
FileInputStream inStream = null;
try {
connectionToServer = new Socket(address, 2233);
outStream = new DataOutputStream(connectionToServer.getOutputStream());
Path fileObject = Paths.get(file);
outStream.writeUTF(fileObject.getFileName().toString());
byte[] data = new byte[dataPacketSize];
inStream = new FileInputStream(fileObject.toFile());
boolean fileFinished = false;
while (!fileFinished) {
int bytesRead = inStream.read(data);
if (bytesRead == -1) {
fileFinished = true;
} else {
outStream.write(data, 0, bytesRead);
System.out.println("Thread " + Thread.currentThread().getName() + " wrote " + bytesRead + " bytes.");
Thread.sleep(timeout);
}
}
} catch (IOException | InterruptedException e) {
System.out.println("Something something.");
throw new RuntimeException("Problem sending data to server.", e);
} finally {
TCPUtil.silentCloseObject(inStream);
TCPUtil.silentCloseObject(outStream);
TCPUtil.silentCloseObject(connectionToServer);
}
}
I'd expect the outStream.write to throw an IOException when it tries to write to a closed server, but nothing.
I'd expect the outStream.write to throw an IOException when it tries to write to a closed server, but nothing.
It won't do that the first time, because of the socket send buffer. If you keep writing, it will eventually throw an IOException: 'connection reset'. If you don't have data to get to that point, you will never find out that the peer has closed.
I think you need to flush and close your stream after written like outStream.flush(); outStream.close(); inStream.close();
Remember ServerSocket.setSoTimeout() is different from client's function with same name.
For server, this function only throws SocketTimeoutException for you to catch it if timeout is expired, but the server socket still remains.
For client, setSoTimeout() relates to 'read timeout' for stream reading.
In your case, you must show your server code of closing the connected socket after catching SocketTimeoutException => ensure server closed the associated socket with a specified client. If done, at client side, your code line:
throw new RuntimeException("Problem sending data to server.", e);
will be called.
[Update]
I noticed that you stated to set timeout for the accepted socket at server side to 10 secs (=10,000 milliseconds); for that period, did your client complete all the file sending? if it did, never the exception occurs.
[Suggest]
for probing, just comment out your code of reading file content to send to server, and try replacing with several lines of writing to output stream:
outStream.writeUTF("ONE");
outStream.writeUTF("TWO");
outStream.writeUTF("TREE");
Then you can come to the conclusion.

Sending Java GET parameters to the server (witout hanging)

I would like to send a GET parameter to a server. I really do not need the InputStream (below), but the request is actually sent when I call "getInputStream". The problem is, this code hangs on getInputStream. The timeout does not apply because the connection is actually established (does not time-out).
What do I need to change so that I'm sending a clean GET to the server without hanging?
URL url = new URL("http://localhost:8888/abc?message=abc"); //[edit]
URLConnection uc = url.openConnection();
uc.setRequestProperty("Accept-Charset", "UTF-8");
uc.setConnectTimeout(1000);
InputStream in = uc.getInputStream();
in.close();
In case it matters, I'm testing with netcat -l as the server instead of using an actual web server. None the less, I would like this code to be very fail-safe so it the server can't adversely effect this code.
I basically gave up in using the URLConnection and wrote the code to use a socket instead. I'm still open for improvements, light-weight posting to a web server is very useful.
URL u = new URL("http://localhost:8888/abc?message=abc");
String get = "";
if (u.getPath() != null)
get += u.getPath();
if (u.getQuery() != null)
get += "?" + u.getQuery();
if (u.getRef() != null)
get += "#" + u.getRef();
Socket socket = new Socket();
socket
.connect(new InetSocketAddress(u.getHost(), u.getPort()),
750);
OutputStream out = socket.getOutputStream();
out.write(("GET " + get + "\n\n").getBytes());
out.close();

Java server socket communication is VERY slow

Local on Linux. It's about 10 seconds for a 20k message. My guess is my Java is bad and Python is fine.
py client:
def scan(self, msg):
try:
print 'begin scan'
HOST = 'localhost'
PORT = 33000
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT));
s.sendall(msg)
data = s.recv(1024)
s.close()
print 'Received', repr(data)
except Exception, e:
print "error: " + str(e)
Java server:
ServerSocket service = new ServerSocket(33000);
while(true) {
debug("Begin waiting for connection");
//this spins
Socket connection = service.accept();
debug("Connection received from " + connection.getInetAddress().getHostName());
OutputStreamWriter out = new OutputStreamWriter(connection.getOutputStream());
BufferedInputStream in = new BufferedInputStream(connection.getInputStream());
ScanResultsHeader results = new ScanResultsHeader();
Scanner scanner = new Scanner();
results = scanner.scan("scannerfake#gmail.com", "123", in);
and
public ScanResultsHeader scan (String userEmail,
String imapRetrievalId,
BufferedInputStream mimeEmail)
throws IOException, FileNotFoundException, MimeException, ScannerException {
//how fast would it be to just slurp up stream?
debug("slurp!");
String slurp = IOUtils.toString(mimeEmail);
debug("slurped " + slurp.length() + " characters");
slurp = slurp.toLowerCase();
debug("lc'ed it");
//...
My guess is I'm juggling the input streams wrong. One catch is the "BufferedInputStream mimeEmail" signature is required by the library API scan is using, so I'll need to get to that form eventually. But I noticed the simple act of slurping up a string takes ludicrously long so I'm already doing something incorrect.
Revising my answer....
If you are reading efficiently, and it appears you are, it will only be taking a lot time because either
You are creating a new connection every time you send a message which can be very expensive.
You are not sending the data as fast as you think.
The message is very large (unlikely but it could be)
There are plenty of examples on how to do this and a good library you can use is IOUtils which makes it simpler.
You should be able to send about 200K/s messages over a single socket in Java.
If you have a sends X bytes protocol using Big Endian you can do this.
DataInputStream dis = new DataInputStream( ...
int len = dis.readInt();
byte[] bytes = new byte[len];
dis.readFully(bytes);
String text = new String(bytes, "UTF-8");
Original problem was that the client isn't sending an end-of-input so the "slurp" operation keeps waiting for more stuff to cross the connection.
Solution was to implement an application-layer protocol to send the size of the message in advance, then stop listening for more message after that many bytes. I would have preferred a standard library -- something like, FiniteInputStream extends BufferedInputStream and takes a size as an argument, but wrote my own.

Categories

Resources