I'm trying to implement a simple client-server application, using NIO.
As an exercise, communication should be text-based and line-oriented. But when the server reads the bytes sent by the client, it gets nothing, or rather, the buffer is filled with a bunch of zeroes.
I'm using a selector, and this is the code triggered, when the channel is readable.
private void handleRead() throws IOException {
System.out.println("Handler Read");
while (lineIndex < 0) {
buffer.clear();
switch (channel.read(buffer)) {
case -1:
// Close the connection.
return;
case 0:
System.out.println("Nothing to read.");
return;
default:
System.out.println("Converting to String...");
buffer.flip();
bufferToString();
break;
}
}
// Do something with the line read.
}
In this snippet, lineIndex is an int holding the index at which the first \n occurred, when reading. It is initialized with -1, meaning there's no \n present.
The variable buffer references a ByteBuffer, and channel represents a SocketChannel.
To keep it simple, without Charsets and whatnot, this is how bufferToString is coded:
private void bufferToString() {
char c;
System.out.println("-- Buffer to String --");
for (int i = builder.length(); buffer.remaining() > 1; ++i) {
c = buffer.getChar();
builder.append(c);
System.out.println("Appending: " + c + "(" + (int) c + ")");
if (c == '\n' && lineIndex < 0) {
System.out.println("Found a new-line character!");
lineIndex = i;
}
}
}
The variable builder holds a reference to a StringBuilder.
I expected getChar to do a reasonable convertion, but all I get in my output is a bunch (corresponding to half of the buffer capacity) of
Appending: (0)
Terminated by a
Nothing to read.
Any clues of what may be the cause? I have similar code in the client which is also unable to properly read anything from the server.
If it is of any help, here is a sample of what the writing code looks like:
private void handleWrite() throws IOException {
buffer.clear();
String msg = "Some message\n";
for (int i = 0; i < msg.length(); ++i) {
buffer.putChar(msg.charAt(i));
}
channel.write(buffer);
}
I've also confirmed that the result from channel.write is greater than zero, reassuring that the bytes are indeed written and sent.
Turns out, this was a buffer indexing problem. In the server, a flip() was missing before writing to the socket. In the client code, a few flip() were missing too, after reading and before writing. Now everything works as expected.
Current writing code (server side):
private void handleWrite() throws IOException {
String s = extractLine();
for (int i = 0, len = s.length(); i < len;) {
buffer.clear();
while (buffer.remaining() > 1 && i < len) {
buffer.putChar(s.charAt(i));
++i;
}
buffer.flip();
channel.write(buffer);
}
// some other operations...
}
Related
I'm having some issues with arduino. In class, we are learning arduino/java communication. Thus, we are asked to interpret bytes sent from the arduino and write it out in the console of eclipse as whatever type the "key" of the message tells us to write it in.
As of now, I'm just testing input streams, but I can't seem to get a complete message ever. This is what I'm doing:
public void run() throws SerialPortException {
while (true) {
if (port.available()) { //code written in another class, referenced below
byte byteArray[] = port.readByte(); //also code written in another class, referenced below
char magicNum = (char) byteArray[0];
String outputString = null;
for (int i = 0; i < byteArray.length; ++i) {
char nextChar = (char) byteArray[i];
outputString += Character.toString(nextChar);
}
System.out.println(outputString);
}
}
}
below is the code from the other class that is used in the above code
public boolean available() throws SerialPortException {
if (port.getInputBufferBytesCount() == 0) {
return false;
}
return true;
}
public byte[] readByte() throws SerialPortException {
boolean debug= true;
byte bytesRead[] = port.readBytes();
if (debug) {
System.out.println("[0x" + String.format("%02x", bytesRead[0]) + "]");
}
return bytesRead;
}
It is not possible to know when data is going to be available, nor whether input data is going to be available all at once rather than in several chunks.
This is a quick and dirty fix:
public void run() throws SerialPortException {
String outputString = "";
while (true) {
if (port.available()) {
byte byteArray[] = port.readByte();
for (int i = 0; i < byteArray.length; ++i) {
char nextChar = (char) byteArray[i];
if (nextChar == '\n') {
System.out.println(outputString);
outputString = "";
}
outputString += Character.toString(nextChar);
}
}
}
}
The declaration of outputString is moved out, and it is assigned "" so to get rid of that ugly null on standard output.
Each time \n is encountered in the serial input data, the content of outputString is printed on standard output first and cleared afterwards.
I'm trying to code a program where I can:
Load a file
Input a start and beginning offset addresses where to scan data from
Scan that offset range in search of specific sequence of bytes (such as "05805A6C")
Retrieve the offset of every match and write them to a .txt file
i66.tinypic.com/2zelef5.png
As the picture shows I need to search the file for "05805A6C" and then print to a .txt file the offset "0x21F0".
I'm using Java Swing for this. So far I've been able to load the file as a Byte array[]. But I haven't found a way how to search for the specific sequence of bytes, nor setting that search between a range of offsets.
This is my code that opens and reads the file into byte array[]
public class Read {
static public byte[] readBytesFromFile () {
try {
JFileChooser chooser = new JFileChooser();
int returnVal = chooser.showOpenDialog(null);
if (returnVal == JFileChooser.APPROVE_OPTION) {
FileInputStream input = new FileInputStream(chooser.getSelectedFile());
byte[] data = new byte[input.available()];
input.read(data);
input.close();
return data;
}
return null;
}
catch (IOException e) {
System.out.println("Unable to read bytes: " + e.getMessage());
return null;
}
}
}
And my code where I try to search among the bytes.
byte[] model = Read.readBytesFromFile();
String x = new String(model);
boolean found = false;
for (int i = 0; i < model.length; i++) {
if(x.contains("05805A6C")){
found = true;
}
}
if(found == true){
System.out.println("Yes");
}else{
System.out.println("No");
}
Here's a bomb-proof1 way to search for a sequence of bytes in a byte array:
public boolean find(byte[] buffer, byte[] key) {
for (int i = 0; i <= buffer.length - key.length; i++) {
int j = 0;
while (j < key.length && buffer[i + j] == key[j]) {
j++;
}
if (j == key.length) {
return true;
}
}
return false;
}
There are more efficient ways to do this for large-scale searching; e.g. using the Boyer-Moore algorithm. However:
converting the byte array a String and using Java string search is NOT more efficient, and it is potentially fragile depending on what encoding you use when converting the bytes to a string.
converting the byte array to a hexadecimal encoded String is even less efficient ... and memory hungry ... though not fragile if you have enough memory. (You may need up to 5 times the memory as the file size while doing the conversion ...)
1 - bomb-proof, modulo any bugs :-)
EDIT It seems the charset from system to system is different so you may get different results so I approach it with another method:
String x = HexBin.encode(model);
String b = new String("058a5a6c");
int index = 0;
while((index = x.indexOf(b,index)) != -1 )
{
System.out.println("0x"+Integer.toHexString(index/2));
index = index + 2;
}
...
Suppose you are writing a reader which removes all instances of a character (let's say that you are removing 'x'.)
You might write that like this:
public class ExampleReader extends FilterReader {
public ExampleReader(Reader in) {
super(in);
}
#Override
public int read() throws IOException {
int ch;
while ((ch = in.read()) != -1) {
if (ch != 'x') {
return ch;
}
}
}
#Override
public int read(char[] cbuf, int off, int len) throws IOException {
int charsRead = in.read(cbuf, off, len);
if (charsRead == -1) {
return -1;
}
// srcPos will always be >= dstPos
int charsRemoved = 0;
int srcEnd = off + charsRead;
for (int srcPos = off, dstPos = off; srcPos < srcEnd; srcPos++, dstPos++) {
char ch = cbuf[srcPos];
if (ch == 'x') {
dstPos--;
charsRemoved++;
} else {
cbuf[dstPos] = cbuf[srcPos];
}
}
return charsRead - charsRemoved;
}
}
In code review, another developer claims that if you return less than len, you're not supposed to have written any characters outside of the slice you read, according to the return value. However, this is not mentioned in the docs at all - they just say that the len value passed in is the maximum number of chars to read.
My own view is that if you are passed len, then you have been given permission to write anything you want in off..off+len and if you happen to return less, then you're not making any guarantees about the contents of the rest of the array. Likewise, if I were calling a reader, I wouldn't assume that the data outside of the range which was returned is meaningful to read.
Who is right?
(As a side note, what I actually implemented was line separator normalisation, \r\n and others to \n. I was sure that something so common would have been in Guava, but it didn't seem to be. Is it really that rare a task?)
The Javadoc for InputStream.read() says "these bytes will be stored in elements b[off] through b[off+k-1], leaving elements b[off+k] through b[off+len-1] unaffected." It's not reiterated for Reader.read() but it clearly should be.
Does read() return -1 if EOF is reached during the read operation, or on the subsequent call? The Java docs aren't entirely clear on this and neither is the book I'm reading.
The following code from the book is reading a file with three different types of values repeating, a double, a string of varying length and a binary long. The buffer is supposed to fill at some random place in the middle of any of the values, and the code will handle it. What I don't understand is if the -1 is returned during the read operation, the last values won't get output in the prinf statment.
try(ReadableByteChannel inCh = Files.newByteChannel(file)) {
ByteBuffer buf = ByteBuffer.allocateDirect(256);
buf.position(buf.limit());
int strLength = 0;
byte[] strChars = null;
while(true) {
if(buf.remaining() < 8) {
if(inCh.read(buf.compact()) == -1) {
break;
}
buf.flip();
}
strLength = (int)buf.getDouble();
if (buf.remaining() < 2*strLength) {
if(inCh.read(buf.compact()) == -1) {
System.err.println("EOF found while reading the prime string.");
break;
}
buf.flip();
}
strChars = new byte[2*strLength];
buf.get(strChars);
if(buf.remaining() <8) {
if(inCh.read(buf.compact()) == -1) {
System.err.println("EOF found reading the binary prime value.");
break;
}
buf.flip();
}
System.out.printf("String length: %3s String: %-12s Binary Value: %3d%n",
strLength, ByteBuffer.wrap(strChars).asCharBuffer(), buf.getLong());
}
System.out.println("\n EOF Reached.");
I suggest to make a simple test to understand how it works, like this
ReadableByteChannel in = Files.newByteChannel(Paths.get("1.txt"));
ByteBuffer b = ByteBuffer.allocate(100);
System.out.println(in.read(b));
System.out.println(in.read(b));
1.txt contains 1 byte, the test prints
1
-1
I have a Java application running which fetches data by XML, but once in a while i have some data consisting some sort of control code?
An invalid XML character (Unicode: 0x6) was found in the CDATA section.
org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x6) was found in the CDATA section.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at domain.Main.processLogFromUrl(Main.java:342)
at domain.Main.<init>(Main.java:67)
at domain.Main.main(Main.java:577)
Can anyone explain what this control code exactly does as i cannot find much info?
Thanks in advance.
You need to write a FilterInputStream to filter the data before the SAX parser gets it. It must either remove or recode the bad data.
Apache have a super-flexible example. You may wish to put together a much simpler one.
Here's one of mine which does other cleaning up but I am sure it will be a good start.
/* Cleans up often very bad xml.
*
* 1. Strips leading white space.
* 2. Recodes £ etc to &#...;.
* 3. Recodes lone & as &.
*
*/
public class XMLInputStream extends FilterInputStream {
private static final int MIN_LENGTH = 2;
// Everything we've read.
StringBuilder red = new StringBuilder();
// Data I have pushed back.
StringBuilder pushBack = new StringBuilder();
// How much we've given them.
int given = 0;
// How much we've read.
int pulled = 0;
public XMLInputStream(InputStream in) {
super(in);
}
public int length() {
// NB: This is a Troll length (i.e. it goes 1, 2, many) so 2 actually means "at least 2"
try {
StringBuilder s = read(MIN_LENGTH);
pushBack.append(s);
return s.length();
} catch (IOException ex) {
log.warning("Oops ", ex);
}
return 0;
}
private StringBuilder read(int n) throws IOException {
// Input stream finished?
boolean eof = false;
// Read that many.
StringBuilder s = new StringBuilder(n);
while (s.length() < n && !eof) {
// Always get from the pushBack buffer.
if (pushBack.length() == 0) {
// Read something from the stream into pushBack.
eof = readIntoPushBack();
}
// Pushback only contains deliverable codes.
if (pushBack.length() > 0) {
// Grab one character
s.append(pushBack.charAt(0));
// Remove it from pushBack
pushBack.deleteCharAt(0);
}
}
return s;
}
// Returns false at eof.
// Might not actually push back anything but usually will.
private boolean readIntoPushBack() throws IOException {
// File finished?
boolean eof = false;
// Next char.
int ch = in.read();
if (ch >= 0) {
// Discard whitespace at start?
if (!(pulled == 0 && isWhiteSpace(ch))) {
// Good code.
pulled += 1;
// Parse out the &stuff;
if (ch == '&') {
// Process the &
readAmpersand();
} else {
// Not an '&', just append.
pushBack.append((char) ch);
}
}
} else {
// Hit end of file.
eof = true;
}
return eof;
}
// Deal with an ampersand in the stream.
private void readAmpersand() throws IOException {
// Read the whole word, up to and including the ;
StringBuilder reference = new StringBuilder();
int ch;
// Should end in a ';'
for (ch = in.read(); isAlphaNumeric(ch); ch = in.read()) {
reference.append((char) ch);
}
// Did we tidily finish?
if (ch == ';') {
// Yes! Translate it into a &#nnn; code.
String code = XML.hash(reference);
if (code != null) {
// Keep it.
pushBack.append(code);
} else {
throw new IOException("Invalid/Unknown reference '&" + reference + ";'");
}
} else {
// Did not terminate properly!
// Perhaps an & on its own or a malformed reference.
// Either way, escape the &
pushBack.append("&").append(reference).append((char) ch);
}
}
private void given(CharSequence s, int wanted, int got) {
// Keep track of what we've given them.
red.append(s);
given += got;
log.finer("Given: [" + wanted + "," + got + "]-" + s);
}
#Override
public int read() throws IOException {
StringBuilder s = read(1);
given(s, 1, 1);
return s.length() > 0 ? s.charAt(0) : -1;
}
#Override
public int read(byte[] data, int offset, int length) throws IOException {
int n = 0;
StringBuilder s = read(length);
for (int i = 0; i < Math.min(length, s.length()); i++) {
data[offset + i] = (byte) s.charAt(i);
n += 1;
}
given(s, length, n);
return n > 0 ? n : -1;
}
#Override
public String toString() {
String s = red.toString();
String h = "";
// Hex dump the small ones.
if (s.length() < 8) {
Separator sep = new Separator(" ");
for (int i = 0; i < s.length(); i++) {
h += sep.sep() + Integer.toHexString(s.charAt(i));
}
}
return "[" + given + "]-\"" + s + "\"" + (h.length() > 0 ? " (" + h + ")" : "");
}
private boolean isWhiteSpace(int ch) {
switch (ch) {
case ' ':
case '\r':
case '\n':
case '\t':
return true;
}
return false;
}
private boolean isAlphaNumeric(int ch) {
return ('a' <= ch && ch <= 'z')
|| ('A' <= ch && ch <= 'Z')
|| ('0' <= ch && ch <= '9');
}
}
Quite why you've got that character will depend on what the data is meant to represent. (Apparently it's ACK, but that's odd to represent in a file...) However, the important point is that it makes the XML invalid - you simply can't represent that character in XML.
From the XML 1.0 spec, section 2.2:
Character Range
/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF]
| [#xE000-#xFFFD] | [#x10000-#x10FFFF]
Note how this excludes Unicode values below U+0020 other than U+0009 (tab), U+000A (line-feed) and U+000D (carriage return).
If you have any influence over the data coming back, you should change it to return valid XML. If not, you'll have to do some preprocessing on it before parsing it as XML. Quite what you'll want to do with unwanted control characters depends on what meaning they have in your situation.
Try to define your XML as version 1.1:
<?xml version="1.1"?>