Reading Image from j2me to c++ - java

I am learning c++ in order to create a little application which displays image stream. The images are coming from a j2me device which aren't stored as a file (so I just have the bytes).
I am thinking I need to send the size of the image as an int first so then the client knows how much to read for that particular image in the stream.
My problem is the size is always way too big - not the size of the image when I initially just try to read the size (I send this length in java server in a socket.write(int) and have tried dataoutputstream.writeInt). I will post some code as it's probably pretty simple.
Why is the size different to what I send?
ssize_t
readLine(int fd, char *buffer, size_t n)
{
ssize_t numRead, tt; /* # of bytes fetched by last read() */
size_t totRead; /* Total bytes read so far */
char *buf;
char ch;
if (n <= 0 || buffer == NULL) {
return -1;
}
buf = buffer; /* No pointer arithmetic on "void *" */
totRead = 0;
for (;;) {
numRead = read(fd, &ch, 1);
tt += numRead;
if (numRead == -1) {
return -1; /* Some other error */
} else if (numRead == 0) { /* EOF */
if (totRead == 0) /* No bytes read; return 0 */
return 0;
else /* Some bytes read; add '\0' */
break;
} else { /* 'numRead' must be 1 if we get here */
if (totRead < n - 1) { /* Discard > (n - 1) bytes */
totRead++;
*buf++ = ch;
}
if (ch == '\n')
break;
}
}
printf("read line %s ", buf);
fflush(stdout);
int test = (int)buf;
printf("read line int %i ", tt);
fflush(stdout);
*buf = '\0';
return totRead;
}

WBXML defines a platform independent way to write int values: Multy-byte integers.
A multi-byte integer consists of a series of octets, where the most significant bit is the continuation flag and the remaining seven bits are a scalar value. The continuation flag indicates that an octet is not the end of the multi-byte sequence. A single integer value is encoded into a sequence of N octets. The first N-1 octets have the continuation flag set to a value of one (1). The final octet in the series has a continuation flag value of zero (0).
The remaining seven bits in each octet are encoded in a big-endian order, eg, most significant bit first. The octets are arranged in a big-endian order, eg, the most significant seven bits are transmitted first. In the situation where the initial octet has less than seven bits of value, all unused bits must be set to zero (0).
For example, the integer value 0xA0 would be encoded with the two-byte sequence 0x81 0x20. The integer value 0x60 would be encoded with the one-byte sequence 0x60.
I did it for Java ME and Bada but it is pretty straightforward to implement in any language.

Your reading code handles text files, it works one char after the other, it checks for newlines, etc.
The image ("so i just have the bytes") seem to be binary data. When you interpret binary data as text, you get all sorts of random errors. These binary data may include for example a "\n", when the value of a pixel happen to be 13. It may also include "\0", which will end the string before the real end.
When store the size first, you send it as int, which is represent with 4 bytes. When you read it as 4 separate characters, you get some garbage.
You need also beware of order/endian of bytes. java uses "network order", on x86, C may read it just the other way around.
You are using the old C standard lib. It may be easier to use the C++ iostreams.

Related

Write bits in a file and retrieve them to a string of "0101.." in java?

I am working on a compression algorithm and for that i need to write strings of bits to a binary file and retrieve back exactly the same to a String again!
say, i have a string "10100100100....." and i will write them in a file as bits
(not chars '0' '1')
. and read back as bits and convert to string...
and this is for a large amount of data (>100 megabytes).
is there any neat and fast way of doing this?
So far i tried (and failed) writing them to bytes by sub-stringing into 8 bits and then as ASCII characters to a string and finally to a .txt file.
{
String Bits="10001010100000000000"; // a lot larger in actual program
String nCoded="";
char nextChar;
int i = 0;
for(i=0; i < Bits.length()-8; i += 8){
nextChar = (char)Integer.parseInt( Bits.substring(i, i+8), 2 );
nCoded += nextChar;
}
// for the remainding bits, padding
if(newBits.length()%8 != 0){
nCoded+=(char)Integer.parseInt(Bits.substring(i), 2);
}
nCoded+=(char)Bits.length()%8; //to track the remainder of Bits that was padded
writeToTextFile( nCoded, "file.txt"); //write the nCoded string to file
}
but this seems to corrupt information and inefficient.
again for clarification, i dont want the String to be written, its just a representation of the actual data. So, i want to
convert each 0 and 1 from the string representation to its binary form
and write that to file.
Here is a method you can use to convert the String to a series of bits, ready for output to file:
private byte[] toByteArray(String input){
//to charArray
char[] preBitChars = input.toCharArray();
int bitShortage = (8 - (preBitChars.length%8));
char[] bitChars = new char[preBitChars.length + bitShortage];
System.arraycopy(preBitChars, 0, bitChars, 0, preBitChars.length);
for (int i= 0; i < bitShortage; i++) {
bitChars[preBitChars.length + i]='0';
}
//to bytearray
byte[] byteArray = new byte[bitChars.length/8];
for(int i=0; i<bitChars.length; i++) {
if (bitChars[i]=='1'){
byteArray[byteArray.length - (i/8) - 1] |= 1<<(i%8);
}
}
return byteArray;
}
Passing the String "01010101" will return the result [85] as a byte[].
It turns out there is an easier way. There is a static Byte.parseByte(String) that returns Byte object. Calling:
Byte aByte = Byte.parseByte("01010101");
System.out.println(aByte);
Displays the same value: 85.
So you may ask a couple of questions here.
Why are we passing a String that is 8 characters in length. Well, you can prefix the String with an 9th character, that would represent a sign bit. I don't think you have this case, but if you needed to, the documentation for Byte.parseByte() states it should be:
An ASCII minus sign '-' ('\u002D') to indicate a negative value or an ASCII plus sign '+' ('\u002B') to indicate a positive value.
So from this information, you would need to break up your String manually into 8 bit Strings and call Byte.parseByte() to get a Byte object for each.
2) What about writing bits to a file? No, file writing is done in bytes. If you need to write the file, then read it back in and convert back to a String, you will need to reverse the process and read the file in as a byte[] then convert that to it's String representation.
A Hint on how to convert a byte to a nice String format can be found here:
Convert byte (java data type) value to bits (a string containing only 8 bits)
You can get an InputStream from a String, read each byte and write it to a file (byte is a smallest unit that you can read/write). Once everything is written, you can read the data in a similar way (i.e. InputStream) and use it. Below is an example:
String hugeSting = "10101010010101010110101001010101";
InputStream in = new ByteArrayInputStream(hugeSting.getBytes());
OutputStream out = new FileOutputStream("Test.txt");
byte b;
while((b = (byte) in.read()) != -1){
out.write(b);
}
in.close();
in = new FileInputStream("Test.txt");
//Read data

How does this piece of code verify a checksum?

Context:
My teacher ported the Darwin-OP Framework from C++ to Java, allowing students like me to use it without having to master C++. Darwin has two controllers: the main controller runs Linux and runs the java code, and has a serial connection with the sub controller (a microcontroller) that controls all the sensors/servo's/transducers etc.
Darwin uses a motion.bin file in which it stores a list of 256 pages. Each page is 512 (8 * 64) bytes, and consists of 7 steps (64 bytes each) plus a page header (also 64 bytes). Each steps contains positions (a value between 0-4095) for the servo to take. So for Darwin to move his arm, he goes through (<7) amount of steps until he finishes the final step.
Inside the page header there is a checksum of 1 byte. The Java code contains two methods in which the checksum is calculated and verified:
private static boolean VerifyChecksum(PAGE page) {
byte checksum = (byte)0x00;
byte[] pagebytes = page.GetBytes();
for (int i = 0; i < pagebytes.length; i++) {
checksum += pagebytes[i];
}
if (checksum != (byte)0xFF) {
return false;
}
return true;
}
private static void SetChecksum(PAGE page) {
byte checksum = (byte)0x00;
byte[] pagebytes = page.GetBytes();
page.header.checksum = (byte)0x00;
for (int i = 0; i < pagebytes.length; i++) {
checksum += pagebytes[i];
}
page.header.checksum = (byte)((byte)0xFF - checksum);
}
Main question: Can someone explain how the checksum is verified? I don't understand why it checks checksum != (byte)0xFF. Why not just compare the calculated checksum to page.header.checksum?
Bonus question: Why check the file integrity in the first place? Would it be that common for a page inside a .bin file to become corrupted?
To compute the checksum, you perform an XOR of all the bytes in the file, then return 0xFF minus that value.
The file passed in to the checksum method is the final file with 0x00 in the checksum position.
sum = 0xFF - XOR(file)
For binary, addition is the same as XOR, hence the line checksum += pagebytes[i];
Your professor's verification method, will XOR the entire file. Which is to say, the original argument to the checksum method, and an additional byte which is the output of the checksum method.
So the expected result is then:
XOR(file, sum)
= XOR(file) + sum
= XOR(file) + 0xFF - XOR(file)
= 0xFF
Assuming the header is part of the page. Then you calculate the sum of all bytes x and store 255-x as the checksum. When verifying it, you calculate y + 255 - x which must be equal to 255. If it is, x and y are the same number.
Note that all calculations are performed modulo 256, therefore x and y are always in the range of 0 to 255.

Represent long in least amount of characters

I need to represent both very large and small numbers in the shortest string possible. The numbers are unsigned. I have tried just straight Base64 encode, but for some smaller numbers, the encoded string is longer than just storing the number as a string. What would be the best way to most optimally store a very large or short number in the shortest string possible with it being URL safe?
I have tried just straight Base64 encode, but for some smaller numbers, the encoded string is longer than just storing the number as a string
Base64 encoding of binary byte data will make it longer, by about a third. It is not supposed to make it shorter, but to allow safe transport of binary data in formats that are not binary safe.
However, base 64 is more compact than decimal representation of a number (or of byte data), even if it is less compact than base 256 (the raw byte data). Encoding your numbers in base 64 directly will make them more compact than decimal. This will do it:
private static final String base64Chars =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_";
static String encodeNumber(long x) {
char[] buf = new char[11];
int p = buf.length;
do {
buf[--p] = base64Chars.charAt((int)(x % 64));
x /= 64;
} while (x != 0);
return new String(buf, p, buf.length - p);
}
static long decodeNumber(String s) {
long x = 0;
for (char c : s.toCharArray()) {
int charValue = base64Chars.indexOf(c);
if (charValue == -1) throw new NumberFormatException(s);
x *= 64;
x += charValue;
}
return x;
}
Using this encoding scheme, Long.MAX_VALUE will be the string H__________, which is 11 characters long, compared to its decimal representation 9223372036854775807 which is 19 characters long. Numbers up to about 16 million will fit in a mere 4 characters. That's about as short as you'll get it. (Technically there are two other characters which do not need to be encoded in URLs: . and ~. You can incorporate those to get base 66, which would be a smidgin shorter for some numbers, although that seems a bit pedantic.)
To extend on Stephen C's answer, here is a piece of code to convert to base 62 (but you can increase this by adding more characters to the digits String (just pick what characters are valid for you):
public static String toString(long n) {
String digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
int base = digits.length();
String s = "";
while (n > 0) {
long d = n % base;
s = digits.charAt(d) + s;
n = n / base;
}
return s;
}
This will never result in the string representation being longer than the digit one.
Assuming that you don't do any compression, and that you restrict yourself to URL safe characters, then the following procedure will give you the most compact encoding possible.
Make a list of all URL safe characters
Count them. Suppose you have N.
Represent your number in base N, representing 0 by the first character, 1 by the 2nd and so on.
So, what about compression ...
If you assume that the numbers you are representing are uniformly distributed across their range, then there is no real opportunity for compression.
Otherwise, there is potential for compression. If you can reduce the size of the common numbers then you can typically achieve a saving by compression. This is how Huffman encoding works.
But the downside is that compression at this level is not perfect across the range of numbers. It reduces the size of some numbers, but it inevitably increases the size of others.
So what does this mean for your use-case?
I think it means that you are looking at the problem the wrong way. You should not be aiming for a minimal encoded size for every number. You should be aiming to minimize the size on average ... averaged over the actual distribution of your numbers.

Bit-wise efficient uniform random number generation

I recall reading about a method for efficiently using random bits in an article on a math-oriented website, but I can't seem to get the right keywords in Google to find it anymore, and it's not in my browser history.
The gist of the problem that was being asked was to take a sequence of random numbers in the domain [domainStart, domainEnd) and efficiently use the bits of the random number sequence to project uniformly into the range [rangeStart, rangeEnd). Both the domain and the range are integers (more correctly, longs and not Z). What's an algorithm to do this?
Implementation-wise, I have a function with this signature:
long doRead(InputStream in, long rangeStart, long rangeEnd);
in is based on a CSPRNG (fed by a hardware RNG, conditioned through SecureRandom) that I am required to use; the return value must be between rangeStart and rangeEnd, but the obvious implementation of this is wasteful:
long doRead(InputStream in, long rangeStart, long rangeEnd) {
long retVal = 0;
long range = rangeEnd - rangeStart;
// Fill until we get to range
for (int i = 0; (1 << (8 * i)) < range; i++) {
int in = 0;
do {
in = in.read();
// but be sure we don't exceed range
} while(retVal + (in << (8 * i)) >= range);
retVal += in << (8 * i);
}
return retVal + rangeStart;
}
I believe this is effectively the same idea as (rand() * (max - min)) + min, only we're discarding bits that push us over max. Rather than use a modulo operator which may incorrectly bias the results to the lower values, we discard those bits and try again. Since hitting the CSPRNG may trigger re-seeding (which can block the InputStream), I'd like to avoid wasting random bits. Henry points out that this code biases against 0 and 257; Banthar demonstrates it in an example.
First edit: Henry reminded me that summation invokes the Central Limit Theorem. I've fixed the code above to get around that problem.
Second edit: Mechanical snail suggested that I look at the source for Random.nextInt(). After reading it for a while, I realized that this problem is similar to the base conversion problem. See answer below.
Your algorithm produces biased results. Let's assume rangeStart=0 and rangeEnd=257. If first byte is greater than 0, that will be the result. If it's 0, the result will be either 0 or 256 with 50/50 probability. So 0 and 256 are twice less likely to be chosen than any other number.
I did a simple test to confirm this:
p(0)=0.001945
p(1)=0.003827
p(2)=0.003818
...
p(254)=0.003941
p(255)=0.003817
p(256)=0.001955
I think you need to do the same as java.util.Random.nextInt and discard the whole number, instead just the last byte.
After reading the source to Random.nextInt(), I realized that this problem is similar to the base conversion problem.
Rather than converting a single symbol at a time, it would be more effective to convert blocks of input symbol at a time through an accumulator "buffer" which is large enough to represent at least one symbol in the domain and in the range. The new code looks like this:
public int[] fromStream(InputStream input, int length, int rangeLow, int rangeHigh) throws IOException {
int[] outputBuffer = new int[length];
// buffer is initially 0, so there is only 1 possible state it can be in
int numStates = 1;
long buffer = 0;
int alphaLength = rangeLow - rangeHigh;
// Fill outputBuffer from 0 to length
for (int i = 0; i < length; i++) {
// Until buffer has sufficient data filled in from input to emit one symbol in the output alphabet, fill buffer.
fill:
while(numStates < alphaLength) {
// Shift buffer by 8 (*256) to mix in new data (of 8 bits)
buffer = buffer << 8 | input.read();
// Multiply by 256, as that's the number of states that we have possibly introduced
numStates = numStates << 8;
}
// spits out least significant symbol in alphaLength
outputBuffer[i] = (int) (rangeLow + (buffer % alphaLength));
// We have consumed the least significant portion of the input.
buffer = buffer / alphaLength;
// Track the number of states we've introduced into buffer
numStates = numStates / alphaLength;
}
return outputBuffer;
}
There is a fundamental difference between converting numbers between bases and this problem, however; in order to convert between bases, I think one needs to have enough information about the number to perform the calculation - successive divisions by the target base result in remainders which are used to construct the digits in the target alphabet. In this problem, I don't really need to know all that information, as long as I'm not biasing the data, which means I can do what I did in the loop labeled "fill."

Java ByteBuffer - relative repositioning in a chain of put calls?

Here's what I want to do, except there are two problems with it: position() does an absolute positioning, not relative (and an argument of -1 is thus illegal), and you apparently can't chain another method call following a position() call - the compiler complains that it doesn't recognize putShort().
// Method to create a packet header for sending a packet. The placement of the two numbers is
// done according to little-endian encoding.
private byte[] createPacketHeader(EPacketType packetType, int fourBits,
int totalMessageLength, int segmentSize) {
return ByteBuffer.allocate(CPacketHeaderSize).order(ByteOrder.LITTLE_ENDIAN).
put((byte) ((byte) (packetType.getValue() << 4) | (byte) fourBits)).
putInt(totalMessageLength). // Bottom 3 bytes of total length (+ 1 byte discarded)
position(-1). // Reposition to discard last byte from above call !!DOESN'T WORK!!
putShort((short) segmentSize). // Segment length
put(_connectIdUtf8). // Connection ID in UTF-8, should be <= 10 bytes
array(); // This assumes zero initialization so final bytes are zero
}
So here's what I'm currently doing. It does work, but seems rather inelegant compared to what I was hoping I could do.
ByteBuffer byteBuffer =
ByteBuffer.allocate(CPacketHeaderSize).order(ByteOrder.LITTLE_ENDIAN);
byteBuffer.put((byte) ((byte) (packetType.getValue() << 4) | (byte) fourBits)).
putInt(totalMessageLength). // Bottom 3 bytes of total length (+ 1 byte discarded)
position(byteBuffer.position() -1); // Discard last byte from above call
byteBuffer.putShort((short) segmentSize). // Segment length
put(_connectIdUtf8); // Connection ID in UTF-8, should be <= 10 bytes
return byteBuffer.array(); // This assumes zero initialization so final bytes are zero
Any suggestions as to how I can get back to something closer to my first attempt?
EDIT:
Thanks for the answers, they were all helpful. If anyone is curious, here's what I ended up doing:
// Method to create a packet header for sending a packet. The placement of the two numbers is
// done according to little-endian encoding.
private byte[] createPacketHeader(EPacketType packetType, int fourBits,
int totalMessageLength, int segmentSize) {
return ByteBuffer.allocate(CPacketHeaderSize).order(ByteOrder.LITTLE_ENDIAN).
put((byte) ((byte) (packetType.getValue() << 4) | (byte) fourBits)).
put(intToThreeBytes(totalMessageLength)). // Bottom 3 bytes of total length
putShort((short) segmentSize). // Segment length
put(_connectIdUtf8). // Connection ID in UTF-8, should be <= 10 bytes
array(); // This assumes zero initialization so final bytes are zero
}
// Method to convert an int into a three-byte byte array, using little-endian encoding
private byte[] intToThreeBytes(int aNumber) {
byte[] byteArray = new byte[3];
for (int i = 0; i < 3; i++)
byteArray[i] = (byte)(aNumber >> i * 8);
return byteArray;
}
Missing in elegance too:
byte[] bytes = ByteBuffer.allocate(4).putInt(totalMessageLength).array();
byteBuffer.put(bytes, 0, 3);
I don't think you can. ByteBuffer just doesn't have the functionality to decrement the write cursor in a relative way. The write cursor only increases relatively.
I was thinking you could use mark, but as you are adding 4 bytes in one operation, you can't mark the third for an easy reset.
position method is not defined in ByteBuffer. But in its super class Buffer. So you will have to explicitly typecast to ByteBuffer after calling position method and before calling putShort method. Change the code as below:
return ((ByteBuffer)(ByteBuffer.allocate(CPacketHeaderSize).order(ByteOrder.LITTLE_ENDIAN).
put((byte) ((byte) (packetType.getValue() << 4) | (byte) fourBits)).
putInt(totalMessageLength). // Bottom 3 bytes of total length (+ 1 byte discarded)
position(-1))). // Reposition to discard last byte from above call !!DOESN'T WORK!!
putShort((short) segmentSize). // Segment length
put(_connectIdUtf8). // Connection ID in UTF-8, should be <= 10 bytes
array();

Categories

Resources