Sending messages to java server in objective-c via AsyncSocket - java

Hi I am trying to send messages to a Java server that I cannot change from an iOS device. I am using AsyncSocket and was wondering how to send and receive the length appended to the string. I am doing a UTF conversion of the string to NSData however I was wondering if there is a difference in the size of the primitive between the two languages. Also is there a big endian and little endian variation? Basically I need to be able to convert the following java methods:
inStream.readUTF();
inStream.readInt();
inStream.readChar();
inStream.readShort();
inStream.readFully(recvBuff, 0, recvLen);
outStream.writeInt();
outStream.writeUTF();
outStream.writeChars();
outStream.writeShort();
outStream.write(sendBytes, 0, sendBytes.length);
I know I am very close but something is not quite right, this is what I have got so far:
I am using an NSMutableArray to append the data and the using AsyncSockets read and write methods.
[theSocket readDataToData:[AsyncSocket ZeroData] withTimeout:timeout buffer:buffer bufferOffset:offset tag:tag]; // inStream.readUTF();
[theSocket readDataToLength:sizeof(int32_t) withTimeout:timeout buffer:buffer bufferOffset:offset tag:tag]; // inStream.readInt();
[theSocket readDataToLength:sizeof(unichar) withTimeout:timeout buffer:buffer bufferOffset:offset tag:tag]; // inStream.readChar();
[theSocket readDataToLength:sizeof(int16_t) withTimeout:timeout tag:tag]; // inStream.readShort();
[theSocket readDataWithTimeout:timeout buffer:buffer bufferOffset:offset maxLength:maxLength tag:tag]; // inStream.readFully(recvBuff, 0, recvLen);
[outputBufferStream appendBytes:&[sendString length] length:sizeof([sendString length])]; // outStream.writeInt();
[outputBufferStream appendData:[sendString dataUsingEncoding:NSUTF8StringEncoding]] // outStream.writeUTF();
char array[5];
[outputBufferStream appendBytes:array length:sizeof(array)]; // outStream.writeChars();
int16_t _short;
[outputBufferStream appendBytes:&_short length:sizeof(_short)]; // outStream.writeShort();
unsigned char *sendBytes;
[outputBufferStream appendBytes:sendBytes length:sendBytesLength]; // outStream.write(sendBytes, 0, sendBytes.length);
I usually append the length at the beginning like so:
int32_t sendStringLength = [sendString length];
[outputBufferStream appendBytes:&sendStringLength length:sizeof(sendStringLength)];
At the end of the write i am appending the following as a terminator:
[outputBufferStream appendData:[#"\n" dataUsingEncoding:NSUTF8StringEncoding]];
I would really appreciate any help with this. Thanks.
EDIT::
I have got most of it working thanks to Robadob. Here is a little java snippet (working) of the bit i am currently stuck trying to get working on Objective-C:
private int sendData(String stringToSend) {
if (theSocket==null) {
lastError="sendData() called before socket was set up.";
return 1; // Error
}
try {
System.out.println("Sending "+stringToSend.length()+" chars ["+ stringToSend.length()*2+" bytes]");
System.out.println("'" + stringToSend + "'");
outStream.writeInt(stringToSend.length()*2);
outStream.writeChars(stringToSend);
outStream.flush();
} catch (IOException e) {
lastError="sendData() exception: "+e;
System.out.println(lastError);
return 2; // Error
}
return 0; // Ok
}
Here is a snippet of what I have got so far in Objective-C:
- (int32_t)sendData:(NSString *)stringToSend {
if (theSocket == nil) {
lastError = #"sendData called before socket was set up";
return 1; // Error
}
#try {
NSLog(#"Sending %d chars [%d bytes]", [stringToSend length], ([stringToSend length] * 2));
NSLog(#"'%#'", stringToSend);
uint32_t stringToSendInt = ([stringToSend length] * 2);
uint32_t stringToSendIntBigE = CFSwapInt32HostToBig(stringToSendInt);
[outputBufferStream appendBytes:&stringToSendIntBigE length:sizeof(stringToSendIntBigE)];
stringToSend = [stringToSend stringByAppendingString:#"\n"];
for (int i = 0; i < ([stringToSend length]); i++) {
unichar characterTmp = [stringToSend characterAtIndex:i];
unichar character = characterTmp << 8;
[outputBufferStream appendBytes:&character length:sizeof(character)];
}
[self syncWriteData:outputBufferStream withTimeout:socketTimeout tag:kSendDataSocketTag];
outputBufferStream = [NSMutableData data];
}
#catch (NSException *e) {
lastError = [NSString stringWithFormat:#"sendData exception: %#", [e reason]];
NSLog(#"%#", lastError);
return 2; // Error
}
return 0; // Ok
}

If you read the docs for writeUTF it says it writes the first 2 bytes using writeShort. This says
Writes a short to the underlying output stream as two bytes, high byte
first. If no exception is thrown, the counter written is incremented
by 2.
A byte is 8 bits, so that makes the value they are writing is 16 bits, you are using int32_t, which is 32 bits. you should be writing an int16 or int16_t (I don't objective-c).

Related

How to read data from RFID tag?

I want to read data from NFCv tag , I tried this method but didn't get the data. I search on internet but didn't find any clue to read data , I used another play store application that tell me that there are 128 blocks and each block is of 4 bytes , and total there are 512 bytes
try {
int offset = 0; // offset of first block to read
int blocks = 128; // number of blocks to read
byte[] cmd = new byte[]{
(byte)0x60, // flags: addressed (= UID field present)
(byte)0x23, // command: READ MULTIPLE BLOCKS
(byte)0x00, (byte)0x00, (byte)0x00, (byte)0x00, (byte)0x00, (byte)0x00, (byte)0x00, (byte)0x00, // placeholder for tag UID
(byte)(offset & 0x0ff), // first block number
(byte)((blocks - 1) & 0x0ff) // number of blocks (-1 as 0x00 means one block)
};
System.arraycopy(id, 0, cmd, 2, 8);
byte[] userdata = nfcvTag.transceive(cmd);
userdata = Arrays.copyOfRange(userdata, 0, 32);
tagData.setText("DATA:" + bytesToHex(userdata));
This is the raw string which recieve from NFCV tag
303330363036422031343530323030383034ffff
ffffffffffffffffffffffff3333303030204120
2046542031353033203030303030393433ffffff
ffffffff32322f30312f323031352d2d31343136
3037ffffffffffffffffffffffffffff752a307c
20dd0aeaffffffffffffffff089cffffffffffff
ffffffffffffffff0000093dffffffffffffffff
ffffffff27130fb60af1ffffffffffffffffffff
8000ffffffffffffffffffffffffffff00fd7d74
ffffffffffffffffffffffff2dcf6030ab0ee1ad
2db36004aadbe17c089f121b20362a7e089d1217
202f2a75ffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffff30303032
3030303600ac9b5300000aca00ac9bb700ac9bc4
00000000fffffffc02dd02de02de02de02dd02dd
02dd02db0000861300000a9c00ac9bff00acb829
00acb82a00acb8330000020dffffffeb03a0039e
039c039d039a039a0397039600008ad300000a51
00002a0800acb83d000000000000000000000000
00009ed500000000000000000000000000007ef9
ffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffff
ffffffffffffffff0000391effffffffffffffff
ffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffff
ffffffff000136ce2e656e64
This is a solution which I come with, Reading each block successively and adding each within a string, and in the last I have complete value of hexadecimal and also UTF-8 String
if(NfcAdapter.ACTION_TAG_DISCOVERED.equals(action)
|| NfcAdapter.ACTION_TECH_DISCOVERED.equals(action))
{
currentTag = intent.getParcelableExtra(NfcAdapter.EXTRA_TAG);
byte[] id = currentTag.getId();
StringBuffer buf = new StringBuffer();
tagId.setText(bytesToHex(id));
for (String tech : currentTag.getTechList()) {
if (tech.equals(NfcV.class.getName())) {
NfcV nfcvTag = NfcV.get(currentTag);
int numberOfBlocks = 0;
fullData = new StringBuffer();
utf8String = new StringBuffer();
blocksData = new ArrayList<String>();
while(numberOfBlocks < 128)
{
try {
nfcvTag.connect();
// connectTag.setText("Hello NFC!");
} catch (IOException e) {
Toast.makeText(getApplicationContext(), "Could not open a connection!", Toast.LENGTH_SHORT).show();
return;
}
try {
//
byte[] tagUid = currentTag.getId(); // store tag UID for use in addressed commands
//
byte[] cmd = new byte[] {
(byte)0x20, // FLAGS
(byte)0x20, // READ_SINGLE_BLOCK
0, 0, 0, 0, 0, 0, 0, 0,
(byte)(numberOfBlocks & 0x0ff)
};
System.arraycopy(tagUid, 0, cmd, 2, 8); // paste tag UID into command
byte[] response = nfcvTag.transceive(cmd);
String data = bytesToHex(response).substring(2);
String utf8 = new String(response , "UTF-8");
blocksData.add(data.replaceAll(" " , ""));
fullData.append(data.replaceAll(" " , ""));
utf8String.append(utf8);
nfcvTag.close();
numberOfBlocks = numberOfBlocks + 1;
} catch (IOException e) {
Toast.makeText(getApplicationContext(), "An error occurred while reading! :" + e.toString() , Toast.LENGTH_SHORT).show();
return;
}
}
The Android NFC stack supports NfcV by default - therefore use class NfcV, which abstracts all of that - instead of dealing with byte[], which you probably don't understand (else you wouldn't ask).
bytesToHex() may be useful for logging, but to decode byte[] to String that's rather:
new String(bytes, StandardCharsets.UTF_8)
The NfcV Android class does not have have any high level access methods it only has transceive(byte[]) method therefore you are using the right method and have to deal with byte arrays.
Note adding the make/model for the tag or link to it's datasheet would help understand how to correctly read the Tag.
But you have not taken in to account the MaxTransceiveLength this might be smaller than the amount of data you are trying to read in one go.
The datahsheet would also tell you the MaxTransceiveLength
I don't know the max value for this Tag / NfcV but for Cards I used the MaxTransceiveLength is 253 bytes so I guess you might be trying to read too many blocks in one go and the card is returning the maximum it can.
Therefore I use code like below for my NfcA tags with similar commands (FAST Read)
I cannot give an NfcV example as I don't have the datasheet to know the exact command format but for showing how to take in to account the MaxTransceiveLength this is not relevant.
Update added Logging in a more understandable format
// Work out how big a fast read we can do (2 bytes for CRC)
int maxTranscieveBytes = mNfcA.getMaxTransceiveLength() - 2;
// Work out how many pages can be fast read
int maxTranscievePages = maxTranscieveBytes / 4;
// Work out how many pages I want to read
int numOfPages = endPage - startPage + 1
while (numOfPages > 0){
// Work out the number of pages to read this time around
if (numOfPages > maxTranscievePages) {
readPages = maxTranscievePages;
// adjust numOfPages left
numOfPages -= maxTranscievePages;
} else {
// Last read
readPages = numOfPages;
numOfPages = 0;
}
// We can read the right number of pages
byte[] result = mNfcA.transceive(new byte[] {
(byte)0x3A, // FAST_READ
(byte)(startPage & 0x0ff),
(byte)((startPage + readPages - 1) & 0x0ff),
});
// Adjust startpage for the number of pages read for next time around
startPage += readPages;
// Do some result checking
// Log the data in more understandable format (one block/page per line)
for(int i = 0; i < (result.length / 4); i++){
String pageData = new String(result, (i * 4), 4,
StandardCharsets.UTF_8 );
Log.v("NFC", i + "=" + pageData);
}
}

How can I check the Payload size when using Azure Eventhubs and avoid a PayloadSizeExceededException?

So I am getting this exception:
com.microsoft.azure.servicebus.PayloadSizeExceededException: Size of the payload exceeded Maximum message size: 256 kb
I believe the exception is self explanatory, however, I an not sure what to do about it.
private int MAXBYTES = (int) ((1024 * 256) * .8);
for (EHubMessage message : payloads) {
byte[] payloadBytes = message.getPayload().getBytes(StandardCharsets.UTF_8);
EventData sendEvent = new EventData(payloadBytes);
events.add(sendEvent);
byteCount += payloadBytes.length;
if (byteCount > this.MAXBYTES) {
calls.add(ehc.sendASync(events));
logs.append("[Size:").append(events.size()).append(" - ").append(byteCount / 1024).append("kb] ");
events = new LinkedList<EventData>();
byteCount = 0;
pushes++;
}
}
I am counting the bytes and such. I have thought through the UTF-8 thing but I believe that should not matter. UTF-8 can be more than one byte, but it should be counted correctly with the "getBytes".
I could not find a reliable way to get the bytes in a string and I am not even sure how Azure counts the bytes. "payload" is a broad statement. Could include the boilerplate stuff and such.
Any Ideas? It would be great if there was a
EventHubClient.checkPayload(list);
method but there doesn't seem to be. How do you guys check the Payload Size?
Per my experience, I think you need to check the size of the current payload count and a new payload first before you add the new payload into events, as below.
const int MAXBYTES = 1024 * 256; // not necessary to multiply by .8
for (EHubMessage message : payloads) {
byte[] payloadBytes = message.getPayload().getBytes(StandardCharsets.UTF_8);
if (byteCount + payloadBytes.length > this.MAXBYTES) {
calls.add(ehc.sendASync(events));
logs.append("[Size:").append(events.size()).append(" - ").append(byteCount / 1024).append("kb] ");
events = new LinkedList<EventData>();
byteCount = 0;
pushes++;
}
EventData sendEvent = new EventData(payloadBytes);
events.add(sendEvent);
}
If you first added the new event data to count the payload size, it's too late and the data size of events which will be sent that might be exceed the payload limits.
Well, I should have added more of the actual code then I did in the original post. Here is what I came up with:
private int MAXBYTES = (int) ((1024 * 256) * .9);
for (EHubMessage message : payloads) {
byte[] payloadBytes = message.getPayload().getBytes(StandardCharsets.UTF_8);
int propsSize = message.getProps() == null ? 0 : message.getProps().toString().getBytes().length;
int messageSize = payloadBytes.length + propsSize;
if (byteCount + messageSize > this.MAXBYTES) {
calls.add(ehc.sendASync(events));
logs.append("[Size:").append(events.size()).append(" - ").append(byteCount / 1024).append("kb] ");
events = new LinkedList<EventData>();
byteCount = 0;
pushes++;
}
byteCount += messageSize;
EventData sendEvent = new EventData(payloadBytes);
sendEvent.getProperties().putAll(message.getProps());
events.add(sendEvent);
}
if (!events.isEmpty()) {
calls.add(ehc.sendASync(events));
logs.append("[Size:").append(events.size()).append(" - ").append(byteCount / 1024).append("kb]");
pushes++;
}
// lets wait til they are done.
CompletableFuture.allOf(calls.toArray(new CompletableFuture[0])).join();
}
If you notice, I was adding Properties to the EventData but not counting the bytes. The toString() for a Map returns something like:
{markiscool=markiscool}
Again, I am not sure of the boilerplate characters that the Azure api is adding but I am sure it is not much. Notice I still back off the MAXBYTES a bit just in case.
It would still be good to get a "payload size checker" method in the api but I would imagine that it would have to build the payload first to give it back to you. I experimented with having my EHubMessage object figure this out for me, but "getBytes()" on a String actually does some conversion that I don't want to do twice.

Extract Bytes At specified location from byte array

hello i am new with byte manipulation in java. i already have byte array with flowing format
1-> datapacketlength (length of name) (first byte)
2-> name (second byte + datapacket length)
3-> datapacketlength (length of datetime)
4-> current date and time
how can i extract the name and current date and time.should i use Arrays.copyOfRange() method.
Regards from
mcd
You can use ByteBuffer and use your current byte array, then use the methods that come with it to get the next float, int etc (such as buffer.getInt and buffer.getFloat).
You can get a portion of your byte array when you create a new bytebuffer by using the wrap method I believe. The possibilities are endless :). To get strings as you asked, you simply need to do something like:
byte[] name = new byte[nameLength];
buffer.get(name);
nameString = byteRangeToString(name);
where byteRangeToString is a method to return a new string representation of the byte[] data you pass it.
public String byteRangeToString(byte[] data)
{
try
{
return new String(data, "UTF-8");
}
catch (UnsupportedEncodingException e)
{
/* handle accordingly */
}
}
See: http://developer.android.com/reference/java/nio/ByteBuffer.html
Using copyOfRange() may run you into memory issues if used excessively.
What about something like :
int nameLength = 0;
int dateLength = 0;
byte[] nameByteArray;
byte[] dateByteArray
for(int i=0; i<bytesArray.length; i++){
if(i == 0){
nameLength = bytesArray[i] & 0xFF;
nameByteArray = new byte[nameLength];
}
else if(i == nameLength+1){
dateLength = byteArray[i] & 0xFF;
dateByteArray = new byte[dateLength];
}
else if(i < nameLength+1){
nameByteArray[i-1] = bytesArray[i];
}
else{
dateByteArray[i-(nameLength+1)] = bytesArray[i];
}
}
You want to use a DataInputStream.

What is the best way to convert this java code into Objective C code?

public byte[] toBytes() {
size = 12;
ByteBuffer buf = ByteBuffer.allocate(size);
buf.putInt(type.ordinal());//type is a enum
buf.putInt(id);
buf.putInt(size);
return buf.array();
}
#Override
public void fromBytes(byte[] data) {
ByteBuffer buf = ByteBuffer.allocate(data.length);
buf.put(data);
buf.rewind();
type = MessageType.values()[buf.getInt()];
id = buf.getInt();
size = buf.getInt();
}
I have two java methods and want to write an objective C method..
For the first method I wrote it into an Objective C code like
- (NSMutableData *) toBytes{
size = 12;
NSMutableData *buf = [[NSMutableData alloc] initWithCapacity:size];
NSData *dataType = [NSData dataWithBytes: &type length: sizeof(type)];
NSData *dataId = [NSData dataWithBytes: &msgId length: sizeof(msgId)];
NSData *dataSize = [NSData dataWithBytes: &size length: sizeof(size)];
[buf appendData:dataType];
[buf appendData:dataId];
[buf appendData:dataSize];
[dataType release];
[dataId release];
[dataSize release];
return buf;
}
But not sure how to read it back...
It could've been easier if I add only one data into the buffer
but I added total three data so I don't know how to read those back..
Thanks in advance...
Note to LCYSoft: i'm making this a community wiki. please correct any issues. i didn't compile this. since you posted one direction and really want an answer, i provided one. sorry, i am kinda busy atm.
this demonstrates both directions, and expands on the OP:
typedef enum t_mon_enum_type {
MONEnum_Edno = 1,
MONEnum_Dve = 2,
MONEnum_Tre = 3
} t_mon_enum_type;
#interface MONObject : NSObject
{
t_mon_enum_type type;
int msgId;
int size;
}
#end
#implementation MONObject
/* ... */
- (NSMutableData *)dataRepresentation
{
const int typeAsInt = (int)type;
const size_t capacity = sizeof(typeAsInt) + sizeof(msgId) + sizeof(size);
NSMutableData * data = [[NSMutableData alloc] initWithCapacity:capacity];
[data appendBytes:&typeAsInt length:sizeof(typeAsInt)];
[data appendBytes:&msgId length:sizeof(msgId)];
[data appendBytes:&size length:sizeof(size)];
return [data autorelease];
}
- (BOOL)isDataRepresentationValid:(NSData *)data { /* #todo */ }
- (BOOL)restoreFromDataRepresentation:(NSData *)data
{
if (![self isDataRepresentationValid]) {
return NO;
}
NSRange range = { 0, 0 };
int tmp = 0;
/* restore `type` */
range.length = sizeof(tmp);
[data getBytes:&tmp range:range];
type = (t_mon_enum_type)tmp;
/* advance read position */
range.location += range.length;
/* restore `msgId` */
range.length = sizeof(msgId);
[data getBytes:&msgId range:range];
/* advance read position */
range.location += range.length;
/*
setting the length here is redundant in this case, but it's how we
write it when dealing with more complex pod types.
*/
range.length = sizeof(size);
[data getBytes:&size range:range];
return YES;
}
i'm not going to rewrite the program for you, but i'll provide a tip:
you can use c++ in objc programs. specifically, you can compile as C (.c), ObjC (.m), C++ (.cpp), and ObjC++ (.mm). note: one common extension follows each language. the compiler will (by default) compile using the language implied by the file extension.
now, many java programs more closely resemble c++ programs. if you're porting a program, also consider writing it in c++ since the program will often be closer to the java variant.
for objc, you'd probably use CF/NS-MutableData
for c++, you can use std::vector
good luck

Are there C++ equivalents for the Protocol Buffers delimited I/O functions in Java?

I'm trying to read / write multiple Protocol Buffers messages from files, in both C++ and Java. Google suggests writing length prefixes before the messages, but there's no way to do that by default (that I could see).
However, the Java API in version 2.1.0 received a set of "Delimited" I/O functions which apparently do that job:
parseDelimitedFrom
mergeDelimitedFrom
writeDelimitedTo
Are there C++ equivalents? And if not, what's the wire format for the size prefixes the Java API attaches, so I can parse those messages in C++?
Update:
These now exist in google/protobuf/util/delimited_message_util.h as of v3.3.0.
I'm a bit late to the party here, but the below implementations include some optimizations missing from the other answers and will not fail after 64MB of input (though it still enforces the 64MB limit on each individual message, just not on the whole stream).
(I am the author of the C++ and Java protobuf libraries, but I no longer work for Google. Sorry that this code never made it into the official lib. This is what it would look like if it had.)
bool writeDelimitedTo(
const google::protobuf::MessageLite& message,
google::protobuf::io::ZeroCopyOutputStream* rawOutput) {
// We create a new coded stream for each message. Don't worry, this is fast.
google::protobuf::io::CodedOutputStream output(rawOutput);
// Write the size.
const int size = message.ByteSize();
output.WriteVarint32(size);
uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
if (buffer != NULL) {
// Optimization: The message fits in one buffer, so use the faster
// direct-to-array serialization path.
message.SerializeWithCachedSizesToArray(buffer);
} else {
// Slightly-slower path when the message is multiple buffers.
message.SerializeWithCachedSizes(&output);
if (output.HadError()) return false;
}
return true;
}
bool readDelimitedFrom(
google::protobuf::io::ZeroCopyInputStream* rawInput,
google::protobuf::MessageLite* message) {
// We create a new coded stream for each message. Don't worry, this is fast,
// and it makes sure the 64MB total size limit is imposed per-message rather
// than on the whole stream. (See the CodedInputStream interface for more
// info on this limit.)
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size)) return false;
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
// Parse the message.
if (!message->MergeFromCodedStream(&input)) return false;
if (!input.ConsumedEntireMessage()) return false;
// Release the limit.
input.PopLimit(limit);
return true;
}
Okay, so I haven't been able to find top-level C++ functions implementing what I need, but some spelunking through the Java API reference turned up the following, inside the MessageLite interface:
void writeDelimitedTo(OutputStream output)
/* Like writeTo(OutputStream), but writes the size of
the message as a varint before writing the data. */
So the Java size prefix is a (Protocol Buffers) varint!
Armed with that information, I went digging through the C++ API and found the CodedStream header, which has these:
bool CodedInputStream::ReadVarint32(uint32 * value)
void CodedOutputStream::WriteVarint32(uint32 value)
Using those, I should be able to roll my own C++ functions that do the job.
They should really add this to the main Message API though; it's missing functionality considering Java has it, and so does Marc Gravell's excellent protobuf-net C# port (via SerializeWithLengthPrefix and DeserializeWithLengthPrefix).
I solved the same problem using CodedOutputStream/ArrayOutputStream to write the message (with the size) and CodedInputStream/ArrayInputStream to read the message (with the size).
For example, the following pseudo-code writes the message size following by the message:
const unsigned bufLength = 256;
unsigned char buffer[bufLength];
Message protoMessage;
google::protobuf::io::ArrayOutputStream arrayOutput(buffer, bufLength);
google::protobuf::io::CodedOutputStream codedOutput(&arrayOutput);
codedOutput.WriteLittleEndian32(protoMessage.ByteSize());
protoMessage.SerializeToCodedStream(&codedOutput);
When writing you should also check that your buffer is large enough to fit the message (including the size). And when reading, you should check that your buffer contains a whole message (including the size).
It definitely would be handy if they added convenience methods to C++ API similar to those provided by the Java API.
IsteamInputStream is very fragile to eofs and other errors that easily occurs when used together with std::istream. After this the protobuf streams are permamently damaged and any already used buffer data is destroyed. There are proper support for reading from traditional streams in protobuf.
Implement google::protobuf::io::CopyingInputStream and use that together with CopyingInputStreamAdapter. Do the same for the output variants.
In practice a parsing call ends up in google::protobuf::io::CopyingInputStream::Read(void* buffer, int size) where a buffer is given. The only thing left to do is read into it somehow.
Here's an example for use with Asio synchronized streams (SyncReadStream/SyncWriteStream):
#include <google/protobuf/io/zero_copy_stream_impl_lite.h>
using namespace google::protobuf::io;
template <typename SyncReadStream>
class AsioInputStream : public CopyingInputStream {
public:
AsioInputStream(SyncReadStream& sock);
int Read(void* buffer, int size);
private:
SyncReadStream& m_Socket;
};
template <typename SyncReadStream>
AsioInputStream<SyncReadStream>::AsioInputStream(SyncReadStream& sock) :
m_Socket(sock) {}
template <typename SyncReadStream>
int
AsioInputStream<SyncReadStream>::Read(void* buffer, int size)
{
std::size_t bytes_read;
boost::system::error_code ec;
bytes_read = m_Socket.read_some(boost::asio::buffer(buffer, size), ec);
if(!ec) {
return bytes_read;
} else if (ec == boost::asio::error::eof) {
return 0;
} else {
return -1;
}
}
template <typename SyncWriteStream>
class AsioOutputStream : public CopyingOutputStream {
public:
AsioOutputStream(SyncWriteStream& sock);
bool Write(const void* buffer, int size);
private:
SyncWriteStream& m_Socket;
};
template <typename SyncWriteStream>
AsioOutputStream<SyncWriteStream>::AsioOutputStream(SyncWriteStream& sock) :
m_Socket(sock) {}
template <typename SyncWriteStream>
bool
AsioOutputStream<SyncWriteStream>::Write(const void* buffer, int size)
{
boost::system::error_code ec;
m_Socket.write_some(boost::asio::buffer(buffer, size), ec);
return !ec;
}
Usage:
AsioInputStream<boost::asio::ip::tcp::socket> ais(m_Socket); // Where m_Socket is a instance of boost::asio::ip::tcp::socket
CopyingInputStreamAdaptor cis_adp(&ais);
CodedInputStream cis(&cis_adp);
Message protoMessage;
uint32_t msg_size;
/* Read message size */
if(!cis.ReadVarint32(&msg_size)) {
// Handle error
}
/* Make sure not to read beyond limit of message */
CodedInputStream::Limit msg_limit = cis.PushLimit(msg_size);
if(!msg.ParseFromCodedStream(&cis)) {
// Handle error
}
/* Remove limit */
cis.PopLimit(msg_limit);
Here you go:
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/io/coded_stream.h>
using namespace google::protobuf::io;
class FASWriter
{
std::ofstream mFs;
OstreamOutputStream *_OstreamOutputStream;
CodedOutputStream *_CodedOutputStream;
public:
FASWriter(const std::string &file) : mFs(file,std::ios::out | std::ios::binary)
{
assert(mFs.good());
_OstreamOutputStream = new OstreamOutputStream(&mFs);
_CodedOutputStream = new CodedOutputStream(_OstreamOutputStream);
}
inline void operator()(const ::google::protobuf::Message &msg)
{
_CodedOutputStream->WriteVarint32(msg.ByteSize());
if ( !msg.SerializeToCodedStream(_CodedOutputStream) )
std::cout << "SerializeToCodedStream error " << std::endl;
}
~FASWriter()
{
delete _CodedOutputStream;
delete _OstreamOutputStream;
mFs.close();
}
};
class FASReader
{
std::ifstream mFs;
IstreamInputStream *_IstreamInputStream;
CodedInputStream *_CodedInputStream;
public:
FASReader(const std::string &file), mFs(file,std::ios::in | std::ios::binary)
{
assert(mFs.good());
_IstreamInputStream = new IstreamInputStream(&mFs);
_CodedInputStream = new CodedInputStream(_IstreamInputStream);
}
template<class T>
bool ReadNext()
{
T msg;
unsigned __int32 size;
bool ret;
if ( ret = _CodedInputStream->ReadVarint32(&size) )
{
CodedInputStream::Limit msgLimit = _CodedInputStream->PushLimit(size);
if ( ret = msg.ParseFromCodedStream(_CodedInputStream) )
{
_CodedInputStream->PopLimit(msgLimit);
std::cout << mFeed << " FASReader ReadNext: " << msg.DebugString() << std::endl;
}
}
return ret;
}
~FASReader()
{
delete _CodedInputStream;
delete _IstreamInputStream;
mFs.close();
}
};
I ran into the same issue in both C++ and Python.
For the C++ version, I used a mix of the code Kenton Varda posted on this thread and the code from the pull request he sent to the protobuf team (because the version posted here doesn't handle EOF while the one he sent to github does).
#include <google/protobuf/message_lite.h>
#include <google/protobuf/io/zero_copy_stream.h>
#include <google/protobuf/io/coded_stream.h>
bool writeDelimitedTo(const google::protobuf::MessageLite& message,
google::protobuf::io::ZeroCopyOutputStream* rawOutput)
{
// We create a new coded stream for each message. Don't worry, this is fast.
google::protobuf::io::CodedOutputStream output(rawOutput);
// Write the size.
const int size = message.ByteSize();
output.WriteVarint32(size);
uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
if (buffer != NULL)
{
// Optimization: The message fits in one buffer, so use the faster
// direct-to-array serialization path.
message.SerializeWithCachedSizesToArray(buffer);
}
else
{
// Slightly-slower path when the message is multiple buffers.
message.SerializeWithCachedSizes(&output);
if (output.HadError())
return false;
}
return true;
}
bool readDelimitedFrom(google::protobuf::io::ZeroCopyInputStream* rawInput, google::protobuf::MessageLite* message, bool* clean_eof)
{
// We create a new coded stream for each message. Don't worry, this is fast,
// and it makes sure the 64MB total size limit is imposed per-message rather
// than on the whole stream. (See the CodedInputStream interface for more
// info on this limit.)
google::protobuf::io::CodedInputStream input(rawInput);
const int start = input.CurrentPosition();
if (clean_eof)
*clean_eof = false;
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size))
{
if (clean_eof)
*clean_eof = input.CurrentPosition() == start;
return false;
}
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit = input.PushLimit(size);
// Parse the message.
if (!message->MergeFromCodedStream(&input)) return false;
if (!input.ConsumedEntireMessage()) return false;
// Release the limit.
input.PopLimit(limit);
return true;
}
And here is my python2 implementation:
from google.protobuf.internal import encoder
from google.protobuf.internal import decoder
#I had to implement this because the tools in google.protobuf.internal.decoder
#read from a buffer, not from a file-like objcet
def readRawVarint32(stream):
mask = 0x80 # (1 << 7)
raw_varint32 = []
while 1:
b = stream.read(1)
#eof
if b == "":
break
raw_varint32.append(b)
if not (ord(b) & mask):
#we found a byte starting with a 0, which means it's the last byte of this varint
break
return raw_varint32
def writeDelimitedTo(message, stream):
message_str = message.SerializeToString()
delimiter = encoder._VarintBytes(len(message_str))
stream.write(delimiter + message_str)
def readDelimitedFrom(MessageType, stream):
raw_varint32 = readRawVarint32(stream)
message = None
if raw_varint32:
size, _ = decoder._DecodeVarint32(raw_varint32, 0)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message = MessageType()
message.ParseFromString(data)
return message
#In place version that takes an already built protobuf object
#In my tests, this is around 20% faster than the other version
#of readDelimitedFrom()
def readDelimitedFrom_inplace(message, stream):
raw_varint32 = readRawVarint32(stream)
if raw_varint32:
size, _ = decoder._DecodeVarint32(raw_varint32, 0)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message.ParseFromString(data)
return message
else:
return None
It might not be the best looking code and I'm sure it can be refactored a fair bit, but at least that should show you one way to do it.
Now the big problem: It's SLOW.
Even when using the C++ implementation of python-protobuf, it's one order of magnitude slower than in pure C++. I have a benchmark where I read 10M protobuf messages of ~30 bytes each from a file. It takes ~0.9s in C++, and 35s in python.
One way to make it a bit faster would be to re-implement the varint decoder to make it read from a file and decode in one go, instead of reading from a file and then decoding as this code currently does. (profiling shows that a significant amount of time is spent in the varint encoder/decoder). But needless to say that alone is not enough to close the gap between the python version and the C++ version.
Any idea to make it faster is very welcome :)
Just for completeness, I post here an up-to-date version that works with the master version of protobuf and Python3
For the C++ version it is sufficient to use the utils in delimited_message_utils.h, here a MWE
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/util/delimited_message_util.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
template <typename T>
bool writeManyToFile(std::deque<T> messages, std::string filename) {
int outfd = open(filename.c_str(), O_WRONLY | O_CREAT | O_TRUNC);
google::protobuf::io::FileOutputStream fout(outfd);
bool success;
for (auto msg: messages) {
success = google::protobuf::util::SerializeDelimitedToZeroCopyStream(
msg, &fout);
if (! success) {
std::cout << "Writing Failed" << std::endl;
break;
}
}
fout.Close();
close(outfd);
return success;
}
template <typename T>
std::deque<T> readManyFromFile(std::string filename) {
int infd = open(filename.c_str(), O_RDONLY);
google::protobuf::io::FileInputStream fin(infd);
bool keep = true;
bool clean_eof = true;
std::deque<T> out;
while (keep) {
T msg;
keep = google::protobuf::util::ParseDelimitedFromZeroCopyStream(
&msg, &fin, nullptr);
if (keep)
out.push_back(msg);
}
fin.Close();
close(infd);
return out;
}
For the Python3 version, building on #fireboot 's answer, the only thing thing that needed modification is the decoding of raw_varint32
def getSize(raw_varint32):
result = 0
shift = 0
b = six.indexbytes(raw_varint32, 0)
result |= ((ord(b) & 0x7f) << shift)
return result
def readDelimitedFrom(MessageType, stream):
raw_varint32 = readRawVarint32(stream)
message = None
if raw_varint32:
size = getSize(raw_varint32)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message = MessageType()
message.ParseFromString(data)
return message
Was also looking for a solution for this. Here's the core of our solution, assuming some java code wrote many MyRecord messages with writeDelimitedTo into a file. Open the file and loop, doing:
if(someCodedInputStream->ReadVarint32(&bytes)) {
CodedInputStream::Limit msgLimit = someCodedInputStream->PushLimit(bytes);
if(myRecord->ParseFromCodedStream(someCodedInputStream)) {
//do your stuff with the parsed MyRecord instance
} else {
//handle parse error
}
someCodedInputStream->PopLimit(msgLimit);
} else {
//maybe end of file
}
Hope it helps.
Working with an objective-c version of protocol-buffers, I ran into this exact issue. On sending from the iOS client to a Java based server that uses parseDelimitedFrom, which expects the length as the first byte, I needed to call writeRawByte to the CodedOutputStream first. Posting here to hopegully help others that run into this issue. While working through this issue, one would think that Google proto-bufs would come with a simply flag which does this for you...
Request* request = [rBuild build];
[self sendMessage:request];
}
- (void) sendMessage:(Request *) request {
//** get length
NSData* n = [request data];
uint8_t len = [n length];
PBCodedOutputStream* os = [PBCodedOutputStream streamWithOutputStream:outputStream];
//** prepend it to message, such that Request.parseDelimitedFrom(in) can parse it properly
[os writeRawByte:len];
[request writeToCodedOutputStream:os];
[os flush];
}
Since I'm not allowed to write this as a comment to Kenton Varda's answer above; I believe there is a bug in the code he posted (as well as in other answers which have been provided). The following code:
...
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size)) return false;
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
...
sets an incorrect limit because it does not take into account the size of the varint32 which has already been read from input. This can result in data loss/corruption as additional bytes are read from the stream which may be part of the next message. The usual way of handling this correctly is to delete the CodedInputStream used to read the size and create a new one for reading the payload:
...
uint32_t size;
{
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
if (!input.ReadVarint32(&size)) return false;
}
google::protobuf::io::CodedInputStream input(rawInput);
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
...
You can use getline for reading a string from a stream, using the specified delimiter:
istream& getline ( istream& is, string& str, char delim );
(defined in the header)

Categories

Resources