I'm trying to use JNA with USBXPRESS library from SiLabs (siusbxp.dll), and while basic functions work fine, there is a problem with SI_GetDeviceProductString function.
public class usbxpress
{
public interface SiUSBXp extends StdCallLibrary
{
SiUSBXp INSTANCE = (SiUSBXp) Native.loadLibrary("SiUSBXp", SiUSBXp.class);
byte SI_GetNumDevices (IntByReference lpdwNumDevices);
byte SI_GetProductString( int dwDeviceNum, byte[] lpvDeviceString, int dwFlags);
byte SI_Open (int dwDevice, HANDLEByReference cyHandle);
byte SI_GetPartNumber (HANDLE cyHandle, ByteByReference lpbPartNum);
byte SI_GetDeviceProductString (HANDLE cyHandle, PointerByReference lpProduct, ByteByReference lpbLength, int bConvertToASCII);
//byte SI_GetDeviceProductString (HANDLE cyHandle, LPVOID lpProduct, LPBYTE lpbLength, BOOL bConvertToASCII = TRUE); //original c function
}
public static void main(String[] args)
{
//checking number of connected devices
IntByReference lpdwNumDevices = new IntByReference();
SiUSBXp.INSTANCE.SI_GetNumDevices (lpdwNumDevices);
System.out.println(lpdwNumDevices.getValue());
//opening the device
HANDLEByReference dev_handle_ref = new HANDLEByReference();
byte status = SiUSBXp.INSTANCE.SI_Open(0, dev_handle_ref);
System.out.printf("Status %d\n", status);
HANDLE device_handle = dev_handle_ref.getValue();
//checking part number
ByteByReference lpbPartNum = new ByteByReference();
SiUSBXp.INSTANCE.SI_GetPartNumber(device_handle, lpbPartNum);
System.out.printf("Part number is CP210%d\n", lpbPartNum.getValue());
//checking product string - does not work
PointerByReference lpProduct = new PointerByReference();
ByteByReference lpbLength = new ByteByReference();
SiUSBXp.INSTANCE.SI_GetDeviceProductString(device_handle, lpProduct, lpbLength, 1);
}}
when I try to run it I get the following error:
> Exception in thread "main" java.lang.UnsatisfiedLinkError: Error looking up function 'SI_GetDeviceProductString': at com.sun.jna.Function.<init>(Function.java:179)
at com.sun.jna.NativeLibrary.getFunction(NativeLibrary.java:391)
at com.sun.jna.NativeLibrary.getFunction(NativeLibrary.java:371)
at com.sun.jna.Library$Handler.invoke(Library.java:205)
at $Proxy0.SI_GetDeviceProductString(Unknown Source)
It feels like it's the problem with the default argument of the C-function. I tried to use int as the argument and tried omitting it, but none of these helped.
I didn't even reach the SI_Write (HANDLE cyHandle, LPVOID lpBuffer, DWORD dwBytesToWrite, LPDWORD lpdwBytesWritten, OVERLAPPED* o = NULL); function, which is promising to cause even more problems:)
Could please anyone suggest how can I deal with the default function arguments in JNA?
Update: SI_Write function works fine this way:
byte SI_Write (HANDLE cyHandle, PointerByReference lpBuffer, int dwBytesToWrite, IntByReference lpdwBytesWritten, Pointer o);
...
SiUSBXp.INSTANCE.SI_Write (device_handle, lpBuffer, message.length, lpdwBytesWritten, null);
So the problem is caused by something else, but still it exists.
I've managed to solve the problem by using an older version of the SiUSBXp library, which was found on the internet.
Newer version downloaded from SiLabs website is acting weird - sometimes SI_GetDeviceProductString function is visible in Dependency Walker and sometimes it's not, while older one is fine.
Related
I have a problem statement of reading the text of locally open MSWord document. What I understand, using the following approach, given the path of the document, I can perform any operation in the document .
https://github.com/java-native-access/jna/blob/master/contrib/msoffice/src/com/sun/jna/platform/win32/COM/util/office/Wordautomation_KB_313193_Mod.java
But in my case I have a Handle (WinDef.HWND) to the locally opened word object . And I am not able to get the local path from it. I have given the code which I am trying out and I am not able to achieve what I looking for . Please give the any pointer how I can achieve solution of the above .
Please note that the following gives the path of WINWORD.EXE . And
System.out.println("File Path: "+desktop.getFilePath());
import com.sun.jna.Native;
import com.sun.jna.Pointer;
import com.sun.jna.platform.DesktopWindow;
import com.sun.jna.platform.FileUtils;
import com.sun.jna.platform.WindowUtils;
import com.sun.jna.platform.WindowUtils.NativeWindowUtils;
import com.sun.jna.platform.win32.WinDef;
import com.sun.jna.platform.win32.Kernel32Util;
import com.sun.jna.platform.win32.WinUser;
import com.sun.jna.win32.StdCallLibrary;
import java.util.List;
public class NativeWordpadExtractor {
public static void main(String ar[]){
executeNativeCommands();
}
public static void executeNativeCommands(){
NativeExtractor.User32 user32 = NativeExtractor.User32.INSTANCE;
user32.EnumWindows(new WinUser.WNDENUMPROC() {
int count = 0;
#Override
public boolean callback(WinDef.HWND hWnd, Pointer arg1) {
byte[] windowText = new byte[512];
user32.GetWindowTextA(hWnd, windowText, 512);
String wText = Native.toString(windowText);
// get rid of this if block if you want all windows regardless of whether
// or not they have text
if (wText.isEmpty()) {
return true;
}
if("SampleTextForScreenScrapping_Word - WordPad".equals(wText)){
System.out.println("Got the 'Wordpad'" + hWnd + ", class " + hWnd.getClass() +"getPointer"+ hWnd.getPointer()+ " Text: " + wText);
//WinDef.HWND notePadHwnd = user32.FindWindowA("Wordpad",null );
byte[] fileText = new byte[1024];
System.out.println("fileText : " + WindowUtils.getWindowTitle(hWnd));
List<DesktopWindow> desktops=WindowUtils.getAllWindows(true);
// Approach 1) For getting a handle to the Desktop object . I am not able to achieve result with this.
for(DesktopWindow desktop:desktops){
System.out.println("File Path: "+desktop.getFilePath());
System.out.println("Title : "+desktop.getTitle());
}
System.out.println("fileText : " + WindowUtils.getAllWindows(true));
// Approach 2) For getting a handle to the native object .
// This is also not working
WinDef.HWND editHwnd = user32.FindWindowExA(hWnd, null, null, null);
byte[] lParamStr = new byte[512];
WinDef.LRESULT resultBool = user32.SendMessageA(editHwnd, NativeExtractor.User32.WM_GETTEXT, 512, lParamStr);
System.out.println("The content of the file is : " + Native.toString(lParamStr));
return false;
}
System.out.println("Found window with text " + hWnd + ", total " + ++count + " Text: " + wText);
return true;
}
}, null);
}
interface User32 extends StdCallLibrary {
NativeExtractor.User32 INSTANCE = (NativeExtractor.User32) Native.loadLibrary("user32", NativeExtractor.User32.class);
int WM_SETTEXT = 0x000c;
int WM_GETTEXT = 0x000D;
int GetWindowTextA(WinDef.HWND hWnd, byte[] lpString, int nMaxCount);
boolean EnumWindows(WinUser.WNDENUMPROC lpEnumFunc, Pointer arg);
WinDef.HWND FindWindowA(String lpClassName, String lpWindowName);
WinDef.HWND FindWindowExA(WinDef.HWND hwndParent, WinDef.HWND hwndChildAfter, String lpClassName, String lpWindowName);
WinDef.LRESULT SendMessageA(WinDef.HWND paramHWND, int paramInt, WinDef.WPARAM paramWPARAM, WinDef.LPARAM paramLPARAM);
WinDef.LRESULT SendMessageA(WinDef.HWND editHwnd, int wmGettext, long l, byte[] lParamStr);
int GetClassNameA(WinDef.HWND hWnd, byte[] lpString, int maxCount);
}
}
I'm not quite sure you can achieve what you want, but I'll do what I can to answer your questions to get you closer to the goal.
There are two ways to get the file information: one more generic with Java/JNA and the the other requiring you to peer inside the process memory space. I'll address the first one.
Rather than dealing with a window handle, let's get the Process ID, which is easier to use later. That's relatively straightforward:
IntByReference pidPtr = new IntByReference();
com.sun.jna.platform.win32.User32.INSTANCE.GetWindowThreadProcessId(hWnd, pidPtr);
int pid = pidPtr.getValue();
(Of note, you should probably have your own User32 interface extend the one above so you can just use the one class and not have to fully qualify the JNA version like I did.)
Now, armed with the PID, there are a few options to try to get the path.
If you're lucky and the user opened the file directly (rather than using File>Open), you can recover the commandline they used, and it will likely have the path. You can retrieve this from the WMI class Win32_Process. Full code you can find in my project OSHI in the WindowsOperatingSystem class or you can try using Runtime.getRuntime().exec() to use the commandline WMI version: wmic path Win32_Process where ProcessID=1234 get CommandLine, capturing the result in a BufferedReader (or see OSHI's ExecutingCommand class for an implementation.)
If the command line check is unsuccessful you can search for which file handles are open by that process. The easiest way to do that is to download the Handle utility (but all your users would have to do this) and then just execute the command line handle -p 1234. This will list open files held by that process.
If you can't rely on your users downloading Handle, you can try to implement the same code yourself. This is an undocumented API using NtQuerySystemInformation. See the JNA project Issue 657 for sample code which will iterate all of an operating system's handles, allowing you to look at the files. Given that you already know the PID you can shortcut the iteration after SYSTEM_HANDLE sh = info.Handles[i]; by skipping the remainder of the code unless sh.ProcessID matches your pid. As stated in that issue, the code listed is mostly unsupported and dangerous. There is no guarantee it will work in future versions of Windows.
Finally, you can see what you can do with process memory. Armed with the PID, you could open the Process to get a handle:
HANDLE pHandle = Kernel32.INSTANCE.OpenProcess(WinNT.PROCESS_QUERY_INFORMATION, false, pid);
Then you could enumerate its modules with EnumProcessModules; for each module use GetModuleInformation to retrieve a MODULEINFO structure. This gives you a pointer to memory that you can explore to your heart's content. Of course, accurately knowing at what offsets to find what information requires the API for the executable you're exploring (Word, WordPad, etc., and the appropriate version.) And you need admin rights. This exploration is left as an exercise for the reader.
I need to use DLL inside my Java application. DLL is exporting some set of functions, authors called it "Direct DLL API". I'm trying to define in java equivalent of following function declaration:
int XcCompress( HXCEEDCMP hComp, const BYTE* pcSource, DWORD dwSourceSize, BYTE** ppcCompressed, DWORD* pdwCompressedSize, BOOL bEndOfData );
Inside my interface that extends Library I declared it as follows:
int XcCompress(WString hComp, Pointer pcSource, int dwSourceSize, Pointer[] ppcCompressed, IntByReference pdwCompressedSize, boolean bEndOfData);
Problem is everytime I get an error:
Exception in thread "main" java.lang.Error: Invalid memory access
So basically I'm stuck at this point.
HXCEEDCMP hComp - is suppose to store handler to the function, and works fine as WString for init DLL / destroying DLL functions so I kept it like this.
The header reference "creature" is:
typedef HXCEEDCMP ( XCD_WINAPI *LPFNXCCREATEXCEEDCOMPRESSIONW )( const WCHAR* );
const BYTE* pcSource - is the source data for compression, inside my code I instantiate it this way:
private static Pointer setByteArrayPointer(String dataToCompress) {
Pointer pointer = new Memory(1024);
pointer.write(0, dataToCompress.getBytes(), 0,
dataToCompress.getBytes().length);
return pointer;
}
DWORD dwSourceSize - for this im getting reserved Memory size in this way:
String testData = "ABCDABCDABCDAAD";
Pointer source = setByteArrayPointer(testData);
(int) ((Memory)source).size()
BYTE** ppcCompressed - function should populate ppcCompressed reference after work is done. I assume I made a mistake there, by doing it in this way:
Pointer[] compressed = {new Pointer(1024), new Pointer(1024)};
DWORD* pdwCompressedSize - returned by function size of compressed data. I map it in this way:
IntByReference intByReference = new IntByReference();
Not sure if it is good idea aswell..
BOOL bEndOfData - i need to set it to true.
So finally my method call, which returns an error looks like this:
xceedApiDll.XcCompress(handle, source, (int) ((Memory)source).size(), compressed, intByReference, true);
Any help will be appreciated. Thank you.
I think i solved the issue (thanks for comments guys). Maybe for someone using this library it will be useful:
In the end the main problem was with handler declaration and the ppcCompressed value.
I used the following solution which works fine for me:
Method declarations inside java interface:
int XcCompress(Pointer hComp, byte[] pcSource, int dwSourceSize, PointerByReference ppcCompressed, IntByReference pdwCompressedSize, int bEndOfData);
int XcUncompress(Pointer hComp, byte[] pcSource, int dwSourceSize, PointerByReference ppcUncompressed, IntByReference pdwUncompressedSize, int bEndOfdata);
Usage:
private static final XceedFunctions XCEED_DLL_API;
static {
XCEED_DLL_API = Native.load("XceedZipX64", XceedFunctions.class);
}
private static final String TEST_DATA = "abcabcddd";
//Data pointers
private static Pointer compHandle;
private static byte[] baSource = TEST_DATA.getBytes();
private static PointerByReference pbrCompressed = new PointerByReference();
private static PointerByReference pbrUncompressed = new PointerByReference();
private static IntByReference ibrCompressedSize = new IntByReference();
private static IntByReference ibrUncompressedSize = new IntByReference();
public static void main(String[] args) {
try {
boolean isSuccessfulInit = XCEED_DLL_API.XceedZipInitDLL();
if(isSuccessfulInit) {
compHandle = XCEED_DLL_API.XcCreateXceedCompressionW(new WString("YOUR_LICENCE_KEY_HERE"));
int compressionResult = XCEED_DLL_API.XcCompress(compHandle, baSource, baSource.length, pbrCompressed, ibrCompressedSize, 1);
byte[] compressed = getDataFromPbr(pbrCompressed, ibrCompressedSize);
System.out.println("Compression result: " + compressionResult + " Data: " + new String(compressed));
int decompressionResult = XCEED_DLL_API.XcUncompress(compHandle, compressed, compressed.length, pbrUncompressed, ibrUncompressedSize, 1);
byte[] uncompressed = getDataFromPbr(pbrUncompressed, ibrUncompressedSize);
System.out.println("Decompression result: " + decompressionResult + " Data: " + new String(uncompressed));
}
} finally {
System.out.println("Free memory and shutdown");
if(compHandle != null) {
XCEED_DLL_API.XcDestroyXceedCompression(compHandle);
}
XCEED_DLL_API.XceedZipShutdownDLL();
}
}
private static byte[] getDataFromPbr(PointerByReference pbr, IntByReference ibr) {
return pbr.getValue().getByteArray(0, ibr.getValue());
}
Example output:
Compression result: 0 Data: KLJNLJNII yK
Decompression result: 0 Data: abcabcddd
Free memory and shutdown
Java Doc for Function
I can't seem to figure out how to use this function. I have a Java.Midi.Sequence and the File I want to write to, but I can't figure out what "int fileType" is. There are no static int's to reference in either MidiSystem, Sequence, or MidiFileWriter. Nor does 0 help.
The Error I get when using zero is so:
Exception in thread "AWT-EventQueue-0" java.lang.ClassCastException: seph.reed.effigy.MidiLoader$1 cannot be cast to javax.sound.midi.ShortMessage
at com.sun.media.sound.StandardMidiFileWriter.writeTrack(StandardMidiFileWriter.java:386)
at com.sun.media.sound.StandardMidiFileWriter.getFileStream(StandardMidiFileWriter.java:204)
at com.sun.media.sound.StandardMidiFileWriter.write(StandardMidiFileWriter.java:137)
at com.sun.media.sound.StandardMidiFileWriter.write(StandardMidiFileWriter.java:153)
at javax.sound.midi.MidiSystem.write(MidiSystem.java:1060)
at seph.reed.effigy.MidiLoader.saveClipAs(MidiLoader.java:197)
at seph.reed.effigy.EffigyMenuBar$2.onClick(EffigyMenuBar.java:47)
The personal function referenced is:
public void saveClipAs(File selectedFile) {
try {
Sequence out = new Sequence(Sequence.PPQ, 256);
Track toMe = out.createTrack();
Sequencer fromMe = ANCESTOR(Effigy.class).m_gui.getCurrentClip().m_sequencer;
//traverse linked list adding notes to track
for(MidiEventEntity ptr = fromMe.m_head; ptr != null; ptr = ptr.m_next) {
byte[] midiData = new byte[3];
midiData[0] = MidiToolBox.NOTE_ON;
midiData[1] = (byte)ptr.getNote();
midiData[2] = (byte)127;
long tick = (long) (256 * ptr.getBeat()); //256 ticks per 1/4 note
MidiEvent addMe = new MidiEvent(new MidiMessage(midiData) {
#Override
public Object clone() {
return null; }
}, tick);
toMe.add(addMe);
}
//THIS LINE BELOW
MidiSystem.write(out, 0, selectedFile);
}
catch (InvalidMidiDataException e) {
e.printStackTrace(); }
catch (IOException e) {
e.printStackTrace();
}
}
Thanks for any help. I'm utterly at a loss as to what int fileType is really asking for.
EDIT: removed a dumb secondary question.
EDIT: functional code:
for(MidiEventEntity ptr = fromMe.m_head; ptr != null; ptr = ptr.m_next) {
byte status = MidiToolBox.NOTE_ON;
byte note = (byte)ptr.getNote();
byte velocity = (byte)127;
long tick = (long) (256 * ptr.getBeat()); //256 ticks per 1/4 note
ShortMessage msg = new ShortMessage(status, note, velocity);
MidiEvent addMe = new MidiEvent(msg, tick);
toMe.add(addMe);
}
It looks like the int corresponds to Midi Type 0, Midi Type 1, Midi Type 2 (more details here)
In terms of how you go about determining what midi types your system supports it looks like you can call the MidiSystem.getMidiFileTypes(Sequence sequence) method.
According to https://docs.oracle.com/javase/tutorial/sound/SPI-providing-MIDI.html :
There are three standard MIDI file formats, all of which an implementation of the Java Sound API can support: Type 0, Type 1, and Type 2. These file formats differ in their internal representation of the MIDI sequence data in the file, and are appropriate for different kinds of sequences. If an implementation doesn't itself support all three types, a service provider can supply the support for the unimplemented ones. There are also variants of the standard MIDI file formats, some of them proprietary, which similarly could be supported by a third-party vendor.
Thus the fileType is either 0, 1, or 2.
What kinds of file types your implementation supports can be seen via MidiSystem.getMidiFileTypes().
The file type of a midi file can be identified via
MidiSystem.getMidiFileFormat() (see
http://docs.oracle.com/javase/7/docs/api/javax/sound/midi/MidiSystem.html#getMidiFileFormat%28java.io.File%29
and http://docs.oracle.com/javase/7/docs/api/javax/sound/midi/MidiFileFormat.html)
I have created this method that should return the full path and file name so that I can uniquely identify a program. However, it only returns C:\Program Files (x86)\Java\jre6\bin\javaw.exe or an empty string instead of the path for the particular program in focus. What is it that I am doing wrong?
private void getFocusWindow() {
HWND focusedWindow = User32.INSTANCE.GetForegroundWindow();
char[] nameName = new char[512];
User32.INSTANCE.GetWindowModuleFileName(focusedWindow, nameName, 512);
System.out.println(nameName);
}
Using psapi:
Solution:
Provides full path and module file name, only exception is in eclipse when it prints out '�'. See #technomage's answer for more detail about GetModuleFileNameEx method.
private void getFocusWindow() {
PsApi psapi = (PsApi) Native.loadLibrary("psapi", PsApi.class);
HWND focusedWindow = User32.INSTANCE.GetForegroundWindow();
byte[] name = new byte[1024];
IntByReference pid = new IntByReference();
User32.INSTANCE.GetWindowThreadProcessId(focusedWindow, pid);
HANDLE process = Kernel32.INSTANCE.OpenProcess(0x0400 | 0x0010, false, pid.getValue());
psapi.GetModuleFileNameExA(process, null, name, 1024);
String nameString= Native.toString(name);
System.out.println(nameString);
}
psapi class:
public interface PsApi extends StdCallLibrary {
int GetModuleFileNameExA(HANDLE process, HANDLE module ,
byte[] name, int i);
}
GetWindowModuleFileName and GetModuleFileName work only with the current process (i.e. you'll only get useful information for the current process's windows) in Windows NT 4 and later.
http://support.microsoft.com/?id=228469
The article recommends using the PSAPI function GetModuleFileNameEx instead.
EDIT
You'll need to convert the Window handle to a module handle (which is probably shorter than converting window handle to PID to module handle). Keep in mind that the window handle is just an address (so you'll need the GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS flag).
I'm trying to read / write multiple Protocol Buffers messages from files, in both C++ and Java. Google suggests writing length prefixes before the messages, but there's no way to do that by default (that I could see).
However, the Java API in version 2.1.0 received a set of "Delimited" I/O functions which apparently do that job:
parseDelimitedFrom
mergeDelimitedFrom
writeDelimitedTo
Are there C++ equivalents? And if not, what's the wire format for the size prefixes the Java API attaches, so I can parse those messages in C++?
Update:
These now exist in google/protobuf/util/delimited_message_util.h as of v3.3.0.
I'm a bit late to the party here, but the below implementations include some optimizations missing from the other answers and will not fail after 64MB of input (though it still enforces the 64MB limit on each individual message, just not on the whole stream).
(I am the author of the C++ and Java protobuf libraries, but I no longer work for Google. Sorry that this code never made it into the official lib. This is what it would look like if it had.)
bool writeDelimitedTo(
const google::protobuf::MessageLite& message,
google::protobuf::io::ZeroCopyOutputStream* rawOutput) {
// We create a new coded stream for each message. Don't worry, this is fast.
google::protobuf::io::CodedOutputStream output(rawOutput);
// Write the size.
const int size = message.ByteSize();
output.WriteVarint32(size);
uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
if (buffer != NULL) {
// Optimization: The message fits in one buffer, so use the faster
// direct-to-array serialization path.
message.SerializeWithCachedSizesToArray(buffer);
} else {
// Slightly-slower path when the message is multiple buffers.
message.SerializeWithCachedSizes(&output);
if (output.HadError()) return false;
}
return true;
}
bool readDelimitedFrom(
google::protobuf::io::ZeroCopyInputStream* rawInput,
google::protobuf::MessageLite* message) {
// We create a new coded stream for each message. Don't worry, this is fast,
// and it makes sure the 64MB total size limit is imposed per-message rather
// than on the whole stream. (See the CodedInputStream interface for more
// info on this limit.)
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size)) return false;
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
// Parse the message.
if (!message->MergeFromCodedStream(&input)) return false;
if (!input.ConsumedEntireMessage()) return false;
// Release the limit.
input.PopLimit(limit);
return true;
}
Okay, so I haven't been able to find top-level C++ functions implementing what I need, but some spelunking through the Java API reference turned up the following, inside the MessageLite interface:
void writeDelimitedTo(OutputStream output)
/* Like writeTo(OutputStream), but writes the size of
the message as a varint before writing the data. */
So the Java size prefix is a (Protocol Buffers) varint!
Armed with that information, I went digging through the C++ API and found the CodedStream header, which has these:
bool CodedInputStream::ReadVarint32(uint32 * value)
void CodedOutputStream::WriteVarint32(uint32 value)
Using those, I should be able to roll my own C++ functions that do the job.
They should really add this to the main Message API though; it's missing functionality considering Java has it, and so does Marc Gravell's excellent protobuf-net C# port (via SerializeWithLengthPrefix and DeserializeWithLengthPrefix).
I solved the same problem using CodedOutputStream/ArrayOutputStream to write the message (with the size) and CodedInputStream/ArrayInputStream to read the message (with the size).
For example, the following pseudo-code writes the message size following by the message:
const unsigned bufLength = 256;
unsigned char buffer[bufLength];
Message protoMessage;
google::protobuf::io::ArrayOutputStream arrayOutput(buffer, bufLength);
google::protobuf::io::CodedOutputStream codedOutput(&arrayOutput);
codedOutput.WriteLittleEndian32(protoMessage.ByteSize());
protoMessage.SerializeToCodedStream(&codedOutput);
When writing you should also check that your buffer is large enough to fit the message (including the size). And when reading, you should check that your buffer contains a whole message (including the size).
It definitely would be handy if they added convenience methods to C++ API similar to those provided by the Java API.
IsteamInputStream is very fragile to eofs and other errors that easily occurs when used together with std::istream. After this the protobuf streams are permamently damaged and any already used buffer data is destroyed. There are proper support for reading from traditional streams in protobuf.
Implement google::protobuf::io::CopyingInputStream and use that together with CopyingInputStreamAdapter. Do the same for the output variants.
In practice a parsing call ends up in google::protobuf::io::CopyingInputStream::Read(void* buffer, int size) where a buffer is given. The only thing left to do is read into it somehow.
Here's an example for use with Asio synchronized streams (SyncReadStream/SyncWriteStream):
#include <google/protobuf/io/zero_copy_stream_impl_lite.h>
using namespace google::protobuf::io;
template <typename SyncReadStream>
class AsioInputStream : public CopyingInputStream {
public:
AsioInputStream(SyncReadStream& sock);
int Read(void* buffer, int size);
private:
SyncReadStream& m_Socket;
};
template <typename SyncReadStream>
AsioInputStream<SyncReadStream>::AsioInputStream(SyncReadStream& sock) :
m_Socket(sock) {}
template <typename SyncReadStream>
int
AsioInputStream<SyncReadStream>::Read(void* buffer, int size)
{
std::size_t bytes_read;
boost::system::error_code ec;
bytes_read = m_Socket.read_some(boost::asio::buffer(buffer, size), ec);
if(!ec) {
return bytes_read;
} else if (ec == boost::asio::error::eof) {
return 0;
} else {
return -1;
}
}
template <typename SyncWriteStream>
class AsioOutputStream : public CopyingOutputStream {
public:
AsioOutputStream(SyncWriteStream& sock);
bool Write(const void* buffer, int size);
private:
SyncWriteStream& m_Socket;
};
template <typename SyncWriteStream>
AsioOutputStream<SyncWriteStream>::AsioOutputStream(SyncWriteStream& sock) :
m_Socket(sock) {}
template <typename SyncWriteStream>
bool
AsioOutputStream<SyncWriteStream>::Write(const void* buffer, int size)
{
boost::system::error_code ec;
m_Socket.write_some(boost::asio::buffer(buffer, size), ec);
return !ec;
}
Usage:
AsioInputStream<boost::asio::ip::tcp::socket> ais(m_Socket); // Where m_Socket is a instance of boost::asio::ip::tcp::socket
CopyingInputStreamAdaptor cis_adp(&ais);
CodedInputStream cis(&cis_adp);
Message protoMessage;
uint32_t msg_size;
/* Read message size */
if(!cis.ReadVarint32(&msg_size)) {
// Handle error
}
/* Make sure not to read beyond limit of message */
CodedInputStream::Limit msg_limit = cis.PushLimit(msg_size);
if(!msg.ParseFromCodedStream(&cis)) {
// Handle error
}
/* Remove limit */
cis.PopLimit(msg_limit);
Here you go:
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/io/coded_stream.h>
using namespace google::protobuf::io;
class FASWriter
{
std::ofstream mFs;
OstreamOutputStream *_OstreamOutputStream;
CodedOutputStream *_CodedOutputStream;
public:
FASWriter(const std::string &file) : mFs(file,std::ios::out | std::ios::binary)
{
assert(mFs.good());
_OstreamOutputStream = new OstreamOutputStream(&mFs);
_CodedOutputStream = new CodedOutputStream(_OstreamOutputStream);
}
inline void operator()(const ::google::protobuf::Message &msg)
{
_CodedOutputStream->WriteVarint32(msg.ByteSize());
if ( !msg.SerializeToCodedStream(_CodedOutputStream) )
std::cout << "SerializeToCodedStream error " << std::endl;
}
~FASWriter()
{
delete _CodedOutputStream;
delete _OstreamOutputStream;
mFs.close();
}
};
class FASReader
{
std::ifstream mFs;
IstreamInputStream *_IstreamInputStream;
CodedInputStream *_CodedInputStream;
public:
FASReader(const std::string &file), mFs(file,std::ios::in | std::ios::binary)
{
assert(mFs.good());
_IstreamInputStream = new IstreamInputStream(&mFs);
_CodedInputStream = new CodedInputStream(_IstreamInputStream);
}
template<class T>
bool ReadNext()
{
T msg;
unsigned __int32 size;
bool ret;
if ( ret = _CodedInputStream->ReadVarint32(&size) )
{
CodedInputStream::Limit msgLimit = _CodedInputStream->PushLimit(size);
if ( ret = msg.ParseFromCodedStream(_CodedInputStream) )
{
_CodedInputStream->PopLimit(msgLimit);
std::cout << mFeed << " FASReader ReadNext: " << msg.DebugString() << std::endl;
}
}
return ret;
}
~FASReader()
{
delete _CodedInputStream;
delete _IstreamInputStream;
mFs.close();
}
};
I ran into the same issue in both C++ and Python.
For the C++ version, I used a mix of the code Kenton Varda posted on this thread and the code from the pull request he sent to the protobuf team (because the version posted here doesn't handle EOF while the one he sent to github does).
#include <google/protobuf/message_lite.h>
#include <google/protobuf/io/zero_copy_stream.h>
#include <google/protobuf/io/coded_stream.h>
bool writeDelimitedTo(const google::protobuf::MessageLite& message,
google::protobuf::io::ZeroCopyOutputStream* rawOutput)
{
// We create a new coded stream for each message. Don't worry, this is fast.
google::protobuf::io::CodedOutputStream output(rawOutput);
// Write the size.
const int size = message.ByteSize();
output.WriteVarint32(size);
uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
if (buffer != NULL)
{
// Optimization: The message fits in one buffer, so use the faster
// direct-to-array serialization path.
message.SerializeWithCachedSizesToArray(buffer);
}
else
{
// Slightly-slower path when the message is multiple buffers.
message.SerializeWithCachedSizes(&output);
if (output.HadError())
return false;
}
return true;
}
bool readDelimitedFrom(google::protobuf::io::ZeroCopyInputStream* rawInput, google::protobuf::MessageLite* message, bool* clean_eof)
{
// We create a new coded stream for each message. Don't worry, this is fast,
// and it makes sure the 64MB total size limit is imposed per-message rather
// than on the whole stream. (See the CodedInputStream interface for more
// info on this limit.)
google::protobuf::io::CodedInputStream input(rawInput);
const int start = input.CurrentPosition();
if (clean_eof)
*clean_eof = false;
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size))
{
if (clean_eof)
*clean_eof = input.CurrentPosition() == start;
return false;
}
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit = input.PushLimit(size);
// Parse the message.
if (!message->MergeFromCodedStream(&input)) return false;
if (!input.ConsumedEntireMessage()) return false;
// Release the limit.
input.PopLimit(limit);
return true;
}
And here is my python2 implementation:
from google.protobuf.internal import encoder
from google.protobuf.internal import decoder
#I had to implement this because the tools in google.protobuf.internal.decoder
#read from a buffer, not from a file-like objcet
def readRawVarint32(stream):
mask = 0x80 # (1 << 7)
raw_varint32 = []
while 1:
b = stream.read(1)
#eof
if b == "":
break
raw_varint32.append(b)
if not (ord(b) & mask):
#we found a byte starting with a 0, which means it's the last byte of this varint
break
return raw_varint32
def writeDelimitedTo(message, stream):
message_str = message.SerializeToString()
delimiter = encoder._VarintBytes(len(message_str))
stream.write(delimiter + message_str)
def readDelimitedFrom(MessageType, stream):
raw_varint32 = readRawVarint32(stream)
message = None
if raw_varint32:
size, _ = decoder._DecodeVarint32(raw_varint32, 0)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message = MessageType()
message.ParseFromString(data)
return message
#In place version that takes an already built protobuf object
#In my tests, this is around 20% faster than the other version
#of readDelimitedFrom()
def readDelimitedFrom_inplace(message, stream):
raw_varint32 = readRawVarint32(stream)
if raw_varint32:
size, _ = decoder._DecodeVarint32(raw_varint32, 0)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message.ParseFromString(data)
return message
else:
return None
It might not be the best looking code and I'm sure it can be refactored a fair bit, but at least that should show you one way to do it.
Now the big problem: It's SLOW.
Even when using the C++ implementation of python-protobuf, it's one order of magnitude slower than in pure C++. I have a benchmark where I read 10M protobuf messages of ~30 bytes each from a file. It takes ~0.9s in C++, and 35s in python.
One way to make it a bit faster would be to re-implement the varint decoder to make it read from a file and decode in one go, instead of reading from a file and then decoding as this code currently does. (profiling shows that a significant amount of time is spent in the varint encoder/decoder). But needless to say that alone is not enough to close the gap between the python version and the C++ version.
Any idea to make it faster is very welcome :)
Just for completeness, I post here an up-to-date version that works with the master version of protobuf and Python3
For the C++ version it is sufficient to use the utils in delimited_message_utils.h, here a MWE
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/util/delimited_message_util.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
template <typename T>
bool writeManyToFile(std::deque<T> messages, std::string filename) {
int outfd = open(filename.c_str(), O_WRONLY | O_CREAT | O_TRUNC);
google::protobuf::io::FileOutputStream fout(outfd);
bool success;
for (auto msg: messages) {
success = google::protobuf::util::SerializeDelimitedToZeroCopyStream(
msg, &fout);
if (! success) {
std::cout << "Writing Failed" << std::endl;
break;
}
}
fout.Close();
close(outfd);
return success;
}
template <typename T>
std::deque<T> readManyFromFile(std::string filename) {
int infd = open(filename.c_str(), O_RDONLY);
google::protobuf::io::FileInputStream fin(infd);
bool keep = true;
bool clean_eof = true;
std::deque<T> out;
while (keep) {
T msg;
keep = google::protobuf::util::ParseDelimitedFromZeroCopyStream(
&msg, &fin, nullptr);
if (keep)
out.push_back(msg);
}
fin.Close();
close(infd);
return out;
}
For the Python3 version, building on #fireboot 's answer, the only thing thing that needed modification is the decoding of raw_varint32
def getSize(raw_varint32):
result = 0
shift = 0
b = six.indexbytes(raw_varint32, 0)
result |= ((ord(b) & 0x7f) << shift)
return result
def readDelimitedFrom(MessageType, stream):
raw_varint32 = readRawVarint32(stream)
message = None
if raw_varint32:
size = getSize(raw_varint32)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message = MessageType()
message.ParseFromString(data)
return message
Was also looking for a solution for this. Here's the core of our solution, assuming some java code wrote many MyRecord messages with writeDelimitedTo into a file. Open the file and loop, doing:
if(someCodedInputStream->ReadVarint32(&bytes)) {
CodedInputStream::Limit msgLimit = someCodedInputStream->PushLimit(bytes);
if(myRecord->ParseFromCodedStream(someCodedInputStream)) {
//do your stuff with the parsed MyRecord instance
} else {
//handle parse error
}
someCodedInputStream->PopLimit(msgLimit);
} else {
//maybe end of file
}
Hope it helps.
Working with an objective-c version of protocol-buffers, I ran into this exact issue. On sending from the iOS client to a Java based server that uses parseDelimitedFrom, which expects the length as the first byte, I needed to call writeRawByte to the CodedOutputStream first. Posting here to hopegully help others that run into this issue. While working through this issue, one would think that Google proto-bufs would come with a simply flag which does this for you...
Request* request = [rBuild build];
[self sendMessage:request];
}
- (void) sendMessage:(Request *) request {
//** get length
NSData* n = [request data];
uint8_t len = [n length];
PBCodedOutputStream* os = [PBCodedOutputStream streamWithOutputStream:outputStream];
//** prepend it to message, such that Request.parseDelimitedFrom(in) can parse it properly
[os writeRawByte:len];
[request writeToCodedOutputStream:os];
[os flush];
}
Since I'm not allowed to write this as a comment to Kenton Varda's answer above; I believe there is a bug in the code he posted (as well as in other answers which have been provided). The following code:
...
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size)) return false;
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
...
sets an incorrect limit because it does not take into account the size of the varint32 which has already been read from input. This can result in data loss/corruption as additional bytes are read from the stream which may be part of the next message. The usual way of handling this correctly is to delete the CodedInputStream used to read the size and create a new one for reading the payload:
...
uint32_t size;
{
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
if (!input.ReadVarint32(&size)) return false;
}
google::protobuf::io::CodedInputStream input(rawInput);
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
...
You can use getline for reading a string from a stream, using the specified delimiter:
istream& getline ( istream& is, string& str, char delim );
(defined in the header)