mkdir() directory created but not found [duplicate] - java

File "~/workspace/Test.txt" does exist, but fd always returns -1. Can somebody please give a hint as to what is wrong with the code? Thanks.
int fd = open("~/workspace/Test.txt", O_RDONLY);
cout << "fd is "<<fd<<endl;
if (fd < 0) {
cout << "did not find file"<<endl;
return false;
}

(Assuming your OS is some Posix like Linux)
The ~ should be expanded. Usually the shell expands it. But open wants a real file path.
You could try:
std::string fname (getenv("HOME"));
fname += "/workspace/Test.txt";
int fd = open(fname.c_str(), O_RDONLY);
if (fd<0) {
std::cerr << "failed to open " << fname
<< " : " << strerror(errno) << std::endl;
return false;
}
See glob(7), wordexp(3), getenv(3), strerror(3), open(2), environ(7)
Read Advanced Linux Programming

Related

OpenCV Assertion Error in detectMultiScale

I'm trying to test out the facial recognition API in OpenCV. I have imported the .jar provided and it loads the DLL's correctly. The imageLibIit() function will load the DLL. I also have the provided XML files in a directory:src\main\resources\opencv
public boolean faceDetect(String inputFilename, String outputFilename){
if(!loaded){
if(!imageLibInit()){ // initializes and loads the dynamic libraries
return false;
}
}
//TODO #Austin fix image resource issues
ClassLoader loader = Thread.currentThread().getContextClassLoader();
String xmlResource = loader.getResource("opencv/haarcascade_frontalface_alt.xml").getPath();
CascadeClassifier faceDetector = new CascadeClassifier(xmlResource);
Mat inputImage = Imgcodecs.imread(inputFilename);
ErrorUtils.println(xmlResource);
if(inputImage.empty()){
ErrorUtils.println("image is empty");
return false;
}
MatOfRect facesDetected = new MatOfRect();
faceDetector.detectMultiScale(inputImage, facesDetected);
Rect[] detections = facesDetected.toArray();
ErrorUtils.println("Faces detected in '" + inputFilename + "': " + detections.length);
for(Rect detection : detections){
Imgproc.rectangle(
inputImage,
new Point(detection.x, detection.y),
new Point(detection.x + detection.width, detection.y + detection.height),
new Scalar(0, 0, 255)
);
}
Imgcodecs.imwrite(outputFilename, inputImage);
return true;
}
I'm still getting an error:
OpenCV Error: Assertion failed (!empty()) in cv::CascadeClassifier::detectMultiScale, file C:\builds\master_PackSlaveAddon-win64-vc12-static\opencv\modules\objdetect\src\cascadedetect.cpp, line 1639
I have researched this error and each time the solution seems to be something with the resources. It's more than likely an incredibly simple issue, but I am stuck as of now.
Inspecting the OpenCV source code, we can find the following:
Internally the CascadeClassifier implementation uses class FileStorage to load the data. [1]
Class FileStorage internally uses the function fopen(...) to open the data file. [2] [3]
Since the path returned by the resouce loader ("/D:/Programming/workspaces/github/project1/HomeServer/target/classes/opencv/ha‌​arcascade_frontalface_alt.xml") is a little odd, containing the leading slash, the first suspicion leads to some issue with opening the file.
I wrote a simple little test to check this with Visual C++:
#include <cstdio>
#include <iostream>
bool try_open(char const* filename)
{
FILE* f;
f = fopen(filename, "rb");
if (f) {
fclose(f);
return true;
}
return false;
}
int main()
{
char const* path_1("/e:/foo.txt");
char const* path_2("e:/foo.txt");
std::cout << "fopen(\"" << path_1 << "\") -> " << try_open(path_1) << "\n";
std::cout << "fopen(\"" << path_2 << "\") -> " << try_open(path_2) << "\n";
return 0;
}
Output:
fopen("/e:/foo.txt") -> 0
fopen("e:/foo.txt") -> 1
Therefore the path is a culprit. According to this answer, one platform independent way is to modify the code as follows to generate a valid path:
// ...
ClassLoader loader = Thread.currentThread().getContextClassLoader();
String xmlResource = loader.getResource("opencv/haarcascade_frontalface_alt.xml").getPath();
File file = new File(xmlResource);
xmlResource = file.getAbsolutePath());
// ...
CascadeClassifier faceDetector = new CascadeClassifier(xmlResource);
if(faceDetector.empty()){
ErrorUtils.println("cascade classifier is empty");
return false;
}
// ...

SeekableByteChannel.read() always returns 0, InputStream is fine

We have a data file for which we need to generate a CRC. (As a placeholder, I'm using CRC32 while the others figure out what CRC polynomial they actually want.) This code seems like it ought to work:
broken:
Path in = ......;
try (SeekableByteChannel reading =
Files.newByteChannel (in, StandardOpenOption.READ))
{
System.err.println("byte channel is a " + reading.getClass().getName() +
" from " + in + " of size " + reading.size() + " and isopen=" + reading.isOpen());
java.util.zip.CRC32 placeholder = new java.util.zip.CRC32();
ByteBuffer buffer = ByteBuffer.allocate (reasonable_buffer_size);
int bytesread = 0;
int loops = 0;
while ((bytesread = reading.read(buffer)) > 0) {
byte[] raw = buffer.array();
System.err.println("Claims to have read " + bytesread + " bytes, have buffer of size " + raw.length + ", updating CRC");
placeholder.update(raw);
loops++;
buffer.clear();
}
// do stuff with placeholder.getValue()
}
catch (all the things that go wrong with opening files) {
and handle them;
}
The System.err and loops stuff is just for debugging; we don't actually care how many times it takes. The output is:
byte channel is a sun.nio.ch.FileChannelImpl from C:\working\tmp\ls2kst83543216xuxxy8136.tmp of size 7196 and isopen=true
finished after 0 time(s) through the loop
There's no way to run the real code inside a debugger to step through it, but from looking at the source to sun.nio.ch.FileChannelImpl.read() it looks like a 0 is returned if the file magically becomes closed while internal data structures are prepared; the code below is copied from the Java 7 reference implementation, comments added by me:
// sun.nio.ch.FileChannelImpl.java
public int read(ByteBuffer dst) throws IOException {
ensureOpen(); // this throws if file is closed...
if (!readable)
throw new NonReadableChannelException();
synchronized (positionLock) {
int n = 0;
int ti = -1;
Object traceContext = IoTrace.fileReadBegin(path);
try {
begin();
ti = threads.add();
if (!isOpen())
return 0; // ...argh
do {
n = IOUtil.read(fd, dst, -1, nd);
} while (......)
.......
But the debugging code tests isOpen() and gets true. So I don't know what's going wrong.
As the current test data files are tiny, I dropped this in place just to have something working:
works for now:
try {
byte[] scratch = Files.readAllBytes(in);
java.util.zip.CRC32 placeholder = new java.util.zip.CRC32();
placeholder.update(scratch);
// do stuff with placeholder.getValue()
}
I don't want to slurp the entire file into memory for the Real Code, because some of those files can be large. I do note that readAllBytes uses an InputStream in its reference implementation, which has no trouble reading the same file that SeekableByteChannel failed to. So I'll probably rewrite the code to just use input streams instead of byte channels. I'd still like to figure out what's gone wrong in case a future scenario comes up where we need to use byte channels. What am I missing with SeekableByteChannel?
Check that 'reasonable_buffer_size' isn't zero.

Programmatically change the desktop wallpaper periodically

What is the best way to go about creating a program that would change the desktop wallpaper periodically? I would also like to create a GUI around the program. I am a Computer Science student, and as such I know basic programming in Java, and C++ among others. This will be done on Windows 7 OS.
What would be the best language to use for a project like this?
Ideally I would like to use the system clock to trigger the change. Is this possible?
Am I in over my head?
Any answers will be very much appreciated. Thank you.
In Java :
import java.util.*;
public class changer
{
public static native int SystemParametersInfo(int uiAction,int uiParam,String pvParam,int fWinIni);
static
{
System.loadLibrary("user32");
}
public int Change(String path)
{
return SystemParametersInfo(20, 0, path, 0);
}
public static void main(String args[])
{
String wallpaper_file = "c:\\wallpaper.jpg";
changer mychanger = new changer();
mychanger.Change(wallpaper_file);
}
}
In Win32 C++, You can use SetTimer to trigger a change.
#define STRICT 1
#include <windows.h>
#include <iostream.h>
VOID CALLBACK TimerProc(HWND hWnd, UINT nMsg, UINT nIDEvent, DWORD dwTime)
{
LPWSTR wallpaper_file = L"C:\\Wallpapers\\wallpaper.png";
int return_value = SystemParametersInfo(SPI_SETDESKWALLPAPER, 0, wallpaper_file, SPIF_UPDATEINIFILE);
cout << "Programmatically change the desktop wallpaper periodically: " << dwTime << '\n';
cout.flush();
}
int main(int argc, char *argv[], char *envp[])
{
int Counter=0;
MSG Msg;
UINT TimerId = SetTimer(NULL, 0, 2000, &TimerProc); //2000 milliseconds
cout << "TimerId: " << TimerId << '\n';
if (!TimerId)
return 16;
while (GetMessage(&Msg, NULL, 0, 0))
{
++Counter;
if (Msg.message == WM_TIMER)
cout << "Counter: " << Counter << "; timer message\n";
else
cout << "Counter: " << Counter << "; message: " << Msg.message << '\n';
DispatchMessage(&Msg);
}
KillTimer(NULL, TimerId);
return 0;
}
This is a reasonably straightforward project, and can be done easily with any language that can call Win32 API functions (C++, for example). The non-obvious function to change the wallpaper is SystemParametersInfo with the SPI_SETDESKWALLPAPER flag. You give it a file name of a new image, and the wallpaper changes.

java writeInt from php,

hey all i am trying to make a data output stream in php
to write back primitive data types to a java application
i created a class that write the data to an array
(write it same as java do , copy from java code)
and finally i am writing back the array to the client.
feels like its not working well
for example the writeInt method
send to the java client some wrong values
am i doing ok ?
thank you
here is my code
private $buf = array();
public function writeByte($b) {
$this->buf[] = pack('c' ,$b);
}
public function writeInt($v) {
$this->writeByte($this->shiftRight3($v , 24) & 0xFF);
$this->writeByte($this->shiftRight3($v , 16) & 0xFF);
$this->writeByte($this->shiftRight3($v , 8) & 0xFF);
$this->writeByte($this->shiftRight3($v , 0) & 0xFF);
}
private function shiftRight3($a ,$b){
if(is_numeric($a) && $a < 0){
return ($a >> $b) + (2<<~$b);
}else{
return ($a >> $b);
}
}
public function toByteArray(){
return $this->buf;
}
this is how i am setting the main php file
header("Content-type: application/octet-stream" ,true);
header("Content-Transfer-Encoding: binary" ,true);
this is how i am returning the data
$arrResult = $dataOutputStream->toByteArray();
for ($i = 0 ; $i < count($arrResult) ; $i ++){
echo $arrResult[$i];
}
I EDIT THE QUESTION ,ACCOURDING TO MY CODE CHANGING
in the java client side seems that i have 2 bytes to read start always
i have 13 , 10 , which is \r \n
how come i am reading them always ?
(in my test i am sending one byte to the java client side ,
URL u = new URL("http://localhost/jtpc/test/inputTest.php");
URLConnection c = u.openConnection();
InputStream in = c.getInputStream();
int read = 0;
for (int j = 0; read != -1 ; j++) {
read = in.read();
System.out.println("More to read : " + read);
}
)
the output will be ,
More to read : 13
More to read : 10
More to read : 1 (this is the byte i am sending)
Php has pack() function for turning data into binary form. Unpack() reverses the operation.
$binaryInt = pack('I', $v);
The one thing that strikes me as odd is that you are setting the content type to application/zip, but you don't seem to be creating a ZIP encoded output stream. Is this an oversight ... or does PHP perform the encoding for you without you asking?
EDIT
According to RFC 2046, the recommended content type for a binary data format whose content type is not standardized is "application/octet-stream". There is also a practice of defining custom content subtypes with a name starting with "x-" (for experimental), but RFC 2046 says that this practice is now strongly discouraged.
You don't need that shiftRight3() method, just use >>, as you are masking the result, and then turning it into a chr(). Throw it away.

Are there C++ equivalents for the Protocol Buffers delimited I/O functions in Java?

I'm trying to read / write multiple Protocol Buffers messages from files, in both C++ and Java. Google suggests writing length prefixes before the messages, but there's no way to do that by default (that I could see).
However, the Java API in version 2.1.0 received a set of "Delimited" I/O functions which apparently do that job:
parseDelimitedFrom
mergeDelimitedFrom
writeDelimitedTo
Are there C++ equivalents? And if not, what's the wire format for the size prefixes the Java API attaches, so I can parse those messages in C++?
Update:
These now exist in google/protobuf/util/delimited_message_util.h as of v3.3.0.
I'm a bit late to the party here, but the below implementations include some optimizations missing from the other answers and will not fail after 64MB of input (though it still enforces the 64MB limit on each individual message, just not on the whole stream).
(I am the author of the C++ and Java protobuf libraries, but I no longer work for Google. Sorry that this code never made it into the official lib. This is what it would look like if it had.)
bool writeDelimitedTo(
const google::protobuf::MessageLite& message,
google::protobuf::io::ZeroCopyOutputStream* rawOutput) {
// We create a new coded stream for each message. Don't worry, this is fast.
google::protobuf::io::CodedOutputStream output(rawOutput);
// Write the size.
const int size = message.ByteSize();
output.WriteVarint32(size);
uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
if (buffer != NULL) {
// Optimization: The message fits in one buffer, so use the faster
// direct-to-array serialization path.
message.SerializeWithCachedSizesToArray(buffer);
} else {
// Slightly-slower path when the message is multiple buffers.
message.SerializeWithCachedSizes(&output);
if (output.HadError()) return false;
}
return true;
}
bool readDelimitedFrom(
google::protobuf::io::ZeroCopyInputStream* rawInput,
google::protobuf::MessageLite* message) {
// We create a new coded stream for each message. Don't worry, this is fast,
// and it makes sure the 64MB total size limit is imposed per-message rather
// than on the whole stream. (See the CodedInputStream interface for more
// info on this limit.)
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size)) return false;
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
// Parse the message.
if (!message->MergeFromCodedStream(&input)) return false;
if (!input.ConsumedEntireMessage()) return false;
// Release the limit.
input.PopLimit(limit);
return true;
}
Okay, so I haven't been able to find top-level C++ functions implementing what I need, but some spelunking through the Java API reference turned up the following, inside the MessageLite interface:
void writeDelimitedTo(OutputStream output)
/* Like writeTo(OutputStream), but writes the size of
the message as a varint before writing the data. */
So the Java size prefix is a (Protocol Buffers) varint!
Armed with that information, I went digging through the C++ API and found the CodedStream header, which has these:
bool CodedInputStream::ReadVarint32(uint32 * value)
void CodedOutputStream::WriteVarint32(uint32 value)
Using those, I should be able to roll my own C++ functions that do the job.
They should really add this to the main Message API though; it's missing functionality considering Java has it, and so does Marc Gravell's excellent protobuf-net C# port (via SerializeWithLengthPrefix and DeserializeWithLengthPrefix).
I solved the same problem using CodedOutputStream/ArrayOutputStream to write the message (with the size) and CodedInputStream/ArrayInputStream to read the message (with the size).
For example, the following pseudo-code writes the message size following by the message:
const unsigned bufLength = 256;
unsigned char buffer[bufLength];
Message protoMessage;
google::protobuf::io::ArrayOutputStream arrayOutput(buffer, bufLength);
google::protobuf::io::CodedOutputStream codedOutput(&arrayOutput);
codedOutput.WriteLittleEndian32(protoMessage.ByteSize());
protoMessage.SerializeToCodedStream(&codedOutput);
When writing you should also check that your buffer is large enough to fit the message (including the size). And when reading, you should check that your buffer contains a whole message (including the size).
It definitely would be handy if they added convenience methods to C++ API similar to those provided by the Java API.
IsteamInputStream is very fragile to eofs and other errors that easily occurs when used together with std::istream. After this the protobuf streams are permamently damaged and any already used buffer data is destroyed. There are proper support for reading from traditional streams in protobuf.
Implement google::protobuf::io::CopyingInputStream and use that together with CopyingInputStreamAdapter. Do the same for the output variants.
In practice a parsing call ends up in google::protobuf::io::CopyingInputStream::Read(void* buffer, int size) where a buffer is given. The only thing left to do is read into it somehow.
Here's an example for use with Asio synchronized streams (SyncReadStream/SyncWriteStream):
#include <google/protobuf/io/zero_copy_stream_impl_lite.h>
using namespace google::protobuf::io;
template <typename SyncReadStream>
class AsioInputStream : public CopyingInputStream {
public:
AsioInputStream(SyncReadStream& sock);
int Read(void* buffer, int size);
private:
SyncReadStream& m_Socket;
};
template <typename SyncReadStream>
AsioInputStream<SyncReadStream>::AsioInputStream(SyncReadStream& sock) :
m_Socket(sock) {}
template <typename SyncReadStream>
int
AsioInputStream<SyncReadStream>::Read(void* buffer, int size)
{
std::size_t bytes_read;
boost::system::error_code ec;
bytes_read = m_Socket.read_some(boost::asio::buffer(buffer, size), ec);
if(!ec) {
return bytes_read;
} else if (ec == boost::asio::error::eof) {
return 0;
} else {
return -1;
}
}
template <typename SyncWriteStream>
class AsioOutputStream : public CopyingOutputStream {
public:
AsioOutputStream(SyncWriteStream& sock);
bool Write(const void* buffer, int size);
private:
SyncWriteStream& m_Socket;
};
template <typename SyncWriteStream>
AsioOutputStream<SyncWriteStream>::AsioOutputStream(SyncWriteStream& sock) :
m_Socket(sock) {}
template <typename SyncWriteStream>
bool
AsioOutputStream<SyncWriteStream>::Write(const void* buffer, int size)
{
boost::system::error_code ec;
m_Socket.write_some(boost::asio::buffer(buffer, size), ec);
return !ec;
}
Usage:
AsioInputStream<boost::asio::ip::tcp::socket> ais(m_Socket); // Where m_Socket is a instance of boost::asio::ip::tcp::socket
CopyingInputStreamAdaptor cis_adp(&ais);
CodedInputStream cis(&cis_adp);
Message protoMessage;
uint32_t msg_size;
/* Read message size */
if(!cis.ReadVarint32(&msg_size)) {
// Handle error
}
/* Make sure not to read beyond limit of message */
CodedInputStream::Limit msg_limit = cis.PushLimit(msg_size);
if(!msg.ParseFromCodedStream(&cis)) {
// Handle error
}
/* Remove limit */
cis.PopLimit(msg_limit);
Here you go:
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/io/coded_stream.h>
using namespace google::protobuf::io;
class FASWriter
{
std::ofstream mFs;
OstreamOutputStream *_OstreamOutputStream;
CodedOutputStream *_CodedOutputStream;
public:
FASWriter(const std::string &file) : mFs(file,std::ios::out | std::ios::binary)
{
assert(mFs.good());
_OstreamOutputStream = new OstreamOutputStream(&mFs);
_CodedOutputStream = new CodedOutputStream(_OstreamOutputStream);
}
inline void operator()(const ::google::protobuf::Message &msg)
{
_CodedOutputStream->WriteVarint32(msg.ByteSize());
if ( !msg.SerializeToCodedStream(_CodedOutputStream) )
std::cout << "SerializeToCodedStream error " << std::endl;
}
~FASWriter()
{
delete _CodedOutputStream;
delete _OstreamOutputStream;
mFs.close();
}
};
class FASReader
{
std::ifstream mFs;
IstreamInputStream *_IstreamInputStream;
CodedInputStream *_CodedInputStream;
public:
FASReader(const std::string &file), mFs(file,std::ios::in | std::ios::binary)
{
assert(mFs.good());
_IstreamInputStream = new IstreamInputStream(&mFs);
_CodedInputStream = new CodedInputStream(_IstreamInputStream);
}
template<class T>
bool ReadNext()
{
T msg;
unsigned __int32 size;
bool ret;
if ( ret = _CodedInputStream->ReadVarint32(&size) )
{
CodedInputStream::Limit msgLimit = _CodedInputStream->PushLimit(size);
if ( ret = msg.ParseFromCodedStream(_CodedInputStream) )
{
_CodedInputStream->PopLimit(msgLimit);
std::cout << mFeed << " FASReader ReadNext: " << msg.DebugString() << std::endl;
}
}
return ret;
}
~FASReader()
{
delete _CodedInputStream;
delete _IstreamInputStream;
mFs.close();
}
};
I ran into the same issue in both C++ and Python.
For the C++ version, I used a mix of the code Kenton Varda posted on this thread and the code from the pull request he sent to the protobuf team (because the version posted here doesn't handle EOF while the one he sent to github does).
#include <google/protobuf/message_lite.h>
#include <google/protobuf/io/zero_copy_stream.h>
#include <google/protobuf/io/coded_stream.h>
bool writeDelimitedTo(const google::protobuf::MessageLite& message,
google::protobuf::io::ZeroCopyOutputStream* rawOutput)
{
// We create a new coded stream for each message. Don't worry, this is fast.
google::protobuf::io::CodedOutputStream output(rawOutput);
// Write the size.
const int size = message.ByteSize();
output.WriteVarint32(size);
uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
if (buffer != NULL)
{
// Optimization: The message fits in one buffer, so use the faster
// direct-to-array serialization path.
message.SerializeWithCachedSizesToArray(buffer);
}
else
{
// Slightly-slower path when the message is multiple buffers.
message.SerializeWithCachedSizes(&output);
if (output.HadError())
return false;
}
return true;
}
bool readDelimitedFrom(google::protobuf::io::ZeroCopyInputStream* rawInput, google::protobuf::MessageLite* message, bool* clean_eof)
{
// We create a new coded stream for each message. Don't worry, this is fast,
// and it makes sure the 64MB total size limit is imposed per-message rather
// than on the whole stream. (See the CodedInputStream interface for more
// info on this limit.)
google::protobuf::io::CodedInputStream input(rawInput);
const int start = input.CurrentPosition();
if (clean_eof)
*clean_eof = false;
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size))
{
if (clean_eof)
*clean_eof = input.CurrentPosition() == start;
return false;
}
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit = input.PushLimit(size);
// Parse the message.
if (!message->MergeFromCodedStream(&input)) return false;
if (!input.ConsumedEntireMessage()) return false;
// Release the limit.
input.PopLimit(limit);
return true;
}
And here is my python2 implementation:
from google.protobuf.internal import encoder
from google.protobuf.internal import decoder
#I had to implement this because the tools in google.protobuf.internal.decoder
#read from a buffer, not from a file-like objcet
def readRawVarint32(stream):
mask = 0x80 # (1 << 7)
raw_varint32 = []
while 1:
b = stream.read(1)
#eof
if b == "":
break
raw_varint32.append(b)
if not (ord(b) & mask):
#we found a byte starting with a 0, which means it's the last byte of this varint
break
return raw_varint32
def writeDelimitedTo(message, stream):
message_str = message.SerializeToString()
delimiter = encoder._VarintBytes(len(message_str))
stream.write(delimiter + message_str)
def readDelimitedFrom(MessageType, stream):
raw_varint32 = readRawVarint32(stream)
message = None
if raw_varint32:
size, _ = decoder._DecodeVarint32(raw_varint32, 0)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message = MessageType()
message.ParseFromString(data)
return message
#In place version that takes an already built protobuf object
#In my tests, this is around 20% faster than the other version
#of readDelimitedFrom()
def readDelimitedFrom_inplace(message, stream):
raw_varint32 = readRawVarint32(stream)
if raw_varint32:
size, _ = decoder._DecodeVarint32(raw_varint32, 0)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message.ParseFromString(data)
return message
else:
return None
It might not be the best looking code and I'm sure it can be refactored a fair bit, but at least that should show you one way to do it.
Now the big problem: It's SLOW.
Even when using the C++ implementation of python-protobuf, it's one order of magnitude slower than in pure C++. I have a benchmark where I read 10M protobuf messages of ~30 bytes each from a file. It takes ~0.9s in C++, and 35s in python.
One way to make it a bit faster would be to re-implement the varint decoder to make it read from a file and decode in one go, instead of reading from a file and then decoding as this code currently does. (profiling shows that a significant amount of time is spent in the varint encoder/decoder). But needless to say that alone is not enough to close the gap between the python version and the C++ version.
Any idea to make it faster is very welcome :)
Just for completeness, I post here an up-to-date version that works with the master version of protobuf and Python3
For the C++ version it is sufficient to use the utils in delimited_message_utils.h, here a MWE
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/util/delimited_message_util.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
template <typename T>
bool writeManyToFile(std::deque<T> messages, std::string filename) {
int outfd = open(filename.c_str(), O_WRONLY | O_CREAT | O_TRUNC);
google::protobuf::io::FileOutputStream fout(outfd);
bool success;
for (auto msg: messages) {
success = google::protobuf::util::SerializeDelimitedToZeroCopyStream(
msg, &fout);
if (! success) {
std::cout << "Writing Failed" << std::endl;
break;
}
}
fout.Close();
close(outfd);
return success;
}
template <typename T>
std::deque<T> readManyFromFile(std::string filename) {
int infd = open(filename.c_str(), O_RDONLY);
google::protobuf::io::FileInputStream fin(infd);
bool keep = true;
bool clean_eof = true;
std::deque<T> out;
while (keep) {
T msg;
keep = google::protobuf::util::ParseDelimitedFromZeroCopyStream(
&msg, &fin, nullptr);
if (keep)
out.push_back(msg);
}
fin.Close();
close(infd);
return out;
}
For the Python3 version, building on #fireboot 's answer, the only thing thing that needed modification is the decoding of raw_varint32
def getSize(raw_varint32):
result = 0
shift = 0
b = six.indexbytes(raw_varint32, 0)
result |= ((ord(b) & 0x7f) << shift)
return result
def readDelimitedFrom(MessageType, stream):
raw_varint32 = readRawVarint32(stream)
message = None
if raw_varint32:
size = getSize(raw_varint32)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message = MessageType()
message.ParseFromString(data)
return message
Was also looking for a solution for this. Here's the core of our solution, assuming some java code wrote many MyRecord messages with writeDelimitedTo into a file. Open the file and loop, doing:
if(someCodedInputStream->ReadVarint32(&bytes)) {
CodedInputStream::Limit msgLimit = someCodedInputStream->PushLimit(bytes);
if(myRecord->ParseFromCodedStream(someCodedInputStream)) {
//do your stuff with the parsed MyRecord instance
} else {
//handle parse error
}
someCodedInputStream->PopLimit(msgLimit);
} else {
//maybe end of file
}
Hope it helps.
Working with an objective-c version of protocol-buffers, I ran into this exact issue. On sending from the iOS client to a Java based server that uses parseDelimitedFrom, which expects the length as the first byte, I needed to call writeRawByte to the CodedOutputStream first. Posting here to hopegully help others that run into this issue. While working through this issue, one would think that Google proto-bufs would come with a simply flag which does this for you...
Request* request = [rBuild build];
[self sendMessage:request];
}
- (void) sendMessage:(Request *) request {
//** get length
NSData* n = [request data];
uint8_t len = [n length];
PBCodedOutputStream* os = [PBCodedOutputStream streamWithOutputStream:outputStream];
//** prepend it to message, such that Request.parseDelimitedFrom(in) can parse it properly
[os writeRawByte:len];
[request writeToCodedOutputStream:os];
[os flush];
}
Since I'm not allowed to write this as a comment to Kenton Varda's answer above; I believe there is a bug in the code he posted (as well as in other answers which have been provided). The following code:
...
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size)) return false;
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
...
sets an incorrect limit because it does not take into account the size of the varint32 which has already been read from input. This can result in data loss/corruption as additional bytes are read from the stream which may be part of the next message. The usual way of handling this correctly is to delete the CodedInputStream used to read the size and create a new one for reading the payload:
...
uint32_t size;
{
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
if (!input.ReadVarint32(&size)) return false;
}
google::protobuf::io::CodedInputStream input(rawInput);
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
...
You can use getline for reading a string from a stream, using the specified delimiter:
istream& getline ( istream& is, string& str, char delim );
(defined in the header)

Categories

Resources