Passing a byte array from java to python on Android using Chaquopy - java

I'm running an android camera app and I would like to do the image processing in Python. To test this, I want to pass a single image frame to a python function, divide all values by 2 using integer division and return the result.
For that end, I have the following code:
in Java:
public void onCapturedImage(Image image)
{
Image.Plane[] tmp = image.getPlanes();
byte[] bytes = null;
ByteBuffer buffer = tmp[0].getBuffer();
buffer.rewind();
bytes = new byte[buffer.remaining()];
buffer.get(bytes, 0, buffer.remaining());
buffer.rewind();
Log.d(TAG, "start python section");
// assume python.start() is elsewhere
Python py = Python.getInstance();
PyObject array1 = PyObject.fromJava(bytes);
Log.d(TAG, "get python module");
PyObject py_module = py.getModule("mymod");
Log.d(TAG, "call pic func");
byte [] result = py_module.callAttr("pic_func", array1).toJava(byte[].class);
// compare the values at some random location to see make sure result is as expected
Log.d(TAG, "Compare: "+Byte.toString(bytes[33]) + " and " + Byte.toString(result[33]));
Log.d(TAG,"DONE");
}
In python, I have the following:
import numpy as np
def pic_func(o):
a = np.array(o)
b = a//2
return b.tobytes()
I have several issues with this code.
It does not behave as expected - the value at location 33 is not half. I probably have a mix-up with the byte values, but I'm not sure what's going on exactly. The same code without "tobytes" and using a python list rather than a numpy array does work as expected.
Passing parameters - not sure what happens under the hood. Is it pass by value or by reference? Is the array being copied, or just a pointer being passed around?
It is SLOW. it takes about 90 seconds to compute this operation over 12 million values. Any pointers on speeding this up?
Thanks!

Your last two questions are related, so I'll answer them together.
PyObject array1 = PyObject.fromJava(bytes)
py_module.callAttr("pic_func", array1)
This passes by reference: the Python code receives a jarray object which accesses the original array.
np.array(o)
As of Chaquopy 8.x, this is a direct memory copy when o is a Java primitive array, so performance shouldn't be a problem. On older versions of Chaquopy, you can avoid a slow element-by-element copy by converting to a Python bytes object first, which can be done in either language:
In Java: PyObject array1 = py.getBuiltins().callAttr("bytes", bytes)
Or in Python: np.array(bytes(o))
b.tobytes()
toJava(byte[].class)
Both of these expressions will also make a copy, but they will also be direct memory copies, so performance shouldn't be a problem.
As for it returning the wrong answer, I think that's probably because NumPy is using its default data type of float64. When calling np.array, you should specify the data type explicitly by passing dtype=np.int8 or dtype=np.uint8. (If you search for byte[] in the Chaquopy documentation you'll find the exact details of how signed/unsigned conversion works, but it's probably easier just to try both and see which one gives the answer you expect.)

Related

c++ (u256)*(h256 const*)(char*[] + int) cast rewriting to java

i need to to rewrite some code from c++ to java and i've got into trouble with such c++ code:
using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;
using h256 = FixedHash<32>;
using bytes = std::vector<byte>;
uint32_t offset = ...;
bytes m_data = ...;
u256 result;
result = (u256)*(h256 const*)(m_data.data() + (size_t)offset);
I have no idea what's going on and how do i rewrite it in java code.
I've understood that firstly we made and offset and now pointing at some element of m_data array, then cast in to array of h256 type (i've watched debug and this cast made the following: we get data from 0 to offset from m_data and then cast to 32 size array with leading zero's)
And then we get a first value (im not sure about it) of this array and cast to u256? But the first value after (h256 const*) cast is zero but anyway the resulting value is not a zero.
Do u have any ideas?
I don't know what a u256 is, and the question miss the typedef, but this is the typical way in C to get a scalar type (int16_t, int32_t, int64_t, double....) from a buffer in memory.
Essentially the use of the syntax:
type t = (type)*(const type *)(buffer + offset)
... let you obtain an object of a specific type from a byte array starting from a particular index.
It's not very safe, but it blazing fast when converted to assembly!
NOTE: the pointer math depends from the declaration of "buffer", if it's int8_t * for instance buffer will be get from the "offset"-nth byte, if it's int32_t * it will be used from the "offset * 4"-nth byte.

Using Python Tensor of TensorFlow in Java

I have a Tensorflow program running in Python, and for some convenience reasons I want to run the same program on Java, so I have to save my model and load it in my Java application.
My problem is that a don't know how to save a Tensor object, here is my code :
class Main:
def __init__(self, checkpoint):
...
self.g = tf.Graph()
self.sess = tf.Session()
self.img_placeholder = tf.placeholder(tf.float32,
shape=(1, 679, 1024, 3), name='img_placeholder')
#self.preds is an instance of Tensor
self.preds = transform(self.img_placeholder)
self.saver = tf.train.Saver()
self.saver.restore(self.sess, checkpoint)
def ffwd(...):
...
_preds = self.sess.run(self.preds, feed_dict=
{self.img_placeholder: self.X})
...
So since I can't create my Tensor (the transform function creates the NN behind the scenes...), I'am obliged to save it and reload it into Java. I have found ways of saving the session but not Tensor instances.
Could someone give me some insights on how to achieve this ?
Python Tensor objects are symbolic references to a specific output of an operation in the graph.
An operation in a graph can be uniquely identified by its string name. A specific output of that operation is identified by an integer index into the list of outputs of that operation. That index is typically zero since a vast majority of operations produce a single output.
To obtain the name of an Operation and the output index referred to by a Tensor object in Python you could do something like:
print(preds.op.name)
print(preds.value_index) # Most likely will be 0
And then in Java, you can feed/fetch nodes by name.
Let's say preds.op.name returned the string foo, and preds.value_index returned the integer 1, then in Java, you'd do the following:
session.runner().feed("img_placeholder").fetch("foo", 1)
(See javadoc for org.tensorflow.Session.Runner for details).
You may find the slides linked to in https://github.com/tensorflow/models/tree/master/samples/languages/java along with the speaker notes in those slides useful.
Hope that helps.

Hashing raw bytes in Python and Java produces different results

I'm trying to replicate the behavior of a Python 2.7 function in Java, but I'm getting different results when running a (seemingly) identical sequence of bytes through a SHA-256 hash. The bytes are generated by manipulating a very large integer (exactly 2048 bits long) in a specific way (2nd line of my Python code example).
For my examples, the original 2048-bit integer is stored as big_int and bigInt in Python and Java respectively, and both variables contain the same number.
Python2 code I'm trying to replicate:
raw_big_int = ("%x" % big_int).decode("hex")
buff = struct.pack(">i", len(raw_big_int) + 1) + "\x00" + raw_big_int
pprint("Buffer contains: " + buff)
pprint("Encoded: " + buff.encode("hex").upper())
digest = hashlib.sha256(buff).digest()
pprint("Digest contains: " + digest)
pprint("Encoded: " + digest.encode("hex").upper())
Running this code prints the following (note that the only result I'm actually interested in is the last one - the hex-encoded digest. The other 3 prints are just to see what's going on under the hood):
'Buffer contains: \x00\x00\x01\x01\x00\xe3\xbb\xd3\x84\x94P\xff\x9c\'\xd0P\xf2\xf0s,a^\xf0i\xac~\xeb\xb9_\xb0m\xa2&f\x8d~W\xa0\xb3\xcd\xf9\xf0\xa8\xa2\x8f\x85\x02\xd4&\x7f\xfc\xe8\xd0\xf2\xe2y"\xd0\x84ck\xc2\x18\xad\xf6\x81\xb1\xb0q\x19\xabd\x1b>\xc8$g\xd7\xd2g\xe01\xd4r\xa3\x86"+N\\\x8c\n\xb7q\x1c \x0c\xa8\xbcW\x9bt\xb0\xae\xff\xc3\x8aG\x80\xb6\x9a}\xd9*\x9f\x10\x14\x14\xcc\xc0\xb6\xa9\x18*\x01/eC\x0eQ\x1b]\n\xc2\x1f\x9e\xb6\x8d\xbfb\xc7\xce\x0c\xa1\xa3\x82\x98H\x85\xa1\\\xb2\xf1\'\xafmX|\x82\xe7%\x8f\x0eT\xaa\xe4\x04*\x91\xd9\xf4e\xf7\x8c\xd6\xe5\x84\xa8\x01*\x86\x1cx\x8c\xf0d\x9cOs\xebh\xbc1\xd6\'\xb1\xb0\xcfy\xd7(\x8b\xeaIf6\xb4\xb7p\xcdgc\xca\xbb\x94\x01\xb5&\xd7M\xf9\x9co\xf3\x10\x87U\xc3jB3?vv\xc4JY\xc9>\xa3cec\x01\x86\xe9c\x81F-\x1d\x0f\xdd\xbf\xe8\xe9k\xbd\xe7c5'
'Encoded: 0000010100E3BBD3849450FF9C27D050F2F0732C615EF069AC7EEBB95FB06DA226668D7E57A0B3CDF9F0A8A28F8502D4267FFCE8D0F2E27922D084636BC218ADF681B1B07119AB641B3EC82467D7D267E031D472A386222B4E5C8C0AB7711C200CA8BC579B74B0AEFFC38A4780B69A7DD92A9F101414CCC0B6A9182A012F65430E511B5D0AC21F9EB68DBF62C7CE0CA1A382984885A15CB2F127AF6D587C82E7258F0E54AAE4042A91D9F465F78CD6E584A8012A861C788CF0649C4F73EB68BC31D627B1B0CF79D7288BEA496636B4B770CD6763CABB9401B526D74DF99C6FF3108755C36A42333F7676C44A59C93EA36365630186E96381462D1D0FDDBFE8E96BBDE76335'
'Digest contains: Q\xf9\xb9\xaf\xe1\xbey\xdc\xfa\xc4.\xa9 \xfckz\xfeB\xa0>\xb3\xd6\xd0*S\xff\xe1\xe5*\xf0\xa3i'
'Encoded: 51F9B9AFE1BE79DCFAC42EA920FC6B7AFE42A03EB3D6D02A53FFE1E52AF0A369'
Now, below is my Java code so far. When I test it, I get the same value for the input buffer, but a different value for the digest. (bigInt contains a BigInteger object containing the same number as big_int in the Python example above)
byte[] rawBigInt = bigInt.toByteArray();
ByteBuffer buff = ByteBuffer.allocate(rawBigInt.length + 4);
buff.order(ByteOrder.BIG_ENDIAN);
buff.putInt(rawBigInt.length).put(rawBigInt);
System.out.print("Buffer contains: ");
System.out.println( DatatypeConverter.printHexBinary(buff.array()) );
MessageDigest hash = MessageDigest.getInstance("SHA-256");
hash.update(buff);
byte[] digest = hash.digest();
System.out.print("Digest contains: ");
System.out.println( DatatypeConverter.printHexBinary(digest) );
Notice that in my Python example, I started the buffer off with len(raw_big_int) + 1 packed, where in Java I started with just rawBigInt.length. I also omitted the extra 0-byte ("\x00") when writing in Java. I did both of these for the same reason - in my tests, calling toByteArray() on a BigInteger returned a byte array already beginning with a 0-byte that was exactly 1 byte longer than Python's byte sequence. So, at least in my tests, len(raw_big_int) + 1 equaled rawBigInt.length, since rawBigInt began with a 0-byte and raw_big_int did not.
Alright, that aside, here is the Java code's output:
Buffer contains: 0000010100E3BBD3849450FF9C27D050F2F0732C615EF069AC7EEBB95FB06DA226668D7E57A0B3CDF9F0A8A28F8502D4267FFCE8D0F2E27922D084636BC218ADF681B1B07119AB641B3EC82467D7D267E031D472A386222B4E5C8C0AB7711C200CA8BC579B74B0AEFFC38A4780B69A7DD92A9F101414CCC0B6A9182A012F65430E511B5D0AC21F9EB68DBF62C7CE0CA1A382984885A15CB2F127AF6D587C82E7258F0E54AAE4042A91D9F465F78CD6E584A8012A861C788CF0649C4F73EB68BC31D627B1B0CF79D7288BEA496636B4B770CD6763CABB9401B526D74DF99C6FF3108755C36A42333F7676C44A59C93EA36365630186E96381462D1D0FDDBFE8E96BBDE76335
Digest contains: E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855
As you can see, the buffer contents appear the same in both Python and Java, but the digests are obviously different. Can someone point out where I'm going wrong?
I suspect it has something to do with the strange way Python seems to store bytes - the variables raw_big_int and buff show as type str in the interpreter, and when printed out by themselves have that strange format with the '\x's that is almost the same as the bytes themselves in some places, but is utter gibberish in others. I don't have enough Python experience to understand exactly what's going on here, and my searches have turned up fruitless.
Also, since I'm trying to port the Python code into Java, I can't just change the Python - my goal is to write Java code that takes the same input and produces the same output. I've searched around (this question in particular seemed related) but didn't find anything to help me out. Thanks in advance, if for nothing else than for reading this long-winded question! :)
In Java, you've got the data in the buffer, but the cursor positions are all wrong. After you've written your data to the ByteBuffer it looks like this, where the x's represent your data and the 0's are unwritten bytes in the buffer:
xxxxxxxxxxxxxxxxxxxx00000000000000000000000000000000000000000
^ position ^ limit
The cursor is positioned after the data you've written. A read at this point will read from position to limit, which is the bytes you haven't written.
Instead, you want this:
xxxxxxxxxxxxxxxxxxxx00000000000000000000000000000000000000000
^ position ^ limit
where the position is 0 and the limit is the number of bytes you've written. To get there, call flip(). Flipping a buffer conceptually switches it from write mode to read mode. I say "conceptually" because ByteBuffers don't have explicit read and write modes, but you should think of them as if they do.
(The opposite operation is compact(), which goes back to read mode.)

how to convert php unpack() in a similar method in Java

I've no coding experience in PHP at all. But while looking for a solution for my Java project, i found an example of the problem in PHP, which incidentally is alien to me.
Can anyone please explain the working and the result of the unpack('N*',"string") function of PHP and how to implement it in Java?
An example would help me a lot!
Thanks!
In PHP (and in Perl, where PHP copied it from), unpack("N*", ...) takes a string (actually representing a sequence of bytes) and parses each 4-byte segment of it as a signed 32-bit big-endian ("Network byte order") integer, returning them in an array.
There are several ways to do the same in Java, but one way would be to wrap the input byte array in a java.nio.ByteBuffer, convert it to an IntBuffer and then read the integers from that:
public static int[] unpackNStar ( byte[] bytes ) {
// first, wrap the input array in a ByteBuffer:
ByteBuffer byteBuf = ByteBuffer.wrap( bytes );
// then turn it into an IntBuffer, using big-endian ("Network") byte order:
byteBuf.order( ByteOrder.BIG_ENDIAN );
IntBuffer intBuf = byteBuf.asIntBuffer();
// finally, dump the contents of the IntBuffer into an array
int[] integers = new int[ intBuf.remaining() ];
intBuf.get( integers );
return integers;
}
Of course, if you just want to iterate over the integers, you don't really need the IntBuffer or the array:
ByteBuffer buf = ButeBuffer.wrap( bytes );
buf.order( ByteOrder.BIG_ENDIAN );
while ( buf.hasRemaining() ) {
int num = buf.getInt();
// do something with num...
}
In fact, iterating over a ByteBuffer like this is a convenient way to emulate the behavior of even more complicated examples of unpack() in Perl or PHP.
(Disclaimer: I have not tested this code. I believe it should work, but it's always possible that I may have mistyped or misunderstood something. Please test before using.)
Ps. If you're reading the bytes from an input stream, you could also wrap it in a DataInputStream and use its readInt() method. Of course, it's also possible to use a ByteArrayInputStream to read the input from a byte array, achieving the same results as the ByteBuffer examples above.

reading data from Matlab into Java

I'm trying to read a matrix produced in Matlab into a 2D array in java.
I've been using jmatio so far for writing from java to a .mat file (successfully), but now can't manage to go the other way around.
I've managed to import a matrix into an MLArray object using this code:
matfilereader = new MatFileReader("filename.mat");
MLArray j = matfilereader.getMLArray("dataname");
But other than getting its string representation I couldn't manage to access the data itself. I found no example for this or documentation on the library itself, and I actually wrote a function to parse the intire string into a double[][] array but that's only good if the matrix is smaller than 1000 items...
Would be grateful for any experience or tips,
thanks,
Amir
matfilereader.getMLArray has several subclasses to access different kinds of data in MLArray object.
To represent double array you can cast MLArray to MLDouble:
MLDouble j = (MLDouble)matfilereader.getMLArray("dataname");
I'm not familiar with that tool, but it's pretty old. Try saving to an older version of *.mat file and see if your results change. That is, add either the '-v7.0' or '-v6' flag when you save you r*.mat file.
Example code:
save filename var1 var2 -v7.0
or
save filename var1 var2 -v6

Categories

Resources