e.g. I save the tuple T = {k1, v1, k2, v2} to the redis by jedis:
eredis:q(Conn, ["SET", <<"mykey">>, term_to_binary(T)]).
I am trying to use the code below to read this erlang term:
Jedis j = Redis.pool.getResource();
byte[] t = j.get("mykey").getBytes();
OtpInputStream ois = new OtpInputStream(t);
System.out.println(OtpErlangObject.decode(ois));
The error is: com.ericsson.otp.erlang.OtpErlangDecodeException: Uknown data type: 239.
So how can I get the erlang term correctly?
Erlang side:
term_to_binary({k1, v1, k2, v2}).
<<131,104,4,100,0,2,107,49,100,0,2,118,49,100,0,2,107,50,
100,0,2,118,50>>
Java side:
j.get("mykey").getBytes():
-17 -65 -67 104 4 100 0 2 107 49 100 0 2 118 49 100 0 2 107 50 100 0 2 118 50.
It seems that only the first 3 byte are different. So I change them to byte(131),
and then it can be printed correctly with System.out.println(OtpErlangObject.decode(ois)).
But when the term is more complicated, such as for a record with list inside, it wont work. cuz some other characters will appear not only at the head of the data but also the end and the middle.
Why the data I saved is different from what I got?
The negative numbers at the beginning of the byte array are not valid values for erlang external term syntax.
I would assume that since you have been storing the erlang terms in redis this way for some time, you are inserting them correctly.
That really only leaves one thing: When you call getBytes() your encoding is off, it is most likely using whatever encoding is set as the default on your system (probably UTF-8, but I'm not sure). Really what you want to do is pass a different encoding to getBytes(), probably like this: getBytes("US-ASCII").
check out the documentation for encodings are available.
Heres a link on SO that shows how to convert a string to an ASCII byte array.
Related
Update 2 (newest)
Here's the situation:
A foreign application is storing zlib deflated (compressed) data in this format:
78 9C BC (...data...) 00 00 FF FF - let's call it DATA1
If I take original XML file and deflate it in Java or Tcl, I get:
78 9C BD (...data...) D8 9F 29 BB - let's call it DATA2
Definitely the last 4 bytes in DATA2 is the Adler-32 checksum, which in DATA1 is replaced with the zlib FULL-SYNC marker (why? I have no idea).
3rd byte is different by value of 1.
The (...data...) is equal between DATA1 and DATA2.
Now the most interesting part: if I update the DATA1 changing the 3rd byte from BC to BD, leave last 8 bytes untouched (so 0000FFFF) and inflating this data with new Inflater(true) (https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/zip/Inflater.html#%3Cinit%3E(boolean)), I am able to decode it correctly! (because the Inflater in this mode does not require zlib Adler-32 checksum and zlib header)
Questions:
Why does changing BC to BD work? Is it safe to do in all cases? I checked with few cases and worked each time.
Why would any application output an incorrect (?) deflate value of BC at all?
Why would the application start with a zlib header (78 9C), but not produce compliant zlib structure (FLUSH-SYNC instead of Adler-32)? It's not a small hobby application, but a widely used business app (I would say dozens of thousands of business users).
### Update 1 (old)
After further analysis it seems that I have a zlib-compressed byte array that misses the final checksum (adler32).
According to RFC 1950, the correct zlib format must end with the adler32 checksum, but for some reason a dataset that I work with has zlib bytes, that miss that checksum. It always ends with 00 00 FF FF, which in zlib format is a marker of SYNC FLUSH. For a complete zlib object, there should be adler32 afterwards, but there is none.
Still it should be possible to inflate such data, right?
As mentioned earlier (in original question below), I've tried to pass this byte array to Java inflater (I've tried with one from Tcl too), with no luck. Somehow the application that produces these bytes is able to read it correctly (as also mentioned below).
How can I decompress it?
Original question, before update:
Context
There is an application (closed source code), that connects to MS SQL Server and stores compressed XML document there in a column of image type. This application - when requested - can export the document into a regular XML file on the local disk, so I have access to both plain text XML data, as well as the compressed one, directly in the database.
The problem
I'd like to be able to decompress any value from this column using my own code connecting to the SQL Server.
The problem is that it is some kind of weird zlib format. It does start with typical for zlib header bytes (78 9C), but I'm unable to decompress it (I used method described at Java Decompress a string compressed with zlib deflate).
The whole data looks like 789CBC58DB72E238...7E526E7EFEA5E3D5FF0CFE030000FFFF (of course dots mean more bytes inside - total of 1195).
What I've tried already
What caught my attention was the ending 0000FFFF, but even if I truncate it, the decompression still fails. I actually tried to decompress it truncating all amounts of bytes from the end (in the loop, chopping last byte per iteration) - none of iterations worked either.
I also compressed the original XML file into zlib bytes to see how it looks like then and apart from the 2 zlib header bytes and then maybe 5-6 more bytes afterwards, the rest of data was different. Number of output bytes was also different (smaller), but not much (it was like ~1180 vs 1195 bytes).
The difference on the deflate side is that the foreign application is using Z_SYNC_FLUSH or Z_FULL_FLUSH to flush the provided data so far to the compressed stream. You are (correctly) using Z_FINISH to end the stream. In the first case you end up with a partial deflate stream that is not terminated and has no check value. Instead it just ends with an empty stored block, which results in the 00 00 ff ff bytes at the end. In the second case you end up with a complete deflate stream and a zlib trailer with the check value. In that case, there happens to be a single deflate block (the data must be relatively small), so the first block is the last block, and is marked as such with a 1 as the low bit of the first byte.
What you are doing is setting the last block bit on the first block. This will in general not always work, since the stream may have more than one block. In that case, some other bit in the middle of the stream would need to be set.
I'm guessing that what you are getting is part, but not all of the compressed data. There is a flush to permit transmission of the data so far, but that would normally be followed by continued compression and more such flushed packets.
(Same question as #2, with the same answer.)
Making use of an ASCII .DAT file that contains multiple records of a fixed length I would like to read each record and generate an output based on its certain portions of its contents.
So far my program does exactly this but I was alerted to the fact that the first field in each .DAT file starts with the record length and number of records, the only issue I am having is reading this first field and extracting the data as usable, the issue is that the data is in ASCII chars and not decimal numbers.
Below is a code snipet in BASIC that reads the same file and extracts the initial data required
CLS
INPUT "Survey System Data File? : ", survey$
survey$ = "f:\apps\survey\" + survey$
reclen = 3004
OPEN survey$ + ".dat" FOR RANDOM AS 1 LEN = reclen
FIELD #1, 3 AS RL$, 9 AS n$
GET #1, 1
RL = CVI(RL$): n = CVI(n$)
PRINT "Record Length = "; RL
reclen = RL
PRINT "Number of Records = "; n
CLOSE #1
Is there a way of doing something similar in Java?
The initial record and second record are seen below. The second record starts from 0001511
#Å Õ 000151115 2 351228 6 8131720 1121211 12111121121111111112112111 Treat people fairly. Motivated people who go the extra mile should be recognised. Trust employees to make decisions and find out what is best for the business. Examine the workload and the performance and timing of the work. 11 6 5 6 5 2003/10/007:12 21 111 2 1154 1 1 113 1 1 1 1 1 4000100 0 0 0 400 0 0 0 400 4100.0000.0000.0 0 0 10 24 12111none 9 1346
As you can see the initial characters are ASCII chars and not decimals that I'm looking for.
Many thanks in advance for the help.
I have found a way around this issue as the initial record of the file is basically a blank indicator record and so using this initial record length I was able to find the recurring record length of the others.
What's the proper way of using the MifareUltralight.writePage() method?
Querying the getMaxTransceiveLength() method returns 253 bytes. Yet the tag is advertised at 888 bytes. Is the transceive() and thus writePage() supposed to be invoked multiple times? The payload being written has a size of 457 bytes.
val jsonString = Gson().toJson(casualty)
val casualtyBytes = toBlob(casualty)
var currentPage = PAGE_OFFSET
val pageBytes = ByteArray(MifareUltralight.PAGE_SIZE)
var byteIndex = 0
for(i in 0 until casualtyBytes.size){
pageBytes[byteIndex] = casualtyBytes[i]
byteIndex++
if(byteIndex == 4 || i == (casualtyBytes.size-1)) {
tag.writePage(currentPage, pageBytes)
currentPage++
byteIndex = 0
}
}
fun toBlob(item : Any) : ByteArray{
val bos = ByteArrayOutputStream()
val gzip = GZIPOutputStream(bos) //compress
val oos = ObjectOutputStream(gzip)
oos.writeObject(item)
oos.close()
return bos.toByteArray()
}
Exception
java.io.IOException: Transceive failed
at android.nfc.TransceiveResult.getResponseOrThrow(TransceiveResult.java:52)
at android.nfc.tech.BasicTagTechnology.transceive(BasicTagTechnology.java:151)
at android.nfc.tech.MifareUltralight.writePage(MifareUltralight.java:193)
at some.package.nfc.NfcCasualtyPublisher.writeToTag(NfcCasualtyPublisher.kt:42)
at some.package.nfc.NfcCasualtyPublisher.access$writeToTag(NfcCasualtyPublisher.kt:11)
at some.package.nfc.NfcCasualtyPublisher$publishCasualty$1.run(NfcCasualtyPublisher.kt:21)
at java.lang.Thread.run(Thread.java:818)
The memory of MIFARE Ultralight and NTAG tags is organized in pages of 4 byte each. Consequently, the WRITE command (MifareUltralight.writePage()) writes 4 bytes at a time. (Note that the READ command (MifareUltralight.readPages()) reads 4 pages (= 16 bytes) at a time.
Therefore, when you want to write to your NTAG216 tag, you need to split the data into chunks of 4 bytes. You seem to already do that with the for-loop in your code (though you'll run into some issues since you do not clear the unused bytes of the last page if your data is not page-aligned).
Not all the pages of a MIFARE Ultralight/NTAG tag are freely usable for data storage. Only the user memory area in pages 4 to 225 (for NTAG216) is. The first 2 pages (pages 0 and 1) are read-only and reserved for the tag serial number. The next 2 pages (pages 2 and 3) contain write-once memory (i.e. memory areas where a bit that is once written to 1 can't be changed to 0 again). Specifically, there are the lock-bits in page 2 (also in page 226, but you shouldn't have touched them if your data is only 457 bytes). If you set the lock-bits, you prevent write access to parts of the user memory pages, which would result in a "Transceive failed" exception. Consequently, if the value of PAGE_OFFSET is less than 4, you probably rendered the tag unusable by writing data to reserved memory areas.
In general, if you only intend to store (freely-readable) data and won't make use of additional features of the tag (such as password protection), I would strongly suggest that you do not use low-level IO methods for accessing NFC tags. Instead, stick to the NDEF abstraction layer and store your data in NDEF records. Android will then take care of putting the data into appropriate memory locations on any NFC tag.
Finally, the transceive length is the amount of bytes that can be transfered in one command or response. So, for instance, for a WRITE command, this would be 6 bytes in total (4 bytes of data payload, one address byte and one command code byte). For a READ response this would be the 16 bytes of data payload. The value of getMaxTransceiveLength() indicates the maximum transceive length theoretically possible by the underlying libraries, HAL and hardware.
Details: Read data from EEPROM -> output to tera term -> save off log file -> parse through it with java program.
What I have: All EEPROM reads are good. I then take the hex value I read and using sprintf (in Atmel Studio) turn each byte into its 2 respective ASCII codes. Then I send this out to tera term. Output is as follows:
00=00=00=c5=03=76=00=01=00=05=00=cf=00=01=fa=ef=
00=00=00=c6=00=44=00=01=00=05=00=cf=00=00=fe=21=
00=00=00=c8=02=41=00=01=00=05=00=d0=00=01=fc=20=
etc...
I can then parse through it in this manner using a java program I slightly modified:
Seconds: 0x15150380 Milliseconds: 0x0062 Cycle Count: 0x0001 Assert Code: 0x0005 Parameter: 0x00d1 Data Value: 0x006c Checksum: 0xfa5e
(first 4 bytes are seconds, next 2 are milliseconds, etc.)
Next:
For starters I would just like to read each line (1 log) into a byte array so I can verify packet with checksum at end, etc.
My questions:
1) How to read that type of output to an array
2) Would it be better/easier to output data to teraterminal in a different manner? And if so any pointers are appreciated.
Completely new to Java so trying to piece throught this...
Thanks for the help.
I have a number of text files which are in a fixed, repeated format like:
Q 32,0 16
q 27
b 21
I 0
P 1
d 0
m 31,0
Q 48,0 16
q 27
b 2
I 2
P 1
d 0
m 31,0
.
.
.
I want to parse them in Java. What I want to know is the fastest method to parse such a text file. I can change the output format of the text file if that helps with the performance, as the only requirement here is speed of parsing.
I can use external libraries too.
The fastest speed of parsing is to use a binary format. I sugegst you use native byte order and you should be able to read about 20 million entries per second for this sort of data.
An example of reading and writing binary data with a high throughput AND low latency is here.
https://github.com/peter-lawrey/Java-Chronicle
This format is designed to be read as it is written (with less than one micro-second latency between processes)
You could use a simpler format than this as I suspect all you need is high throughput. ;)
BTW: The library supports GC-less read and writing of text such as long and double values directory to/from a memory mapped ByteBuffer. As such it can be used as a fast text logger supporting over one million realistic text messages per second.