How to get CRC64 distributed calculation (use its linearity property)? - java

I need hash over pretty large files which is stored on distributed FS. I'm able to process parts of file with much more better performance than whole file so I'd like to be able to calculate hash over parts and then sum it.
I'm thinking about CRC64 as hashing algorithm but I have no clue how to use its theoretical 'linear function' property so I can sum CRC over parts of file. Any recommendation? Anything I missed here?
Additional notes why I'm looking at CRC64:
I can control file blocks but because of application nature they need to have different size (up to 1 byte, no any fixed blocks are possible).
I know about CRC32 implementation (zlib) which includes way to sum CRC over parts but I'd like something more wider. 8 bytes look nice for me.
I know CRC is pretty fast. I'd like to get profit from this as file can be really huge (up to few Gb).

Decided that this was generally useful enough to write and make available:
/* crc64.c -- compute CRC-64
* Copyright (C) 2013 Mark Adler
* Version 1.4 16 Dec 2013 Mark Adler
*/
/*
This software is provided 'as-is', without any express or implied
warranty. In no event will the author be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
Mark Adler
madler#alumni.caltech.edu
*/
/* Compute CRC-64 in the manner of xz, using the ECMA-182 polynomial,
bit-reversed, with one's complement pre and post processing. Provide a
means to combine separately computed CRC-64's. */
/* Version history:
1.0 13 Dec 2013 First version
1.1 13 Dec 2013 Fix comments in test code
1.2 14 Dec 2013 Determine endianess at run time
1.3 15 Dec 2013 Add eight-byte processing for big endian as well
Make use of the pthread library optional
1.4 16 Dec 2013 Make once variable volatile for limited thread protection
*/
#include <stdio.h>
#include <inttypes.h>
#include <assert.h>
/* The include of pthread.h below can be commented out in order to not use the
pthread library for table initialization. In that case, the initialization
will not be thread-safe. That's fine, so long as it can be assured that
there is only one thread using crc64(). */
#include <pthread.h> /* link with -lpthread */
/* 64-bit CRC polynomial with these coefficients, but reversed:
64, 62, 57, 55, 54, 53, 52, 47, 46, 45, 40, 39, 38, 37, 35, 33, 32,
31, 29, 27, 24, 23, 22, 21, 19, 17, 13, 12, 10, 9, 7, 4, 1, 0 */
#define POLY UINT64_C(0xc96c5795d7870f42)
/* Tables for CRC calculation -- filled in by initialization functions that are
called once. These could be replaced by constant tables generated in the
same way. There are two tables, one for each endianess. Since these are
static, i.e. local, one should be compiled out of existence if the compiler
can evaluate the endianess check in crc64() at compile time. */
static uint64_t crc64_little_table[8][256];
static uint64_t crc64_big_table[8][256];
/* Fill in the CRC-64 constants table. */
static void crc64_init(uint64_t table[][256])
{
unsigned n, k;
uint64_t crc;
/* generate CRC-64's for all single byte sequences */
for (n = 0; n < 256; n++) {
crc = n;
for (k = 0; k < 8; k++)
crc = crc & 1 ? POLY ^ (crc >> 1) : crc >> 1;
table[0][n] = crc;
}
/* generate CRC-64's for those followed by 1 to 7 zeros */
for (n = 0; n < 256; n++) {
crc = table[0][n];
for (k = 1; k < 8; k++) {
crc = table[0][crc & 0xff] ^ (crc >> 8);
table[k][n] = crc;
}
}
}
/* This function is called once to initialize the CRC-64 table for use on a
little-endian architecture. */
static void crc64_little_init(void)
{
crc64_init(crc64_little_table);
}
/* Reverse the bytes in a 64-bit word. */
static inline uint64_t rev8(uint64_t a)
{
uint64_t m;
m = UINT64_C(0xff00ff00ff00ff);
a = ((a >> 8) & m) | (a & m) << 8;
m = UINT64_C(0xffff0000ffff);
a = ((a >> 16) & m) | (a & m) << 16;
return a >> 32 | a << 32;
}
/* This function is called once to initialize the CRC-64 table for use on a
big-endian architecture. */
static void crc64_big_init(void)
{
unsigned k, n;
crc64_init(crc64_big_table);
for (k = 0; k < 8; k++)
for (n = 0; n < 256; n++)
crc64_big_table[k][n] = rev8(crc64_big_table[k][n]);
}
/* Run the init() function exactly once. If pthread.h is not included, then
this macro will use a simple static state variable for the purpose, which is
not thread-safe. The init function must be of the type void init(void). */
#ifdef PTHREAD_ONCE_INIT
# define ONCE(init) \
do { \
static pthread_once_t once = PTHREAD_ONCE_INIT; \
pthread_once(&once, init); \
} while (0)
#else
# define ONCE(init) \
do { \
static volatile int once = 1; \
if (once) { \
if (once++ == 1) { \
init(); \
once = 0; \
} \
else \
while (once) \
; \
} \
} while (0)
#endif
/* Calculate a CRC-64 eight bytes at a time on a little-endian architecture. */
static inline uint64_t crc64_little(uint64_t crc, void *buf, size_t len)
{
unsigned char *next = buf;
ONCE(crc64_little_init);
crc = ~crc;
while (len && ((uintptr_t)next & 7) != 0) {
crc = crc64_little_table[0][(crc ^ *next++) & 0xff] ^ (crc >> 8);
len--;
}
while (len >= 8) {
crc ^= *(uint64_t *)next;
crc = crc64_little_table[7][crc & 0xff] ^
crc64_little_table[6][(crc >> 8) & 0xff] ^
crc64_little_table[5][(crc >> 16) & 0xff] ^
crc64_little_table[4][(crc >> 24) & 0xff] ^
crc64_little_table[3][(crc >> 32) & 0xff] ^
crc64_little_table[2][(crc >> 40) & 0xff] ^
crc64_little_table[1][(crc >> 48) & 0xff] ^
crc64_little_table[0][crc >> 56];
next += 8;
len -= 8;
}
while (len) {
crc = crc64_little_table[0][(crc ^ *next++) & 0xff] ^ (crc >> 8);
len--;
}
return ~crc;
}
/* Calculate a CRC-64 eight bytes at a time on a big-endian architecture. */
static inline uint64_t crc64_big(uint64_t crc, void *buf, size_t len)
{
unsigned char *next = buf;
ONCE(crc64_big_init);
crc = ~rev8(crc);
while (len && ((uintptr_t)next & 7) != 0) {
crc = crc64_big_table[0][(crc >> 56) ^ *next++] ^ (crc << 8);
len--;
}
while (len >= 8) {
crc ^= *(uint64_t *)next;
crc = crc64_big_table[0][crc & 0xff] ^
crc64_big_table[1][(crc >> 8) & 0xff] ^
crc64_big_table[2][(crc >> 16) & 0xff] ^
crc64_big_table[3][(crc >> 24) & 0xff] ^
crc64_big_table[4][(crc >> 32) & 0xff] ^
crc64_big_table[5][(crc >> 40) & 0xff] ^
crc64_big_table[6][(crc >> 48) & 0xff] ^
crc64_big_table[7][crc >> 56];
next += 8;
len -= 8;
}
while (len) {
crc = crc64_big_table[0][(crc >> 56) ^ *next++] ^ (crc << 8);
len--;
}
return ~rev8(crc);
}
/* Return the CRC-64 of buf[0..len-1] with initial crc, processing eight bytes
at a time. This selects one of two routines depending on the endianess of
the architecture. A good optimizing compiler will determine the endianess
at compile time if it can, and get rid of the unused code and table. If the
endianess can be changed at run time, then this code will handle that as
well, initializing and using two tables, if called upon to do so. */
uint64_t crc64(uint64_t crc, void *buf, size_t len)
{
uint64_t n = 1;
return *(char *)&n ? crc64_little(crc, buf, len) :
crc64_big(crc, buf, len);
}
#define GF2_DIM 64 /* dimension of GF(2) vectors (length of CRC) */
static uint64_t gf2_matrix_times(uint64_t *mat, uint64_t vec)
{
uint64_t sum;
sum = 0;
while (vec) {
if (vec & 1)
sum ^= *mat;
vec >>= 1;
mat++;
}
return sum;
}
static void gf2_matrix_square(uint64_t *square, uint64_t *mat)
{
unsigned n;
for (n = 0; n < GF2_DIM; n++)
square[n] = gf2_matrix_times(mat, mat[n]);
}
/* Return the CRC-64 of two sequential blocks, where crc1 is the CRC-64 of the
first block, crc2 is the CRC-64 of the second block, and len2 is the length
of the second block. */
uint64_t crc64_combine(uint64_t crc1, uint64_t crc2, uintmax_t len2)
{
unsigned n;
uint64_t row;
uint64_t even[GF2_DIM]; /* even-power-of-two zeros operator */
uint64_t odd[GF2_DIM]; /* odd-power-of-two zeros operator */
/* degenerate case */
if (len2 == 0)
return crc1;
/* put operator for one zero bit in odd */
odd[0] = POLY; /* CRC-64 polynomial */
row = 1;
for (n = 1; n < GF2_DIM; n++) {
odd[n] = row;
row <<= 1;
}
/* put operator for two zero bits in even */
gf2_matrix_square(even, odd);
/* put operator for four zero bits in odd */
gf2_matrix_square(odd, even);
/* apply len2 zeros to crc1 (first square will put the operator for one
zero byte, eight zero bits, in even) */
do {
/* apply zeros operator for this bit of len2 */
gf2_matrix_square(even, odd);
if (len2 & 1)
crc1 = gf2_matrix_times(even, crc1);
len2 >>= 1;
/* if no more bits set, then done */
if (len2 == 0)
break;
/* another iteration of the loop with odd and even swapped */
gf2_matrix_square(odd, even);
if (len2 & 1)
crc1 = gf2_matrix_times(odd, crc1);
len2 >>= 1;
/* if no more bits set, then done */
} while (len2 != 0);
/* return combined crc */
crc1 ^= crc2;
return crc1;
}
/* Test crc64() on vector[0..len-1] which should have CRC-64 crc. Also test
crc64_combine() on vector[] split in two. */
static void crc64_test(void *vector, size_t len, uint64_t crc)
{
uint64_t crc1, crc2;
/* test crc64() */
crc1 = crc64(0, vector, len);
if (crc1 ^ crc)
printf("mismatch: %" PRIx64 ", should be %" PRIx64 "\n", crc1, crc);
/* test crc64_combine() */
crc1 = crc64(0, vector, (len + 1) >> 1);
crc2 = crc64(0, vector + ((len + 1) >> 1), len >> 1);
crc1 = crc64_combine(crc1, crc2, len >> 1);
if (crc1 ^ crc)
printf("mismatch: %" PRIx64 ", should be %" PRIx64 "\n", crc1, crc);
}
/* Test vectors. */
#define TEST1 "123456789"
#define TESTLEN1 9
#define TESTCRC1 UINT64_C(0x995dc9bbdf1939fa)
#define TEST2 "This is a test of the emergency broadcast system."
#define TESTLEN2 49
#define TESTCRC2 UINT64_C(0x27db187fc15bbc72)
int main(void)
{
crc64_test(TEST1, TESTLEN1, TESTCRC1);
crc64_test(TEST2, TESTLEN2, TESTCRC2);
return 0;
}

OK, my contribution to this. Ported to Java.
I cannot win from 8-byte blocks without doing unsafe thing so I removed block calculation.
I stay with ECMA polynom - ISO one looks too transparent as for me.
Of course in final version I will move test code under JUnit.
So here is code:
package com.test;
import java.util.Arrays;
/**
* CRC-64 implementation with ability to combine checksums calculated over different blocks of data.
**/
public class CRC64 {
private final static long POLY = (long) 0xc96c5795d7870f42L; // ECMA-182
/* CRC64 calculation table. */
private final static long[] table;
/* Current CRC value. */
private long value;
static {
table = new long[256];
for (int n = 0; n < 256; n++) {
long crc = n;
for (int k = 0; k < 8; k++) {
if ((crc & 1) == 1) {
crc = (crc >>> 1) ^ POLY;
} else {
crc = (crc >>> 1);
}
}
table[n] = crc;
}
}
public CRC64() {
this.value = 0;
}
public CRC64(long value) {
this.value = value;
}
public CRC64(byte [] b, int len) {
this.value = 0;
update(b, len);
}
/**
* Construct new CRC64 instance from byte array.
**/
public static CRC64 fromBytes(byte [] b) {
long l = 0;
for (int i = 0; i < 4; i++) {
l <<= 8;
l ^= (long) b[i] & 0xFF;
}
return new CRC64(l);
}
/**
* Get 8 byte representation of current CRC64 value.
**/
public byte[] getBytes() {
byte [] b = new byte[8];
for (int i = 0; i < 8; i++) {
b[7 - i] = (byte) (this.value >>> (i * 8));
}
return b;
}
/**
* Get long representation of current CRC64 value.
**/
public long getValue() {
return this.value;
}
/**
* Update CRC64 with new byte block.
**/
public void update(byte [] b, int len) {
int idx = 0;
this.value = ~this.value;
while (len > 0) {
this.value = table[((int) (this.value ^ b[idx])) & 0xff] ^ (this.value >>> 8);
idx++;
len--;
}
this.value = ~this.value;
}
private static final int GF2_DIM = 64; /* dimension of GF(2) vectors (length of CRC) */
private static long gf2MatrixTimes(long [] mat, long vec)
{
long sum = 0;
int idx = 0;
while (vec != 0) {
if ((vec & 1) == 1)
sum ^= mat[idx];
vec >>>= 1;
idx++;
}
return sum;
}
private static void gf2MatrixSquare(long [] square, long [] mat)
{
for (int n = 0; n < GF2_DIM; n++)
square[n] = gf2MatrixTimes(mat, mat[n]);
}
/*
* Return the CRC-64 of two sequential blocks, where summ1 is the CRC-64 of the
* first block, summ2 is the CRC-64 of the second block, and len2 is the length
* of the second block.
*/
static public CRC64 combine(CRC64 summ1, CRC64 summ2, long len2)
{
// degenerate case.
if (len2 == 0)
return new CRC64(summ1.getValue());
int n;
long row;
long [] even = new long[GF2_DIM]; // even-power-of-two zeros operator
long [] odd = new long[GF2_DIM]; // odd-power-of-two zeros operator
// put operator for one zero bit in odd
odd[0] = POLY; // CRC-64 polynomial
row = 1;
for (n = 1; n < GF2_DIM; n++) {
odd[n] = row;
row <<= 1;
}
// put operator for two zero bits in even
gf2MatrixSquare(even, odd);
// put operator for four zero bits in odd
gf2MatrixSquare(odd, even);
// apply len2 zeros to crc1 (first square will put the operator for one
// zero byte, eight zero bits, in even)
long crc1 = summ1.getValue();
long crc2 = summ2.getValue();
do {
// apply zeros operator for this bit of len2
gf2MatrixSquare(even, odd);
if ((len2 & 1) == 1)
crc1 = gf2MatrixTimes(even, crc1);
len2 >>>= 1;
// if no more bits set, then done
if (len2 == 0)
break;
// another iteration of the loop with odd and even swapped
gf2MatrixSquare(odd, even);
if ((len2 & 1) == 1)
crc1 = gf2MatrixTimes(odd, crc1);
len2 >>>= 1;
// if no more bits set, then done
} while (len2 != 0);
// return combined crc.
crc1 ^= crc2;
return new CRC64(crc1);
}
private static void test(byte [] b, int len, long crcValue) throws Exception {
/* Test CRC64 default calculation. */
CRC64 crc = new CRC64(b, len);
if (crc.getValue() != crcValue) {
throw new Exception("mismatch: " + String.format("%016x", crc.getValue())
+ " should be " + String.format("%016x", crcValue));
}
/* test combine() */
CRC64 crc1 = new CRC64(b, (len + 1) >>> 1);
CRC64 crc2 = new CRC64(Arrays.copyOfRange(b, (len + 1) >>> 1, b.length), len >>> 1);
crc = CRC64.combine(crc1, crc2, len >>> 1);
if (crc.getValue() != crcValue) {
throw new Exception("mismatch: " + String.format("%016x", crc.getValue())
+ " should be " + String.format("%016x", crcValue));
}
}
public static void main(String [] args) throws Exception {
final byte[] TEST1 = "123456789".getBytes();
final int TESTLEN1 = 9;
final long TESTCRC1 = 0x995dc9bbdf1939faL; // ECMA.
test(TEST1, TESTLEN1, TESTCRC1);
final byte[] TEST2 = "This is a test of the emergency broadcast system.".getBytes();
final int TESTLEN2 = 49;
final long TESTCRC2 = 0x27db187fc15bbc72L; // ECMA.
test(TEST2, TESTLEN2, TESTCRC2);
final byte[] TEST3 = "IHATEMATH".getBytes();
final int TESTLEN3 = 9;
final long TESTCRC3 = 0x3920e0f66b6ee0c8L; // ECMA.
test(TEST3, TESTLEN3, TESTCRC3);
}
}

Related

CRC16 CCITT Java implementation

There is a function written in C that calculates CRC16 CCITT. It helps reading data from RFID reader - and basically works fine. I would like to write a function in Java that would do similar thing.
I tried online converter page to do this, but the code I got is garbage.
Could you please take a look at this and advice why Java code that should do the same generates different crc?
Please find attached original C function:
void CRC16(unsigned char * Data, unsigned short * CRC, unsigned char Bytes)
{
int i, byte;
unsigned short C;
*CRC = 0;
for (byte = 1; byte <= Bytes; byte++, Data++)
{
C = ((*CRC >> 8) ^ *Data) << 8;
for (i = 0; i < 8; i++)
{
if (C & 0x8000)
C = (C << 1) ^ 0x1021;
else
C = C << 1;
}
*CRC = C ^ (*CRC << 8);
}
}
And here is the different CRC function written in JAVA that should calculate the same checksum, but it does not:
public static int CRC16_CCITT_Test(byte[] buffer) {
int wCRCin = 0x0000;
int wCPoly = 0x1021;
for (byte b : buffer) {
for (int i = 0; i < 8; i++) {
boolean bit = ((b >> (7 - i) & 1) == 1);
boolean c15 = ((wCRCin >> 15 & 1) == 1);
wCRCin <<= 1;
if (c15 ^ bit)
wCRCin ^= wCPoly;
}
}
wCRCin &= 0xffff;
return wCRCin;
}
When I try calculating 0,2,3 numbers in both functions I got different results:
for C function it is (DEC): 22017
for JAVA function it is (DEC): 28888
OK. I have converter C into Java code and got it partially working.
public static int CRC16_Test(byte[] data, byte bytes) {
int dataIndex = 0;
short c;
short [] crc= {0};
crc[0] = (short)0;
for(int j = 1; j <= Byte.toUnsignedInt(bytes); j++, dataIndex++) {
c = (short)((Short.toUnsignedInt(crc[0]) >> 8 ^ Byte.toUnsignedInt(data[dataIndex])) << 8);
for(int i = 0; i < 8; i++) {
if((Short.toUnsignedInt(c) & 0x8000) != 0) {
c = (short)(Short.toUnsignedInt(c) << 1 ^ 0x1021);
} else {
c = (short)(Short.toUnsignedInt(c) << 1);
}
}
crc[0] = (short)(Short.toUnsignedInt(c) ^ Short.toUnsignedInt(crc[0]) << 8);
}
return crc[0];
}
It gives the same CRC values as C code for 0,2,3 numbers, but i.e. for numbers 255, 216, 228 C code crc is 60999 while JAVA crc is -4537.
OK. Finally thanks to your pointers I got this working.
The last change required was changing 'return crc[0]' to:
return (int) crc[0] & 0xffff;
... and it works...
Many thanks to all :)
There is nothing wrong. For a 16 bit value, –4537 is represented as the exact same 16 bits as 60999 is. If you would like for your routine to return the positive version, convert to int (which is 32 bits) and do an & 0xffff.

crc four bytes converting C code to Java yielding unexpected result

In extension of the following question
Converting crc code from C to Java yields unexpected results
I am trying to convert C code to Java and get unexpected results
C code is as follows
/// swap endianess
static inline uint32_t swap(uint32_t x)
{
return (x >> 24) |
((x >> 8) & 0x0000FF00) |
((x << 8) & 0x00FF0000) |
(x << 24);
}
/// look-up table, already declared above
const uint32_t Crc32Lookup[8][256] =
{
{ 0x00000000,0x77073096,0xEE0E612C,0x990951BA,0x076DC419,0x706AF48F,0xE963A535,0x9E6495A3,
0x0EDB8832,0x79DCB8A4,0xE0D5E91E,0x97D2D988,0x09B64C2B,0x7EB17CBD,0xE7B82D07,0x90BF1D91,
0x1DB71064,0x6AB020F2,0xF3B97148,0x84BE41DE,0x1ADAD47D,0x6DDDE4EB,0xF4D4B551,0x83D385C7,
0x136C9856,0x646BA8C0,0xFD62F97A,0x8A65C9EC,0x14015C4F,0x63066CD9,0xFA0F3D63,0x8D080DF5,
0x3B6E20C8,0x4C69105E,0xD56041E4,0xA2677172,0x3C03E4D1,0x4B04D447,0xD20D85FD,0xA50AB56B,
0x35B5A8FA,0x42B2986C,0xDBBBC9D6,0xACBCF940,0x32D86CE3,0x45DF5C75,0xDCD60DCF,0xABD13D59,
0x26D930AC,0x51DE003A,0xC8D75180,0xBFD06116,0x21B4F4B5,0x56B3C423,0xCFBA9599,0xB8BDA50F,
0x2802B89E,0x5F058808,0xC60CD9B2,0xB10BE924,0x2F6F7C87,0x58684C11,0xC1611DAB,0xB6662D3D,
0x76DC4190,0x01DB7106,0x98D220BC,0xEFD5102A,0x71B18589,0x06B6B51F,0x9FBFE4A5,0xE8B8D433,
0x7807C9A2,0x0F00F934,0x9609A88E,0xE10E9818,0x7F6A0DBB,0x086D3D2D,0x91646C97,0xE6635C01,
0x6B6B51F4,0x1C6C6162,0x856530D8,0xF262004E,0x6C0695ED,0x1B01A57B,0x8208F4C1,0xF50FC457,
0x65B0D9C6,0x12B7E950,0x8BBEB8EA,0xFCB9887C,0x62DD1DDF,0x15DA2D49,0x8CD37CF3,0xFBD44C65,
0x4DB26158,0x3AB551CE,0xA3BC0074,0xD4BB30E2,0x4ADFA541,0x3DD895D7,0xA4D1C46D,0xD3D6F4FB,
0x4369E96A,0x346ED9FC,0xAD678846,0xDA60B8D0,0x44042D73,0x33031DE5,0xAA0A4C5F,0xDD0D7CC9,
0x5005713C,0x270241AA,0xBE0B1010,0xC90C2086,0x5768B525,0x206F85B3,0xB966D409,0xCE61E49F,
0x5EDEF90E,0x29D9C998,0xB0D09822,0xC7D7A8B4,0x59B33D17,0x2EB40D81,0xB7BD5C3B,0xC0BA6CAD,
0xEDB88320,0x9ABFB3B6,0x03B6E20C,0x74B1D29A,0xEAD54739,0x9DD277AF,0x04DB2615,0x73DC1683,
0xE3630B12,0x94643B84,0x0D6D6A3E,0x7A6A5AA8,0xE40ECF0B,0x9309FF9D,0x0A00AE27,0x7D079EB1,
0xF00F9344,0x8708A3D2,0x1E01F268,0x6906C2FE,0xF762575D,0x806567CB,0x196C3671,0x6E6B06E7,
0xFED41B76,0x89D32BE0,0x10DA7A5A,0x67DD4ACC,0xF9B9DF6F,0x8EBEEFF9,0x17B7BE43,0x60B08ED5,
0xD6D6A3E8,0xA1D1937E,0x38D8C2C4,0x4FDFF252,0xD1BB67F1,0xA6BC5767,0x3FB506DD,0x48B2364B,
0xD80D2BDA,0xAF0A1B4C,0x36034AF6,0x41047A60,0xDF60EFC3,0xA867DF55,0x316E8EEF,0x4669BE79,
0xCB61B38C,0xBC66831A,0x256FD2A0,0x5268E236,0xCC0C7795,0xBB0B4703,0x220216B9,0x5505262F,
0xC5BA3BBE,0xB2BD0B28,0x2BB45A92,0x5CB36A04,0xC2D7FFA7,0xB5D0CF31,0x2CD99E8B,0x5BDEAE1D,
0x9B64C2B0,0xEC63F226,0x756AA39C,0x026D930A,0x9C0906A9,0xEB0E363F,0x72076785,0x05005713,
0x95BF4A82,0xE2B87A14,0x7BB12BAE,0x0CB61B38,0x92D28E9B,0xE5D5BE0D,0x7CDCEFB7,0x0BDBDF21,
0x86D3D2D4,0xF1D4E242,0x68DDB3F8,0x1FDA836E,0x81BE16CD,0xF6B9265B,0x6FB077E1,0x18B74777,
0x88085AE6,0xFF0F6A70,0x66063BCA,0x11010B5C,0x8F659EFF,0xF862AE69,0x616BFFD3,0x166CCF45,
0xA00AE278,0xD70DD2EE,0x4E048354,0x3903B3C2,0xA7672661,0xD06016F7,0x4969474D,0x3E6E77DB,
0xAED16A4A,0xD9D65ADC,0x40DF0B66,0x37D83BF0,0xA9BCAE53,0xDEBB9EC5,0x47B2CF7F,0x30B5FFE9,
0xBDBDF21C,0xCABAC28A,0x53B39330,0x24B4A3A6,0xBAD03605,0xCDD70693,0x54DE5729,0x23D967BF,
0xB3667A2E,0xC4614AB8,0x5D681B02,0x2A6F2B94,0xB40BBE37,0xC30C8EA1,0x5A05DF1B,0x2D02EF8D
},
.....
};
/// compute CRC32 (Slicing-by-4 algorithm)
uint32_t Crc32FourBytes(const void* data, size_t length, uint32_t previousCrc32 = 0)
{
uint32_t crc = ~previousCrc32; // same as previousCrc32 ^ 0xFFFFFFFF
const uint32_t* current = (const uint32_t*) data;
// process four bytes at once (Slicing-by-4)
while (length >= 4)
{
printf("Here\n");
#if __BYTE_ORDER == __BIG_ENDIAN
printf("BIG ENDIAN\n");
uint32_t one = *current++ ^ swap(crc);
printf("one is %d\n", (int32_t)one);
crc = Crc32Lookup[0][ one & 0xFF] ^
Crc32Lookup[1][(one>> 8) & 0xFF] ^
Crc32Lookup[2][(one>>16) & 0xFF] ^
Crc32Lookup[3][(one>>24) & 0xFF];
#else
printf("LITTLE ENDIAN\n");
uint32_t one = *current++ ^ crc;
crc = Crc32Lookup[0][(one>>24) & 0xFF] ^
Crc32Lookup[1][(one>>16) & 0xFF] ^
Crc32Lookup[2][(one>> 8) & 0xFF] ^
Crc32Lookup[3][ one & 0xFF];
#endif
length -= 4;
}
const uint8_t* currentChar = (const uint8_t*) current;
// remaining 1 to 3 bytes (standard algorithm)
while (length-- > 0)
crc = (crc >> 8) ^ Crc32Lookup[0][(crc & 0xFF) ^ *currentChar++];
return ~crc; // same as crc ^ 0xFFFFFFFF
}
int main(int argc, char* argv[])
{
const char* test_string = "Hello World";
printf("strlen of test_string %ld\n",strlen(test_string));
uint32_t test_crc32_fb = Crc32FourBytes((void*)test_string,strlen(test_string),0);
printf("test_crc32_fb = %d\n",(int32_t) test_crc32_fb);
return 0;
}
Result
BIG ENDIAN
one is -1819043145
BIG ENDIAN
one is 861613025
test_crc32_fb = 202227096
And my Java implementation is as follows
import java.nio.ByteOrder;
public class CRC32{
/// zlib's CRC32 polynomial
private static final long CrcPolynomial = 0xEDB88320L;
/// swap endianess
private static long swap(long x)
{
return ((x >> 24) & 0x000000FFL) |
((x >> 8) & 0x0000FF00L) |
((x << 8) & 0x00FF0000L) |
((x << 24) & 0xFF000000L);
}
final static long Crc32Lookup[][] = new long [][]
{
{ 0x00000000,0x77073096,0xEE0E612C,0x990951BA,0x076DC419,0x706AF48F,0xE963A535,0x9E6495A3,
0x0EDB8832,0x79DCB8A4,0xE0D5E91E,0x97D2D988,0x09B64C2B,0x7EB17CBD,0xE7B82D07,0x90BF1D91,
0x1DB71064,0x6AB020F2,0xF3B97148,0x84BE41DE,0x1ADAD47D,0x6DDDE4EB,0xF4D4B551,0x83D385C7,
0x136C9856,0x646BA8C0,0xFD62F97A,0x8A65C9EC,0x14015C4F,0x63066CD9,0xFA0F3D63,0x8D080DF5,
0x3B6E20C8,0x4C69105E,0xD56041E4,0xA2677172,0x3C03E4D1,0x4B04D447,0xD20D85FD,0xA50AB56B,
0x35B5A8FA,0x42B2986C,0xDBBBC9D6,0xACBCF940,0x32D86CE3,0x45DF5C75,0xDCD60DCF,0xABD13D59,
0x26D930AC,0x51DE003A,0xC8D75180,0xBFD06116,0x21B4F4B5,0x56B3C423,0xCFBA9599,0xB8BDA50F,
0x2802B89E,0x5F058808,0xC60CD9B2,0xB10BE924,0x2F6F7C87,0x58684C11,0xC1611DAB,0xB6662D3D,
0x76DC4190,0x01DB7106,0x98D220BC,0xEFD5102A,0x71B18589,0x06B6B51F,0x9FBFE4A5,0xE8B8D433,
0x7807C9A2,0x0F00F934,0x9609A88E,0xE10E9818,0x7F6A0DBB,0x086D3D2D,0x91646C97,0xE6635C01,
0x6B6B51F4,0x1C6C6162,0x856530D8,0xF262004E,0x6C0695ED,0x1B01A57B,0x8208F4C1,0xF50FC457,
0x65B0D9C6,0x12B7E950,0x8BBEB8EA,0xFCB9887C,0x62DD1DDF,0x15DA2D49,0x8CD37CF3,0xFBD44C65,
0x4DB26158,0x3AB551CE,0xA3BC0074,0xD4BB30E2,0x4ADFA541,0x3DD895D7,0xA4D1C46D,0xD3D6F4FB,
0x4369E96A,0x346ED9FC,0xAD678846,0xDA60B8D0,0x44042D73,0x33031DE5,0xAA0A4C5F,0xDD0D7CC9,
0x5005713C,0x270241AA,0xBE0B1010,0xC90C2086,0x5768B525,0x206F85B3,0xB966D409,0xCE61E49F,
0x5EDEF90E,0x29D9C998,0xB0D09822,0xC7D7A8B4,0x59B33D17,0x2EB40D81,0xB7BD5C3B,0xC0BA6CAD,
0xEDB88320,0x9ABFB3B6,0x03B6E20C,0x74B1D29A,0xEAD54739,0x9DD277AF,0x04DB2615,0x73DC1683,
0xE3630B12,0x94643B84,0x0D6D6A3E,0x7A6A5AA8,0xE40ECF0B,0x9309FF9D,0x0A00AE27,0x7D079EB1,
0xF00F9344,0x8708A3D2,0x1E01F268,0x6906C2FE,0xF762575D,0x806567CB,0x196C3671,0x6E6B06E7,
0xFED41B76,0x89D32BE0,0x10DA7A5A,0x67DD4ACC,0xF9B9DF6F,0x8EBEEFF9,0x17B7BE43,0x60B08ED5,
0xD6D6A3E8,0xA1D1937E,0x38D8C2C4,0x4FDFF252,0xD1BB67F1,0xA6BC5767,0x3FB506DD,0x48B2364B,
0xD80D2BDA,0xAF0A1B4C,0x36034AF6,0x41047A60,0xDF60EFC3,0xA867DF55,0x316E8EEF,0x4669BE79,
0xCB61B38C,0xBC66831A,0x256FD2A0,0x5268E236,0xCC0C7795,0xBB0B4703,0x220216B9,0x5505262F,
0xC5BA3BBE,0xB2BD0B28,0x2BB45A92,0x5CB36A04,0xC2D7FFA7,0xB5D0CF31,0x2CD99E8B,0x5BDEAE1D,
0x9B64C2B0,0xEC63F226,0x756AA39C,0x026D930A,0x9C0906A9,0xEB0E363F,0x72076785,0x05005713,
0x95BF4A82,0xE2B87A14,0x7BB12BAE,0x0CB61B38,0x92D28E9B,0xE5D5BE0D,0x7CDCEFB7,0x0BDBDF21,
0x86D3D2D4,0xF1D4E242,0x68DDB3F8,0x1FDA836E,0x81BE16CD,0xF6B9265B,0x6FB077E1,0x18B74777,
0x88085AE6,0xFF0F6A70,0x66063BCA,0x11010B5C,0x8F659EFF,0xF862AE69,0x616BFFD3,0x166CCF45,
0xA00AE278,0xD70DD2EE,0x4E048354,0x3903B3C2,0xA7672661,0xD06016F7,0x4969474D,0x3E6E77DB,
0xAED16A4A,0xD9D65ADC,0x40DF0B66,0x37D83BF0,0xA9BCAE53,0xDEBB9EC5,0x47B2CF7F,0x30B5FFE9,
0xBDBDF21C,0xCABAC28A,0x53B39330,0x24B4A3A6,0xBAD03605,0xCDD70693,0x54DE5729,0x23D967BF,
0xB3667A2E,0xC4614AB8,0x5D681B02,0x2A6F2B94,0xB40BBE37,0xC30C8EA1,0x5A05DF1B,0x2D02EF8D
},
....
};
public static int LongToInt(long value){
return (int)(value & 0xFFFFFFFFL);
}
public static long Complement(long value){
return (value ^ 0xFFFFFFFFL);
}
private static long Crc32FourBytes(byte[] data, long length, long previousCrc32, boolean is_bigendian)
{ //long crc = ~previousCrc32; // same as previousCrc32 ^ 0xFFFFFFFF
//force long to unsigned integer below
long crc = Complement(previousCrc32);
int i = 0;
for( int j = data.length; j >= 4 ; j = j-4){
if (is_bigendian == true){
long one = data[i] ^ LongToInt(swap(crc));
System.out.format("one is %d\n", LongToInt(one));
crc = Crc32Lookup[0][LongToInt((one) & 0xFF)] ^
Crc32Lookup[1][LongToInt((one>>>8) & 0xFF)] ^
Crc32Lookup[2][LongToInt((one>>>16) & 0xFF)] ^
Crc32Lookup[3][LongToInt((one>>>24) & 0xFF)];
} else {
long one = data[i] ^ (crc);
crc = Crc32Lookup[0][LongToInt((one>>>24) & 0xFF)] ^
Crc32Lookup[1][LongToInt((one>>>16) & 0xFF)] ^
Crc32Lookup[2][LongToInt((one>>>8 ) & 0xFF)] ^
Crc32Lookup[3][LongToInt((one ) & 0xFF)];
}
i += 1;
//System.out.format("%d\n",k);
}
for (int k=0; k < data.length; k++)
{
crc = (LongToInt(crc) >>> 8) ^ Crc32Lookup[0][LongToInt((LongToInt(crc) & 0xFF) ^ data[k])];
}
return Complement(crc); //return crc ^ 0xFFFFFFFF;
}
public static void main(String []args){
System.out.println("Hello World");
final String str = "Hello World";
byte[] test_string = str.getBytes();
long test_crc32_fb = Crc32FourBytes(test_string,test_string.length,0,true);
System.out.format("%d\n",LongToInt(test_crc32_fb));
}
}
Result
one is -73
one is 1105837251
369408888
I am not sure where I am making mistake in the code. Thanks.
You'll need to use asIntBuffer() to access the input byte array data instead as a series of four-byte words. That would be the equivalent of the C cast const uint32_t* current = (const uint32_t*) data;, which has current access the bytes at data four at a time.
The C code declares current to be an uint32_t*, a pointer to uint32_t:
const uint32_t* current = (const uint32_t*) data;
Therefore (within the loop)
uint32_t one = *current++ ^ swap(crc);
reads and processes 4 bytes at once.
Your Java code, on the other hand, works only on a single byte from the input:
long one = data[i] ^ LongToInt(swap(crc));
Thanks to your guidance especially #Thomas Kläger and #Mark Adler I got this working.
//helper functions
public static final byte[] getByteArrayFromIndex(byte[] data, int index){
return new byte[] {
data[index],
data[index+1],
data[index+2],
data[index+3]
};
}
public static final byte getByteFromIndex(byte[] data, int index){
return data[index];
}
public static long getLongFromByteArray(byte[] array){
return ByteBuffer.wrap(array)
.order(ByteOrder.LITTLE_ENDIAN).getInt() & 0xFFFFFFFFL;
}
private static byte[] convertLongToByteArray(long l) {
byte[] b = new byte[4];
if(java.nio.ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN){
for (int i=0; i<4; i++) {
b[i] = (byte)(l % 256) ;
l = l / 256;
}
}else{
for (int i=3; i>=0; i--) {
b[i] = (byte)(l % 256) ;
l = l / 256;
}
}
return b;
}
//Crc32FourBytes
private static long Crc32FourBytes(byte[] data, long length, long previousCrc32, boolean is_bigendian)
{ //long crc = ~previousCrc32; // same as previousCrc32 ^ 0xFFFFFFFF
//force long to unsigned integer below
long crc = Complement(previousCrc32);
int i = 0;
int j = data.length;
long temp = 0;
for( ; j >= 4 ; j = j-4){
if (is_bigendian == true){
temp = getLongFromByteArray(getByteArrayFromIndex(data,i));
long one = temp ^ swap(crc);
System.out.format("one is %d\n", LongToInt(one));
crc = Crc32Lookup[0][LongToInt((one) & 0xFF)] ^
Crc32Lookup[1][LongToInt((one>>>8) & 0xFF)] ^
Crc32Lookup[2][LongToInt((one>>>16) & 0xFF)] ^
Crc32Lookup[3][LongToInt((one>>>24) & 0xFF)];
System.out.format("crc is %d\n", LongToInt(crc));
} else {
temp = getLongFromByteArray(getByteArrayFromIndex(data,i));
long one = temp ^ (crc);
crc = Crc32Lookup[0][LongToInt((one>>>24) & 0xFF)] ^
Crc32Lookup[1][LongToInt((one>>>16) & 0xFF)] ^
Crc32Lookup[2][LongToInt((one>>>8 ) & 0xFF)] ^
Crc32Lookup[3][LongToInt((one ) & 0xFF)];
System.out.format("crc is %d\n", LongToInt(crc));
}
i += 4;
//System.out.format("%d\n",k);
}
// Remaining 1 to 3 bytes (standard algorithm)
for (int k=data.length-i; k >0; k--)
{
crc = (LongToInt(crc) >>> 8) ^ Crc32Lookup[0][LongToInt((LongToInt(crc) & 0xFF) ^ LongToInt(data[data.length-k]))];
System.out.format("=> crc is %d\n", LongToInt(crc));
}
return Complement(crc); //return crc ^ 0xFFFFFFFF;
}
By extension I also got slice by 8 working.
private static long Crc32EightBytes(byte[] data, long length, long previousCrc32, boolean is_bigendian)
{ //long crc = ~previousCrc32; // same as previousCrc32 ^ 0xFFFFFFFF
//force long to unsigned integer below
long crc = Complement(previousCrc32);
int i = 0;
int j = data.length;
long temp = 0;
for( ; j >= 8 ; j = j-8){
if (is_bigendian == true){
//First set of four
temp = getLongFromByteArray(getByteArrayFromIndex(data,i));
long one = temp ^ swap(crc);
//Second set of four
i += 4;
temp = getLongFromByteArray(getByteArrayFromIndex(data,i));
long two = temp;
System.out.format("one is %d\n", LongToInt(one));
System.out.format("two is %d\n", LongToInt(one));
crc = Crc32Lookup[0][LongToInt((two ) & 0xFF)] ^
Crc32Lookup[1][LongToInt((two>>>8) & 0xFF)] ^
Crc32Lookup[2][LongToInt((two>>>16) & 0xFF)] ^
Crc32Lookup[3][LongToInt((two>>>24) & 0xFF)] ^
Crc32Lookup[4][LongToInt((one ) & 0xFF)] ^
Crc32Lookup[5][LongToInt((one>>>8) & 0xFF)] ^
Crc32Lookup[6][LongToInt((one>>>16) & 0xFF)] ^
Crc32Lookup[7][LongToInt((one>>>24) & 0xFF)];
System.out.format("crc is %d\n", LongToInt(crc));
} else {
//First set of four
temp = getLongFromByteArray(getByteArrayFromIndex(data,i));
long one = temp ^ (crc);
//Second set of four
i += 4;
temp = getLongFromByteArray(getByteArrayFromIndex(data,i));
long two = temp;
crc = Crc32Lookup[0][LongToInt((two>>>24) & 0xFF)] ^
Crc32Lookup[1][LongToInt((two>>>16) & 0xFF)] ^
Crc32Lookup[2][LongToInt((two>>>8 ) & 0xFF)] ^
Crc32Lookup[3][LongToInt((two ) & 0xFF)] ^
Crc32Lookup[4][LongToInt((one>>>24) & 0xFF)] ^
Crc32Lookup[5][LongToInt((one>>>16) & 0xFF)] ^
Crc32Lookup[6][LongToInt((one>>>8 ) & 0xFF)] ^
Crc32Lookup[7][LongToInt((one ) & 0xFF)];
System.out.format("crc is %d\n", LongToInt(crc));
}
//Increment by 4 here because we increment by another 4 for second set in the iterative loop making total increment by 4+4 = 8
i += 4;
System.out.format("%d\n",i);
}
// Remaining 1 to 7 bytes (standard algorithm)
for (int k=data.length-i; k >0; k--)
{
crc = (LongToInt(crc) >>> 8) ^ Crc32Lookup[0][LongToInt((LongToInt(crc) & 0xFF) ^ LongToInt(data[data.length-k]))];
System.out.format("=>>>> crc is %d\n", LongToInt(crc));
}
return Complement(crc); //return crc ^ 0xFFFFFFFF;
}
You need to use unsigned right shifts: >>> instead of >>. The C code uses unsigned ints, which has the same effect.

What do "& 0xFF" and ">>>" shifting do?

I am trying to understand the below code.
The method getKey() returns a string, and getDistance() returns a double. The code is a taken from a class which is meant to hold String (the key) and Double (the distance) pairs.
To be more specific I am unsure as to what the lines that do the shifting do.
public void serialize (byte[] outputArray) {
// write the length of the string out
byte[] data = getKey().getBytes ();
for (int i = 0; i < 2; i++) {
outputArray[i] = (byte) ((data.length >>> ((1 - i) * 8)) & 0xFF);
}
// write the key out
for (int i = 0; i < data.length; i++) {
outputArray[i + 2] = data[i];
}
// now write the distance out
long bits = Double.doubleToLongBits (getDistance());
for (int i = 0; i < 8; i++) {
outputArray[i + 2 + data.length] = (byte) ((bits >>> ((7 - i) * 8)) & 0xFF);
}
}
Any help would be very appreciated.
>>> is unsigned shift to right operator. It shifts the sign bit too.
& 0xFF retains bits to make a 8-bit (byte) value, otherwise you may have some garbage.
Start by reading Java's tutorial on bitwise operators. In short:
>>> is an unsigned right shift
& 0xFF is ANDing the outcome of (bits >>> ((7 - i) * 8)) with 0xFF

Bitwise operator for simply flipping all bits in an integer?

I have to flip all bits in a binary representation of an integer. Given:
10101
The output should be
01010
What is the bitwise operator to accomplish this when used with an integer? For example, if I were writing a method like int flipBits(int n);, what would go in the body? I need to flip only what's already present in the number, not all 32 bits in the integer.
The ~ unary operator is bitwise negation. If you need fewer bits than what fits in an int then you'll need to mask it with & after the fact.
Simply use the bitwise not operator ~.
int flipBits(int n) {
return ~n;
}
To use the k least significant bits, convert it to the right mask.
(I assume you want at least 1 bit of course, that's why mask starts at 1)
int flipBits(int n, int k) {
int mask = 1;
for (int i = 1; i < k; ++i)
mask |= mask << 1;
return ~n & mask;
}
As suggested by Lưu Vĩnh Phúc, one can create the mask as (1 << k) - 1 instead of using a loop.
int flipBits2(int n, int k) {
int mask = (1 << k) - 1;
return ~n & mask;
}
There is a number of ways to flip all the bit using operations
x = ~x; // has been mentioned and the most obvious solution.
x = -x - 1; or x = -1 * (x + 1);
x ^= -1; or x = x ^ ~0;
Well since so far there's only one solution that gives the "correct" result and that's.. really not a nice solution (using a string to count leading zeros? that'll haunt me in my dreams ;) )
So here we go with a nice clean solution that should work - haven't tested it thorough though, but you get the gist. Really, java not having an unsigned type is extremely annoying for this kind of problems, but it should be quite efficient nonetheless (and if I may say so MUCH more elegant than creating a string out of the number)
private static int invert(int x) {
if (x == 0) return 0; // edge case; otherwise returns -1 here
int nlz = nlz(x);
return ~x & (0xFFFFFFFF >>> nlz);
}
private static int nlz(int x) {
// Replace with whatever number leading zero algorithm you want - I can think
// of a whole list and this one here isn't that great (large immediates)
if (x < 0) return 0;
if (x == 0) return 32;
int n = 0;
if ((x & 0xFFFF0000) == 0) {
n += 16;
x <<= 16;
}
if ((x & 0xFF000000) == 0) {
n += 8;
x <<= 8;
}
if ((x & 0xF0000000) == 0) {
n += 4;
x <<= 4;
}
if ((x & 0xC0000000) == 0) {
n += 2;
x <<= 2;
}
if ((x & 0x80000000) == 0) {
n++;
}
return n;
}
faster and simpler solution :
/* inverts all bits of n, with a binary length of the return equal to the length of n
k is the number of bits in n, eg k=(int)Math.floor(Math.log(n)/Math.log(2))+1
if n is a BigInteger : k= n.bitLength();
*/
int flipBits2(int n, int k) {
int mask = (1 << k) - 1;
return n ^ mask;
}
One Line Solution
int flippingBits(int n) {
return n ^ ((1 << 31) - 1);
}
I'd have to see some examples to be sure, but you may be getting unexpected values because of two's complement arithmetic. If the number has leading zeros (as it would in the case of 26), the ~ operator would flip these to make them leading ones - resulting in a negative number.
One possible workaround would be to use the Integer class:
int flipBits(int n){
String bitString = Integer.toBinaryString(n);
int i = 0;
while (bitString.charAt(i) != '1'){
i++;
}
bitString = bitString.substring(i, bitString.length());
for(i = 0; i < bitString.length(); i++){
if (bitString.charAt(i) == '0')
bitString.charAt(i) = '1';
else
bitString.charAt(i) = '0';
}
int result = 0, factor = 1;
for (int j = bitString.length()-1; j > -1; j--){
result += factor * bitString.charAt(j);
factor *= 2;
}
return result;
}
I don't have a java environment set up right now to test it on, but that's the general idea. Basically just convert the number to a string, cut off the leading zeros, flip the bits, and convert it back to a number. The Integer class may even have some way to parse a string into a binary number. I don't know if that's how the problem needs to be done, and it probably isn't the most efficient way to do it, but it would produce the correct result.
Edit: polygenlubricants' answer to this question may also be helpful
I have another way to solve this case,
public static int complementIt(int c){
return c ^ (int)(Math.pow(2, Math.ceil(Math.log(c)/Math.log(2))) -1);
}
It is using XOR to get the complement bit, to complement it we need to XOR the data with 1, for example :
101 XOR 111 = 010
(111 is the 'key', it generated by searching the 'n' square root of the data)
if you are using ~ (complement) the result will depend on its variable type, if you are using int then it will be process as 32bit.
As we are only required to flip the minimum bits required for the integer (say 50 is 110010 and when inverted, it becomes 001101 which is 13), we can invert individual bits one at a time from the LSB to MSB, and keep shifting the bits to the right and accordingly apply the power of 2. The code below does the required job:
int invertBits (int n) {
int pow2=1, int bit=0;
int newnum=0;
while(n>0) {
bit = (n & 1);
if(bit==0)
newnum+= pow2;
n=n>>1;
pow2*=2;
}
return newnum;
}
import java.math.BigInteger;
import java.util.Scanner;
public class CodeRace1 {
public static void main(String[] s) {
long input;
BigInteger num,bits = new BigInteger("4294967295");
Scanner sc = new Scanner(System.in);
input = sc.nextInt();
sc.nextLine();
while (input-- > 0) {
num = new BigInteger(sc.nextLine().trim());
System.out.println(num.xor(bits));
}
}
}
The implementation from openJDK, Integer.reverse():
public static int More ...reverse(int i) {
i = (i & 0x55555555) << 1 | (i >>> 1) & 0x55555555;
i = (i & 0x33333333) << 2 | (i >>> 2) & 0x33333333;
i = (i & 0x0f0f0f0f) << 4 | (i >>> 4) & 0x0f0f0f0f;
i = (i << 24) | ((i & 0xff00) << 8) |
((i >>> 8) & 0xff00) | (i >>> 24);
return i;
}
Base on my experiments on my laptop, the implementation below was faster:
public static int reverse2(int i) {
i = (i & 0x55555555) << 1 | (i >>> 1) & 0x55555555;
i = (i & 0x33333333) << 2 | (i >>> 2) & 0x33333333;
i = (i & 0x0f0f0f0f) << 4 | (i >>> 4) & 0x0f0f0f0f;
i = (i & 0x00ff00ff) << 8 | (i >>> 8) & 0x00ff00ff;
i = (i & 0x0000ffff) << 16 | (i >>> 16) & 0x0000ffff;
return i;
}
Not sure what's the reason behind it - as it may depends on how the java code is interpreted into machine code...
If you just want to flip the bits which are "used" in the integer, try this:
public int flipBits(int n) {
int mask = (Integer.highestOneBit(n) << 1) - 1;
return n ^ mask;
}
public static int findComplement(int num) {
return (~num & (Integer.highestOneBit(num) - 1));
}
int findComplement(int num) {
int i = 0, ans = 0;
while(num) {
if(not (num & 1)) {
ans += (1 << i);
}
i += 1;
num >>= 1;
}
return ans;
}
Binary 10101 == Decimal 21
Flipped Binary 01010 == Decimal 10
One liner (in Javascript - You could convert to your favorite programming language )
10 == ~21 & (1 << (Math.floor(Math.log2(21))+1)) - 1
Explanation:
10 == ~21 & mask
mask : For filtering out all the leading bits before the significant bits count (nBits - see below)
How to calculate the significant bit counts ?
Math.floor(Math.log2(21))+1 => Returns how many significant bits are there (nBits)
Ex:
0000000001 returns 1
0001000001 returns 7
0000010101 returns 5
(1 << nBits) - 1 => 1111111111.....nBits times = mask
It can be done by a simple way, just simply subtract the number from the value
obtained when all the bits are equal to 1 .
For example:
Number: Given Number
Value : A number with all bits set in a given number.
Flipped number = Value – Number.
Example :
Number = 23,
Binary form: 10111
After flipping digits number will be: 01000
Value: 11111 = 31
We can find the most significant set bit in O(1) time for a fixed size integer. For
example below code is for a 32-bit integer.
int setBitNumber(int n)
{
n |= n>>1;
n |= n>>2;
n |= n>>4;
n |= n>>8;
n |= n>>16;
n = n + 1;
return (n >> 1);
}

Extract bit sequences of arbitrary length from byte[] array efficiently

I'm looking for the most efficient way of extracting (unsigned) bit sequences of arbitrary length (0 <= length <= 16) at arbitrary position. The skeleton class show how my current implementation essentially handles the problem:
public abstract class BitArray {
byte[] bytes = new byte[2048];
int bitGet;
public BitArray() {
}
public void readNextBlock(int initialBitGet, int count) {
// substitute for reading from an input stream
for (int i=(initialBitGet>>3); i<=count; ++i) {
bytes[i] = (byte) i;
}
prepareBitGet(initialBitGet, count);
}
public abstract void prepareBitGet(int initialBitGet, int count);
public abstract int getBits(int count);
static class Version0 extends BitArray {
public void prepareBitGet(int initialBitGet, int count) {
bitGet = initialBitGet;
}
public int getBits(int len) {
// intentionally gives meaningless result
bitGet += len;
return 0;
}
}
static class Version1 extends BitArray {
public void prepareBitGet(int initialBitGet, int count) {
bitGet = initialBitGet - 1;
}
public int getBits(int len) {
int byteIndex = bitGet;
bitGet = byteIndex + len;
int shift = 23 - (byteIndex & 7) - len;
int mask = (1 << len) - 1;
byteIndex >>= 3;
return (((bytes[byteIndex] << 16) |
((bytes[++byteIndex] & 0xFF) << 8) |
(bytes[++byteIndex] & 0xFF)) >> shift) & mask;
}
}
static class Version2 extends BitArray {
static final int[] mask = { 0x0, 0x1, 0x3, 0x7, 0xF, 0x1F, 0x3F, 0x7F, 0xFF,
0x1FF, 0x3FF, 0x7FF, 0xFFF, 0x1FFF, 0x3FFF, 0x7FFF, 0xFFFF };
public void prepareBitGet(int initialBitGet, int count) {
bitGet = initialBitGet;
}
public int getBits(int len) {
int offset = bitGet;
bitGet = offset + len;
int byteIndex = offset >> 3; // originally used /8
int bitIndex = offset & 7; // originally used %8
if ((bitIndex + len) > 16) {
return ((bytes[byteIndex] << 16 |
(bytes[byteIndex + 1] & 0xFF) << 8 |
(bytes[byteIndex + 2] & 0xFF)) >> (24 - bitIndex - len)) & mask[len];
} else if ((offset + len) > 8) {
return ((bytes[byteIndex] << 8 |
(bytes[byteIndex + 1] & 0xFF)) >> (16 - bitIndex - len)) & mask[len];
} else {
return (bytes[byteIndex] >> (8 - offset - len)) & mask[len];
}
}
}
static class Version3 extends BitArray {
int[] ints = new int[2048];
public void prepareBitGet(int initialBitGet, int count) {
bitGet = initialBitGet;
int put_i = (initialBitGet >> 3) - 1;
int get_i = put_i;
int buf;
buf = ((bytes[++get_i] & 0xFF) << 16) |
((bytes[++get_i] & 0xFF) << 8) |
(bytes[++get_i] & 0xFF);
do {
buf = (buf << 8) | (bytes[++get_i] & 0xFF);
ints[++put_i] = buf;
} while (get_i < count);
}
public int getBits(int len) {
int bit_idx = bitGet;
bitGet = bit_idx + len;
int shift = 32 - (bit_idx & 7) - len;
int mask = (1 << len) - 1;
int int_idx = bit_idx >> 3;
return (ints[int_idx] >> shift) & mask;
}
}
static class Version4 extends BitArray {
int[] ints = new int[1024];
public void prepareBitGet(int initialBitGet, int count) {
bitGet = initialBitGet;
int g = initialBitGet >> 3;
int p = (initialBitGet >> 4) - 1;
final byte[] b = bytes;
int t = (b[g] << 8) | (b[++g] & 0xFF);
final int[] i = ints;
do {
i[++p] = (t = (t << 16) | ((b[++g] & 0xFF) <<8) | (b[++g] & 0xFF));
} while (g < count);
}
public int getBits(final int len) {
final int i;
bitGet = (i = bitGet) + len;
return (ints[i >> 4] >> (32 - len - (i & 15))) & ((1 << len) - 1);
}
}
public void benchmark(String label) {
int checksum = 0;
readNextBlock(32, 1927);
long time = System.nanoTime();
for (int pass=1<<18; pass>0; --pass) {
prepareBitGet(32, 1927);
for (int i=2047; i>=0; --i) {
checksum += getBits(i & 15);
}
}
time = System.nanoTime() - time;
System.out.println(label+" took "+Math.round(time/1E6D)+" ms, checksum="+checksum);
try { // avoid having the console interfere with our next measurement
Thread.sleep(369);
} catch (InterruptedException e) {}
}
public static void main(String[] argv) {
BitArray test;
// for the sake of getting a little less influence from the OS for stable measurement
Thread.currentThread().setPriority(Thread.MAX_PRIORITY);
while (true) {
test = new Version0();
test.benchmark("no implementaion");
test = new Version1();
test.benchmark("Durandal's (original)");
test = new Version2();
test.benchmark("blitzpasta's (adapted)");
test = new Version3();
test.benchmark("MSN's (posted)");
test = new Version4();
test.benchmark("MSN's (half-buffer modification)");
System.out.println("--- next pass ---");
}
}
}
This works, but I'm looking for a more efficient solution (performance wise). The byte array is guaranteed to be relatively small, between a few bytes up to a max of ~1800 bytes. The array is read exactly once (completely) between each call to the read method. There is no need for any error checking in getBits(), such as exceeding the array etc.
It seems my initial question above isn't clear enough. A "bit sequence" of N bits forms an integer of N bits, and I need to extract those integers with minimal overhead. I have no use for strings, as the values are either used as lookup indices or are directly fed into some computation. So basically, the skeleton shown above is a real class and getBits() signature shows how the rest of the code interacts with it.
Extendet the example code into a microbenchmark, included blitzpasta's solution (fixed missing byte masking). On my old AMD box it turns out as ~11400ms vs ~38000ms. FYI: Its the divide and modulo operations that kill the performance. If you replace /8 with >>3 and %8 with &7, both solutions are pretty close to each other (jdk1.7.0ea104).
There seemed to be a bit confusion about how and what to work on. The first, original post of the example code included a read() method to indicate where and when the byte buffer was filled. This got lost when the code was turned into the microbench. I re-introduced it to make this a little clearer.
The idea is to beat all existing versions by adding another subclass of BitArray which need to implement getBits() and prepareBitGet(), the latter may be empty. Do not change the benchmarking to give your solution an advantage, the same could be done for all the existing solutions, making this a completely moot optimization! (really!!)
I added a Version0, which does nothing but increment the bitGet state. It always returns 0 to get a rough idea how big the benchmark overhead is. Its only there for comparison.
Also, an adaption on MSN's idea was added (Version3). To keep things fair and comparable for all competitors, the byte array filling is now part of the benchmark, as well as a preparatory step (see above). Originally MSN's solution did not do so well, there was lots of overhead in preparing the int[] buffer. I took the liberty of optimizing the step a little, which turned it into a fierce competitor :)
You might also find that I de-convoluted your code a little. Your getBit() could be condensed into a 3-liner, probably shaving off one or two percent. I deliberately did this to keep the code readable and because the other versions aren't as condensed as possible either (again for readability).
Conclusion (code example above update to include versions based on all applicable contributions). On my old AMD box (Sun JRE 1.6.0_21), they come out as:
V0 no implementaion took 5384 ms
V1 Durandal's (original) took 10283 ms
V2 blitzpasta's (adapted) took 12212 ms
V3 MSN's (posted) took 11030 ms
V4 MSN's (half-buffer modification) took 9700 ms
Notes: In this benchmark an average of 7.5 bits is fetched per call to getBits(), and each bit is only read once. Since V3/V4 have to pay a high initialization cost, they tend to show better runtime behavior with more, shorter fetches (and consequently worse the closer to the maximum of 16 the average fetch size gets). Still, V4 stays slightly ahead of all others in all scenarios.
In an actual application, the cache contention must be taken into account, since the extra space needed for V3/v4 may increase cache misses to a point where V0 would be a better choice. If the array is to be traversed more than once, V4 should be favored, since it fetches faster than every other and the costly initialization is amortized after the fist pass.
If you just want the unsigned bit sequence as an int.
static final int[] lookup = {0x0, 0x1, 0x3, 0x7, 0xF, 0x1F, 0x3F, 0x7F, 0xFF, 0x1FF, 0x3FF, 0x7FF, 0xFFF, 0x1FFF, 0x3FFF, 0x7FFF, 0xFFFF };
/*
* bytes: byte array, with the bits indexed from 0 (MSB) to (bytes.length * 8 - 1) (LSB)
* offset: index of the MSB of the bit sequence.
* len: length of bit sequence, must from range [0,16].
* Not checked for overflow
*/
static int getBitSeqAsInt(byte[] bytes, int offset, int len){
int byteIndex = offset / 8;
int bitIndex = offset % 8;
int val;
if ((bitIndex + len) > 16) {
val = ((bytes[byteIndex] << 16 | bytes[byteIndex + 1] << 8 | bytes[byteIndex + 2]) >> (24 - bitIndex - len)) & lookup[len];
} else if ((offset + len) > 8) {
val = ((bytes[byteIndex] << 8 | bytes[byteIndex + 1]) >> (16 - bitIndex - len)) & lookup[len];
} else {
val = (bytes[byteIndex] >> (8 - offset - len)) & lookup[len];
}
return val;
}
If you want it as a String (modification of Margus' answer).
static String getBitSequence(byte[] bytes, int offset, int len){
int byteIndex = offset / 8;
int bitIndex = offset % 8;
int count = 0;
StringBuilder result = new StringBuilder();
outer:
for(int i = byteIndex; i < bytes.length; ++i) {
for(int j = (1 << (7 - bitIndex)); j > 0; j >>= 1) {
if(count == len) {
break outer;
}
if((bytes[byteIndex] & j) == 0) {
result.append('0');
} else {
result.append('1');
}
++count;
}
bitIndex = 0;
}
return result.toString();
}
Well, depending on how far you want to go down the time vs. memory see-saw, you can allocate a side table of every 32-bits at every 16-bit offset and then do a mask and shift based on the 16-bit offset:
byte[] bytes = new byte[2048];
int bitGet;
unsigned int dwords[] = new unsigned int[2046];
public BitArray() {
for (int i=0; i<bytes.length; ++i) {
bytes[i] = (byte) i;
}
for (int i= 0; i<dwords.length; ++i) {
dwords[i]=
(bytes[i ] << 24) |
(bytes[i + 1] << 16) |
(bytes[i + 2] << 8) |
(bytes[i + 3]);
}
}
int getBits(int len)
{
int offset= bitGet;
int offset_index= offset>>4;
int offset_offset= offset & 15;
return (dwords[offset_index] >> offset_offset) & ((1 << len) - 1);
}
You avoid the branching (at the cost of quadrupling your memory footprint). And is looking up the mask really that much faster than (1 << len) - 1?
Just wondering why can't you use java.util.BitSet;
Basically what you can do, is to read the whole data as byte[], convert it to binary in string format and use string utilities like .substring() to do the work. This will also work bit sequences > 16.
Lets say you have 3 bytes: 1, 2, 3 and you want to extract bit sequence from 5th to 16th bit.
Number Binary
1 00000001
2 00000010
3 00000011
Code example:
public static String getRealBinary(byte[] input){
StringBuilder sb = new StringBuilder();
for (byte c : input) {
for (int n = 128; n > 0; n >>= 1){
if ((c & n) == 0)
sb.append('0');
else sb.append('1');
}
}
return sb.toString();
}
public static void main(String[] args) {
byte bytes[] = new byte[]{1,2,3};
String sbytes = getRealBinary(bytes);
System.out.println(sbytes);
System.out.println(sbytes.substring(5,16));
}
Output:
000000010000001000000011
00100000010
Speed:
I did a testrun for 1m times and on my computer it took 0.995s, so its reasonably very fast:
Code to repeat the test yourself:
public static void main(String[] args) {
Random r = new Random();
byte bytes[] = new byte[4];
long start, time, total=0;
for (int i = 0; i < 1000000; i++) {
r.nextBytes(bytes);
start = System.currentTimeMillis();
getRealBinary(bytes).substring(5,16);
time = System.currentTimeMillis() - start;
total+=time;
}
System.out.println("It took " +total + "ms");
}
You want at most 16 bits, taken from an array of bytes. 16 bits can span at most 3 bytes.
Here's a possible solution:
int GetBits(int bit_index, int bit_length) {
int byte_offset = bit_index >> 3;
return ((((((byte_array[byte_offset]<<8)
+byte_array[byte_offset+1])<<8)
+byte_array[byte_offset+2]))
>>(24-(bit_index&7)+bit_length))))
&((1<<bit_length)-1);
}
[Untested]
If you call this a lot you should precompute the 24-bit values for the 3 concatenated bytes, and store those into an int array.
I'll observe that if you are coding this in C on an x86, you don't even need to precompute the 24 bit array; simply access the by te array at the desire offset as a 32 bit value. The x86 will do unaligned fetches just fine. [commenter noted that endianess mucks this up, so it isn't an answer, OK, do the 24 bit version.]
Since Java 7 BitSet has the toLongArray method, which I believe will do exactly what the question asks for:
int subBits = (int) bitSet.get(lowBit, highBit).toLongArray()[0];
This has the advantage that it works with sequences larger than ints or longs. It has the performance disadvantage that a new BitSet object must be allocated, and a new array object to hold the result.
It would be really interesting to see how this compares with the other methods in the benchmark.

Categories

Resources