Why is variable index ANDed twice with 0x1f? - java

I was reading about the underlying data structures for immutable collections in Scala (HashMap and Vector, more precisely) and while reading the code, I came across the get0 function in HashTrieMap.
Why is the variable index AND-ed with 0x1f again in line 312? It seems to me that the result is the same (i.e. the second AND is not necessary). What am I missing?
Here is the aforementioned function:
override def get0(key: A, hash: Int, level: Int): Option[B] = {
val index = (hash >>> level) & 0x1f // index is AND-ed with 0x1f
val mask = (1 << index)
if (bitmap == - 1) {
elems(index & 0x1f).get0(key, hash, level + 5) // once again, AND-ed with 0x1f
} else if ((bitmap & mask) != 0) {
val offset = Integer.bitCount(bitmap & (mask-1))
elems(offset).get0(key, hash, level + 5)
} else
None
}

You are not missing anything. Indeed the second AND is not necessary.

Related

IO-Link CRC32 implementation

currently I try to implement the CRC32 check algorithm from IO-Link for IODD files in Java [1]. In IO-Link each sensor is described with an IODD xml file [2]. The file contains a stamp tag with a crc code. My problem here is that the algorithm uses only unsigned integers and Java does not uses this type of types. Is there still a way to implement that?
I also use JetBrains dotPeek program to decompile the IODDChecker (latest Version). The CRC-Logic is in the file clsCRC.cs. At the end of the file I found the method "CRC32" which calculates the CRC. The problem is that they use other integers (unsigned integers) for the lookup-table. An interesting fact is that the algorithm in the IODD checker (crc32_octetwise) is slightly different from the one described in the specification.
Hope you can help me :)
What I tested:
Java CRC32 from zip
Guava UnsignedInteger
BigInteger
long instead of int
Procedure:
Program read the IODD as String/byte array
Remove the CRC value from Stamp-Tag from the String/byte array
Create CRC32 from string/byte array
In the test IODD you can find the CRC 1195433981 which should be equal to the output of the algorithm
Algorithms:
There are two algorithms. The first one is bitwise and the second one is octetwise and uses an lookup-table which you can find here [2].
uint32_t crc32_bitwise(char *data, size_t length, uint32_t previousCrc32 = 1)
{
uint32_t crc = ~previousCrc32;
int j;
const uint8_t* current = (const uint8_t*) data;
while (length-- > 0)
{
crc ^= *current++;
for (j = 0; j < 8; j++)
{
crc = (crc >> 1) ^ (-(int32_t)(crc & 1) & 0xEB31D82E);
}
}
return ~crc;
}
uint32_t crc32_octetwise(uint8_t const * current, uint32_t length, uint32_t previousCrc32 = 1, const uint32_t crc32Lookup[256])
{
uint32_t crc = ~previousCrc32;
while (length-- > 0)
crc = (crc >> 8) ^ crc32Lookup[(crc & 0xFF) ^ *current++];
return ~crc;
}
[1] https://io-link.com/share/Downloads/Profiles/IOL-Profile_Firmware-Update_V11_10082_Sep19.zip
[2] https://ioddfinder.io-link.com/productvariants/search/11765
[3] Yet another Java CRC32 implementation related issue
Update 1
I forgot to add that I currently receive 1120211745 as CRC Code but 1195433981 is the correct code.
Update 2
The following Codes was implemented from me. The big arrays are the lookup tables. The first int array is the table from the [1] and the second one is the array from the decompiled source. In the original were the int numbers marked as unsigned (e.g. 1996959894U).
private static final int[] numArr1 = new int[] {
0x00000000, 0x9695C4CA, 0xFB4839C9, 0x6DDDFD03, 0x20F3C3CF, 0xB6660705, 0xDBBBFA06, 0x4D2E3ECC, 0x41E7879E, 0xD7724354, 0xBAAFBE57, 0x2C3A7A9D, 0x61144451, 0xF781809B, 0x9A5C7D98, 0x0CC9B952, 0x83CF0F3C, 0x155ACBF6, 0x788736F5, 0xEE12F23F, 0xA33CCCF3, 0x35A90839, 0x5874F53A, 0xCEE131F0, 0xC22888A2, 0x54BD4C68, 0x3960B16B, 0xAFF575A1, 0xE2DB4B6D, 0x744E8FA7, 0x199372A4, 0x8F06B66E, 0xD1FDAE25, 0x47686AEF, 0x2AB597EC, 0xBC205326, 0xF10E6DEA, 0x679BA920, 0x0A465423, 0x9CD390E9, 0x901A29BB, 0x068FED71, 0x6B521072, 0xFDC7D4B8, 0xB0E9EA74, 0x267C2EBE, 0x4BA1D3BD, 0xDD341777, 0x5232A119, 0xC4A765D3, 0xA97A98D0, 0x3FEF5C1A, 0x72C162D6, 0xE454A61C, 0x89895B1F, 0x1F1C9FD5, 0x13D52687, 0x8540E24D, 0xE89D1F4E, 0x7E08DB84, 0x3326E548, 0xA5B32182, 0xC86EDC81, 0x5EFB184B, 0x7598EC17, 0xE30D28DD, 0x8ED0D5DE, 0x18451114, 0x556B2FD8, 0xC3FEEB12, 0xAE231611, 0x38B6D2DB, 0x347F6B89, 0xA2EAAF43, 0xCF375240, 0x59A2968A, 0x148CA846, 0x82196C8C, 0xEFC4918F, 0x79515545, 0xF657E32B, 0x60C227E1, 0x0D1FDAE2, 0x9B8A1E28, 0xD6A420E4, 0x4031E42E, 0x2DEC192D, 0xBB79DDE7, 0xB7B064B5, 0x2125A07F, 0x4CF85D7C, 0xDA6D99B6, 0x9743A77A, 0x01D663B0, 0x6C0B9EB3, 0xFA9E5A79, 0xA4654232, 0x32F086F8, 0x5F2D7BFB, 0xC9B8BF31, 0x849681FD, 0x12034537, 0x7FDEB834, 0xE94B7CFE, 0xE582C5AC, 0x73170166, 0x1ECAFC65, 0x885F38AF, 0xC5710663, 0x53E4C2A9, 0x3E393FAA, 0xA8ACFB60, 0x27AA4D0E, 0xB13F89C4, 0xDCE274C7, 0x4A77B00D, 0x07598EC1, 0x91CC4A0B, 0xFC11B708, 0x6A8473C2, 0x664DCA90, 0xF0D80E5A, 0x9D05F359, 0x0B903793, 0x46BE095F, 0xD02BCD95, 0xBDF63096, 0x2B63F45C, 0xEB31D82E, 0x7DA41CE4, 0x1079E1E7, 0x86EC252D, 0xCBC21BE1, 0x5D57DF2B, 0x308A2228, 0xA61FE6E2, 0xAAD65FB0, 0x3C439B7A, 0x519E6679, 0xC70BA2B3, 0x8A259C7F, 0x1CB058B5, 0x716DA5B6, 0xE7F8617C, 0x68FED712, 0xFE6B13D8, 0x93B6EEDB, 0x05232A11, 0x480D14DD, 0xDE98D017, 0xB3452D14, 0x25D0E9DE, 0x2919508C, 0xBF8C9446, 0xD2516945, 0x44C4AD8F, 0x09EA9343, 0x9F7F5789, 0xF2A2AA8A, 0x64376E40, 0x3ACC760B, 0xAC59B2C1, 0xC1844FC2, 0x57118B08, 0x1A3FB5C4, 0x8CAA710E, 0xE1778C0D, 0x77E248C7, 0x7B2BF195, 0xEDBE355F, 0x8063C85C, 0x16F60C96, 0x5BD8325A, 0xCD4DF690, 0xA0900B93, 0x3605CF59, 0xB9037937, 0x2F96BDFD, 0x424B40FE, 0xD4DE8434, 0x99F0BAF8, 0x0F657E32, 0x62B88331, 0xF42D47FB, 0xF8E4FEA9, 0x6E713A63, 0x03ACC760, 0x953903AA, 0xD8173D66, 0x4E82F9AC, 0x235F04AF, 0xB5CAC065, 0x9EA93439, 0x083CF0F3, 0x65E10DF0, 0xF374C93A, 0xBE5AF7F6, 0x28CF333C, 0x4512CE3F, 0xD3870AF5, 0xDF4EB3A7, 0x49DB776D, 0x24068A6E, 0xB2934EA4, 0xFFBD7068, 0x6928B4A2, 0x04F549A1, 0x92608D6B, 0x1D663B05, 0x8BF3FFCF, 0xE62E02CC, 0x70BBC606, 0x3D95F8CA, 0xAB003C00, 0xC6DDC103, 0x504805C9, 0x5C81BC9B, 0xCA147851, 0xA7C98552, 0x315C4198, 0x7C727F54, 0xEAE7BB9E, 0x873A469D, 0x11AF8257, 0x4F549A1C, 0xD9C15ED6, 0xB41CA3D5, 0x2289671F, 0x6FA759D3, 0xF9329D19, 0x94EF601A, 0x027AA4D0, 0x0EB31D82, 0x9826D948, 0xF5FB244B, 0x636EE081, 0x2E40DE4D, 0xB8D51A87, 0xD508E784, 0x439D234E, 0xCC9B9520, 0x5A0E51EA, 0x37D3ACE9, 0xA1466823, 0xEC6856EF, 0x7AFD9225, 0x17206F26, 0x81B5ABEC, 0x8D7C12BE, 0x1BE9D674, 0x76342B77, 0xE0A1EFBD, 0xAD8FD171, 0x3B1A15BB, 0x56C7E8B8, 0xC0522C72
};
private static final long[] numArr2 = new long[] {
0L, 1996959894L, 3993919788L, 2567524794L, 124634137L, 1886057615L, 3915621685L, 2657392035L, 249268274L, 2044508324L, 3772115230L, 2547177864L, 162941995L, 2125561021L, 3887607047L, 2428444049L, 498536548L, 1789927666L, 4089016648L, 2227061214L, 450548861L, 1843258603L, 4107580753L, 2211677639L, 325883990L, 1684777152L, 4251122042L, 2321926636L, 335633487L, 1661365465L, 4195302755L, 2366115317L, 997073096L, 1281953886L, 3579855332L, 2724688242L, 1006888145L, 1258607687L, 3524101629L, 2768942443L, 901097722L, 1119000684L, 3686517206L, 2898065728L, 853044451L, 1172266101L, 3705015759L, 2882616665L, 651767980L, 1373503546L, 3369554304L, 3218104598L, 565507253L, 1454621731L, 3485111705L, 3099436303L, 671266974L, 1594198024L, 3322730930L, 2970347812L, 795835527L, 1483230225L, 3244367275L, 3060149565L, 1994146192L, 31158534L, 2563907772L, 4023717930L, 1907459465L, 112637215L, 2680153253L, 3904427059L, 2013776290L, 251722036L, 2517215374L, 3775830040L, 2137656763L, 141376813L, 2439277719L, 3865271297L, 1802195444L, 476864866L, 2238001368L, 4066508878L, 1812370925L, 453092731L, 2181625025L, 4111451223L, 1706088902L, 314042704L, 2344532202L, 4240017532L, 1658658271L, 366619977L, 2362670323L, 4224994405L, 1303535960L, 984961486L, 2747007092L, 3569037538L, 1256170817L, 1037604311L, 2765210733L, 3554079995L, 1131014506L, 879679996L, 2909243462L, 3663771856L, 1141124467L, 855842277L, 2852801631L, 3708648649L, 1342533948L, 654459306L, 3188396048L, 3373015174L, 1466479909L, 544179635L, 3110523913L, 3462522015L, 1591671054L, 702138776L, 2966460450L, 3352799412L, 1504918807L, 783551873L, 3082640443L, 3233442989L, 3988292384L, 2596254646L, 62317068L, 1957810842L, 3939845945L, 2647816111L, 81470997L, 1943803523L, 3814918930L, 2489596804L, 225274430L, 2053790376L, 3826175755L, 2466906013L, 167816743L, 2097651377L, 4027552580L, 2265490386L, 503444072L, 1762050814L, 4150417245L, 2154129355L, 426522225L, 1852507879L, 4275313526L, 2312317920L, 282753626L, 1742555852L, 4189708143L, 2394877945L, 397917763L, 1622183637L, 3604390888L, 2714866558L, 953729732L, 1340076626L, 3518719985L, 2797360999L, 1068828381L, 1219638859L, 3624741850L, 2936675148L, 906185462L, 1090812512L, 3747672003L, 2825379669L, 829329135L, 1181335161L, 3412177804L, 3160834842L, 628085408L, 1382605366L, 3423369109L, 3138078467L, 570562233L, 1426400815L, 3317316542L, 2998733608L, 733239954L, 1555261956L, 3268935591L, 3050360625L, 752459403L, 1541320221L, 2607071920L, 3965973030L, 1969922972L, 40735498L, 2617837225L, 3943577151L, 1913087877L, 83908371L, 2512341634L, 3803740692L, 2075208622L, 213261112L, 2463272603L, 3855990285L, 2094854071L, 198958881L, 2262029012L, 4057260610L, 1759359992L, 534414190L, 2176718541L, 4139329115L, 1873836001L, 414664567L, 2282248934L, 4279200368L, 1711684554L, 285281116L, 2405801727L, 4167216745L, 1634467795L, 376229701L, 2685067896L, 3608007406L, 1308918612L, 956543938L, 2808555105L, 3495958263L, 1231636301L, 1047427035L, 2932959818L, 3654703836L, 1088359270L, 936918000L, 2847714899L, 3736837829L, 1202900863L, 817233897L, 3183342108L, 3401237130L, 1404277552L, 615818150L, 3134207493L, 3453421203L, 1423857449L, 601450431L, 3009837614L, 3294710456L, 1567103746L, 711928724L, 3020668471L, 3272380065L, 1510334235L, 755167117L
};
Test with CRC32 from java.util.zip:
public static String crc32(byte[] data) {
Checksum c = new CRC32();
c.update(1);
c.update(data);
return String.valueOf(Long.parseLong(Long.toHexString(c.getValue()), 16)); // 1544046132
}
The solution from [3]
public static String crc32(byte[] data){
int crc = Integer.MAX_VALUE;
for (byte b : data)
crc = numArr[(crc & 0xff) ^ (b & 0xff) & 255] ^ (crc >>> 8);
crc = ~crc; // flip bit/sign
return String.valueOf(Long.parseLong(Integer.toHexString(crc), 16)); // 1120211745
}
This implementation uses Guava and was directly transformed from me. I also try BigInteger, bit operations to enforce unsigned and use byte arrays instead of int or long.
public static String crc32(byte[] data) {
long num = UnsignedInteger.MAX_VALUE.longValue();
for (byte aByte : data)
num = numArr2[(Integer.parseUnsignedInt(Long.toUnsignedString(num)) ^ aByte) & 255] ^ (num >>> 8);
num ^= UnsignedInteger.MAX_VALUE.longValue();
return String.valueOf(num); // 2128031166
}
Update 3
I tried a new start with BigInteger and now the result is closer. The CRC is 1190225612 (expected: 1195433981).
public static String crc32(byte[] bytes) {
BigInteger num = BigInteger.valueOf(0xffffffffL);
for (byte aByte: bytes)
num = BigInteger.valueOf(numArr[num.xor(BigInteger.valueOf(aByte)).and(BigInteger.valueOf(numArr.length - 1)).intValue()]).xor((num.shiftRight(8)));
num = num.xor(BigInteger.valueOf(0xffffffffL));
System.out.println("1195433981");
return String.valueOf(~num.intValue());
}
Update 4
Update the procedure description. Not the whole line with the Stamp tag will be removed instead of only the value from the CRC attribute will be removed.
You have the initial CRC wrong. It is 1. From the document you linked:
The seed value shall be "1" (see previousCrc32 = 1 in Figure B.1 and
Figure B.2).
That value is then inverted before the loop, so at that point your "num" should be 0xfffffffe.
You numArr1 is correct. I don't know where your numArr2 came from.
All you need is this, calling with the first argument being 1:
static private int crc32iodd(int crc, byte[] data){
crc = ~crc;
for (byte b : data)
crc = numArr1[(crc ^ b) & 0xff] ^ (crc >>> 8);
return ~crc;
}
Since Java doesn't have unsigned int, half the time your result will be negative. That's fine, since it's the exact same 32 bits as you would have as an unsigned result, just printed differently.
There is no sense of "closer" for the answer. The integer difference between what you want and what you're getting has no meaning in the world of CRCs, since they are not integer calculations.
Update:
Looking more closely at your links and the CRC reference code you put in your question, it appears you have referenced entirely the wrong documents for the CRC you actually need. You referenced the CRC for BLOBs used for firmware updates, which is not the CRC used for validation of the XML files. This document provides the specification for the CRC used for the XML files as the 32-bit CRC found in the ITU-T recommendation V.42. That is a different CRC using a different polynomial and different initial value. It is in fact the standard CRC-32 used by zip, and is available in Java as java.util.zip.CRC32. You don't need to write any CRC code. Just use that.
That document also notes but does not detail special processing of the XML input before computing the CRC. You will need to do more research to find out exactly what the CRC is calculated over in the XML file, and how it is pre-processed. This package has a description that notes that the associated language xml's have the CRC of main xml's appended to the end before each of their CRC calculations. However it is not made clear as to exactly how the CRC is appended.
This time please conduct adequate research and verification on your part, before coming here with questions about how to implement an incorrectly identified CRC.
For the sake of completeness.
You are right. I can use the java.util.zip.CRC32. I think the problem was that I try at the beginning of my implementation tests to use the whole xml file and so I thought that the description in the PDF was wrong.
And here is the solution:
If you have a IODD XML file than you have to remove the value of crc in the stamp-tag. Subsequently, if not already done, convert the file content without the crc-value to a byte array. And now you can use one of these two functions:
Implementation with java.util.zip.CRC32
public boolean isValid() {
CRC32 m = new CRC32();
m.update(file);
long x = 1195433981L;
return (m.getValue() == x);
}
Manual implementation (The table is according to the site https://rosettacode.org/wiki/CRC-32#Lingo or https://web.mit.edu/freebsd/head/sys/libkern/crc32.c)
public class Crc32 {
private static final long[] numArr2 = new long[] {
0L, 1996959894L, 3993919788L, 2567524794L, 124634137L, 1886057615L, 3915621685L, 2657392035L, 249268274L, 2044508324L, 3772115230L, 2547177864L, 162941995L, 2125561021L, 3887607047L, 2428444049L, 498536548L, 1789927666L, 4089016648L, 2227061214L, 450548861L, 1843258603L, 4107580753L, 2211677639L, 325883990L, 1684777152L, 4251122042L, 2321926636L, 335633487L, 1661365465L, 4195302755L, 2366115317L, 997073096L, 1281953886L, 3579855332L, 2724688242L, 1006888145L, 1258607687L, 3524101629L, 2768942443L, 901097722L, 1119000684L, 3686517206L, 2898065728L, 853044451L, 1172266101L, 3705015759L, 2882616665L, 651767980L, 1373503546L, 3369554304L, 3218104598L, 565507253L, 1454621731L, 3485111705L, 3099436303L, 671266974L, 1594198024L, 3322730930L, 2970347812L, 795835527L, 1483230225L, 3244367275L, 3060149565L, 1994146192L, 31158534L, 2563907772L, 4023717930L, 1907459465L, 112637215L, 2680153253L, 3904427059L, 2013776290L, 251722036L, 2517215374L, 3775830040L, 2137656763L, 141376813L, 2439277719L, 3865271297L, 1802195444L, 476864866L, 2238001368L, 4066508878L, 1812370925L, 453092731L, 2181625025L, 4111451223L, 1706088902L, 314042704L, 2344532202L, 4240017532L, 1658658271L, 366619977L, 2362670323L, 4224994405L, 1303535960L, 984961486L, 2747007092L, 3569037538L, 1256170817L, 1037604311L, 2765210733L, 3554079995L, 1131014506L, 879679996L, 2909243462L, 3663771856L, 1141124467L, 855842277L, 2852801631L, 3708648649L, 1342533948L, 654459306L, 3188396048L, 3373015174L, 1466479909L, 544179635L, 3110523913L, 3462522015L, 1591671054L, 702138776L, 2966460450L, 3352799412L, 1504918807L, 783551873L, 3082640443L, 3233442989L, 3988292384L, 2596254646L, 62317068L, 1957810842L, 3939845945L, 2647816111L, 81470997L, 1943803523L, 3814918930L, 2489596804L, 225274430L, 2053790376L, 3826175755L, 2466906013L, 167816743L, 2097651377L, 4027552580L, 2265490386L, 503444072L, 1762050814L, 4150417245L, 2154129355L, 426522225L, 1852507879L, 4275313526L, 2312317920L, 282753626L, 1742555852L, 4189708143L, 2394877945L, 397917763L, 1622183637L, 3604390888L, 2714866558L, 953729732L, 1340076626L, 3518719985L, 2797360999L, 1068828381L, 1219638859L, 3624741850L, 2936675148L, 906185462L, 1090812512L, 3747672003L, 2825379669L, 829329135L, 1181335161L, 3412177804L, 3160834842L, 628085408L, 1382605366L, 3423369109L, 3138078467L, 570562233L, 1426400815L, 3317316542L, 2998733608L, 733239954L, 1555261956L, 3268935591L, 3050360625L, 752459403L, 1541320221L, 2607071920L, 3965973030L, 1969922972L, 40735498L, 2617837225L, 3943577151L, 1913087877L, 83908371L, 2512341634L, 3803740692L, 2075208622L, 213261112L, 2463272603L, 3855990285L, 2094854071L, 198958881L, 2262029012L, 4057260610L, 1759359992L, 534414190L, 2176718541L, 4139329115L, 1873836001L, 414664567L, 2282248934L, 4279200368L, 1711684554L, 285281116L, 2405801727L, 4167216745L, 1634467795L, 376229701L, 2685067896L, 3608007406L, 1308918612L, 956543938L, 2808555105L, 3495958263L, 1231636301L, 1047427035L, 2932959818L, 3654703836L, 1088359270L, 936918000L, 2847714899L, 3736837829L, 1202900863L, 817233897L, 3183342108L, 3401237130L, 1404277552L, 615818150L, 3134207493L, 3453421203L, 1423857449L, 601450431L, 3009837614L, 3294710456L, 1567103746L, 711928724L, 3020668471L, 3272380065L, 1510334235L, 755167117L
};
private static long crc32(byte[] data){
long crc = 0xFFFFFFFFL;
for (byte b : data)
crc = numArr2[(int)(crc ^ b) & 0xff] ^ (crc >>> 8);
return crc ^ 0xFFFFFFFFL;
}
public static boolean isValid(byte[] data, long checkCrc) {
long crc = crc32(data);
return (checkCrc == crc);
}
}
I had the same problem, but I use Python. One comment first. CRCs are made for the detection of changes. If you change only one bit in the input stream, the CRC will be completely different. So you can't say, the calculated the crc is close to the right value.
This is my code, which calculates the right crc
prev = 0
for eachLine in open(file,"rb"):
prev = zlib.crc32(eachLine, prev)
crc = prev & 0xFFFFFFFF
At first you clear the crc in your file to an empty string "". Then you have to use this file to calculate the crc. Because of processing the file as binary, don't change anything besides the crc (tabs, spaces, line endings). One space, automatically removed by the IODD_Checker tool, costed nearly a complete day.
If you calculate a 32 bit crc, you will have the possibility of 1 to ‭4294967295‬ that 0 is a valid crc. Because the checker uses this value to mark the crc as invalid, I'm not sure there is a special handling defined.
if 0 == prev:
crc = 0xFFFFFFFF
else:
crc = prev & 0xFFFFFFFF
This issue I will have to check.

Java - return null or string by function parametr without if

Simple quesion but can't find the answer. I have a function
private Srting getStrOrNull(String str){
if (str==null) return null;
else{
return "#"+str+"#";//ANY STRING MODIFICATION
}
}
Can I modify this function in that way that not to use any if,switch clause? For example using hashmap etc.
I'll explain what for I need it. I have 1000s rows in database and for example to determine icon according to state for every row I do this way.
private HashMap<Int,String> iconPaths;
public String getIconPathByState(int id){
return iconPaths.get(id)
}
I don't use any switches and ifs to do it as faster as possible as this function will be called 1000s times.
The code "#"+str+"#" is going to take at least
100 times as long as the compare, so I think you are optimizing the wrong part here.
This is premature optimization. Today's processors, like the i7, perform 100,000 millon instructions per second. 1000's of instructions is lost in the noise.
If you really think you need to speed it up, don't guess. Measure.
http://docs.oracle.com/javase/8/docs/technotes/guides/visualvm/index.html
Here is one approach which you can use sometimes, it will be slower in this artificial example with "#"+..., but main idea could be useful:
private final static String getStrOrNull(final String str) throws Exception {
// first step prepare table for different states:
// 0 - input value is null
// 1 - input value is some not null string
// in single thread application you can move this array out of the function as static variable to avoid memory allocation
final Callable<String>[] func = new Callable[2];
func[0] = new Callable<String>() {
#Override
public String call() {
return null;
}
};
func[1] = new Callable<String>() {
#Override
public String call() {
return str.toLowerCase(); // any other string modification
}
};
// now we need to convert our input variable into "state" (0 or 1)
// to do this we're inserting string into set
// if string is not null - it will be in map, final size will be 1
// if string is null - we will remove it via remove(null), final size will be 0
final Set<String> set = new HashSet<String>();
set.add(str);
set.remove(null);
return func[set.size()].call();
}
here is another method how you can convert input variable into 0 or 1:
int i = System.identityHashCode(str); // returns 0 for null, non zero for all other values, you can also use unsafe getaddr
// converts 0 to 0, any non-zero value to 1:
int b = ((i >>> 16) & 0xffff) | (i & 0xffff);
b = ((b >>> 8) & 0xff) | (b & 0xff);
b = ((b >>> 4) & 0xf) | (b & 0xf);
b = ((b >>> 2) & 0x3) | (b & 0x3);
b = ((b >>> 1) & 0x1) | (b & 0x1);
return func[b].call();

Scala: Function test a string for unique char

Solved! Solution at the bottom.
I'm porting some Java code to Scala for fun and I trapped into a pretty nifty way of bit-shifting in Java. The Java code below takes a String as input and tests if it consists of unique characters.
public static boolean isUniqueChars(String str) {
if (str.length() > 256)return false; }
int checker = 0;
for (int i = 0; i < str.length(); i++) {
int val = str.charAt(i) - 'a';
if ((checker & (1 << val)) > 0) return false;
checker |= (1 << val);
}
return true;
Full listing is here: https://github.com/marvin-hansen/ctci/blob/master/java/Chapter%201/Question1_1/Question.java
How the code exactly works is explained here:
How does this Java code which determines whether a String contains all unique characters work?
Porting this directly to Scala doesn't really work so I'm looking for a more functional way to re-write the stuff above.
I have tried BigInt & BitSet
def isUniqueChars2(str : String) : Boolean =
// Java, char's are Unicode so there are 32768 values
if (str.length() > 32768) false
val checker = BigInt(1)
for(i <- 0 to str.length){
val value = str.charAt(i)
if(checker.testBit(value)) false
checker.setBit(value)
}
true
}
This works, however, but without bit-shifting and without lowercase assumption.
Performance is rather unknown ....
However, I would like to do a more functional style solution.
Thanks to user3189923 for the solution.
def isUniqueChars(str : String) = str.distinct == str
That's it. Thank you.
str.distinct == str
In general, method distinct preserves order of occurrence after removing duplicates. Consider
implicit class RichUnique(val str: String) extends AnyVal {
def isUniqueChars() = str.distinct == str
}
and so
"abc".isUniqueChars
res: Boolean = true
"abcc".isUniqueChars
res: Boolean = false
How about:
str.toSet.size == str.size
?

Save to binary file

I'd like to save data to binary file using java. For example i have the number 101 and in my program the output file have 4 Bytes. How can i save the number only in three bits (101) in the output file ?
My program looks like this:
public static void main(String args[]) throws FileNotFoundException, IOException {
int i = 101;
DataOutputStream os = new DataOutputStream(new FileOutputStream("file"));
os.writeInt(i);
os.close();
}
I found something like that: http://www.developer.nokia.com/Community/Wiki/Bit_Input/Output_Stream_utility_classes_for_efficient_data_transfer
You can not write less than one byte to a file. If you want to write the binary number 101 then do int i = 5 and use os.write(i) instead. This will write one byte: 0000 0101.
First off, you can't write just 3 bits to a file, memory is aligned at specific values (8, 16, 32, 64 or even 128 bit, this is compiler/platform specific). If you write smaller sizes than that, they will be expanded to match the alignment.
Secondly, the decimal number 101, written in binary is 0b01100101. the binary number 0b00000101, is decimal 5.
Thirdly, these numbers are now only 1 Byte (8 bit) long, so you can use a char instead of an int.
And last but not least, to write non-integer numbers, use os.write()
So to get to what you want, first check if you want to write 0b01100101 or 0b00000101. Change the int to a char and to the appropriate number (you can write 0b01100101 in Java). And use os.write()
A really naive implementation, I hope it helps you grasp the idea. Also untested, likely to contain off-by-one errors etc...
class BitArray {
// public fields for example code, in real code encapsulate
public int bits=0; // actual count of stored bits
public byte[] buf=new byte[1];
private void doubleBuf() {
byte [] tmp = new byte[buf.length * 2];
System.arraycopy(buf, 0, tmp, 0, buf.length);
buf = tmp;
}
private int arrayIndex(int bitNum) {
return bitNum / 8;
}
private int bitNumInArray(int bitNum) {
return bitNum & 7; // change to change bit order in buf's bytes
}
// returns how many elements of buf are actually in use, for saving etc.
// note that last element usually contains unused bits.
public int getUsedArrayElements() {
return arrayIndex(this.bits-1) + 1;
}
// bitvalue is 0 for 0, non-0 for 1
public void setBit(byte bitValue, int bitNum) {
if (bitNum >= this.bits || bitNum < 0) throw new InvalidArgumentException();
if (bitValue == 0) this.buf[arrayIndex(bitNum)] &= ~((byte)1 << bitNumInArray(bitNum));
else this.buf[arrayIndex(bitNum)] |= (byte)1 << bitNumInArray(bitNum);
}
public void addBit(int bitValue) {
// this.bits is old bit count, which is same as index of new last bit
if (this.buf.length <= arrayIndex(this.bits)) doubleBuf();
++this.bits;
setBit(bitValue, this.bits-1);
}
int readBit(int bitNum) { // return 0 or 1
if (bitNum >= this.bits || bitNum < 0) throw new InvalidArgumentException();
byte value = buf[arrayIndex(bitNum)] & ((byte)1 << bitNumInArray(bitNum));
return (value == 0) ? 0 : 1;
}
void addBits(int bitCount, int bitValues) {
for (int num = bitCount - 1 ; num >= 0 ; --num) {
// change loop iteration order to change bit order of bitValues
addBit(bitValues & (1 << num));
}
}
For efficient solution, it should use int or long array instead of byte array, and include more efficient method for multiple bit addition (add parts of bitValues an entire buf array element at a time, not bit-by-bit like above).
To save this, you need to save the right number of bytes from buf, calculated by getUsedArrayElements().

Bitwise operations in Java: Test if in "1010101111011" a bit is set?

Let's say, I have some user input (as a String) like "11010011011".
Now I want to check if a bit at a bit at a particular position is set (each digit should act as a flag).
Note: I am receiving the user's input as a String.
How can I do that?
You could work with the string as is - say you want to check the first bit on the left:
if (input.charAt(0) == '1') { //
Alternatively if you want to work with a BitSet you can initialise it in a loop:
String input = "11010011011";
BitSet bs = new BitSet(input.length());
int i = 0;
for (char c : input.toCharArray()) {
if (c == '1') bs.set(i);
i++;
}
Then to check if the i-th bit is set:
boolean isSet = bs.get(i);
If you want to use bitwise operations, then first convert the string to integer and test with bitmasks:
int val = Integer.parseInt("11010011011", 2);
System.out.println(val & (1<<0)); //First bit
System.out.println(val & (1<<1)); //Second bit
System.out.println(val & (1<<2)); //Third bit
.....

Categories

Resources