As the title suggests this is a question about an implementation detail from HashMap#resize - that's when the inner array is doubled in size.
It's a bit wordy, but I've really tried to prove that I did my best understanding this...
This happens at a point when entries in this particular bucket/bin are stored in a Linked fashion - thus having an exact order and in the context of the question this is important.
Generally the resize could be called from other places as well, but let's look at this case only.
Suppose you put these strings as keys in a HashMap (on the right there's the hashcode after HashMap#hash - that's the internal re-hashing.) Yes, these are carefully generated, not random.
DFHXR - 11111
YSXFJ - 01111
TUDDY - 11111
AXVUH - 01111
RUTWZ - 11111
DEDUC - 01111
WFCVW - 11111
ZETCU - 01111
GCVUR - 11111
There's a simple pattern to notice here - the last 4 bits are the same for all of them - which means that when we insert 8 of these keys (there are 9 total), they will end-up in the same bucket; and on the 9-th HashMap#put, the resize will be called.
So if currently there are 8 entries (having one of the keys above) in the HashMap - it means there are 16 buckets in this map and the last 4 bits of they key decided where the entries end-up.
We put the nine-th key. At this point TREEIFY_THRESHOLD is hit and resize is called. The bins are doubled to 32 and one more bit from the keys decides where that entry will go (so, 5 bits now).
Ultimately this piece of code is reached (when resize happens):
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
It's actually not that complicated... what it does it splits the current bin into entries that will move to other bins and to entries that will not move to other bins - but will remain into this one for sure.
And it's actually pretty smart how it does that - it's via this piece of code:
if ((e.hash & oldCap) == 0)
What this does is check if the next bit (the 5-th in our case) is actually zero - if it is, it means that this entry will stay where it is; if it's not it will move with a power of two offset in the new bin.
And now finally the question: that piece of code in the resize is carefully made so that it preserves the order of the entries there was in that bin.
So after you put those 9 keys in the HashMap the order is going to be :
DFHXR -> TUDDY -> RUTWZ -> WFCVW -> GCVUR (one bin)
YSXFJ -> AXVUH -> DEDUC -> ZETCU (another bin)
Why would you want to preserve order of some entries in the HashMap. Order in a Map is really bad as detailed here or here.
The design consideration has been documented within the same source file, in a code comment in line 211
* When bin lists are treeified, split, or untreeified, we keep
* them in the same relative access/traversal order (i.e., field
* Node.next) to better preserve locality, and to slightly
* simplify handling of splits and traversals that invoke
* iterator.remove. When using comparators on insertion, to keep a
* total ordering (or as close as is required here) across
* rebalancings, we compare classes and identityHashCodes as
* tie-breakers.
Since removing mappings via an iterator can’t trigger a resize, the reasons to retain the order specifically in resize are “to better preserve locality, and to slightly simplify handling of splits”, as well as being consistent regarding the policy.
There are two common reasons for maintaining order in bins implemented as a linked list:
One is that you maintain order by increasing (or decreasing) hash-value.
That means when searching a bin you can stop as soon as the current item is greater (or less, as applicable) than the hash being searched for.
Another approach involves moving entries to the front (or nearer the front) of the bucket when accessed or just adding them to the front. That suits situations where the probability of an element being accessed is high if it has just been accessed.
I've looked at the source for JDK-8 and it appears to be (at least for the most part) doing the later passive version of the later (add to front):
http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/HashMap.java
While it's true that you should never rely on iteration order from containers that don't guarantee it, that doesn't mean that it can't be exploited for performance if it's structural. Also notice that the implementation of a class is in a privilege position to exploit details of its implementation in a formal way that a user of that class should not.
If you look at the source and understand how its implemented and exploit it, you're taking a risk. If the implementer does it, that's a different matter!
Note:
I have an implementation of an algorithm that relies heavily on a hash-table called Hashlife. That uses this model, have a hash-table that's a power of two because (a) you can get the entry by bit-masking (& mask) rather than a division and (b) rehashing is simplified because you only every 'unzip' hash-bins.
Benchmarking shows that algorithm gaining around 20% by actively moving patterns to the front of their bin when accessed.
The algorithm pretty much exploits repeating structures in cellular automata, which are common so if you've seen a pattern the chances of seeing it again are high.
Order in a Map is really bad [...]
It's not bad, it's (in academic terminology) whatever. What Stuart Marks wrote at the first link you posted:
[...] preserve flexibility for future implementation changes [...]
Which means (as I understand it) that now the implementation happens to keep the order, but in the future if a better implementation is found, it will be used either it keeps the order or not.
Related
As per the following link document: Java HashMap Implementation
I'm confused with the implementation of HashMap (or rather, an enhancement in HashMap). My queries are:
Firstly
static final int TREEIFY_THRESHOLD = 8;
static final int UNTREEIFY_THRESHOLD = 6;
static final int MIN_TREEIFY_CAPACITY = 64;
Why and how are these constants used? I want some clear examples for this.
How they are achieving a performance gain with this?
Secondly
If you see the source code of HashMap in JDK, you will find the following static inner class:
static final class TreeNode<K, V> extends java.util.LinkedHashMap.Entry<K, V> {
HashMap.TreeNode<K, V> parent;
HashMap.TreeNode<K, V> left;
HashMap.TreeNode<K, V> right;
HashMap.TreeNode<K, V> prev;
boolean red;
TreeNode(int arg0, K arg1, V arg2, HashMap.Node<K, V> arg3) {
super(arg0, arg1, arg2, arg3);
}
final HashMap.TreeNode<K, V> root() {
HashMap.TreeNode arg0 = this;
while (true) {
HashMap.TreeNode arg1 = arg0.parent;
if (arg0.parent == null) {
return arg0;
}
arg0 = arg1;
}
}
//...
}
How is it used? I just want an explanation of the algorithm.
HashMap contains a certain number of buckets. It uses hashCode to determine which bucket to put these into. For simplicity's sake imagine it as a modulus.
If our hashcode is 123456 and we have 4 buckets, 123456 % 4 = 0 so the item goes in the first bucket, Bucket 1.
If our hashCode function is good, it should provide an even distribution so that all the buckets will be used somewhat equally. In this case, the bucket uses a linked list to store the values.
But you can't rely on people to implement good hash functions. People will often write poor hash functions which will result in a non-even distribution. It's also possible that we could just get unlucky with our inputs.
The less even this distribution is, the further we're moving from O(1) operations and the closer we're moving towards O(n) operations.
The implementation of HashMap tries to mitigate this by organising some buckets into trees rather than linked lists if the buckets become too large. This is what TREEIFY_THRESHOLD = 8 is for. If a bucket contains more than eight items, it should become a tree.
This tree is a Red-Black tree, presumably chosen because it offers some worst-case guarantees. It is first sorted by hash code. If the hash codes are the same, it uses the compareTo method of Comparable if the objects implement that interface, else the identity hash code.
If entries are removed from the map, the number of entries in the bucket might reduce such that this tree structure is no longer necessary. That's what the UNTREEIFY_THRESHOLD = 6 is for. If the number of elements in a bucket drops below six, we might as well go back to using a linked list.
Finally, there is the MIN_TREEIFY_CAPACITY = 64.
When a hash map grows in size, it automatically resizes itself to have more buckets. If we have a small HashMap, the likelihood of us getting very full buckets is quite high, because we don't have that many different buckets to put stuff into. It's much better to have a bigger HashMap, with more buckets that are less full. This constant basically says not to start making buckets into trees if our HashMap is very small - it should resize to be larger first instead.
To answer your question about the performance gain, these optimisations were added to improve the worst case. You would probably only see a noticeable performance improvement because of these optimisations if your hashCode function was not very good.
It is designed to protect against bad hashCode implementations and also provides basic protection against collision attacks, where a bad actor may attempt to slow down a system by deliberately selecting inputs which occupy the same buckets.
To put it simpler (as much as I could simpler) + some more details.
These properties depend on a lot of internal things that would be very cool to understand - before moving to them directly.
TREEIFY_THRESHOLD -> when a single bucket reaches this (and the total number exceeds MIN_TREEIFY_CAPACITY), it is transformed into a perfectly balanced red/black tree node. Why? Because of search speed. Think about it in a different way:
it would take at most 32 steps to search for an Entry within a bucket/bin with Integer.MAX_VALUE entries.
Some intro for the next topic. Why is the number of bins/buckets always a power of two? At least two reasons: faster than modulo operation and modulo on negative numbers will be negative. And you can't put an Entry into a "negative" bucket:
int arrayIndex = hashCode % buckets; // will be negative
buckets[arrayIndex] = Entry; // obviously will fail
Instead there is a nice trick used instead of modulo:
(n - 1) & hash // n is the number of bins, hash - is the hash function of the key
That is semantically the same as modulo operation. It will keep the lower bits. This has an interesting consequence when you do:
Map<String, String> map = new HashMap<>();
In the case above, the decision of where an entry goes is taken based on the last 4 bits only of you hashcode.
This is where multiplying the buckets comes into play. Under certain conditions (would take a lot of time to explain in exact details), buckets are doubled in size. Why? When buckets are doubled in size, there is one more bit coming into play.
So you have 16 buckets - last 4 bits of the hashcode decide where an entry goes. You double the buckets: 32 buckets - 5 last bits decide where entry will go.
As such this process is called re-hashing. This might get slow. That is (for people who care) as HashMap is "joked" as: fast, fast, fast, slooow. There are other implementations - search pauseless hashmap...
Now UNTREEIFY_THRESHOLD comes into play after re-hashing. At that point, some entries might move from this bins to others (they add one more bit to the (n-1)&hash computation - and as such might move to other buckets) and it might reach this UNTREEIFY_THRESHOLD. At this point it does not pay off to keep the bin as red-black tree node, but as a LinkedList instead, like
entry.next.next....
MIN_TREEIFY_CAPACITY is the minimum number of buckets before a certain bucket is transformed into a Tree.
TreeNode is an alternative way to store the entries that belong to a single bin of the HashMap. In older implementations the entries of a bin were stored in a linked list. In Java 8, if the number of entries in a bin passed a threshold (TREEIFY_THRESHOLD), they are stored in a tree structure instead of the original linked list. This is an optimization.
From the implementation:
/*
* Implementation notes.
*
* This map usually acts as a binned (bucketed) hash table, but
* when bins get too large, they are transformed into bins of
* TreeNodes, each structured similarly to those in
* java.util.TreeMap. Most methods try to use normal bins, but
* relay to TreeNode methods when applicable (simply by checking
* instanceof a node). Bins of TreeNodes may be traversed and
* used like any others, but additionally support faster lookup
* when overpopulated. However, since the vast majority of bins in
* normal use are not overpopulated, checking for existence of
* tree bins may be delayed in the course of table methods.
You would need to visualize it: say there is a Class Key with only hashCode() function overridden to always return same value
public class Key implements Comparable<Key>{
private String name;
public Key (String name){
this.name = name;
}
#Override
public int hashCode(){
return 1;
}
public String keyName(){
return this.name;
}
public int compareTo(Key key){
//returns a +ve or -ve integer
}
}
and then somewhere else, I am inserting 9 entries into a HashMap with all keys being instances of this class. e.g.
Map<Key, String> map = new HashMap<>();
Key key1 = new Key("key1");
map.put(key1, "one");
Key key2 = new Key("key2");
map.put(key2, "two");
Key key3 = new Key("key3");
map.put(key3, "three");
Key key4 = new Key("key4");
map.put(key4, "four");
Key key5 = new Key("key5");
map.put(key5, "five");
Key key6 = new Key("key6");
map.put(key6, "six");
Key key7 = new Key("key7");
map.put(key7, "seven");
Key key8 = new Key("key8");
map.put(key8, "eight");
//Since hascode is same, all entries will land into same bucket, lets call it bucket 1. upto here all entries in bucket 1 will be arranged in LinkedList structure e.g. key1 -> key2-> key3 -> ...so on. but when I insert one more entry
Key key9 = new Key("key9");
map.put(key9, "nine");
threshold value of 8 will be reached and it will rearrange bucket1 entires into Tree (red-black) structure, replacing old linked list. e.g.
key1
/ \
key2 key3
/ \ / \
Tree traversal is faster {O(log n)} than LinkedList {O(n)} and as n grows, the difference becomes more significant.
The change in HashMap implementation was was added with JEP-180. The purpose was to:
Improve the performance of java.util.HashMap under high hash-collision conditions by using balanced trees rather than linked lists to store map entries. Implement the same improvement in the LinkedHashMap class
However pure performance is not the only gain. It will also prevent HashDoS attack, in case a hash map is used to store user input, because the red-black tree that is used to store data in the bucket has worst case insertion complexity in O(log n). The tree is used after a certain criteria is met - see Eugene's answer.
To understand the internal implementation of hashmap, you need to understand the hashing.
Hashing in its simplest form, is a way to assigning a unique code for any variable/object after applying any formula/algorithm on its properties.
A true hash function must follow this rule –
“Hash function should return the same hash code each and every time when the function is applied on same or equal objects. In other words, two equal objects must produce the same hash code consistently.”
I have sort of a hard time understanding an implementation detail from java-9 ImmutableCollections.SetN; specifically why is there a need to increase the inner array twice.
Suppose you do this:
Set.of(1,2,3,4) // 4 elements, but internal array is 8
More exactly I perfectly understand why this is done (a double expansion) in case of a HashMap - where you never (almost) want the load_factor to be one. A value of !=1 improves search time as entries are better dispersed to buckets for example.
But in case of an immutable Set - I can't really tell. Especially since the way an index of the internal array is chosen.
Let me provide some details. First how the index is searched:
int idx = Math.floorMod(pe.hashCode() ^ SALT, elements.length);
pe is the actual value we put in the set. SALT is just 32 bits generated at start-up, once per JVM (this is the actual randomization if you want). elements.length for our example is 8 (4 elements, but 8 here - double the size).
This expression is like a negative-safe modulo operation. Notice that the same logical thing is done in HashMap for example ((n - 1) & hash) when the bucket is chosen.
So if elements.length is 8 for our case, then this expression will return any positive value that is less than 8 (0, 1, 2, 3, 4, 5, 6, 7).
Now the rest of the method:
while (true) {
E ee = elements[idx];
if (ee == null) {
return -idx - 1;
} else if (pe.equals(ee)) {
return idx;
} else if (++idx == elements.length) {
idx = 0;
}
}
Let's break it down:
if (ee == null) {
return -idx - 1;
This is good, it means that the current slot in the array is empty - we can put our value there.
} else if (pe.equals(ee)) {
return idx;
This is bad - slot is occupied and the already in place entry is equal to the one we want to put. Sets can't have duplicate elements - so an Exception is later thrown.
else if (++idx == elements.length) {
idx = 0;
}
This means that this slot is occupied (hash collision), but elements are not equal. In a HashMap this entry would be put to the same bucket as a LinkedNode or TreeNode - but not the case here.
So index is incremented and the next position is tried (with the small caveat that it moves in a circular way when it reaches the last position).
And here is the question: if nothing too fancy (unless I'm missing something) is being done when searching the index, why is there a need to have an array twice as big? Or why the function was not written like this:
int idx = Math.floorMod(pe.hashCode() ^ SALT, input.length);
// notice the diff elements.length (8) and not input.length (4)
The current implementation of SetN is a fairly simple closed hashing scheme, as opposed to the separate chaining approach used by HashMap. ("Closed hashing" is also confusingly known as "open addressing".) In a closed hashing scheme, elements are stored in the table itself, instead of being stored in a list or tree of elements that are linked from each table slot, which is separate chaining.
This implies that if two different elements hash to the same table slot, this collision needs to be resolved by finding another slot for one of the elements. The current SetN implementation resolves this using linear probing, where the table slots are checked sequentially (wrapping around at the end) until an open slot is found.
If you want to store N elements, they'll certainly fit into a table of size N. You can always find any element that's in the set, though you might have to probe several (or many) successive table slots to find it, because there will be lots of collisions. But if the set is probed for an object that's not a member, linear probing will have to check every table slot before it can determine that object isn't a member. With a full table, most probe operations will degrade to O(N) time, whereas the goal of most hash-based approaches is for operations to be O(1) time.
Thus we have a class space-time tradeoff. If we make the table larger, there will be empty slots sprinkled throughout the table. When storing items, there should be fewer collisions, and linear probing will find empty slots more quickly. The clusters of full slots next to each other will be smaller. Probes for non-members will proceed more quickly, since they're more likely to encounter an empty slot sooner while probing linearly -- possibly after not having to reprobe at all.
In bringing up the implementation, we ran a bunch of benchmarks using different expansion factors. (I used the term EXPAND_FACTOR in the code whereas most literature uses load factor. The reason is that the expansion factor is the reciprocal of the load factor, as used in HashMap, and using "load factor" for both meanings would be confusing.) When the expansion factor was near 1.0, the probe performance was quite slow, as expected. It improved considerably as the expansion factor was increased. The improvement was really flattening out by the time it got up to 3.0 or 4.0. We chose 2.0 since it got most of the performance improvement (close to O(1) time) while providing good space savings compared to HashSet. (Sorry, we haven't published these benchmark numbers anywhere.)
Of course, all of these are implementation specifics and may change from one release to the next, as we find better ways to optimize the system. I'm certain there are ways to improve the current implementation. (And fortunately we don't have to worry about preserving iteration order when we do this.)
A good discussion of open addressing and performance tradeoffs with load factors can be found in section 3.4 of
Sedgewick, Robert and Kevin Wayne. Algorithms, Fourth Edition. Addison-Wesley, 2011.
The online book site is here but note that the print edition has much more detail.
Referencing a previous answer to a question on SO, there is a method used called TestForNull. This was my original code before I was told I could make it more efficient:
My original code:
for (int i = 0; i < temp.length; i++) {
if (map.containsKey(temp[i]))
map.put(temp[i], map.get(temp[i]) + 1);
else
map.put(temp[i], 1);
In this snippet, I'm doing three look-ups to the map. I was told that this could be accomplished in just one lookup, so I ended up looking for an answer on SO and found the linked answer, and modified my code to look like:
My modified code:
for (int i = 0; i < temp.length; i++) {
Integer value = map.get(temp[i]);
if (value != null)
map.put(temp[i], value + 1);
else
map.put(temp[i], 1);
}
Even though it seems better, it looks like two look-ups to me and not one. I was wondering if there was an implementation of this that only uses one, and if it can be done without the use of third-party libraries. If it helps I'm using a HashMap for my program.
Java 8 has added a number of default methods to the Map interface that could help, including merge:
map.merge(temp[i], 1, v -> v + 1);
And compute:
map.compute(temp[i], (k, v) -> v == null ? 1 : v + 1);
HashMap's implementations of these methods are appropriately optimized to effectively only perform a single key lookup. (Curiously, the same cannot be said for TreeMap.)
#John Kugelman's answer is the best (as long as you can use java 8).
The first example has a worst case of 3 map calls (in the case of a value already present):
containsKey
get
put
The second example always has exactly 2 calls (and a null check):
get
put
So you are basically trading containsKey for a null check.
In a HashMap, these operations are roughly constant time, assuming good hash code distribution (and that the distribution works well with the size of the HashMap). Other Map implementations (such as TreeMap) have log(n) execution time. Even in the case of HashMap, a null check will be faster than containsKey, making the second option the winner. However, you are unlikely to have a measurable difference unless you have poorly distributed hash codes (or this is the only thing your application is doing) or poor performing equals checks.
I've got a lot of true/false results saved as bits in long[] arrays. I do have a huge number of these (millions and millions of longs).
For example, say I have only five results, I'd have:
+----- condition 5 is true
|
|+---- condition 4 is false
||
||+--- condition 3 is true
|||
|||+-- condition 2 is true
||||
||||+- condition 1 is false
10110
I also do have a few trees representing statements like:
condition1 AND (condition2 OR (condition3 AND condition 4))
The trees are very simple but very long. They basically look like this (it's an oversimplification below, just to show what I've got):
class Node {
int operator();
List<Node> nodes;
int conditionNumber();
}
Basically either the Node is a leaf and then has a condition number (matching one of the bit in the long[] arrays) or the Node is not a leaf and hence refers several subnodes.
They're simple yet they allow to express complicated boolean expressions. It works great.
So far so good, everything is working great. However I do have a problem: I need to evaluate a LOT of expressions, determining if they're true or false. Basically I need to do some brute-force computation for a problem for which there's no know better solution than brute-forcing.
So I need to walk the tree and answer either true or false depending on the content of the tree and the content of the long[].
The method I need to optimize looks like this:
boolean solve( Node node, long[] trueorfalse ) {
...
}
where on the first call the node is the root node and then, obviously, subnodes (being recursive, that solve method calls itself).
Knowing that I'll only have a few trees (maybe up to a hundred or so) but millions and millions of long[] to check, what steps can I take to optimize this?
The obvious recursive solution passes parameters (the (sub)tree and the long[], I could get rid of the long[] by not passing it as a parameter) and is quite slow with all the recursive calls etc. I need to check which operator is used (AND or OR or NOT etc.) and there's quite a lot of if/else or switch statements involved.
I'm not looking for another algorithm (there aren't) so I'm not looking for going from O(x) to O(y) where y would be smaller than x.
What I'm looking for is "times x" speedup: if I can write code performing 5x faster, then I'll have a 5x speedup and that's it and I'd be very happy with it.
The only enhancement I see as of now --and I think it would be a huge "times x" speedup compared to what I have now-- would be to generate bytecode for every tree and have the logic for every tree hardcoded into a class. It should work well because I'll only ever have a hundred or so trees (but the trees aren't fixed: I cannot know in advance how the trees are going to look like, otherwise it would be trivial to simply hardcode manually every tree).
Any idea besides generating bytecode for every tree?
Now if I want to try the bytecode generation route, how should I go about it?
In order to maximize the opportunities for shortcut evaluation, you need to do your own branch prediction.
You might want to profile it, tallying
which AND branches evaluate into false
which OR branches result into true
You can then reorder the tree relative to the weights that you found in the profiling step. If you want/need to be particularly nifty, you can devise a mechanism that detects the weighting for a certain dataset during runtime, so you can reorder the branches on the fly.
Note that in the latter case, it might be advisable to not reorder the actual tree (with respect to storage efficiency and correctness of result while still executing), but rather devise a tree-node visitor (traversal algorithm) that is able to locally sort the branches according to the 'live' weights.
I hope all of this makes sense, because I realize the prose version is dense. However, like Fermat said, the code example is too big to fit into this margin :)
There is a simple and fast way to evaluate boolean operations like this in C. Assuming you want to evaluate z=(x op y) you can do this:
z = result[op+x+(y<<1)];
So op will be a multiple of 4 to select your operation AND, OR, XOR, etc. you create a lookup table for all possible answers. If this table is small enough, you can encode it into a single value and use right shift and a mask to select the output bit:
z = (MAGIC_NUMBER >> (op+x+(y<<1))) & 1;
That would be the fastest way to evaluate large numbers of these. Of course you'll have to split operations with multiple inputs into trees where each node has only 2 inputs. There is no easy way to short circuit this however. You can convert the tree into a list where each item contains the operation number and pointers to the 2 inputs and output. Once in list form, you can use a single loop to blow through that one line a million times very quickly.
For small trees, this is a win. For larger trees with short circuiting it's probably not a win because the average number of branches that need to be evaluated goes from 2 to 1.5 which is a huge win for large trees. YMMV.
EDIT:
On second thought, you can use something like a skip-list to implement short circuiting. Each operation (node) would include a compare value and a skip-count. if the result matched the compare value, you can bypass the next skip-count values. So the list would be created from a depth-first traversal of the tree, and the first child would include a skip count equal to the size of the other child. This takes a bit more complexity to each node evaluation but allows short circuiting. Careful implementation could do it without any condition checking (think 1 or 0 times the skip-count).
I think your byte-coding idea is the right direction.
What I would do, regardless of language, is write a precompiler.
It would walk each tree, and use print statements to translate it into source code, such as.
((word&1) && ((word&2) || ((word&4) && (word&8))))
That can be compiled on the fly whenever the trees change, and the resulting byte code / dll loaded, all of which takes under a second.
The thing is, at present you are interpreting the contents of the trees.
Turning them into compiled code should make them run 10-100 times faster.
ADDED in response to your comments on not having the JDK. Then, if you can't generate Java byte code, I would try to write my own byte-code interpreter than would run as fast as possible. It could look something like this:
while(iop < nop){
switch(code[iop++]){
case BIT1: // check the 1 bit and stack a boolean
stack[nstack++] = ((word & 1) != 0);
break;
case BIT2: // check the 2 bit and stack a boolean
stack[nstack++] = ((word & 2) != 0);
break;
case BIT4: // check the 4 bit and stack a boolean
stack[nstack++] = ((word & 4) != 0);
break;
// etc. etc.
case AND: // pop 2 booleans and push their AND
nstack--;
stack[nstack-1] = (stack[nstack-1] && stack[nstack]);
break;
case OR: // pop 2 booleans and push their OR
nstack--;
stack[nstack-1] = (stack[nstack-1] || stack[nstack]);
break;
}
}
The idea is to cause the compiler to turn the switch into a jump table, so it performs each operation with the smallest number of cycles.
To generate the opcodes, you can just do a postfix walk of the tree.
On top of that, you might be able to simplify it by some manipulation of De Morgan's laws, so you could check more than one bit at a time.
I am looking for something like a Queue that would allow me to put elements at the end of the queue and pop them out in the beginning, like a regular Queue does.
The difference would be that I also need to compact the Queue from time to time. This is, let's assume I have the following items on my Queue (each character, including the dot, is an item in the Queue):
e d . c . b . a
(this Queue has 8 items)
Then, I'd need for example to remove the last dot, so to get:
e d . c . b a
Is there anything like that in the Java Collection classes? I need to use this for a program I am doing where I can't use anything but Java's classes. I am not allowed to design one for myself. Currently I'm just using a LinkedList, but I thought maybe this would be more like a Queue than a LinkedList.
Thanks
edit:
Basically here is what the project is about:
There is a traffic light that can be either green(and the symbol associated is '-') or red('|'). That traffic light is on the right:
alt text http://img684.imageshack.us/img684/9602/0xfhiliidyxdy43q5mageu0.png
In the beggining, you don't have any cars and the traffic light is green, so our list is represented as:
....... -
Now, on the next iteration, I have a random variable that will tell me wherever there is a car coming or not. If there's a car coming, then we can see it appearing from the left. At each iteration, all cars move one step to the right. If they have any car directly on their right, then they can't move:
a...... - (iteration 1)
.a..... - (iteration 2)
..a.... - (iteration 3)
etc.
Now, what happens is that sometimes the traffic light can turn red('-'). In that case, if you have several cars, then even if they had some distance between them when moving, when they have to stop for the traffic light they will get close:
...b.a. - (iteration n)
....b.a - (iteration n+1)
.....ba - (iteration n+2) here they got close to each other
Now, that is the reason why this works like a Queue, but sometimes I have to take down those dots, when the cars are near the red traffic light.
Keep also in mind that the size of of the street here was 7 characters, but it sometimes grows, so we can't assume this is a fixed length list.
A queue is basically a list of items with a defined behavior, in this case FIFO (First In First Out). You add items at the end, and remove them from the beginning.
Now a queue can be implemented in any way you choose; either with a linked-list or with an Array. I think you're on the right path. A linked list would definitely make it easier.
You'll have O(1) for the add and the remove with your queue (if you maintain a reference to the front and the back), but the worst-case scenario for compacting (removing the dot) would be O(n).
I believe there might be a way to reduce the compact operation to O(1) (if you're only removing one dot at a time) if you use a secondary data structure. What you need is another queue (implemented using another linked-list) that maintains a reference to dots in the first linked list.
So, when you insert (a, ., b, c, ., d) you have a list that looks like:
[pointer to rear] -> [d] -> [.] -> [c] -> [b] -> [.] -> [a] <- [pointer to front]
and you also have a secondary queue (implemented as a linked list) that maintains a reference to the dot:
[pointer to rear] -> [reference to second dot] -> [reference to first dot] <- [pointer to front]
Then, when you need to perform a compact operation, all you have to do is remove the first element from the second queue and retain the reference. So you now have a reference to a dot that is in the middle of the first linked list. You can now easily remove that dot from the first list.
You mentioned in a comment that you need to keep track of the order. A queue by definition is an ordered structure (in the sense that things remain in the order they were inserted). So all you need to do is insert a reference to the dot into the second queue when you insert a dot into the first. That way, order is maintained. So when you pull off a reference to a dot from the second queue, you have a reference to the actual and corresponding dot in the first queue.
The trade-off here for speed is that you need more memory, because you're maintaining a second list of references. Worst-case memory requirement is 2x what you're using now. But that is a decent trade-off to get O(1) vs O(n).
Homework exercises/school projects are always tricky, adding subtle stuff to the requirements that may make someone's brain melt down. Do you have any requirement to include the spaces as part of the queue?
Personally, I wouldn't do that unless explicitly required: it seems simpler to represent your cars as Car, Space pairs, (you can define the pair as a struct, assuming you are allowed to use structs) where space is a numeric value representing the space towards the next car in the vehicle. Then, to compact, you only need to look through the list items: when you find one that has Space > 0, do Space--; return;, and all other cars will have already "advanced", as they keep the space with the ones in front of them. In order to output, make sure to toss out Space dots for each car after (if the stoplight is at the right and the cars come from the left) or before (stoplight at left and cars coming from right) the car itself, and there you go. Also note that the Space of the first car represents the distance to the stoplight itself, since it has no car before it.
If you add to the struct a pointer to the next car (and a null pointer for the last car), you already have a linked list: keep a "global" variable that points to the first car (or null for an empty queue). Since Java doesn't directly supports pointers, turn the struct into a class and use "object references" (which are the same as pointers for all purposes other than C'ish pointer arithmetics), and there you go: only one class, built from scratch. The only thing you'll need to touch from Java's libraries is the standard IO and, maybe, a bit of string manipulation, which is an inherent need derived from having to take input and produce output (some colleges have their own course-specific IO libraries, but that doesn't make a big difference here). To loop through the queue you'd do something like this (assuming the class is named "Node", which is quite generic, and obvious names for the fields):
for(Node pos = First; pos != null; pos = pos.Next) {
/* Do your stuff here, knowing that "pos" points to the "current" item on each iteration. */
}
To add new nodes you probably have to traverse the queue to figure out at which distance from the last will the new car "spawn"; when you do so keep the reference from the last node and make its "next" reference point to the new node:
Node last = First;
int distance = 0;
for(Node pos = First; pos != null; pos=pos.Next) {
distance += pos.Space;
last = pos;
}
last.Next = new Node(the_letter_for_this_car, MaxDistance-distance, null);
Of course, tune the constructor to whatever you have.
Considering that this is a college project, let's take a look at some details: the compact process time becomes O(n) and it's memory usage is O(0) (the process itself doesn't need any "local" variables, other than maybe a pointer to traverse the collection which is independent from the length of the queue.) In addition, the memory usage for the queue itself is guaranteed to be smaller or equal to what it would be representing the spaces as items (it only gets to be equal on the worst scenario, when enough cars are stuck at a red light). So, unless the requirements include something incompatible with this approach, I'd expect this to be what your teachers want: it's reasoned, efficient, and compliant to what you were asked for.
Hope this helps.
I would say that a LinkedList would be the best approach here... as a LinkedList allows you to push/pop from the front/back and allows you to remove an item in the middle of the list.
Obviously, the lookup time sucks, but if you are adding/removing from the front/back of the list more often then looking up, then I'd say stick with the LinkedList.
Maybe a second LinkedList which keeps the dot element ?