"Simulating" a 64-bit integer with two 32-bit integers - java

I'm writing a very computationally intense procedure for a mobile device and I'm limited to 32-bit CPUs. In essence, I'm performing dot products of huge sets of data (>12k signed 16-bit integers). Floating point operations are just too slow, so I've been looking for a way to perform the same computation with integer types. I stumbled upon something called Block Floating Point arithmetic (page 17 in the linked paper). It does a pretty good job, but now I'm faced with a problem of 32 bits just not being enough to store the output of my calculation with enough precision.
Just to clarify, the reason it's not enough precision is that I would have to drastically reduce precision of each of my arrays' elements to get a number fitting into a 32-bit integer in the end. It's the summation of ~16000 things that makes my result so huge.
Is there a way (I'd love a reference to an article or a tutorial) to use two 32-bit integers as most significant word and least significant word and define arithmetic on them (+, -, *, /) to process data efficiently? Also, are there perhaps better ways of doing such things? Is there a problem with this approach? I'm fairly flexible on programming language I use. I would prefer C/C++ but java works as well. I'm sure someone has done this before.

I'm pretty sure that the JVM must support a 64-bit arithmetic long type, and if the platform doesn't support it, then the VM must emulate it. However, if you can't afford to use float for performance problems, then a JVM will probably destroy you.
Most C and C++ implementations will provide 64-bit arithmetic emulated for 32bit targets- I know that MSVC and GCC do. However, you should be aware that you can be talking about many integer instructions to save a single floating-point instruction. You should consider that the specifications for this program are unreasonable, or perhaps that you could free performance from somewhere else.

Yes, just use 64 bit integers:
long val; // Java
#include <stdint.h>
int64_t val; // C

There is a list of libraries on the wikipedia page about Arbitrary Precision Arithmetic. Perhaps something on there would work for you?

If you can use Java, the short answer is: Use Java long's. The Java standard defines a long as 64 bits. Any JVM should implement this or it is not compliant with the standard. Nothing requires the CPU to support 64-bit arithmetic. If it's not natively supported, a JVM should implement it with software.
If you really have some crippled Java that does not support long's, use BigInteger. This handles integers of any arbitrarily-large size.

Talking about C/C++.
Any normal compiler would support "long long" type as 64-bit integrs with all normal arithmetic.
Combined with -O3, it gets very good chances of outputting best possible code for 64-bit arithemtic on your platform.

Related

Java's BigInteger implementation

I'm new here so please excuse my noob mistakes. I'm currently working on a little project of mine that sees me dealing with digits with a length in the forty thousands and beyond.
I'm currently using BigInteger to handle these values, and I need something that performs faster. I've read that BigInteger uses an array of integers in its implementation, and what I need to know is whether BigInteger is using each index in this array to represent each decimal point, as in 1 - 9, or is it using something more efficient.
I ask this because I already have an implementation in mind that uses bit operations, which makes it more efficient, memory and processing wise.
So the final question is - is BigInteger already efficient enough, and should I just rely on that? It would better to know this rather than putting it to the test unnecessarily, which would take a lot of time.
Thank you.
At least with Oracle's Java 8 and OpenJDK 8, it doesn't store one decimal digit per int. It stores full 32-bit portions per 32-bit int in the int[], which can be seen with its source code.
Bit operations are fast for it, since it's a sign-magnitude value and the magnitude is stored packed just as you'd expect, just make sure that you use the relevant BigInteger bitwise methods rather than implementing your own.
If you still need more speed, try something like GMP, though be aware that it uses a LGPL or GPL license. It would also be better to use it outside of Java.

Java - operations on native short always return int, why? and how? [duplicate]

Why does the Java API use int, when short or even byte would be sufficient?
Example: The DAY_OF_WEEK field in class Calendar uses int.
If the difference is too minimal, then why do those datatypes (short, int) exist at all?
Some of the reasons have already been pointed out. For example, the fact that "...(Almost) All operations on byte, short will promote these primitives to int". However, the obvious next question would be: WHY are these types promoted to int?
So to go one level deeper: The answer may simply be related to the Java Virtual Machine Instruction Set. As summarized in the Table in the Java Virtual Machine Specification, all integral arithmetic operations, like adding, dividing and others, are only available for the type int and the type long, and not for the smaller types.
(An aside: The smaller types (byte and short) are basically only intended for arrays. An array like new byte[1000] will take 1000 bytes, and an array like new int[1000] will take 4000 bytes)
Now, of course, one could say that "...the obvious next question would be: WHY are these instructions only offered for int (and long)?".
One reason is mentioned in the JVM Spec mentioned above:
If each typed instruction supported all of the Java Virtual Machine's run-time data types, there would be more instructions than could be represented in a byte
Additionally, the Java Virtual Machine can be considered as an abstraction of a real processor. And introducing dedicated Arithmetic Logic Unit for smaller types would not be worth the effort: It would need additional transistors, but it still could only execute one addition in one clock cycle. The dominant architecture when the JVM was designed was 32bits, just right for a 32bit int. (The operations that involve a 64bit long value are implemented as a special case).
(Note: The last paragraph is a bit oversimplified, considering possible vectorization etc., but should give the basic idea without diving too deep into processor design topics)
EDIT: A short addendum, focussing on the example from the question, but in an more general sense: One could also ask whether it would not be beneficial to store fields using the smaller types. For example, one might think that memory could be saved by storing Calendar.DAY_OF_WEEK as a byte. But here, the Java Class File Format comes into play: All the Fields in a Class File occupy at least one "slot", which has the size of one int (32 bits). (The "wide" fields, double and long, occupy two slots). So explicitly declaring a field as short or byte would not save any memory either.
(Almost) All operations on byte, short will promote them to int, for example, you cannot write:
short x = 1;
short y = 2;
short z = x + y; //error
Arithmetics are easier and straightforward when using int, no need to cast.
In terms of space, it makes a very little difference. byte and short would complicate things, I don't think this micro optimization worth it since we are talking about a fixed amount of variables.
byte is relevant and useful when you program for embedded devices or dealing with files/networks. Also these primitives are limited, what if the calculations might exceed their limits in the future? Try to think about an extension for Calendar class that might evolve bigger numbers.
Also note that in a 64-bit processors, locals will be saved in registers and won't use any resources, so using int, short and other primitives won't make any difference at all. Moreover, many Java implementations align variables* (and objects).
* byte and short occupy the same space as int if they are local variables, class variables or even instance variables. Why? Because in (most) computer systems, variables addresses are aligned, so for example if you use a single byte, you'll actually end up with two bytes - one for the variable itself and another for the padding.
On the other hand, in arrays, byte take 1 byte, short take 2 bytes and int take four bytes, because in arrays only the start and maybe the end of it has to be aligned. This will make a difference in case you want to use, for example, System.arraycopy(), then you'll really note a performance difference.
Because arithmetic operations are easier when using integers compared to shorts. Assume that the constants were indeed modeled by short values. Then you would have to use the API in this manner:
short month = Calendar.JUNE;
month = month + (short) 1; // is july
Notice the explicit casting. Short values are implicitly promoted to int values when they are used in arithmetic operations. (On the operand stack, shorts are even expressed as ints.) This would be quite cumbersome to use which is why int values are often preferred for constants.
Compared to that, the gain in storage efficiency is minimal because there only exists a fixed number of such constants. We are talking about 40 constants. Changing their storage from int to short would safe you 40 * 16 bit = 80 byte. See this answer for further reference.
The design complexity of a virtual machine is a function of how many kinds of operations it can perform. It's easier to having four implementations of an instruction like "multiply"--one each for 32-bit integer, 64-bit integer, 32-bit floating-point, and 64-bit floating-point--than to have, in addition to the above, versions for the smaller numerical types as well. A more interesting design question is why there should be four types, rather than fewer (performing all integer computations with 64-bit integers and/or doing all floating-point computations with 64-bit floating-point values). The reason for using 32-bit integers is that Java was expected to run on many platforms where 32-bit types could be acted upon just as quickly as 16-bit or 8-bit types, but operations on 64-bit types would be noticeably slower. Even on platforms where 16-bit types would be faster to work with, the extra cost of working with 32-bit quantities would be offset by the simplicity afforded by only having 32-bit types.
As for performing floating-point computations on 32-bit values, the advantages are a bit less clear. There are some platforms where a computation like float a=b+c+d; could be performed most quickly by converting all operands to a higher-precision type, adding them, and then converting the result back to a 32-bit floating-point number for storage. There are other platforms where it would be more efficient to perform all computations using 32-bit floating-point values. The creators of Java decided that all platforms should be required to do things the same way, and that they should favor the hardware platforms for which 32-bit floating-point computations are faster than longer ones, even though this severely degraded PC both the speed and precision of floating-point math on a typical PC, as well as on many machines without floating-point units. Note, btw, that depending upon the values of b, c, and d, using higher-precision intermediate computations when computing expressions like the aforementioned float a=b+c+d; will sometimes yield results which are significantly more accurate than would be achieved of all intermediate operands were computed at float precision, but will sometimes yield a value which is a tiny bit less accurate. In any case, Sun decided everything should be done the same way, and they opted for using minimal-precision float values.
Note that the primary advantages of smaller data types become apparent when large numbers of them are stored together in an array; even if there were no advantage to having individual variables of types smaller than 64-bits, it's worthwhile to have arrays which can store smaller values more compactly; having a local variable be a byte rather than an long saves seven bytes; having an array of 1,000,000 numbers hold each number as a byte rather than a long waves 7,000,000 bytes. Since each array type only needs to support a few operations (most notably read one item, store one item, copy a range of items within an array, or copy a range of items from one array to another), the added complexity of having more array types is not as severe as the complexity of having more types of directly-usable discrete numerical values.
If you used the philosophy where integral constants are stored in the smallest type that they fit in, then Java would have a serious problem: whenever programmers write code using integral constants, they have to pay careful attention to their code to check if the type of the constants matter, and if so look up the type in the documentation and/or do whatever type conversions are needed.
So now that we've outlined a serious problem, what benefits could you hope to achieve with that philosophy? I would be unsurprised if the only runtime-observable effect of that change would be what type you get when you look the constant up via reflection. (and, of course, whatever errors are introduced by lazy/unwitting programmers not correctly accounting for the types of the constants)
Weighing the pros and the cons is very easy: it's a bad philosophy.
Actually, there'd be a small advantage. If you have a
class MyTimeAndDayOfWeek {
byte dayOfWeek;
byte hour;
byte minute;
byte second;
}
then on a typical JVM it needs as much space as a class containing a single int. The memory consumption gets rounded to a next multiple of 8 or 16 bytes (IIRC, that's configurable), so the cases when there are real saving are rather rare.
This class would be slightly easier to use if the corresponding Calendar methods returned a byte. But there are no such Calendar methods, only get(int) which must returns an int because of other fields. Each operation on smaller types promotes to int, so you need a lot of casting.
Most probably, you'll either give up and switch to an int or write setters like
void setDayOfWeek(int dayOfWeek) {
this.dayOfWeek = checkedCastToByte(dayOfWeek);
}
Then the type of DAY_OF_WEEK doesn't matter, anyway.
Using variables smaller than the bus size of the CPU means more cycles are necessary. For example when updating a single byte in memory, a 64-bit CPU needs to read a whole 64-bit word, modify only the changed part, then write back the result.
Also, using a smaller data type requires overhead when the variable is stored in a register, since the behavior of the smaller data type to be accounted for explicitly. Since the whole register is used anyways, there is nothing to be gained by using a smaller data type for method parameters and local variables.
Nevertheless, these data types might be useful for representing data structures that require specific widths, such as network packets, or for saving space in large arrays, sacrificing speed.

Primitive data types and portability in Java

I quote from Herbert Schildt Chapter 3 Data types, Variables and Arrays :
The primitive types represent single values not complex objects.
Although Java is otherwise completely object-oriented, the primitive
types are not. The reason for this efficiency. Making the primitive
types would have degraded performance too much.
The primitive types are defined to have an explicit range and
mathematical behavior. Languages such as C, C++ allow the size of an
integer to vary based upon the dictates of the execution environment.
However, Java is different. Because of Java’s portability requirement,
all data types have a strongly defined range. For example, an int is
always 32-bit regardless of the particular platform. This allows
programs to be written that are guaranteed to run without porting on
any machine architecture. While strictly specifying the size of an
integer may cause a small loss of performance in some environments, it
is necessary in order to achieve portability.
What does he mean by the last 2 lines ? And how come specifying the size of an integer may cause a small loss of performance in some environments?
In "lower" languages, primitive data types sizes are often derived from the CPU's ability to handle them.
E.g., in c, an int is defined as being "at least 16 bits in size", but its size may vary between architectures in order to assure that "The type int should be the integer type that the target processor is most efficient working with." (source). This means that if your code makes careless assumptions about an int's size, it may very well break if you port it from 32-bit x86 to 64-bit powerpc.
java, as noted above, is different. An int, e.g., will always be 32 bits. This means you don't have to worry about its size changing when you run the same code on a different architecture. The tradeoff, as also mentioned above, is performance - on any architecture that doesn't natively handle 32 bit calculations, these ints need to be expanded to the native size the CPU can handle (which will have a small penalty), or worse, if the CPU can only handle smaller ints, every operation on an int may require several CPU operations.
A few (very few) computers use 36 bit architecture, so you need an extra step to mask off bits, simulate overflows, etc.
Java implements its own data pointer mechanism on the top of the underlying system' pointer mechanism. Lot of the systems may use smaller pointers if the data is not large enough.
For ex: If your integer data only requires 16 bit pointer, system will only allocate required storage where 16 bit pointer is used. But if you are using Java, it will convert 16 bit into 32 bit pointer allocation where large memory space is required which also degrades performance in terms of storage and data seek. Because your pointer is large.
AFAIK, there is no way in java to define the size of an integer. It is always an 32 bit int as mentioned here. But some programming languages may allow to specify the size of the integer (Ex: Ada).
The performance issue comes when compiler try to convert our code to machine code instructions (See here and here). Normally machine code instructions are 32 or 64 bits. If our ints are same as the size in machine code, it is easy to convert them into the machine code. Otherwise compiler needs to put an extra effort to covert them into the machine code. That's when the performance issue comes.

Are 64 bit integers less efficient than 32 bit integers in the JVM?

Background: I want to store numbers that are precise to 4 decimal places, without roundoff. So I thought of using integers internally; for example, 12.3456 is represented as 123456 internally. But with 32b integers, I can count only upto 214748, which is very small.
I guess that 64-bit integers are the solution. But are operations involving 64-bit integers less efficient than 32-bit integers, given a machine running a 64-bit JVM?
BTW, I am using an information retrieval package (Solr), an optimization package (Drools) and other packages written in Java, and they may not play well with decimal datatype (if you suggest it).
Even if it is slower, I doubt this would be the bottleneck in your system. You are very likely going to have more significant performance issues in other parts of your program.
Also, the answer to this question provides more details, but basically "It's platform dependent.". It's not necessarily true 64 bit will be slower than 32 bit.
This is likely to be platform dependant. I have seen cases where using long instead of int is about 10% faster. The 64-bit JVM for Java 5.0 was about 5% - 10% slower than the 32-bit JVM for Java 5.0. Java 6 doesn't appear to have this problem.
I imagine the cost of dividing by 10000 far outweighs the cost of using a long instead of an int value.
You could also use double, rounding the result to four decimal places before printing/outputting it.
Generally, the more data you need to hurl around, the slower it is, so even on a 64-bit VM sticking to int instead of long is faster in most cases.
This becomes very clear if you think in terms of memory footprint: an array of 1 million ints requires 4MB, 1M longs eat 8MB.
As for the computational speed, there is some overhead to perform operations on 64-bit types with 32-bit instructions. But even if the VM can use 64-bit instructions (which it should on a 64-bit VM), depending on the CPU they may still be slower than their 32-bit counterparts (add/subtract will probably go in one clock, but multiply and divide in 64-bit are usually slower than in 32-bit).
A very common misconception is that integer math is faster than floating point math. As soon as you need to perform extra operations to "normalize" your integers, floating point will beat your integer implementation flat in performance. The actual differences in clock cycles spent between integer and floating point instructions is neglible for most applications, so if floating point is waht you need, use it and don't attempt to emulate it yourself.
For the question which type to actually use: Use the type thats most appropriate in terms of data representation. Worry about performance when you get there. Look at wht operations you need to perform and what precision you need. Then select the type that offers exactly that. Judging by the libraries you mentioned, double will probably be the winner of that.

Operation on different data types

Considering the basic data types like char, int, float, double etc..in any standard language C/C++, Java etc
Is there anything like.."operating on integers are faster than operating on characters".. by operating I mean assignment, arithmetic op/ comparison etc.
Are data types slower than one another?
For almost anything you're doing this has almost no effect, but purely for informational purposes, it is usually fastest to work with data types whose size is machine word size (i.e. 32 bits on x86 and 64-bits on amd64). Additionally, SSE/MMX instructions give you benefits as well if you can group these and work on them at the same time
Rules for this are a bit like rules for English spelling and/or grammar. The rules are broken at least as often as they're followed.
Just for example, for years "everybody has known" that floating point operations are slower than integers, especially for more complex operations like multiply and divide. In reality, some processors do some integer operations (especially multiplication and division) by converting the operands to floating point, doing the operation in floating point, then converting the result back to an integer. As you'd expect from that, the floating point operation is actually faster (though only a little bit).
Most of the time, however, it doesn't matter much -- in a lot of cases, it's quite reasonable to think of the operations on the processor itself as free, and concern yourself primarily with optimizing your use of bandwidth to memory. Of course, doing that well is often even harder...
yes , some data types are definitely slower than others. For example , floats are more complicated than int's and thus may incur additional penalties when doing divides and multiplies. It all depends on how your hardware is setup and what kind of instructions it supports.
Data types which is longer than the machine word size will also be slower because it takes more cycles to perform operations.
depending on what you do, the difference can be quite large, especially when working with floats versus double versus long double.
In modern processors it comes down to simd instructions, which have certain length, most commonly 128 bit. so four float versus two double numbers.
However some processors only have 32 bit simd instructions(PPC) and GPU hardware has a factor of eight performance difference between float and double.
when you add trigonometric , exponential, and square root functions into the mix, float numbers are going to have better performance overall given number of factors.
Almost all of the answers on this page are mostly right. The answer, however, varies wildly depending upon your hardware, language, compiler, and VM (in managed languages like Java). On most CPUs, your best performance will be to do the operations on a data type that fits the native operand size of your CPU. In some cases, some compilers will optimize this for you, however.
On most modern desktop CPUs the difference between floating point and integer operations has become pretty trivial. However, on older hardware and a lot of embedded systems the difference in all of these factors can still be really, really big.
The important thing is to know the specifics of your target architecture and your tools.
This answer relates to the Java case (only).
The literal answer is that the relative speed of the primitive types and operators depends on your processor hardware and your JVM implementation.
But a better answer is that it usually doesn't make a a lot of difference to performance what representations you use. Indeed, any clever data type optimizations you do to make your code run fast on your current machine / JVM may turn out to be anti-optimizations on a different machine / JVM combination.
In general, it is better to pick a data type that represents your data in a correct and natural way, and leave it to the compiler to sort out the details. However, if you are creating large arrays of a primitive type, it is worth knowing that Java uses compact representations for arrays of boolean, byte and short.

Categories

Resources