Replace Java math operators with BigDecimal equivalents

Replace Java math operators with BigDecimal equivalents - java

I have a program that works with snippets of Java code that do math on doubles using the standard mathematical operators, like this:
double someVal = 25.03;
return (someVal * 3) - 50;
For Reasons (mostly rounding errors) I would like to change all these snippets to use BigDecimal instead of double, modifying the math functions along the way, like this:
MathContext mc = MathContext.DECIMAL32;
BigDecimal someVal = new BigDecimal("25.03", mc);
return someVal.multiply(BigDecimal.valueOf(3), mc).subtract(BigDecimal.valueOf(50), mc);
The snippets are mostly pretty simple, but I would prefer to avoid a fragile solution (eg, regex) if I can. Is there a relatively straightforward way to do this?
Note I want to have a program or code perform these modifications (metaprogramming). Clearly I'm capable of making the changes by hand, but life is too short.

You could try Google's "Refaster", which, according to the paper, is "a tool that uses normal, compilable before-and-after examples of Java code to specify a Java refactoring."
The code lives under core/src/main/java/com/google/errorprone/refaster in Google's error-prone github project. (It used to live in its own github project.)
This is more of a hint than an answer, though, since I've never worked directly with Refaster and don't know how well it does on primitive expressions like the ones in your example. I also don't know how well-suited it is for toolchain use like yours, vs. one-time refactors (which is how Google tends to use it). But it's actively maintained, and I've seen it used really effectively in the past.

We use BigDecimal for financial calculation. As per other comments, you going to have some performance degradation and the code will be very hard to read. The performance impact depends on how many operation you going to have. Usually, you facing rounding issues with doubles when your calculation chain is long. You wouldn't have many issues if you do c=a+b but will if you have c+=a+b million times. And with a thousand of operations, you will notice how bigDecimal are slower than double, so do performance testing.
Be careful when changing your code especially with the division, you will have to specify rounding mode and scale of the result, this what people usually wouldn't do and it leads to the errors.
I assume it not only about replacing calculation logic but also you'll need to change your domain model so I doubt you can come up with a script to do it in a reasonable time, so do it by hands. Good IDE will help you a lot.
No matter how you going to convert your code I suggest to firstly make sure that all your calculation logic covered by unit tests and do unit test conversion before changing the logic. I.e replace assertion of the values by wrapping them with bigDecimals. In that case, you will avoid silly typing/algorithm mistakes.
I would not answer your question how to convert from double to BigDecimal just want to share some notes regaridng to the

Don't do this.
It's a huge readability hit. Your example turned "25.03 * 3 - 50" into 4 lines of code.
Financial code usually uses double, or long for cents. It's precise enough that it's just a question of proper programming to avoid rounding errors: What to do with Java BigDecimal performance?
It's, likely, a huge performance hit, in particular with erratic garbage collections, which is not acceptable for HFT and the like: https://stackoverflow.com/a/1378156/1339987
I don't know how much code you are talking about but I do expect you will have to make lots of small decisions. This reduces the chance there is an openly available tool (which I'm all but sure does not exist) and increases the amount of configuration work you would have to do should you find one.
This will introduce bugs for the same reason you expect it to be an improvement, unless you have extremely good test coverage.
Programmatically manipulating Java source is covered here: Automatically generating Java source code
You don't need to accept this answer but I believe my advice is correct and think other readers of this question need to see up-front the case for not making this transformation, which is my rationale for posting this somewhat nonanswer.

Related

Are Parallel Streams and forkjoinpool safe to use in Production?

We developed an API call which uses Java8 parallel streams and we have got very good performance, almost double compared to sequential processing when doing stress tests.
I know it depends on the use case, but I am using it for crypto operations, so I assume that this is a good use case.
However, I have read a lot of articles that encourages to be very careful with them. There are also articles that discuss that they are not very well internally designed, like here.
Thus: are parallel streams production ready; are they widely used in Production Systems?

This question invites for "opinions"; but I try to answer fact-based.
Fork/Join
These classes aren't new! As you can see, they were already introduced with Java 1.7. In other words: these classes are around for several years by now; and used in many places. Thus: low risk.
Parallel Streams
Were added "just recently" in Java terms (keep in mind how much of legacy Java has in 2017; and how slowly [compared to other languages] Java is evolving). I think the simple answer here is: we don't know yet if parallel streams will become a "cornerstone" of Java programming, or if people will prefer other ways to solve the problems addresses by parallel streams at some point.
Beyond that: users of other languages (such as JavaScript) are used to "changes gears" (aka frameworks) on an almost "monthly" basis. That is means a lot of churn, but it also means that "good things" are applied quickly; like in: why post-pone improving things?!
What I mean by that: when you find that parallel streams help you to improve performance; and when your team agrees "yes, we can deal with the stream()-way of writing code" ... then just go forward.
In other words: when parallel streams help your team/product to "get better", than why not try to capitalize on that? Now, not in 12 or 24 months.
If streams are "not that great big thing"; then well, maybe you have to rewrite some code at some point in the future.
Long story short: this is about balancing potential risks against potential gains. It seems that you made some positive experiences already; so I think a reasonable compromise would be: apply streams, but in a controlled way. So, that a later decision "wrong turn, get rid of them" doesn't become too expensive.

Since I wrote the article you linked to I should say a few words.
As others have said, try it and see. Parallel Streams wants to split into a balanced tree: Split left, right, left, right. I you do that then performance is good. If not, then performance is terrible.
The Framework uses dyadic recursive division. Streams are linear. That is not a good match. And never forget that volume changes everything. Adding scaling to the mix may surprise you, but you won't know until you try it.
Let us know how it works out.

What currency to use in unit tests?

I'm developing an application that relies heavily on Joda-Money, and have a number of unit tests that verify my business logic. One (admittedly minor) sticking point for me has been what sort of Money/BigMoney objects to test with; specifically, what CurrencyUnit to use.
As I see it, I have a couple of options:
Just use USD
This is clearly the easiest way to go, and most of my actual application will be working with US Dollars so it makes a fair bit of sense. On the other hand, it feels rather US-centric, and I'm concerned it would risk letting currency-specific errors go unchecked.
Use another real currency, like CAD
This would catch erroneous hard-codings of USD, but otherwise isn't much better than just using USD.
Use a dedicated "fake" currency, i.e. XTS
This clearly makes sense, after all, XTS is "reserved for use in testing". But Joda denotes psuedo-currencies as currencies with -1 decimal places. In practice, the primary difference between currencies in Joda-Money is the number of decimal places, so this risks masking any errors involving decimal place precision, such as erroneously rounding to an integer value.
Register my own custom currency with CurrencyUnit.registerCurrency()
This would obviously work, but seems a little odd seeing as there are alternatives.
Use a CurrencyUnit instance created by a mocking library
Pretty much the same as registering a custom currency.
Again, this is obviously a minor issue, but I'm curious if there's a standard practice for cases like this, or if there's a clear reason to prefer one of these options in particular.

Use USD (or, generally, whatever currency is most commonly used in your application). I say this for two reasons:
Good test data is unremarkable in every respect except that part of it which the test is actually about. When you're writing tests that don't have anything to do with differences between currencies, you don't want to have to think about differences between currencies. Just use whatever is most natural in the application.
The idea that using an unusual currency everywhere will somehow result in better testing of unusual currencies is a red herring. Tests should be explicit and focused. If you need to test something about a specific currency, write a test whose point is to test that thing. And if a test is not about a specific currency, it shouldn't break when handling of some unusual aspect of that currency breaks -- it's not valuable to have half your tests break for the same reason; you only want one to break. So there is just no need to spread unusual currencies around the test suite and hope that that will catch something. Instead, optimize for readability; see point 1.

Since most of your application deals with U. S. dollars, it makes sense to use them for the vast majority of your tests. What currency-specific errors are you worried might happen? Missing halfpennies when compounding interest? Conversion to yen and back is off by a cent or two? Write tests for that.
If I were you, I'd fire up a local REPL (a Scala REPL works very well if you don't have JShell), run a few experiments. You'll probably find that you worried for nothing. Or you might discover that there is indeed a flaw, and your REPL session will inform how you write a test for that flaw.
You're using Joda Money because you don't want to reinvent this particular wheel. I didn't realize that until I read the question details. Presumably Joda Money has been tested rigorously by its developers and it does everything you expect it to do. You don't need to test it again.
But if you were making your own class to represent money amounts, I would suggest you use U. S. dollars, euros, Japanese yen and Lybian dinars (LYD). The reason for the first three is obvious. I suggest Lybian dinars because of the three decimal places. From what I can gather, there is no physical 1 dirham coin, so, for example, 513 darahim would get rounded down to 500 darahim and 997 darahim would get rounded up to a full dinar.
To test conversions, I would start with East Caribbean dollars (XCD), since they are fixed to exchange to US$2.70. Later on you can worry about currencies that fluctuate in relation to each other, and whether you're going to deal with them by mocking a rate server, or by connecting to an actual rate server but putting variances in your tests or by some other way.

Can you/How do you save CPU and memory by choosing wisely [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I understand the JVM optimizes some things for you (not clear on which things yet), but lets say I were to do this:
while(true) {
int var = 0;
}
would doing:
int var;
while(true) {
var = 0;
}
take less space? Since you aren't declaring a new reference every time, you don't have to specify the type every time.
I understand you really would only need to put var outside of while if I wanted to use it outside of that loop (instead of only being able to use it locally like in the first example). Also, what about objects, would it be different that primitive types in that situation? I understand it's a small situation, but build-up of this kind of stuff can cause my application to take a lot of memory/cpu. I'm trying to use the least amount of operations possible, but I don't completely understand whats going on behind the scenes.
If someone could help me out, even maybe link me to somewhere I can learn about saving cpu by decreasing amount of operations, it would be highly appreciated. Please no books (unless they're free! :D), no way of getting one right now /:

Don't. Premature optimization is the root of all evil.
Instead, write your code as it makes most sense conceptually. Write it thoughtfully, yes. But don't think you can be a 'human compiler' and optimize and still write good code.
Once you have written your code (more or less naively, depending on your level of experience) you write performance tests for it. Try to think of different ways in which the code may be used (many times in a row, from front to back or reversed, many concurrent invocations etc) and try to cover these in test cases. Then benchmark your code.
If you find that some test cases are not performing well, investigate why. Measure parts of the test case to see where the time is going. Zoom into the parts where most time is spent.
Mostly, you will find weird loops where, upon reading the code again, you will think 'that was silly to write it that way. Of course this is slow' and easily fix it. In my experience most performance problems can be solved this way and 'hardcore optimization' is hardly ever needed.
In the end you will find that 99* percent of all performance problems can be solved by touching only 1 percent of the code. The other code never comes into play. This is why you should not 'prematurely' optimize. You will be spending valuable time optimizing code that had no performance issues in the first place. And making it less readable in the process.
Numbers made up of course but you know what I mean :)
Hot Licks points out the fact that this isn't much of an answer, so let me expand on this with some good ol' perfomance tips:
Keep an eye out for I/O
Most performance problems are not in pure Java. Instead they are in interfacing with other systems. In particular disk access is notoriously slow. So is the network. So minimize it's use.
Optimize SQL queries
SQL queries will add seconds, even minutes, to your program's execution time if you don't watch out. So think about those very carefully. Again, benchmark them. You can write very optimized Java code, but if it first spends ten seconds waiting for the database to run some monster SQL query than it will never be fast.
Use the right kind of collections
Most performance problems are related to doing things lots of times. Usually when working with big sets of data. Putting your data in a Map instead of in a List can make a huge difference. Also there are specialized collection types for all sorts of performance requirements. Study them and pick wisely.
Don't write code
When performance really matters, squeezing the last 'drops' out of some piece of code becomes a science all in itself. Unless you are writing some very exotic code, chances are great there will be some library or toolkit to solve your kind of problems. It will be used by many in the real world. Tried and tested. Don't try to beat that code. Use it.
We humble Java developers are end-users of code. We take the building blocks that the language and it's ecosystem provides and tie it together to form an application. For the most part, performance problems are caused by us not using the provided tools correctly, or not using any tools at all for that matter. But we really need specifics to be able to discuss those. Benchmarking gives you that specifity. And when the slow code is identified it is usually just a matter of changing a collection from list to map, or sorting it beforehand, or dropping a join from some query etc.

Attempting to optimise code which doesn't need to be optimised increases complexity and decreases readability.
However, there are cases were improving readability also comes with improved performance.
For example,
if a numeric value cannot be null, use a primitive instead of a wrapper. This makes it clearer that the value cannot be null but also uses less memory and reduces pressure on the GC.
use a Set when you have a collection which cannot have duplicates. Often a List is used when in fact a Set would be more appropriate, depending on the operations you perform, this can also be faster by reducing time complexity.
consider using an enum with one instance for a singleton (if you have to use singletons at all) This is much simpler as well as faster than double check locking. Hint: try to only have stateless singletons.
writing simpler, well structured code is also easier for the JIT to optimise. This is where trying to out smart the JIT with more complex solutions will back fire because you end up confusing the JIT and what you think should be faster is actually slower. (And it's more complicated as well)
try to reduce how much you write to the console (and IO in general) in critical sections. Writing to the console is so expensive, both for the program and the poor human having to read it that is it worth spending more time producing concise console output.
try to use a StringBuilder when you have a loop of elements to add. Note: Avoid using StringBuilder for one liners, just series of append() as this can actually be slower and harder to read.
Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away. --
Antoine de Saint-Exupery,
French writer (1900 - 1944)
Developers like to solve hard problems and there is a very strong temptation to solve problems which don't need to be solved. This is a very common behaviour for developers of up to 10 years experience (it was for me anyway ;), after about this point you have already solved most common problem before and you start selecting the best/minimum set of solutions which will solve a problem. This is the point you want to get to in your career and you will be able to develop quality software in far less time than you could before.
If you dream up an interesting problem to solve, go ahead and solve it in your own time, see what difference it makes, but don't include it in your working code unless you know (because you measured) that it really makes a difference.
However, if you find a simpler, elegant solution to a problem, this is worth including not because it might be faster (thought it might be), but because it should make the code easier to understand and maintain and this is usually far more valuable use of your time. Successfully used software usually costs three times as much to maintain as it cost to develop. Do what will make the life of the poor person who has to understand why you did something easier (which is harder if you didn't do it for any good reason in the first place) as this might be you one day ;)
A good example on when you might make an application slower to improve reasoning, is in the use of immutable values and concurrency. Immutable values are usually slower than mutable ones, sometimes much slower, however when used with concurrency, mutable state is very hard to get provably right, and you need this because testing it is good but not reliable. Using concurrency you have much more CPU to burn so a bit more cost in using immutable objects is a very sensible trade off. In some cases using immutable objects can allow you to avoid using locks and actually improve throughput. e.g. CopyOnWriteArrayList, if you have a high read to write ration.

Is it beneficial(in terms of memory & space complexity) to write a few lines of code into a single line. Is it worth it?

Example: Simple program of swapping two nos.
int a = 10;
int b = 20;
a = a+b;
b = a-b;
a = a-b;
Now in the following piece of code:
a=a+b-(b=a);
I mean What is the difference b/w these two piece of codes?
Addition : What if the addition of these two exceed the legitimate limit of an Integer which is different in case of Java & C++?

Neither of these looks good to me. Readability is key. If you want to swap values, the most "obvious" way to do it is via a temporary value:
int a = 10;
int b = 20;
int tmp = a;
a = b;
b = tmp;
I neither know nor would I usually care whether this was as efficient as the "clever" approaches involving arithmetic. Until someone proves that the difference in performance is significant within a real application, I'd aim for the simplest possible code that works. Not just here, but for all code. Decide how well you need it to perform (and in what dimensions), test it, and change it to be more complicated but efficient if you need to.
(Of course, if you've got a swap operation available within your platform, use that instead... even clearer.)

In C++, the code yields undefined behavior because there's no sequence point in a+b-(b=a) and you're changing b and reading from it.
You're better off using std::swap(a,b), it is optimized for speed and much more readable than what you have there.

Since your specific code is already commented upon, i would just add a general aspect. Writing one liners doesn't really matter because at instruction level, you cannot escape the number of steps your assembly is going to translate into machine code. Most of the compilers would already optimize accordingly.
That is, unless the one liner is actually using a different mechanism to achieve the goal for e.g. in case of swapping two variables, if you do not use a third variable and can avoid all the hurdles such as type overflow etc. and use bitwise operators for instance, then you might have saved one memory location and thereby access time to it.
In practice, this is of almost no value and is trouble for readability as already mentioned in other answers. Professional programs need to be maintained by people so they should be easy to understand.
One definition of good code is Code actually does what it appears to be doing
Even you yourself would find it hard to fix your own code if it is written cleverly in terms of some what shortened but complex operations. Readability should always be prioritized and most of the times, the real needed efficiency comes from improving design, approach or better data structures/algorithms, than instead short - one liners.
Quoting Dijkstra: The competent programmer is fully aware of the limited size of his own skull. He therefore approaches his task with full humility, and avoids clever tricks like the plague.

A couple points:
Code should first reflect your intentions. After all, it's meant for humans to read. After that, if you really really must, you can start to tweak the code for performance. Most of all never write code to demonstrate a gimmick or bit twiddling hack.
Breaking code onto multiple lines has absolutely no impact on performance.
Don't underestimate the compiler's optimizer. Just write the code as intuitively as possible, and the optimizer will ensure it has the best performance.
In this regard, the most descriptive, intuitive, fastest code, is:
std::swap(a, b);

Readability and instant understand-ability is what I personally rate (and several others may vote for) when writing and reading code. It improves maintainability. In the particular example provided, it is difficult to understand immediately what the author is trying to achieve in those few lines.
The single line code:a=a+b-(b=a); although very clever does not convey the author's intent to others obviously.
In terms of efficiency, optimisation by the compiler will achieve that anyway.

In terms of java at least i remember reading that the JVM is optimized for normal straight forward uses so often times you just fool yourself if you try to do stuff like that.
Moreover it looks awful.

OK, try this. Next time you have a strange bug, start by squashing up as much code into single lines as you can.
Wait a couple weeks so you've forgotten how it's supposed to work.
Try to debug it.

Of course it depends on the compiler. Although I cannot foresee any kind of earth-shattering difference. Abstruse code is the main one.

Where to code this heuristic?

I want to ask a complex question.
I have to code a heuristic for my thesis. I need followings:
Evaluate some integral functions
Minimize functions over an interval
Do this over thousand and thousand times.
So I need a faster programming language to do these jobs. Which language do you suggest? First, I started with Java, but taking integrals become a problem. And I'm not sure about speed.
Connecting Java and other softwares like MATLAB may be a good idea. Since I'm not sure, I want to take your opinions.
Thanks!

C,Java, ... are all Turing complete languages. They can calculate the same functions with the same precision.
If you want achieve performance goals use C that is a compiled and high performances language . Can decrease your computation time avoiding method calls and high level features present in an interpreted language like Java.
Anyway remember that your implementation may impact the performances more than which language you choose, because for increasing input dimension is the computational complexity that is relevant ( http://en.wikipedia.org/wiki/Computational_complexity_theory ).

It's not the programming language, it's probably your algorithm. Determine the big0 notation of your algorithm. If you use loops in loops, where you could use a search by a hash in a Map instead, your algorithm can be made n times faster.
Note: Modern JVM's (JDK 1.5 or 1.6) compile Just-In-Time natively (as in not-interpreted) to a specific OS and a specific OS version and a specific hardware architecture. You could try the -server to JIT even more aggressively (at the cost of an even longer initialization time).
Do this over thousand and thousand times.
Are you sure it's not more, something like 10^1000 instead? Try accurately calculating how many times you need to run that loop, it might surprise you. The type of problems on which heuristics are used, tend to have a really big search space.

Before you start switching languages, I'd first try to do the following things:
Find the best available algorithms.
Find available implementations of those algorithms usable from your language.
There are e.g. scientific libraries for Java. Try to use these libraries.
If they are not fast enough investigate whether there is anything to be done about it. Is your problem more specific than what the library assumes. Are you able to improve the algorithm based on that knowledge.
What is it that takes so much/memory? Is this realy related to your language? Try to avoid observing JVM start times instead of the time it performed calculation for you.
Then, I'd consider switching languages. But don't expect it to be easy to beat optimized third party java libraries in c.

Order of the algorithm
Tipically switching between languages only reduce the time required by a constant factor. Let's say you can double the speed using C, but if your algorithm is O(n^2) it will take four times to process if you double the data no matter the language.
And the JVM can optimize a lot of things getting good results.
Some posible optimizations in Java
If you have functions that are called a lot of times make them final. And the same for entire classes. The compiler will know that it can inline the method code, avoiding creating method-call stack frames for that call.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.