What currency to use in unit tests? - java

I'm developing an application that relies heavily on Joda-Money, and have a number of unit tests that verify my business logic. One (admittedly minor) sticking point for me has been what sort of Money/BigMoney objects to test with; specifically, what CurrencyUnit to use.
As I see it, I have a couple of options:
Just use USD
This is clearly the easiest way to go, and most of my actual application will be working with US Dollars so it makes a fair bit of sense. On the other hand, it feels rather US-centric, and I'm concerned it would risk letting currency-specific errors go unchecked.
Use another real currency, like CAD
This would catch erroneous hard-codings of USD, but otherwise isn't much better than just using USD.
Use a dedicated "fake" currency, i.e. XTS
This clearly makes sense, after all, XTS is "reserved for use in testing". But Joda denotes psuedo-currencies as currencies with -1 decimal places. In practice, the primary difference between currencies in Joda-Money is the number of decimal places, so this risks masking any errors involving decimal place precision, such as erroneously rounding to an integer value.
Register my own custom currency with CurrencyUnit.registerCurrency()
This would obviously work, but seems a little odd seeing as there are alternatives.
Use a CurrencyUnit instance created by a mocking library
Pretty much the same as registering a custom currency.
Again, this is obviously a minor issue, but I'm curious if there's a standard practice for cases like this, or if there's a clear reason to prefer one of these options in particular.

Use USD (or, generally, whatever currency is most commonly used in your application). I say this for two reasons:
Good test data is unremarkable in every respect except that part of it which the test is actually about. When you're writing tests that don't have anything to do with differences between currencies, you don't want to have to think about differences between currencies. Just use whatever is most natural in the application.
The idea that using an unusual currency everywhere will somehow result in better testing of unusual currencies is a red herring. Tests should be explicit and focused. If you need to test something about a specific currency, write a test whose point is to test that thing. And if a test is not about a specific currency, it shouldn't break when handling of some unusual aspect of that currency breaks -- it's not valuable to have half your tests break for the same reason; you only want one to break. So there is just no need to spread unusual currencies around the test suite and hope that that will catch something. Instead, optimize for readability; see point 1.

Since most of your application deals with U. S. dollars, it makes sense to use them for the vast majority of your tests. What currency-specific errors are you worried might happen? Missing halfpennies when compounding interest? Conversion to yen and back is off by a cent or two? Write tests for that.
If I were you, I'd fire up a local REPL (a Scala REPL works very well if you don't have JShell), run a few experiments. You'll probably find that you worried for nothing. Or you might discover that there is indeed a flaw, and your REPL session will inform how you write a test for that flaw.
You're using Joda Money because you don't want to reinvent this particular wheel. I didn't realize that until I read the question details. Presumably Joda Money has been tested rigorously by its developers and it does everything you expect it to do. You don't need to test it again.
But if you were making your own class to represent money amounts, I would suggest you use U. S. dollars, euros, Japanese yen and Lybian dinars (LYD). The reason for the first three is obvious. I suggest Lybian dinars because of the three decimal places. From what I can gather, there is no physical 1 dirham coin, so, for example, 513 darahim would get rounded down to 500 darahim and 997 darahim would get rounded up to a full dinar.
To test conversions, I would start with East Caribbean dollars (XCD), since they are fixed to exchange to US$2.70. Later on you can worry about currencies that fluctuate in relation to each other, and whether you're going to deal with them by mocking a rate server, or by connecting to an actual rate server but putting variances in your tests or by some other way.

Related

XBRL: How do you merge rows from different filings?

We use a XBRL processor to ingest filings from SEC. Often times, a company declares a metric in different filings with different concepts - with or without exactly matching values - but to be regarded as the same financial metric. Essentially when you want to create a stitched view of all the filings, these numbers should appear on the same row. I'd provide an example to make it clear:
ASGN's 2020 10-K filing uses us-gaap:IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestments to report EBT.
ASGN's 2021 10-K filing uses us-gaap:IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterest to report EBT.
If you notice, even the figures for 2020 and 2019 do not match between the two filings. My question is - how do you reconcile these cases in code - to create a stitched/continuous view? Is this a solved problem or is it more of a process where you need to make manual interventions? Are there libraries that help with this? Is there mapping information available with the SEC that can be used - even when the data do not agree? Would be great if anyone can help with this. Thanks.
From personal experience I can give you a list of considerations when it comes to non-program-development people who work in the financial sector and submit standardized information:
the level of respect they have for the "you have to do things this way" paradigm is effectively 0.
expectance that filings aren't filled out properly/correctly should be at 100%.
Even though sec filings are meant to consolidate data in a standardized, meaningful, and readily available transparent facet, the financial sector is plagued with ambiguity and interoperable terms which may differ from corporate entity to corporate entity.
... or in short ... in their point of view "ILFCOBITEINI and ILFCOBITMIAILFEMI look pretty similar, so they pretty much mean the same thing."
As far as I know, there is no support on behalf of sec or other federation entities which is in charge of controlling sec filing accuracy since the idea is " you file it wrong... you pay a fine " .... meaning that due to interoperability of forms that "wrong" level is pretty ambiguous.
As such, the problem is that you must account for unexpected pseudo-fails when it comes to filings, meaning that you should probably write some code which does structural-to-content identity matches across different entries.
I'd advise using a reasoning logic subsystem (that you'll have to write) instead of a simple switch-case statement operating on a "if-this-exists-else" basis. ... and always consider that the level of incompetence in the financial sector is disgustingly high.
It depends ...
Why is the data for the same "row" (ex. revenues), for the same time period (ex.the 12 months ending dec 31, 2020), different? (Merger or acquisition? Accounting restatement? Something else?)
How might you handle this example, if you were manually "by hand" creating a financial model for this company in a spreadsheet?
Possible approaches:
"Most recent": For each row for each time period, use the most recently reported data.
"As first reported": For each row and each time period, use the "as first reported" data.
These are only two of several ways to present the data.
Neither of the above is "correct" or "better". Each has pros and cons.
Thoughts? Questions?
Point 1: differences aren't unusual, as companies make restatements and corrections from one year to the other. You will find them anywhere, not only with XBRL.
Point 2: they are using labels that look the same for two distinct concepts. At first, that should not exist in that case, as it induces to error if one is just downloading the labeled tables from the SEC. However, the FASB may have changed that from one year to the other. Did you check it? There are other reasons for this kind of error, which are actually subject of an ongoing research project of mine. They involve error and fraud. So, be careful. There could be more to it.
To answer your question, there is no way to make sure you are doing your work correctly given those discrepancies other than getting an accountant/lawyer to check them. You could also get an intern ;)

How can randomly generated input for a function be used for testing in a systematic way?

How can a method be tested with random input in a systematic way? What I mean by this is, if the input data for the units being tested changes each run, how can you effectively find out which values caused the failure and retest those same values after the code has been change? For example the function int[] foo(int[] data) and it can be checked if the return value is correct or not, after writing test cases for "hard coded inputs" (e.g. {1,2,3}) how can random input be tested? If it's random each time, any errors wont be reproducible. I could print the random input to the screen, but that would become messy for each function. Any suggestions?
Also, is it still unit testing if the whole program is being tested? For example calling the constructor and all methods of class in one #Test? The class only has one public method to begin with so this can't really be avoided, but I'm wondering if JUnit is the best tool for it.
How can a method be tested with random input in a systematic way?
In the general case and in the most strict sense, you cannot.
By "strict sense" I mean: validating the correctness of the output no matter which input is presented.
Assuming it would be possible, 'strictness' imply that your test case can compute the result of the function on each and every (random) input. Which means you will need to write a piece of code that replicates the method to be tested - theoretically possible, but leading to a paradoxical situation:
assume you find multiple bugs in the method under test. What is the cheapest way to correct it? Of course, substituting the method code with the testing code
and now the tester (the author of the currently implemented method) needs to... what?... write another "incarnation" of the function in order to test her own implementation?
However, "fuzzing" is still a valid method: except that it is never to be taken in the strict sense; the tests expects the results to expose certain traits/invariants/etc, something that can be defined/checked not matter what the input is. For example, "the method never throws", or "the returned array has the same length as the input array (or double the size, or all elements are odd, or whatever" or "the result is alway a proper HTML page which passes the W3C markup validator".
tested with random input in a systematic way?
You almost have an oxymoron here, mate, like "honest politician" or "aeroplane-safe 2016-made Galaxy 7 Note". If testing is "systematic" it means "there is a (rule) system that governs the way the tests are conducted" - almost quite opposite to "random input".
The trick to reconcile the two: you still have a (rule-based) system to categorize your input (e.g. equivalence partitioning ), except that instead of taking a certain (constant) representative for your categories, you pick the representatives at random. I'm going to repeat this: inside each categories, you pick your representative at random; (as opposed to "pick a random input and see which category it belongs to").
How's this useful? Not much added value, because according to the system/equivalence, picking a representative is as good as picking any other.
Sorry my QA mate, you don't get off-the-hook in regards with your responsibility as a tester to design and plan for the tests (no matter if you use random techniques in generating your input).
If it's random each time, any errors wont be reproducible. I could print the random input to the screen, but that would become messy for each function. Any suggestions?
This is a frivolous reason to avoid random input if random input is deemed necessary: just use some tools to visually organize the testing logs if the simple flowing text format is so hard to read.
E.g. output your test log formatted as JSON with a certain structure and use/write a visualisation tool to represent/explore/fold/unfold it in such a way that human exploration is not a pain in the nether back part of the body.
If your job is to automate the testing, it is supposed you are able to code, isn't it?
Also, is it still unit testing if the whole program is being tested?
"Whole program is being tested" exactly how? What is the specific goal of this "whole system testing"?
There is a distinction between "unit testing" (even a more-than-comprehensive 10000% coverage unit testing) and "functional testing" or "integration testing" or "performance testing" or "usability testing" or "localization testing" (this including the gra'ma' of the UI/error messages) - all the latter belonging to "whole program testing" without being unit testing.
Hint: in defining the type of testing, the specific goal one has in mind when designing/executing test takes precedence over the means used in testing; I've seen testing performed manually using GUI test harnesses, in which the testers where manually writing values to unit test the SDK underneath.
On the other side, there are categories of non-unit testing which may make use of unit testing techniques (e.g. a Web or REST Service - wrap and present it as a proxy-function API and then you can write your tests using JUnit/TestBG or whatever unit test framework you fancy. And yet, you are doing functional or integration testing).
Property-based testing could be a solution for you. Basically, the idea is to have a framework generating all sorts of input data (random, edge cases) which is then fed into your implementation.
It's true that with random data you may end up with test cases behaving differently with every run. But at least the test framework usually would show you which input is used when tests are failing and you can have a further look into the reasons for failing. It's not a guarantee that your method will work in 100% of the cases, but at least you get some coverage and it's still better than nothing.
Typically, such a framework also allows you to restrict the generated data to a set which makes sense for your implementation. Or you can implement your own generator providing data for your tests.
For Java there is e.g. JUnit-Quickcheck which integrates with JUnit: http://pholser.github.io/junit-quickcheck/site/0.6.1/
A lot has already been written about the differences between unit tests/integration tests etc. Maybe have a look here: What is the difference between integration and unit tests?

Replace Java math operators with BigDecimal equivalents

I have a program that works with snippets of Java code that do math on doubles using the standard mathematical operators, like this:
double someVal = 25.03;
return (someVal * 3) - 50;
For Reasons (mostly rounding errors) I would like to change all these snippets to use BigDecimal instead of double, modifying the math functions along the way, like this:
MathContext mc = MathContext.DECIMAL32;
BigDecimal someVal = new BigDecimal("25.03", mc);
return someVal.multiply(BigDecimal.valueOf(3), mc).subtract(BigDecimal.valueOf(50), mc);
The snippets are mostly pretty simple, but I would prefer to avoid a fragile solution (eg, regex) if I can. Is there a relatively straightforward way to do this?
Note I want to have a program or code perform these modifications (metaprogramming). Clearly I'm capable of making the changes by hand, but life is too short.
You could try Google's "Refaster", which, according to the paper, is "a tool that uses normal, compilable before-and-after examples of Java code to specify a Java refactoring."
The code lives under core/src/main/java/com/google/errorprone/refaster in Google's error-prone github project. (It used to live in its own github project.)
This is more of a hint than an answer, though, since I've never worked directly with Refaster and don't know how well it does on primitive expressions like the ones in your example. I also don't know how well-suited it is for toolchain use like yours, vs. one-time refactors (which is how Google tends to use it). But it's actively maintained, and I've seen it used really effectively in the past.
We use BigDecimal for financial calculation. As per other comments, you going to have some performance degradation and the code will be very hard to read. The performance impact depends on how many operation you going to have. Usually, you facing rounding issues with doubles when your calculation chain is long. You wouldn't have many issues if you do c=a+b but will if you have c+=a+b million times. And with a thousand of operations, you will notice how bigDecimal are slower than double, so do performance testing.
Be careful when changing your code especially with the division, you will have to specify rounding mode and scale of the result, this what people usually wouldn't do and it leads to the errors.
I assume it not only about replacing calculation logic but also you'll need to change your domain model so I doubt you can come up with a script to do it in a reasonable time, so do it by hands. Good IDE will help you a lot.
No matter how you going to convert your code I suggest to firstly make sure that all your calculation logic covered by unit tests and do unit test conversion before changing the logic. I.e replace assertion of the values by wrapping them with bigDecimals. In that case, you will avoid silly typing/algorithm mistakes.
I would not answer your question how to convert from double to BigDecimal just want to share some notes regaridng to the
Don't do this.
It's a huge readability hit. Your example turned "25.03 * 3 - 50" into 4 lines of code.
Financial code usually uses double, or long for cents. It's precise enough that it's just a question of proper programming to avoid rounding errors: What to do with Java BigDecimal performance?
It's, likely, a huge performance hit, in particular with erratic garbage collections, which is not acceptable for HFT and the like: https://stackoverflow.com/a/1378156/1339987
I don't know how much code you are talking about but I do expect you will have to make lots of small decisions. This reduces the chance there is an openly available tool (which I'm all but sure does not exist) and increases the amount of configuration work you would have to do should you find one.
This will introduce bugs for the same reason you expect it to be an improvement, unless you have extremely good test coverage.
Programmatically manipulating Java source is covered here: Automatically generating Java source code
You don't need to accept this answer but I believe my advice is correct and think other readers of this question need to see up-front the case for not making this transformation, which is my rationale for posting this somewhat nonanswer.

Do testing frameworks exist that allow a percentage of failure?

When reading a question about testing a Java program several times with random seeds the word testing sparked of the association with unit testing in my mind, but that might not have been the kind of testing going on there.
In my opinion introducing randomness in a unit test would be considered bad practice, but then I started considering the case where a (small) percentage of failure might be acceptable for the moment.
For example, the code fails to pass the unit test once every 10^n for n > 3 and gradually you want n to go to infinity without the test going red, maybe yellowish.
Another example might be a system-wide test where most of the time things go right, but you still want to limit/know how often they might go wrong.
So my question is, are there any frameworks (full spectrum of testing) out there that can be persuaded to allow a percentage of failure in an huge/excessive amount of repeated tests?
You can "persuade" most testing frameworks to allow partial failures by performing your "tests" not using the framework directly but just by doing plain old conditional (eg: if-statements) and recording the percentage of failures. Then use the framework to assert that this percentage is below your threshold.
If your tests are deterministic (which is generally considered to be a good thing) then another approach is to test for the current behavior even when it's wrong, but to comment the incorrect assertions (typically with what the "right" answer should be). If these tests ever fail you can check the comment. If the code has become "correct" then great, update that assertion. If not, then decide whether the change in behavior is better or worse than the old behavior, and act accordingly. This approach lets you sort of "tighten the screws" over time.

For java applications, is it safe to use BigDecimal when dealing with money, or should I use integers and create an abstraction for money?

It's been awhile since I've wrote applications that dealt with money. Many years ago, I would make a money object that dealt with Integers behind the scenes. Whenever the amount was printed somewhere, it would simply put the decimal place in the correct spot. This was to prevent decimal problems.
Do I still need to do this, or can I just use BigDecimal? What is considered the best practice right now?
It depends on your requirements. You may only have a need for resolution to the nearest K (for example, salary requirements on a job posting website.)
Assuming you mean you need granularity, BigDecimal seems perfectly suited for the job. It seems certainly "safe" to use, but without knowing exactly what you plan to do with it, it's hard to say for certain.

Categories

Resources