I have recently set up find bugs in eclipse to see what reports it generates. Ive set all the settings to be as sensitive as possible. If i create a small application that writes to a file and do not close the stream it picks it up which is all good.
However, using a project that has been written where we no there are a couple of bugs, especially in the output, we get no errors at all (in terms of find bugs)
I was wondering if anyone could run this through their version and report whether I may have find bugs set up incorrectly or whether in fact it could find no bugs?
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
public class SpellUtil {
private final static String teenSpelling[] = {"Zero", "One", "Two", "Three",
"Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten", "Eleven",
"Twelve", "Thirteen", "Fourteen", "Fifteen", "Sixteen",
"Seventeen", "Eighteen", "Nineteen"};
private final static String centSpelling[] = {"Twenty", "Thirty", "Forty",
"Fifty", "Sixty", "Seventy", "Eighty", "Ninety"};
private final static String suffixSpelling[] = {
"", // Dummy! no level 0 (added for nicer indexing in code)
"", // Nothing for level 1
" Thousand, ", " Million, ", " Billion, ", " Trillion, ", " Quadrillion, ",
" Quintillion, "};
public static String spell(int number) {
int rem, placeIndicator = 1;
boolean isNegative = false;
List<String> spelling = new ArrayList<String>();
if (number < 0) {
isNegative = true;
number = Math.abs(number);
}
while (number > 0) {
rem = number % 1000;
number = number / 1000;
spelling.add(suffixSpelling[placeIndicator]);
try {
spelling.add(spellBelow1000(rem));
} catch (SpellingException e) {
System.out.println(e.getMessage());
}
placeIndicator++;
}
StringBuilder sb = new StringBuilder();
if (isNegative) sb.append("Minus ");
for (int i = spelling.size() - 1; i >= 0; i--) {
sb.append(spelling.get(i));
}
return sb.toString();
}
private static String spellBelow1000(int number) throws SpellingException {
if (number < 0 || number >= 1000)
throw new SpellingException("Expecting a number between 0 and 999: " + number);
if (number < 20) {
// if number is a teen,
// find it in teen table and return its equivalent text (word).
return teenSpelling[number];
} else if (number < 100) {
// otherwise, if it is a cent,
// find the most (div) and least (rem) significant digits (MSD/LSD)
int div = (int) number / 10;
int rem = (int) number % 10;
if (rem == 0) {
// if LSD is zero, return the cent key word directly (like
// fifty).
return centSpelling[div-2];
} else {
// otherwise, return the text as cent-teen (like fifty-one)
return centSpelling[div-2] + "-" + teenSpelling[rem];
}
} else {
// otherwise, it is a mil;
// find it's MSD and remaining cent.
int div = number / 100;
int rem = (int) number % 100; // TODO will findbugs detect unnecessary (int)?
// Prepare the mil prefix:
String milText = teenSpelling[div] + " Hundred";
// decide whether to append the cent tail or not.
if (rem == 0) {
// if it does have a non-zero cent, that's it.
// return the mil prefix, for example three hundred:
return milText;
} else {
// otherwise, spell the cent and append it to mil prefix.
// (now, rem is a cent).
// For example, three Hundred and Sixty-Four:
return milText + " and " + spellBelow1000(rem);
}
}
}
}
Your expectation to find a bug in this line:
int rem = (int) number % 100; // TODO will findbugs detect unnecessary (int)?
is WRONG because the result of a % operation is not an integer in general.
In C and C++, the remainder operator accepts only integral operands, but in Java, it also accepts floating-point operands. This means statements such as double x = 8.2 % 4; are quite valid in Java and the result could be a non-integer value. (0.1999999999999993 in this case)
Please see the Java language specification here.
You seem to have a problem with findbugs configuration. I suggest using Findbugs via Sonar. It is much more easy to configure and you get checkstyle, pmd and a system for managing and resolving violations.
Sonar findbugs page
I think that the problem is that you misunderstanding what FindBugs does and what its capabilities are.
Basically, FindBugs parses each class to produce a parse-tree - an in-memory representation of the structure of the program. Then attempts to find places in the tree that match known patterns which represent incorrect or questionable programming. For example:
if (someString == "42") {
....
}
FindBugs will most likely tell you comparing strings using the '==' operator here is wrong. What it has done is look through the class for any expression node where the operator is '==' and one or both of the operands is a String. It will repeat this procedure for a large number of patterns that it has been programmed to detect and report. Some will be more complicated than this one, but basically, FindBug is only doing a form of structural pattern matching.
What FindBugs does not and cannot do is to understand what your program is actually supposed to do. So for example:
public boolean isOdd(int arg) {
return (arg % 2) == 0;
}
This is obviously incorrect to anyone who understands simple mathematics ... but FindBugs won't notice it. That is because FindBugs has no idea what the method is actually supposed to do. Furthermore, it is not capable of doing the elementary semantic analysis that is required to figure out that the code doesn't implement the math.
The reason I am doing this is because I need to do a presentation of find bugs and I need an application to generate some bugs to show how it works.
Maybe you need to cheat a bit:
Read the Findbugs documentation to understand the things that it is capable of finding.
Write some "toy" applications with bugs that you know Findbugs can find for you.
It is also worthwhile including examples that you know it won't find ... so that you can explain the limitations of Findbugs.
What findbugs does is to look for some common mistakes that could (and most probably would) lead to unexpected/unwanted behavior. Most of those errors are technical or wrong use of java and its API. A list of all findbugs checks can be found here In other words if you did something you don't wanted in a right way, findbugs would not detect it. In your code I can't see anything that would be detected by findbugs. The unnecessary cast you mentioned in your comments is not a findbugs rule because it doesn't change the behavior of your code. It is more of a style or efficiency error and would be detected by tools like checkstyle or PMD.
Related
The method returns an "Ingredient" object that is constructed from a given line in a recipe txt file. Note: an InvalidIngredientException is like an Ingredient version of an InputMismatchException. This isn't thrown by any of the lines in the given recipe file.
public static Ingredient parseString(String line)
throws InvalidIngredientException {
double quantity = 1;
String measurement = "";
String[] parts = line.split(";");
if (parts.length == 1) {
throw new InvalidIngredientException(EXP_MSG);
}
if (!parts[0].trim().isEmpty()
&& !(Double.parseDouble(parts[0]) == 1)) {
quantity = Double.parseDouble(parts[0].trim());
}
if (!parts[1].trim().isEmpty()) {
measurement = parts[1].trim();
}
return new Ingredient(quantity, measurement, parts[2].trim());
}
A recipe file looks like this:
Cranberry Oatmeal Chews
8; tablespoon; butter
2; tablespoon; oil
1; cup; light brown sugar
1; ; zest of one orange
6; tablespoon; sour cream
2; teaspoon; vanilla
1.5; cup; flour
.5; teaspoon; baking soda
1; teaspoon; cinammon
.5; teaspoon; salt
2; cup; oats
1.5; cup; dried cranberries
.5; cup; walnuts
The method works, but I feel like it could use less code.
What you are trying to do is called "bind CSV row to an object". There are quite a few good libraries for parsing CSV, most mature ones offering the binding functionality as well. Also there are annotation-based code generation enabled frameworks like Lombok or Jackson which make Java one step closer to convenient languages like Scala, by saving you from writing very verbose getters/setters by hand (with a minor complication to the build process, maybe).
And once you use correct search term you will find a plenty of examples. One doing just what I described above is this one, below is a version adjusted to your naming. It is using Jackson.
Object definition with Jackson annotations:
#JsonPropertyOrder({"quantity", "measure", "ingredient"})
public class Ingredient {
public double quantity;
public String measure;
public int ingredient;
}
Invocation code with the Jackson CsvMapper:
List<Ingredient> result = new CsvMapper()
.readerWithTypedSchemaFor(Ingredient.class)
.readValues(csvFile)
.readAll();
There are a few little things that you can do to make your code look better and a increase the performance a bit also.
Change the regex when splitting the line to "\s;\s". This way you avoid the trim() calls multiple times
Use else if...else syntax. This will make your code not only a bit shorter but also easier to read.
I'm not used to Java, so there might be minor errors in this code. Feel free to edit if you see one.
Coming from a C# perspective, I'd make the following changes:
Move the .trim() calls to one place. Here I do that just after splitting the input lines into parts by creating a Stream using of() then calling map (assuming Java 8 or newer).
Remove the lines instantiating default values for quantity and measurement. Because we don't have to trim, we can use the ternary operator to declare and instantiate the variables on the same line.
Don't check if parts[1] is empty. Since "" is the fallback value, it doesn't matter if parts[1] is also "". This also means you don't need the intermediate measurement variable.
public static Ingredient parseString(String line)
throws InvalidIngredientException {
String[] parts = Stream.of(line.split(";")).map(p => p.trim()).toArray();
if (parts.length == 1) {
throw new InvalidIngredientException(EXP_MSG);
}
double quantity = parts[0].isEmpty() ? 1 : Double.parseDouble(parts[0]);
return new Ingredient(quantity, parts[1], parts[2]);
}
You also have the potential to throw errors other than InvalidIngredientException if parts.length == 0 or if parts[0] cannot be parsed to a Double. I'm not sure how strict you're supposed to be when declaring which exceptions your method can throw, but here's a version that should catch any exceptions and only return the InvalidIngredientException you declared.
public static Ingredient parseString(String line)
throws InvalidIngredientException {
try {
String[] parts = Stream.of(line.split(";")).map(p => p.trim()).toArray();
double quantity = parts[0].isEmpty() ? 1 : Double.parseDouble(parts[0]);
return new Ingredient(quantity, parts[1], parts[2]);
}
catch (Exception e) {
throw new InvalidIngredientException(EXP_MSG);
}
}
For Java practice I started working on a method countBinary that accepts an integer n as a parameter that prints all binary numbers that have n digits in ascending order, printing each value on a separate line. Assuming n is non-negative and greater than 0, some example outputs would look like this.
I am getting pretty much nowhere with this. I am able to write a program that finds all possible letter combinations of a String and similar things, but I have been unable to make almost any progress with this specific problem using binary and integers.
Apparently the best way to go about this issue is by defining a helper method that accepts different parameters than the original method and by building up a set of characters as a String for eventual printing.
Important Note: I am NOT supposed to use for loops at all for this exercise.
Edit - Important Note: I need to have trailing 0's so that all outputs are the same length.
So far this is what I have:
public void countBinary(int n)
{
String s = "01";
countBinary(s, "", n);
}
private static void countBinary(String s, String chosen, int length)
{
if (s.length() == 0)
{
System.out.println(chosen);
}
else
{
char c = s.charAt(0);
s = s.substring(1);
chosen += c;
countBinary(s, chosen, length);
if (chosen.length() == length)
{
chosen = chosen.substring(0, chosen.length() - 1);
}
countBinary(s, chosen, length);
s = c + s;
}
}
When I run my code my output looks like this.
Can anyone explain to me why my method is not running the way I expect it to, and if possible show me a solution to my issue so that I might get the correct output? Thank you!
There are more efficient ways to do it, but this will give you a start:
public class BinaryPrinter {
static void printAllBinary(String s, int n) {
if (n == 0) System.out.println(s);
else {
printAllBinary(s + '0', n - 1);
printAllBinary(s + '1', n - 1);
}
}
public static void main(String [] args) {
printAllBinary("", 4);
}
}
I'll let you work out the more efficient way.
QUESTION:
How can I read the string "d6+2-d4" so that each d# will randomly generate a number within the parameter of the dice roll?
CLARIFIER:
I want to read a string and have it so when a d# appears, it will randomly generate a number such as to simulate a dice roll. Then, add up all the rolls and numbers to get a total. Much like how Roll20 does with their /roll command for an example. If !clarifying {lstThen.add("look at the Roll20 and play with the /roll command to understand it")} else if !understandStill {lstThen.add("I do not know what to say, someone else could try explaining it better...")}
Info:
I was making a Java program for Dungeons and Dragons, only to find that I have come across a problem in figuring out how to calculate the user input: I do not know how to evaluate a string such as this.
I theorize that I may need Java's eval at the end. I do know what I want to happen/have a theory on how to execute (this is more so PseudoCode than Java):
Random rand = new Random();
int i = 0;
String toEval;
String char;
String roll = txtField.getText();
while (i<roll.length) {
check if character at i position is a d, then highlight the numbers
after d until it comes to a special character/!aNumber
// so if d was found before 100, it will then highlight 100 and stop
// if the character is a symbol or the end of the string
if d appears {
char = rand.nextInt(#);
i + #'s of places;
// so when i++ occurs, it will move past whatever d# was in case
// d# was something like d100, d12, or d5291
} else {
char = roll.length[i];
}
toEval = toEval + char;
i++;
}
perform evaluation method on toEval to get a resulting number
list.add(roll + " = " + evaluated toEval);
EDIT:
With weston's help, I have honed in on what is likely needed, using a splitter with an array, it can detect certain symbols and add it into a list. However, it is my fault for not clarifying on what else was needed. The pseudocode above doesn't helpfully so this is what else I need to figure out.
roll.split("(+-/*^)");
As this part is what is also tripping me up. Should I make splits where there are numbers too? So an equation like:
String[] numbers = roll.split("(+-/*^)");
String[] symbols = roll.split("1234567890d")
// Rough idea for long way
loop statement {
loop to check for parentheses {
set operation to be done first
}
if symbol {
loop for symbol check {
perform operations
}}} // ending this since it looks like a bad way to do it...
// Better idea, originally thought up today (5/11/15)
int val[];
int re = 1;
loop {
if (list[i].containsIgnoreCase(d)) {
val[]=list[i].splitIgnoreCase("d");
list[i] = 0;
while (re <= val[0]) {
list[i] = list[i] + (rand.nextInt(val[1]) + 1);
re++;
}
}
}
// then create a string out of list[]/numbers[] and put together with
// symbols[] and use Java's evaluator for the String
wenton had it, it just seemed like it wasn't doing it for me (until I realised I wasn't specific on what I wanted) so basically to update, the string I want evaluated is (I know it's a little unorthodox, but it's to make a point; I also hope this clarifies even further of what is needed to make it work):
(3d12^d2-2)+d4(2*d4/d2)
From reading this, you may see the spots that I do not know how to perform very well... But that is why I am asking all you lovely, smart programmers out there! I hope I asked this clearly enough and thank you for your time :3
The trick with any programming problem is to break it up and write a method for each part, so below I have a method for rolling one dice, which is called by the one for rolling many.
private Random rand = new Random();
/**
* #param roll can be a multipart roll which is run and added up. e.g. d6+2-d4
*/
public int multiPartRoll(String roll) {
String[] parts = roll.split("(?=[+-])"); //split by +-, keeping them
int total = 0;
for (String partOfRoll : parts) { //roll each dice specified
total += singleRoll(partOfRoll);
}
return total;
}
/**
* #param roll can be fixed value, examples -1, +2, 15 or a dice to roll
* d6, +d20 -d100
*/
public int singleRoll(String roll) {
int di = roll.indexOf('d');
if (di == -1) //case where has no 'd'
return Integer.parseInt(roll);
int diceSize = Integer.parseInt(roll.substring(di + 1)); //value of string after 'd'
int result = rand.nextInt(diceSize) + 1; //roll the dice
if (roll.startsWith("-")) //negate if nessasary
result = -result;
return result;
}
Recently I have been had to search a number of string values to see which one matches a certain pattern. Neither the number of string values nor the pattern itself is clear until a search term has been entered by the user. The problem is I have noticed each time my application runs the following line:
if (stringValue.matches (rexExPattern))
{
// do something so simple
}
it takes about 40 micro second. No need to say when the number of string values exceeds a few thousands, it'll be too slow.
The pattern is something like:
"A*B*C*D*E*F*"
where A~F are just examples here, but the pattern is some thing like the above. Please note* that the pattern actually changes per search. For example "A*B*C*" may change to W*D*G*A*".
I wonder if there is a better substitution for the above pattern or, more generally, an alternative for java regular expressions.
Regular expressions in Java are compiled into an internal data structure. This compilation is the time-consuming process. Each time you invoke the method String.matches(String regex), the specified regular expression is compiled again.
So you should compile your regular expression only once and reuse it:
Pattern pattern = Pattern.compile(regexPattern);
for(String value : values) {
Matcher matcher = pattern.matcher(value);
if (matcher.matches()) {
// your code here
}
}
Consider the following (quick and dirty) test:
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test3 {
// time that tick() was called
static long tickTime;
// called at start of operation, for timing
static void tick () {
tickTime = System.nanoTime();
}
// called at end of operation, prints message and time since tick().
static void tock (String action) {
long mstime = (System.nanoTime() - tickTime) / 1000000;
System.out.println(action + ": " + mstime + "ms");
}
// generate random strings of form AAAABBBCCCCC; a random
// number of characters each randomly repeated.
static List<String> generateData (int itemCount) {
Random random = new Random();
List<String> items = new ArrayList<String>();
long mean = 0;
for (int n = 0; n < itemCount; ++ n) {
StringBuilder s = new StringBuilder();
int characters = random.nextInt(7) + 1;
for (int k = 0; k < characters; ++ k) {
char c = (char)(random.nextInt('Z' - 'A') + 'A');
int rep = random.nextInt(95) + 5;
for (int j = 0; j < rep; ++ j)
s.append(c);
mean += rep;
}
items.add(s.toString());
}
mean /= itemCount;
System.out.println("generated data, average length: " + mean);
return items;
}
// match all strings in items to regexStr, do not precompile.
static void regexTestUncompiled (List<String> items, String regexStr) {
tick();
int matched = 0, unmatched = 0;
for (String item:items) {
if (item.matches(regexStr))
++ matched;
else
++ unmatched;
}
tock("uncompiled: regex=" + regexStr + " matched=" + matched +
" unmatched=" + unmatched);
}
// match all strings in items to regexStr, precompile.
static void regexTestCompiled (List<String> items, String regexStr) {
tick();
Matcher matcher = Pattern.compile(regexStr).matcher("");
int matched = 0, unmatched = 0;
for (String item:items) {
if (matcher.reset(item).matches())
++ matched;
else
++ unmatched;
}
tock("compiled: regex=" + regexStr + " matched=" + matched +
" unmatched=" + unmatched);
}
// test all strings in items against regexStr.
static void regexTest (List<String> items, String regexStr) {
regexTestUncompiled(items, regexStr);
regexTestCompiled(items, regexStr);
}
// generate data and run some basic tests
public static void main (String[] args) {
List<String> items = generateData(1000000);
regexTest(items, "A*");
regexTest(items, "A*B*C*");
regexTest(items, "E*C*W*F*");
}
}
Strings are random sequences of 1-8 characters with each character occurring 5-100 consecutive times (e.g. "AAAAAAGGGGGDDFFFFFF"). I guessed based on your expressions.
Granted this might not be representative of your data set, but the timing estimates for applying those regular expressions to 1 million randomly generates strings of average length 208 each on my modest 2.3 GHz dual-core i5 was:
Regex Uncompiled Precompiled
A* 0.564 sec 0.126 sec
A*B*C* 1.768 sec 0.238 sec
E*C*W*F* 0.795 sec 0.275 sec
Actual output:
generated data, average length: 208
uncompiled: regex=A* matched=6004 unmatched=993996: 564ms
compiled: regex=A* matched=6004 unmatched=993996: 126ms
uncompiled: regex=A*B*C* matched=18677 unmatched=981323: 1768ms
compiled: regex=A*B*C* matched=18677 unmatched=981323: 238ms
uncompiled: regex=E*C*W*F* matched=25495 unmatched=974505: 795ms
compiled: regex=E*C*W*F* matched=25495 unmatched=974505: 275ms
Even without the speedup of precompiled expressions, and even considering that the results vary wildly depending on the data set and regular expression (and even considering that I broke a basic rule of proper Java performance tests and forgot to prime HotSpot first), this is very fast, and I still wonder if the bottleneck is truly where you think it is.
After switching to precompiled expressions, if you still are not meeting your actual performance requirements, do some profiling. If you find your bottleneck is still in your search, consider implementing a more optimized search algorithm.
For example, assuming your data set is like my test set above: If your data set is known ahead of time, reduce each item in it to a smaller string key by removing repetitive characters, e.g. for "AAAAAAABBBBCCCCCCC", store it in a map of some sort keyed by "ABC". When a user searches for "ABC*" (presuming your regex's are in that particular form), look for "ABC" items. Or whatever. It highly depends on your scenario.
Long.parseLong("string") throws an error if string is not parsable into long.
Is there a way to validate the string faster than using try-catch?
Thanks
You can create rather complex regular expression but it isn't worth that. Using exceptions here is absolutely normal.
It's natural exceptional situation: you assume that there is an integer in the string but indeed there is something else. Exception should be thrown and handled properly.
If you look inside parseLong code, you'll see that there are many different verifications and operations. If you want to do all that stuff before parsing it'll decrease the performance (if we are talking about parsing millions of numbers because otherwise it doesn't matter). So, the only thing you can do if you really need to improve performance by avoiding exceptions is: copy parseLong implementation to your own function and return NaN instead of throwing exceptions in all correspondent cases.
From commons-lang StringUtils:
public static boolean isNumeric(String str) {
if (str == null) {
return false;
}
int sz = str.length();
for (int i = 0; i < sz; i++) {
if (Character.isDigit(str.charAt(i)) == false) {
return false;
}
}
return true;
}
You could do something like
if(s.matches("\\d*")){
}
Using regular expression - to check if String s is full of digits.
But what do you stand to gain? another if condition?
org.apache.commons.lang3.math.NumberUtils.isParsable(yourString) will determine if the string can be parsed by one of: Integer.parseInt(String), Long.parseLong(String), Float.parseFloat(String) or Double.parseDouble(String)
Since you are interested in Longs you could have a condition that checks for isParsable and doesn't contain a decimal
if (NumberUtils.isParsable(yourString) && !StringUtils.contains(yourString,".")){ ...
This is a valid question because there are times when you need to infer what type of data is being represented in a string. For example, you may need to import a large CSV into a database and represent the data types accurately. In such cases, calling Long.parseLong and catching an exception can be too slow.
The following code only handles ASCII decimal:
public class LongParser {
// Since tryParseLong represents the value as negative during processing, we
// counter-intuitively want to keep the sign if the result is negative and
// negate it if it is positive.
private static final int MULTIPLIER_FOR_NEGATIVE_RESULT = 1;
private static final int MULTIPLIER_FOR_POSITIVE_RESULT = -1;
private static final int FIRST_CHARACTER_POSITION = 0;
private static final int SECOND_CHARACTER_POSITION = 1;
private static final char NEGATIVE_SIGN_CHARACTER = '-';
private static final char POSITIVE_SIGN_CHARACTER = '+';
private static final int DIGIT_MAX_VALUE = 9;
private static final int DIGIT_MIN_VALUE = 0;
private static final char ZERO_CHARACTER = '0';
private static final int RADIX = 10;
/**
* Parses a string representation of a long significantly faster than
* <code>Long.ParseLong</code>, and avoids the noteworthy overhead of
* throwing an exception on failure. Based on the parseInt code from
* http://nadeausoftware.com/articles/2009/08/java_tip_how_parse_integers_quickly
*
* #param stringToParse
* The string to try to parse as a <code>long</code>.
*
* #return the boxed <code>long</code> value if the string was a valid
* representation of a long; otherwise <code>null</code>.
*/
public static Long tryParseLong(final String stringToParse) {
if (stringToParse == null || stringToParse.isEmpty()) {
return null;
}
final int inputStringLength = stringToParse.length();
long value = 0;
/*
* The absolute value of Long.MIN_VALUE is greater than the absolute
* value of Long.MAX_VALUE, so during processing we'll use a negative
* value, then we'll multiply it by signMultiplier before returning it.
* This allows us to avoid a conditional add/subtract inside the loop.
*/
int signMultiplier = MULTIPLIER_FOR_POSITIVE_RESULT;
// Get the first character.
char firstCharacter = stringToParse.charAt(FIRST_CHARACTER_POSITION);
if (firstCharacter == NEGATIVE_SIGN_CHARACTER) {
// The first character is a negative sign.
if (inputStringLength == 1) {
// There are no digits.
// The string is not a valid representation of a long value.
return null;
}
signMultiplier = MULTIPLIER_FOR_NEGATIVE_RESULT;
} else if (firstCharacter == POSITIVE_SIGN_CHARACTER) {
// The first character is a positive sign.
if (inputStringLength == 1) {
// There are no digits.
// The string is not a valid representation of a long value.
return null;
}
} else {
// Store the (negative) digit (although we aren't sure yet if it's
// actually a digit).
value = -(firstCharacter - ZERO_CHARACTER);
if (value > DIGIT_MIN_VALUE || value < -DIGIT_MAX_VALUE) {
// The first character is not a digit (or a negative sign).
// The string is not a valid representation of a long value.
return null;
}
}
// Establish the "maximum" value (actually minimum since we're working
// with negatives).
final long rangeLimit = (signMultiplier == MULTIPLIER_FOR_POSITIVE_RESULT)
? -Long.MAX_VALUE
: Long.MIN_VALUE;
// Capture the maximum value that we can multiply by the radix without
// overflowing.
final long maxLongNegatedPriorToMultiplyingByRadix = rangeLimit / RADIX;
for (int currentCharacterPosition = SECOND_CHARACTER_POSITION;
currentCharacterPosition < inputStringLength;
currentCharacterPosition++) {
// Get the current digit (although we aren't sure yet if it's
// actually a digit).
long digit = stringToParse.charAt(currentCharacterPosition)
- ZERO_CHARACTER;
if (digit < DIGIT_MIN_VALUE || digit > DIGIT_MAX_VALUE) {
// The current character is not a digit.
// The string is not a valid representation of a long value.
return null;
}
if (value < maxLongNegatedPriorToMultiplyingByRadix) {
// The value will be out of range if we multiply by the radix.
// The string is not a valid representation of a long value.
return null;
}
// Multiply by the radix to slide all the previously parsed digits.
value *= RADIX;
if (value < (rangeLimit + digit)) {
// The value would be out of range if we "added" the current
// digit.
return null;
}
// "Add" the digit to the value.
value -= digit;
}
// Return the value (adjusting the sign if needed).
return value * signMultiplier;
}
}
You can use java.util.Scanner
Scanner sc = new Scanner(s);
if (sc.hasNextLong()) {
long num = sc.nextLong();
}
This does range checking etc, too. Of course it will say that "99 bottles of beer" hasNextLong(), so if you want to make sure that it only has a long you'd have to do extra checks.
This case is common for forms and programs where you have the input field and are not sure if the string is a valid number. So using try/catch with your java function is the best thing to do if you understand how try/catch works compared to trying to write the function yourself. In order to setup the try catch block in .NET virtual machine, there is zero instructions of overhead, and it is probably the same in Java. If there are instructions used at the try keyword then these will be minimal, and the bulk of the instructions will be used at the catch part and that only happens in the rare case when the number is not valid.
So while it "seems" like you can write a faster function yourself, you would have to optimize it better than the Java compiler in order to beat the try/catch mechanism you already use, and the benefit of a more optimized function is going to be very minimal since number parsing is quite generic.
If you run timing tests with your compiler and the java catch mechanism you already described, you will probably not notice any above marginal slowdown, and by marginal I mean it should be almost nothing.
Get the java language specification to understand the exceptions more and you will see that using such a technique in this case is perfectly acceptable since it wraps a fairly large and complex function. Adding on those few extra instructions in the CPU for the try part is not going to be such a big deal.
I think that's the only way of checking if a String is a valid long value. but you can implement yourself a method to do that, having in mind the biggest long value.
There are much faster ways to parse a long than Long.parseLong. If you want to see an example of a method that is not optimized then you should look at parseLong :)
Do you really need to take into account "digits" that are non-ASCII?
Do you really need to make several methods calls passing around a radix even tough you're probably parsing base 10?
:)
Using a regexp is not the way to go: it's harder to determine if you're number is too big for a long: how do you use a regexp to determine that 9223372036854775807 can be parsed to a long but that 9223372036854775907 cannot?
That said, the answer to a really fast long parsing method is a state machine and that no matter if you want to test if it's parseable or to parse it. Simply, it's not a generic state machine accepting complex regexp but a hardcoded one.
I can both write you a method that parses a long and another one that determines if a long can be parsed that totally outperforms Long.parseLong().
Now what do you want? A state testing method? In that case a state testing method may not be desirable if you want to avoid computing twice the long.
Simply wrap your call in a try/catch.
And if you really want something faster than the default Long.parseLong, write one that is tailored to your problem: base 10 if you're base 10, not checking digits outside ASCII (because you're probably not interested in Japanese's itchi-ni-yon-go etc.).
Hope this helps with the positive values. I used this method once for validating database primary keys.
private static final int MAX_LONG_STR_LEN = Long.toString(Long.MAX_VALUE).length();
public static boolean validId(final CharSequence id)
{
//avoid null
if (id == null)
{
return false;
}
int len = id.length();
//avoid empty or oversize
if (len < 1 || len > MAX_LONG_STR_LEN)
{
return false;
}
long result = 0;
// ASCII '0' at position 48
int digit = id.charAt(0) - 48;
//first char cannot be '0' in my "id" case
if (digit < 1 || digit > 9)
{
return false;
}
else
{
result += digit;
}
//start from 1, we already did the 0.
for (int i = 1; i < len; i++)
{
// ASCII '0' at position 48
digit = id.charAt(i) - 48;
//only numbers
if (digit < 0 || digit > 9)
{
return false;
}
result *= 10;
result += digit;
//if we hit 0x7fffffffffffffff
// we are at 0x8000000000000000 + digit - 1
// so negative
if (result < 0)
{
//overflow
return false;
}
}
return true;
}
Try to use this regular expression:
^(-9223372036854775808|0)$|^((-?)((?!0)\d{1,18}|[1-8]\d{18}|9[0-1]\d{17}|92[0-1]\d{16}|922[0-2]\d{15}|9223[0-2]\d{14}|92233[0-6]\d{13}|922337[0-1]\d{12}|92233720[0-2]\d{10}|922337203[0-5]\d{9}|9223372036[0-7]\d{8}|92233720368[0-4]\d{7}|922337203685[0-3]\d{6}|9223372036854[0-6]\d{5}|92233720368547[0-6]\d{4}|922337203685477[0-4]\d{3}|9223372036854775[0-7]\d{2}|922337203685477580[0-7]))$
It checks all possible numbers for Long.
But as you know in Java Long can contain additional symbols like +, L, _ and etc. And this regexp doesn't validate these values. But if this regexp is not enough for you, you can add additional restrictions for it.
Guava Longs.tryParse("string") returns null instead of throwing an exception if parsing fails. But this method is marked as Beta right now.
You could try using a regular expression to check the form of the string before trying to parse it?
A simple implementation to validate an integer that fits in a long would be:
public static boolean isValidLong(String str) {
if( str==null ) return false;
int len = str.length();
if (str.charAt(0) == '+') {
return str.matches("\\+\\d+") && (len < 20 || len == 20 && str.compareTo("+9223372036854775807") <= 0);
} else if (str.charAt(0) == '-') {
return str.matches("-\\d+") && (len < 20 || len == 20 && str.compareTo("-9223372036854775808") <= 0);
} else {
return str.matches("\\d+") && (len < 19 || len == 19 && str.compareTo("9223372036854775807") <= 0);
}
}
It doesn't handle octal, 0x prefix or so but that is seldom a requirement.
For speed, the ".match" expressions are easy to code in a loop.