NumberFormatException when parsing Hindi numbers - java

I have an android application and today I have got a crash report which contains this:
This exception trigger when the application tries to parse string number which is provided by the user.
It is obvious that problem is the application cannot parse Hindi numbers! So, how can I solve this?

Regex
Using regex would be better if you want to match any unicode digits.The regex would be \\p{N}+ and here's how to use it:
Matcher m=Pattern.compile("\\p{N}+").matcher(input);
if(m.find())
{
System.out.println(m.group());
}
Locale
To answer your question you should use NumberFormat as mentioned in docs. Specify a Locale for NumberFormat.
NumberFormat nf = NumberFormat.getInstance(new Locale("hi", "IN"));
nf.parse(input);

You can use Character.getNumericValue(char).
The good thing about this method is that it can do what you need.
But to work in valid you should implement in your application support for local.
NumberFormat format = NumberFormat.getInstance(new Locale("hin","IND"));
Number parse = format.parse("१");
System.out.println(parse);
Prints 1.

Try this. This will remove non numeric characters.
Pattern p = Pattern.compile("(\\d+)");
Matcher m = p.matcher(str); // str is input String
while(m.find()) {
System.out.println(m.group(1));
}
If you are dealing with double(with decimal places). you can try this
String text = "123.0114cc";
String numOnly = text.replaceAll("\\p{Alpha}","");
double numVal = Double.valueOf(numOnly);
System.out.println(numVal);

Use
BigDecimal bigDecimal = new BigDecimal(YOUR_VALUE);
before applying the regex, as the BigDecimal supports 12 integers, 12.35 decimal, and 12 $ currency, 12% percentage and its localized value.

You can use the following method which receives a string and converts every Indian digit inside it to Arabic.
public static String convertAllIndianToArabic(String str)
{
for(int i=0; i<str.length(); i++)
{
if(str.charAt(i)=='٠')
str = str.substring(0, i)+"0"+str.substring(i+1);
else if(str.charAt(i)=='١')
str = str.substring(0, i)+"1"+str.substring(i+1);
else if(str.charAt(i)=='٢')
str = str.substring(0, i)+"2"+str.substring(i+1);
else if(str.charAt(i)=='٣')
str = str.substring(0, i)+"3"+str.substring(i+1);
else if(str.charAt(i)=='٤')
str = str.substring(0, i)+"4"+str.substring(i+1);
else if(str.charAt(i)=='٥')
str = str.substring(0, i)+"5"+str.substring(i+1);
else if(str.charAt(i)=='٦')
str = str.substring(0, i)+"6"+str.substring(i+1);
else if(str.charAt(i)=='٧')
str = str.substring(0, i)+"7"+str.substring(i+1);
else if(str.charAt(i)=='٨')
str = str.substring(0, i)+"8"+str.substring(i+1);
else if(str.charAt(i)=='٩')
str = str.substring(0, i)+"9"+str.substring(i+1);
}
return str;
}

Related

Extract multiple dates (dd-MMM-yyyy format) from a string in java

I have searched everywhere for this but couldn't get a specific solution, and the documentation also didn't cover this. So I want to extract the start date and end date from this string "1-Mar-2019 to 31-Mar-2019". The problem is I'm not able to extract both the date strings.
I found the closest solution here but couldn't post a comment asking how to extract values individually due to low reputation: https://stackoverflow.com/a/8116229/10735227
I'm using a regex pattern to look for the occurrences and to extract both occurrences to 2 strings first.
Here's what I tried:
Pattern p = Pattern.compile("(\\d{1,2}-[a-zA-Z]{3}-\\d{4})");
Matcher m = p.matcher(str);
while(m.find())
{
startdt = m.group(1);
enddt = m.group(1); //I think this is wrong, don't know how to fix it
}
System.out.println("startdt: "+startdt+" enddt: "+enddt);
Output is:
startdt: 31-Mar-2019 enddt: 31-Mar-2019
Additionally I need to use DateFormatter to convert the string to date (adding the trailing 0 before single digit date if required).
You can catch both dates simply calling the find method twice, if you only have one, this would only capture the first one :
String str = "1-Mar-2019 to 31-Mar-2019";
String startdt = null, enddt = null;
Pattern p = Pattern.compile("(\\d{1,2}-[a-zA-Z]{3}-\\d{4})");
Matcher m = p.matcher(str);
if(m.find()) {
startdt = m.group(1);
if(m.find()) {
enddt = m.group(1);
}
}
System.out.println("startdt: "+startdt+" enddt: "+enddt);
Note that this could be used with a while(m.find()) and a List<String to be able to extract every date your could find.
If your text may be messy, and you really need to use a regex to extract the date range, you may use
String str = "Text here 1-Mar-2019 to 31-Mar-2019 and tex there";
String startdt = "";
String enddt = "";
String date_rx = "\\d{1,2}-[a-zA-Z]{3}-\\d{4}";
Pattern p = Pattern.compile("(" + date_rx + ")\\s*to\\s*(" + date_rx + ")");
Matcher m = p.matcher(str);
if(m.find())
{
startdt = m.group(1);
enddt = m.group(2);
}
System.out.println("startdt: "+startdt+" enddt: "+enddt);
// => startdt: 1-Mar-2019 enddt: 31-Mar-2019
See the Java demo
Also, consider this enhancement: match the date as whole word to avoid partial matches in longer strings:
Pattern.compile("\\b(" + date_rx + ")\\s*to\\s*(" + date_rx + ")\\b")
If the range can be expressed with - or to you may replace to with (?:to|-), or even (?:to|\\p{Pd}) where \p{Pd} matches any hyphen/dash.
You can simply use String::split
String range = "1-Mar-2019 to 31-Mar-2019";
String dts [] = range.split(" ");
System.out.println(dts[0]);
System.out.println(dts[2]);

How can I format int numbers in java?

I am trying to apply this format to a given int number #´###,### I tried with DecimalFormat class but it only allows to have one grouping separator symbol when I need to have two the accute accent for millions and commas for thousands.
So at the end I can format values like 1,000 or millions in this way 1´000,000
I always prefer to use String.format, but I am not sure if there is a Locale that would format numbers like that either. Here is some code that will do the job though.
// Not sure if you wanted to start with a number or a string. Adjust accordingly
String stringValue = "1000000";
float floatValue = Float.valueOf(stringValue);
// Format the string to a known format
String formattedValue = String.format(Locale.US, "%,.2f", floatValue);
// Split the string on the separator
String[] parts = formattedValue.split(",");
// Put the parts back together with the special separators
String specialFormattedString = "";
int partsRemaining = parts.length;
for(int i=0;i<parts.length;i++)
{
specialFormattedString += parts[i];
partsRemaining--;
if(partsRemaining > 1)
specialFormattedString += "`";
else if(partsRemaining == 1)
specialFormattedString += ",";
}
I found useful the link #Roshan provide in comments, this solution is using regex expression and replaceFirst method
public static String audienceFormat(int number) {
String value = String.valueOf(number);
if (value.length() > 6) {
value = value.replaceFirst("(\\d{1,3})(\\d{3})(\\d{3})", "$1\u00B4$2,$3");
} else if (value.length() >=5 && value.length() <= 6) {
value = value.replaceFirst("(\\d{2,3})(\\d{3})", "$1,$2");
} else {
value = value.replaceFirst("(\\d{1})(\\d+)", "$1,$2");
}
return value;
}
I don't know if this solution has a performance impact, also I am rockie with regex, so this code might be shorted.
Try this, these Locale formats in your required format.
List<Locale> locales = Arrays.asList(new Locale("it", "CH"), new Locale("fr", "CH"), new Locale("de", "CH"));
for (Locale locale : locales) {
DecimalFormat df = (DecimalFormat) NumberFormat.getCurrencyInstance(locale);
DecimalFormatSymbols dfs = df.getDecimalFormatSymbols();
dfs.setCurrencySymbol("");
df.setDecimalFormatSymbols(dfs);
System.out.println(String.format("%5s %15s %15s", locale, format(df.format(1000)), format(df.format(1_000_000))));
}
util method
private static String format(String str) {
int index = str.lastIndexOf('\'');
if (index > 0) {
return new StringBuilder(str).replace(index, index + 1, ",").toString();
}
return str;
}
output
it_CH 1,000.00 1'000,000.00
fr_CH 1,000.00 1'000,000.00
de_CH 1,000.00 1'000,000.00
set df.setMaximumFractionDigits(0); to remove the fractions
output
it_CH 1,000 1'000,000
fr_CH 1,000 1'000,000
de_CH 1,000 1'000,000
Maybe try using this, the "#" in place with the units you want before the space or comma.
String num = "1000500000.574";
String newnew = new DecimalFormat("#,###.##").format(Double.parseDouble(number));

Java, Format String with indexed whitespace

I have my String, "08000001066". This String, which is a telephone number should be displayed with correct format as, "0800 000 1066".
One person suggested i should use this code block,
DecimalFormatSymbols phoneNumberSymbols = new DecimalFormatSymbols();
phoneNumberSymbols.setGroupingSeparator(' ');
DecimalFormat phoneNumberFormat = new DecimalFormat("####,###,###", phoneNumberSymbols);
This results in something close to what i want, but not exact as the DecimalFormat required a number (double, or float - of which a zero leading string cannot be parsed).
How would i format a String by method of something like Decimal Format's ####,###,###?
String phoneNumber = "08000001066";
StringBuilder sb = new StringBuilder(phoneNumber)
.insert(4," ")
.insert(8," ");
String output = sb.toString();
Try this out, it relies on your input being the correct number of digits but works with leading zeroes.
String inputString = "08000001066";
java.text.MessageFormat phoneFormat = new java.text.MessageFormat("{0} {1} {2}");
String[] phoneNumberArray = {inputString.substring(0,4), inputString.substring(4,7), inputString.substring(7)};
System.out.println(phoneFormat.format(phoneNumberArray));
Pattern p = Pattern.compile("(\\d{4})(\\d{3})(\\d{4})");
String s = "08000001066";
Matcher m = p.matcher(s);
if(m.matches()) {
return String.format("%s %s %s", m.group(1), m.group(2), m.group(3)));
}

Removing Dollar and comma from string

How can we remove dollar sign ($) and all comma(,) from same string? Would it be better to avoid regex?
String liveprice = "$123,456.78";
do like this
NumberFormat format = NumberFormat.getCurrencyInstance();
Number number = format.parse("\$123,456.78");
System.out.println(number.toString());
output
123456.78
Try,
String liveprice = "$123,456.78";
String newStr = liveprice.replaceAll("[$,]", "");
replaceAll uses regex, to avoid regex than try with consecutive replace method.
String liveprice = "$1,23,456.78";
String newStr = liveprice.replace("$", "").replace(",", "");
Without regex, you can try this:
String output = "$123,456.78".replace("$", "").replace(",", "");
Here is more information Oracle JavaDocs:
liveprice = liveprice.replace("X", "");
Just use Replace instead
String liveprice = "$123,456.78";
String output = liveprice.replace("$", "");
output = output .replace(",", "");
Will this works?
String liveprice = "$123,456.78";
String newStr = liveprice.replace("$", "").replace(",","");
Output: 123456.78
Live Demo
Better One:
String liveprice = "$123,456.78";
String newStr = liveprice.replaceAll("[$,]", "")
Live Demo
In my case, #Prabhakaran's answer did not work, someone can try this.
String salary = employee.getEmpSalary().replaceAll("[^\\d.]", "");
Float empSalary = Float.parseFloat(salary);
Is a replace really what you need?
public void test() {
String s = "$123,456.78";
StringBuilder t = new StringBuilder();
for ( int i = 0; i < s.length(); i++ ) {
char ch = s.charAt(i);
if ( Character.isDigit(ch)) {
t.append(ch);
}
}
}
This will work for any decorated number.
Example using Swedish Krona currency
String x="19.823.567,10 kr";
x=x.replace(".","");
x=x.replaceAll("\\s+","");
x=x.replace(",", ".");
x=x.replaceAll("[^0-9 , .]", "");
System.out.println(x);
Will give the output ->19823567.10(which can now be used for any computation)
import java.text.NumberFormat
def currencyAmount = 9876543.21 //Default is BigDecimal
def currencyFormatter = NumberFormat.getInstance( Locale.US )
assert currencyFormatter.format( currencyAmount ) == "9,876,543.21"
Don't need getCurrencyInstance() if currency is not required.
I think that you could use regex. For example:
"19.823.567,10 kr".replace(/\D/g, '')

Java: Find a specific pattern using Pattern and Matcher

This is the string that I have:
KLAS 282356Z 32010KT 10SM FEW090 10/M13 A2997 RMK AO2 SLP145 T01001128 10100 20072 51007
This is a weather report. I need to extract the following numbers from the report: 10/M13. It is temperature and dewpoint, where M means minus. So, the place in the String may differ and the temperature may be presented as M10/M13 or 10/13 or M10/13.
I have done the following code:
public String getTemperature (String metarIn){
Pattern regex = Pattern.compile(".*(\\d+)\\D+(\\d+)");
Matcher matcher = regex.matcher(metarIn);
if (matcher.matches() && matcher.groupCount() == 1) {
temperature = matcher.group(1);
System.out.println(temperature);
}
return temperature;
}
Obviously, the regex is wrong, since the method always returns null. I have tried tens of variations but to no avail. Thanks a lot if someone can help!
This will extract the String you seek, and it's only one line of code:
String tempAndDP = input.replaceAll(".*(?<![M\\d])(M?\\d+/M?\\d+).*", "$1");
Here's some test code:
public static void main(String[] args) throws Exception {
String input = "KLAS 282356Z 32010KT 10SM FEW090 M01/M13 A2997 RMK AO2 SLP145 T01001128 10100 20072 51007";
String tempAndDP = input.replaceAll(".*(?<![M\\d])(M?\\d+/M?\\d+).*", "$1");
System.out.println(tempAndDP);
}
Output:
M01/M13
The regex should look like:
M?\d+/M?\d+
For Java this will look like:
"M?\\d+/M?\\d+"
You might want to add a check for white space on the front and end:
"\\sM?\\d+/M?\\d+\\s"
But this will depend on where you think you are going to find the pattern, as it will not be matched if it is at the end of the string, so instead we should use:
"(^|\\s)M?\\d+/M?\\d+($|\\s)"
This specifies that if there isn't any whitespace at the end or front we must match the end of the string or the start of the string instead.
Example code used to test:
Pattern p = Pattern.compile("(^|\\s)M?\\d+/M?\\d+($|\\s)");
String test = "gibberish M130/13 here";
Matcher m = p.matcher(test);
if (m.find())
System.out.println(m.group().trim());
This returns: M130/13
Try:
Pattern regex = Pattern.compile(".*\\sM?(\\d+)/M?(\\d+)\\s.*");
Matcher matcher = regex.matcher(metarIn);
if (matcher.matches() && matcher.groupCount() == 2) {
temperature = matcher.group(1);
System.out.println(temperature);
}
Alternative for regex.
Some times a regex is not the only solution. It seems that in you case, you must get the 6th block of text. Each block is separated by a space character. So, what you need to do is count the blocks.
Considering that each block of text does NOT HAVE fixed length
Example:
String s = "KLAS 282356Z 32010KT 10SM FEW090 10/M13 A2997 RMK AO2 SLP145 T01001128 10100 20072 51007";
int spaces = 5;
int begin = 0;
while(spaces-- > 0){
begin = s.indexOf(' ', begin)+1;
}
int end = s.indexOf(' ', begin+1);
String result = s.substring(begin, end);
System.out.println(result);
Considering that each block of text does HAVE fixed length
String s = "KLAS 282356Z 32010KT 10SM FEW090 10/M13 A2997 RMK AO2 SLP145 T01001128 10100 20072 51007";
String result = s.substring(33, s.indexOf(' ', 33));
System.out.println(result);
Prettier alternative, as pointed by Adrian:
String result = rawString.split(" ")[5];
Note that split acctualy receives a regex pattern as parameter

Categories

Resources