How can I parse the content of file to variables? - java

I have text file which looks like this :
ABC=-1 Temp=2 Try=34 Message="some text" SYS=3
ABC=-1 Temp=5 Try=40 Message="some more and different text" SYS=6
and the pattern continues but only the numeric values and text inside the " " is changed.
NOTE: the Message= could have multiple quotes as well.
I want to store the value of ABC,Temp,Try and SYS to int variables
And Message to a String variable.
I am currently using:
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
int count = line.indexOf("ABC=");
if (count >= 0) {
int clear = line.charAt(count + 3);
}
}
scanner.close();
I thought of using the Scanner class and read line by line, but I am confused about how can I classify the line in different variables?

First make a class that represents the data:
public static class MyData { // please pick a better name
final int abc;
final int temp;
final int tryNumber; // try is a keyword
final String message;
final int sys;
public MyData(int abc, int temp, int tryNumber, String message, int sys) {
this.abc = abc;
this.temp = temp;
this.tryNumber = tryNumber;
this.message = message;
this.sys = sys;
}
}
Then make a method that transforms a String into this class using Regex capture groups:
private static Pattern p =
Pattern.compile("ABC=([^ ]+) Temp=([^ ]+) Try=([^ ]+) Message=\"(.+)\" SYS=([^ ]+)");
private static MyData makeData(String input) {
int abc = 0, temp = 0, tryNumber = 0, sys = 0;
String message = "";
Matcher m = p.matcher(input);
if (!(m.find()) return null;
abc = Integer.parseInt(m.group(1));
temp = Integer.parseInt(m.group(2));
tryNumber = Integer.parseInt(m.group(3));
message = m.group(4);
sys = Integer.parseInt(m.group(5));
return new MyData(abc, temp, tryNumber, message, sys);
}
Then read the file using a scanner:
public static void main (String... args) throws Exception {
File file = new File("/path/to/your/file.txt");
List<MyData> dataList = new ArrayList<>();
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
MyData data = makeData(line);
if(data != null) dataList.add(data);
}
scanner.close();
}
Here's a completely working demo on ideone

You can use regex for this kind of parsing with a pattern of:
"ABC=([+-]?\\d+) Temp=([+-]?\\d+) Try=([+-]?\\d+) Message=\"(.+)\" SYS=([+-]?\\d+)"
Pattern Breakdown (Pattern Reference):
ABC= - literal string
([+-]?\\d+) - captures a positive or negative number in capture group 1
Temp= - literal string
([+-]?\\d+) - captures a positive or negative number in capture group 2
Try= - literal string
([+-]?\\d+) - captures a positive or negative number in capture group 3
Message= - literal string
\"(.+)\" captures a string in between double quotes in capture group 4
SYS= - literal string
([+-]?\\d+) - captures a positive or negative number in capture group 5
If the String matches the pattern you can extract your values like this:
public static void main(String[] args) throws Exception {
List<String> data = new ArrayList() {{
add("ABC=-1 Temp=2 Try=34 Message=\"some text\" SYS=3");
add("ABC=-1 Temp=5 Try=40 Message=\"some more \"and\" different text\" SYS=6");
}};
String pattern = "ABC=([+-]?\\d+) Temp=([+-]?\\d+) Try=([+-]?\\d+) Message=\"(.+)\" SYS=([+-]?\\d+)";
int abc = 0;
int temp = 0;
int tryNum = 0;
String message = "";
int sys = 0;
for (String d : data) {
Matcher matcher = Pattern.compile(pattern).matcher(d);
if (matcher.matches()) {
abc = Integer.parseInt(matcher.group(1));
temp = Integer.parseInt(matcher.group(2));
tryNum = Integer.parseInt(matcher.group(3));
message = matcher.group(4);
sys = Integer.parseInt(matcher.group(5));
System.out.printf("%d %d %d %s %d%n", abc, temp, tryNum, message, sys);
}
}
}
Results:
-1 2 34 some text 3
-1 5 40 some more "and" different text 6

If you are already using the indexOf approach, the following code will work
String a = "ABC=-1 Temp=2 Try=34 Message=\"some text\" SYS=3";
int abc_index = a.indexOf("ABC");
int temp_index = a.indexOf("Temp");
int try_index = a.indexOf("Try");
int message_index = a.indexOf("Message");
int sys_index = a.indexOf("SYS");
int length = a.length();
int abc = Integer.parseInt(a.substring(abc_index + 4, temp_index - 1));
int temp = Integer.parseInt(a.substring(temp_index + 5, try_index - 1));
int try_ = Integer.parseInt(a.substring(try_index + 4, message_index - 1));
String message = a.substring(message_index + 9, sys_index - 2);
int sys = Integer.parseInt(a.substring(sys_index + 4, length));
System.out.println("abc : " + abc);
System.out.println("temp : " + temp);
System.out.println("try : " + try_);
System.out.println("message : " + message);
System.out.println("sys : " + sys);
This will give you the following
abc : -1
temp : 2
try : 34
message : some text
sys : 3
This will work only if the string data you get has this exact syntax, ie, it contains ABC, Temp, Try, Message, and SYS. Hope this helps.

Related

Replace part of substring with specific characters based on delimiter

String s = "abc//jason:1234567#123.123.213.212/";
I want to replace all the substring before and after ":" delimiter with "......."
I want my final output to be :
"abc//.....:.......#123.123.213:212/"
I tried doing this since there is a second : in the string it gets messed up, is it there better way to be able to get my output:
String [] headersplit;
headersplit = s.split(":");
If you want to locate only symbols between "//" and "#" then algorithm is simple, provided that mention symbols are compulsory.
public class Main {
public static void main(String[] args) {
String s = "abc//jason:1234567#123.123.213.212/";
System.out.println(replaceSensitiveInfo(s));
}
static String replaceSensitiveInfo(String src) {
int slashes = src.indexOf("//");
int colon = src.indexOf(":", slashes);
int at = src.indexOf("#", colon);
StringBuilder sb = new StringBuilder(src);
sb.replace(slashes + 2, colon, ".".repeat(colon - slashes - 2));
sb.replace(colon + 1, at, ".".repeat(at - colon - 1));
return sb.toString();
}
}
Not the best way but it works for your example and should work for others:
String s = "abc//jason:1234567#123.123.213:212/";
String result = replaceSensitiveInfo(s);
private String replaceSensitiveInfo(String info){
StringBuilder sb = new StringBuilder(info);
String substitute = ".";
int start = sb.indexOf("//") + 2;
int end = sb.indexOf(":");
String firstReplace = substitute.repeat(end - start);
sb.replace(start, end, firstReplace);
int start2 = sb.indexOf(":") + 1;
int end2 = sb.indexOf("#");
String secondReplace = substitute.repeat(end2 - start2);
sb.replace(start2, end2, secondReplace);
return sb.toString();
}

How to use a regular expression to print repeating characters only once and non repeating characters in the same order as they appear in a String?

I'm writing a function to print decimal representation of a rational number( in the form of numerator and denominator) and trying to print the repeating part of digits inside a parenthesis and decimal part remains the same.
for EX: 1) 2/3=0.(3)
2) 2/4=0.5(0)
3)22/7=3.(142857)
For this I tried using a regular expression to capture the repeating characters of decimal part but my regular expression captures the repeating characters once and non repeating characters.
Here is my code...Can someone help me on this!!!
div = ((double) num)/deno;
String str = String.valueOf(div);
String arr[] = str.split("\\.");
String wp = arr[0];
String dp = arr[1];
String repeated = dp.replaceAll("(.+?)\\1+", "$1");
System.out.println("repeated is " + repeated);
System.out.println(wp + "." + "(" + repeated + ")");`
output I'm getting is:-
Input given 22/7
Integer part: 3
Decimal part: 142857142857143
repeating characters captured by regular expression- 142857143
final output-3.(142857143)
When you are replacing the repeating part, the last 143 is not getting replaced with `` empty string. So it remains in the output.
You can use Pattern class, with regex (\d+)+\1, like this:
public class Test
{
public static void main(String[] args) throws Exception
{
double[] nums = {2.0/3, 2.0/4, 22.0/7};
for(double d : nums)
print(d);
}
static void print(double div) {
String str = String.valueOf(div);
String arr[] = str.split("\\.");
String wp = arr[0];
String dp = arr[1];
String repeated = dp;
Pattern ptrn = Pattern.compile("(\\d+)+\\1");
Matcher m = ptrn.matcher(dp);
if(m.find()) {
repeated = m.group(1);
System.out.println(str + " -> "+ wp + "." + "(" + repeated + ")");
} else {
System.out.println(str + " -> "+ wp + "." + dp +"(0)");
}
}
}
Output:
0.6666666666666666 -> 0.(6)
0.5 -> 0.5(0)
3.142857142857143 -> 3.(142857)
Your regex is pretty close to work.
Alternative:
Matcher matcher = Pattern.compile("(.+?)\\1").matcher(decimalPart);
String repeated = matcher.find() ? matcher.group(1) : "0";
See alternative in context:
public static void main(String[] args) {
List<String> divisions = Arrays.asList("2/3", "2/4", "22/7");
List<String> quotientsAsString = getQuotientsAsString(divisions);
List<String> repeatedResult = getRepeatedResult(quotientsAsString);
printResult(divisions, quotientsAsString, repeatedResult);
}
private static void printResult(List<String> divisions, List<String> quotientsAsString,
List<String> repeatedResult) {
for (int i = 0; i < divisions.size(); i++) {
System.out.printf("%d) %s = %s => %s%n", (i + 1), divisions.get(i)
, quotientsAsString.get(i), repeatedResult.get(i));
}
}
private static List<String> getRepeatedResult(List<String> quotientsAsString) {
//Pre-compile regex before enter loop
Pattern dotSignPattern = Pattern.compile("\\.");
Pattern repeatedDecimalPattern = Pattern.compile("(.+?)\\1");
List<String> repeatedResult = new ArrayList<>();
for (String quotient : quotientsAsString) {
String[] quotientParts = dotSignPattern.split(quotient);
String integerPart = quotientParts[0];
String decimalPart = quotientParts[1];
// Pattern in context!!!
Matcher matcher = repeatedDecimalPattern.matcher(decimalPart);
String repeated = matcher.find() ? matcher.group(1) : "0";
String resultRepeated = String.format("%s.(%s)", integerPart, repeated);
String resultZeroRepeated = String.format("%s.%s(%s)", integerPart, decimalPart, repeated);
String result = repeated.equals("0") ? resultZeroRepeated : resultRepeated;
repeatedResult.add(result);
}
return repeatedResult;
}
private static List<String> getQuotientsAsString(List<String> divisions) {
//Pre-compile regex before enter loop
Pattern divSignPattern = Pattern.compile("/");
List<String> quotientsAsString = new ArrayList<>();
for (String div : divisions) {
String[] divParts = divSignPattern.split(div);
Double dividend = Double.valueOf(divParts[0]);
Double divisor = Double.valueOf(divParts[1]);
Double quotient = dividend / divisor;
quotientsAsString.add(String.valueOf(quotient));
}
return quotientsAsString;
}
Output:
1) 2/3 = 0.6666666666666666 => 0.(6)
2) 2/4 = 0.5 => 0.5(0)
3) 22/7 = 3.142857142857143 => 3.(142857)

Finding Pattern from one string to apply to another string - Java

So I have a string like this: <em>1234</em>56.70
it's basically a number where the em tags help identify what to highlight in the string
I need to first convert the string to an actual number with the current locale format. So I remove the em tags (replaceAll by emptyString) and then use the numberFormat java API to get a string like: $123,456.70
The problem with this is, I lost the highlight (em) tags. So I need to put it back in the string that is formatted, something like this: <em>$123,4</em>56.70
highlightValue = "<em>1234</em>56.70";
highlightValue = highlightValue.replaceAll("<em>", "").replaceAll("</em>", ""); // highlightValue is now 123456.70
highlightValue = numberFormat.convertToFormat(highlightValue, currencyCode); // highlightValue is now $123,456.70
highlightValue = someFunction(highlightValue); // this function needs to return <em>$123,4</em>56.70
I am not sure what approach to use. I was trying pattern matching but didn't know how to achieve it.
All help appreciated !
I am assuming that you want to highlight the number from starting up to some number of digits.This can be done.
In the initial string count the number of digits after which the tag is present. The starting tag will always be placed at the beginning. It is the ending tag you have to worry about. Now count the number of digits, excluding any other symbols.When the required number of digits have been passed, again place the tag. Either you can create a StringBuilder from the String highlighted and insert the tag string directly, or divide the string into two substrings and then join them together with the tag string in the middle.
Hope this helped.
I took an approach, where I count the numbers in front of the tag, in the middle of the tag - as I think no formatting will actually change the numbers(assuming you don't add leading zeroes) and after that I insert back the tag based on the numbers which were in front of the tag or for the closing tag in front and inside
so this is the code:
public static void main(String[] args) {
String input1 = "<em>1234</em>56.70";
String result1 = formatString(input1, "em");
System.out.printf("input1 = %s%n", input1);
System.out.printf("result1 = %s%n", result1);
String input2 = "<em>8127</em>29.12";
String result2 = formatString(input2, "em");
System.out.printf("input2 = %s%n", input2);
System.out.printf("result2 = %s%n", result2);
}
private static String formatString(String input, String tagName) {
String tagOpening = String.format("<%s>", tagName);
int tagOpeningLength = tagOpening.length();
String tagClosing = String.format("</%s>", tagName);
int tagClosingLength = tagClosing.length();
int inputLength = input.length();
int tagOpeningPos = input.indexOf(tagOpening);
int tagClosingPos = input.indexOf(tagClosing, tagOpeningPos);
String beforeTag;
if(tagOpeningPos > 0)
beforeTag = input.substring(0, tagOpeningPos);
else
beforeTag = "";
int digitsInBeforeTag = countNumbers(beforeTag);
String tagValue;
if(tagOpeningPos + tagOpeningLength < tagClosingPos)
tagValue = input.substring(tagOpeningPos + tagOpeningLength, tagClosingPos);
else
tagValue = "";
int digitsInTagValue = countNumbers(tagValue);
String afterTag;
if((tagClosingPos + tagClosingLength) < inputLength)
afterTag = input.substring(tagClosingPos + tagClosingLength);
else
afterTag = "";
String valueToBeFormatted = beforeTag + tagValue + afterTag;
double value = Double.parseDouble(valueToBeFormatted);
NumberFormat nf = NumberFormat.getInstance(Locale.ENGLISH);
String formattedValue = nf.format(value);
int newEmOpeningPos = findSubstringWithThisManyNumbers(formattedValue, digitsInBeforeTag);
int newEmClosingPos = findSubstringWithThisManyNumbers(formattedValue, digitsInBeforeTag+digitsInTagValue);
StringBuilder result = new StringBuilder();
result.append(formattedValue.substring(0, newEmOpeningPos));
result.append(tagOpening);
result.append(formattedValue.substring(newEmOpeningPos, newEmClosingPos));
result.append(tagClosing);
result.append(formattedValue.substring(newEmClosingPos));
return result.toString();
}
private static int findSubstringWithThisManyNumbers(String input, int digitCount) {
int pos = 0;
int counter = 0;
for(char c : input.toCharArray()) {
if(counter >= digitCount)
break;
if(Character.isDigit(c))
counter++;
pos++;
}
return pos;
}
private static int countNumbers(String str) {
int result = 0;
for(char c : str.toCharArray())
if(Character.isDigit(c))
result++;
return result;
}
the output was
input1 = <em>1234</em>56.70
result1 = <em>123,4</em>56.7
input2 = <em>8127</em>29.12
result2 = <em>812,7</em>29.12
I don't know how can this be practical. But anyway.
String highlightValue = "0<em>1234</em>56.70";
int startIndex = highlightValue.indexOf("<em>");
String startString = highlightValue.substring(0, startIndex);
String endString = highlightValue.substring(highlightValue.indexOf("</em>") + "</em>".length());
highlightValue = highlightValue.replaceAll("<em>", "").replaceAll("</em>", "");
highlightValue = numberFormat.convertToFormat(highlightValue, currencyCode);
// highlightValue is now $123,456.70
int endIndex = highlightValue.indexOf(endString);
highlightValue = startString + "<em>" + highlightValue.substring(0, endIndex) + "</em>" + endString;
System.out.println(highlightValue);
// 0<em>$123,4</em>56.70

Parsing a string that contains multiple symbols results in crash

Im trying to split some characters in Java that contains "," , ":" and "-"
For instance ,
if the input is 58,1:2-4, it should produce the following output
Booknumber: 58
Chapter Number: 1
Verses = [2,3,4] (since 2-4 is the
values from 2 to 4)
Following is the code that I have tried,
private int getBookNumber() {
bookNumber = chapterNumber.split("[,]")[0];
return Integer.valueOf(bookNumber);
}
private int getChapterNumber() {
chapterNumber = sample.split("[:]")[0];
verseNumbers = sample.split("[:]")[1];
return Integer.valueOf(chapterNumber);
}
private List<Integer> getVerseNumbers(String bookValue) {
List<Integer> verseNumList = new ArrayList<>();
if (bookValue.contains("-")) {
//TODO parse - separated string
} else {
verseNumList.add(Integer.valueOf(bookValue));
}
return verseNumList;
}
I would invoke them in the following manner sequentially
int chapterNumber = getChapterNumber();
int bookNumber = getBookNumber();
List<Integer> verseNumbers = getVerseNumbers(this.verseNumbers);
But Im getting Caused by: java.lang.NumberFormatException: Invalid int: "58 , 1 " in the line int chapterNumber = getChapterNumber();
is there an efficient way to parse this string ?
You should change getChapterNumber like this:
private int getChapterNumber() {
chapterNumber = sample.split("[:]")[0];
verseNumbers = sample.split("[:]")[1];
return Integer.valueOf(chapterNumber.split("[,]")[1]);
}
But the best would be to use matcher:
String line = "58,1:2-4";
Pattern pattern = Pattern.compile("(\\d+),(\\d+):(.*)");
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
System.out.println("group 1: " + matcher.group(1));
System.out.println("group 2: " + matcher.group(2));
System.out.println("group 3: " + matcher.group(3));
}
Output:
group 1: 58
group 2: 1
group 3: 2-4
I might approach this using base string methods to avoid the heavy equipment which comes with a regex matcher:
String input = "58,1:2-4";
int commaIndex = input.indexOf(",");
int colonIndex = input.indexOf(":");
int bookNumber = Integer.valueOf(input.substring(0, commaIndex));
int chapterNumber = Integer.valueOf(input.substring(commaIndex+1, colonIndex));
String verseString = input.substring(colonIndex+1);
String[] verses = verseString.split("-");
int startVerse = Integer.valueOf(verses[0]);
int endVerse = Integer.valueOf(verses[1]);
int[] allVerses = new int[endVerse - startVerse + 1];
for (int i=0; i < allVerses.length; ++i) {
allVerses[i] = startVerse + i;
}

Extract numbers from Strings, add them and convert back to string

I have couple of similar strings. I want to extract the numbers from them, add the numbers and convert it back to the same string format.
And the logic should be generic, i.e., it should work for any given strings.
Example:
String s1 = "1/9"; String s2 = "12/4"; The total of the above two Strings should be "13/13" (String again)
I know how to extract numbers from any given String. I referred: How to extract numbers from a string and get an array of ints?
But I don't know how to put them up back again to the same String format.
Can any one please help me over this?
Note: the string format can be anything, I have just taken an example for explanation.
Take a look at this:
public class StringTest {
public static void main(String[] args) {
String divider = "/";
String s1 = "1/9";
String s2 = "12/4";
String[] fragments1 = s1.split(divider);
String[] fragments2 = s2.split(divider);
int first = Integer.parseInt(fragments1[0]);
first += Integer.parseInt(fragments2[0]);
int second = Integer.parseInt(fragments1[1]);
second += Integer.parseInt(fragments2[1]);
String output = first + divider + second;
System.out.println(output);
}
}
The code prints:
13/13
Using a regex (and Markus' code)
public class StringTest {
public static void main(String[] args) {
String s1 = "1/9";
String s2 = "12&4";
String[] fragments1 = s1.split("[^\\d]");
String[] fragments2 = s2.split("[^\\d]");
int first = Integer.parseInt(fragments1[0]);
first += Integer.parseInt(fragments2[0]);
int second = Integer.parseInt(fragments1[1]);
second += Integer.parseInt(fragments2[1]);
String output = first + divider + second;
System.out.println(output);
}
}
You should be able to get from here to joining back from an array. If you're getting super fancy, you'll need to use regular expression capture groups and store the captured delimiters somewhere.
First, split your strings into matches and non-matches:
public static class Token {
public final String text;
public final boolean isMatch;
public Token(String text, boolean isMatch) {
this.text = text;
this.isMatch = isMatch;
}
#Override
public String toString() {
return text + ":" + isMatch;
}
}
public static List<Token> tokenize(String src, Pattern pattern) {
List<Token> tokens = new ArrayList<>();
Matcher matcher = pattern.matcher(src);
int last = 0;
while (matcher.find()) {
if (matcher.start() != last) {
tokens.add(new Token(src.substring(last, matcher.start()), false));
}
tokens.add(new Token(src.substring(matcher.start(), matcher.end()), true));
last = matcher.end();
}
if (last < src.length()) {
tokens.add(new Token(src.substring(last), false));
}
return tokens;
}
Once this is done, you can create lists you can iterate over and process.
For example, this code:
Pattern digits = Pattern.compile("\\d+");
System.out.println(tokenize("1/2", digits));
...outputs:
[1:true, /:false, 2:true]
Damn quick and dirty not relying on knowing which separator is used. You have to make sure, m1.group(2) and m2.group(2) are equal (which represents the separator).
public static void main(String[] args) {
String s1 = "1/9";
String s2 = "12/4";
Matcher m1 = Pattern.compile("(\\d+)(.*)(\\d+)").matcher(s1);
Matcher m2 = Pattern.compile("(\\d+)(.*)(\\d+)").matcher(s2);
m1.matches(); m2.matches();
int sum1 = parseInt(m1.group(1)) + parseInt(m2.group(1));
int sum2 = parseInt(m2.group(3)) + parseInt(m2.group(3));
System.out.printf("%s%s%s\n", sum1, m1.group(2), sum2);
}
Consider function:
public String format(int first, int second, String separator){
return first + separator + second;
}
then:
System.out.println(format(6, 13, "/")); // prints "6/13"
Thanks #remus. Reading your logic I was able to build the following code. This code solves the problem for any given strings having same format.
public class Test {
public static void main(String[] args) {
ArrayList<Integer> numberList1 = new ArrayList<Integer>();
ArrayList<Integer> numberList2 = new ArrayList<Integer>();
ArrayList<Integer> outputList = new ArrayList<Integer>();
String str1 = "abc 11:4 xyz 10:9";
String str2 = "abc 9:2 xyz 100:11";
String output = "";
// Extracting numbers from the two similar string
Pattern p1 = Pattern.compile("-?\\d+");
Matcher m = p1.matcher(str1);
while (m.find()) {
numberList1.add(Integer.valueOf(m.group()));
}
m = p1.matcher(str2);
while (m.find()) {
numberList2.add(Integer.valueOf(m.group()));
}
// Numbers extracted. Printing them
System.out.println("List1: " + numberList1);
System.out.println("List2: " + numberList2);
// Adding the respective indexed numbers from both the lists
for (int i = 0; i < numberList1.size(); i++) {
outputList.add(numberList1.get(i) + numberList2.get(i));
}
// Printing the summed list
System.out.println("Output List: " + outputList);
// Splitting string to segregate numbers from text and getting the format
String[] template = str1.split("(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)");
// building the string back using the summed list and format
int counter = 0;
for (String tmp : template) {
if (Test.isInteger(tmp)) {
output += outputList.get(counter);
counter++;
} else {
output += tmp;
}
}
// Printing the output
System.out.println(output);
}
public static boolean isInteger(String s) {
try {
Integer.parseInt(s);
} catch (NumberFormatException e) {
return false;
}
return true;
}
}
output:
List1: [11, 4, 10, 9]
List2: [9, 2, 100, 11]
Output List: [20, 6, 110, 20]
abc 20:6 xyz 110:20

Categories

Resources