Converting string data from an array of data arranged in columns - java

I'm trying to convert a string array that is itself a part of another array fed into Java from an external file.
There are two parts to this question:
How do I convert the string's substring elements to doubles or ints?
How do I skip the header which is itself a part of the string?
I have the following piece of code that is NOT giving me an error but neither is it giving me output. The data is arranged in columns, so as far as the split, I'm not sure what delimiter to use as the argument for that method. I've tried \r, \n, ",", " " and nothing works.
str0 = year.split(",");
year = year.trim();
int[] yearData = new int[str0.length-1];
for(i = 0; i < str0.length-1; i++) {
yearData[i] = Integer.parseInt(str0[i]);
System.out.println(yearData[i]);
}

The code you have provided is not working. Anyway consider the given example which is using Regular Expression, where you found all the numbers in the string, so our regular expression works well. By changing the Regular Expression you can get the substring as well as you can skip the head part. I hope it would help.
String regEx = "[+|-]?(\\d+(\\.\\d*)?)|(\\.\\d+)";
String str = "256 is the square of 16 and -2.5 squared is 6.25 “ + “and -.243 is less than 0.1234.";
Pattern pattern = Pattern.compile(regEx);
Matcher m = pattern.matcher(str);
int i = 0;
String subStr = null;
while(m.find()) {
System.out.println(m.group());

Try something like this:
year = year.trim(); // This should come before the split()...
str0 = year.split("[\\s,;]+"); // split() uses RegEx...
int[] yearData = new int[str0.length-1];
for(i = 0; i < str0.length-1; i++) {
yearData[i] = Integer.parseInt(str0[i]);
System.out.println(yearData[i]);
}

Related

How would I replace this function with a regex replace

I have a file name with this format yy_MM_someRandomString_originalFileName.
example:
02_01_fEa3129E_my Pic.png
I want replace the first 2 underscores with / so that the example becomes:
02/01/fEa3129E_my Pic.png
That can be done with replaceAll, but the problem is that files may contain underscores as well.
#Test
void test() {
final var input = "02_01_fEa3129E_my Pic.png";
final var formatted = replaceNMatches(input, "_", "/", 2);
assertEquals("02/01/fEa3129E_my Pic.png", formatted);
}
private String replaceNMatches(String input, String regex,
String replacement, int numberOfTimes) {
for (int i = 0; i < numberOfTimes; i++) {
input = input.replaceFirst(regex, replacement);
}
return input;
}
I solved this using a loop, but is there a pure regex way to do this?
EDIT: this way should be able to let me change a parameter and increase the amount of underscores from 2 to n.
You could use 2 capturing groups and use those in the replacement where the match of the _ will be replaced by /
^([^_]+)_([^_]+)_
Replace with:
$1/$2/
Regex demo | Java demo
For example:
String regex = "^([^_]+)_([^_]+)_";
String string = "02_01_fEa3129E_my Pic.png";
String subst = "$1/$2/";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
String result = matcher.replaceFirst(subst);
System.out.println(result);
Result
02/01/fEa3129E_my Pic.png
Your current solution has few problems:
It is inefficient - because each replaceFirst need to start from beginning of string so it needs to iterate over same starting characters many times.
It has a bug - because of point 1. while iterating from beginning instead of last modified place, we can replace value which was inserted previously.
For instance if we want to replace single character two times, each with X like abc -> XXc after code like
String input = "abc";
input = input.replaceFirst(".", "X"); // replaces a with X -> Xbc
input = input.replaceFirst(".", "X"); // replaces X with X -> Xbc
we will end up with Xbc instead of XXc because second replaceFirst will replace X with X instead of b with X.
To avoid that kind of problems you can rewrite your code to use Matcher#appendReplacement and Matcher#appendTail methods which ensures that we will iterate over input once and can replace each matched part with value we want
private static String replaceNMatches(String input, String regex,
String replacement, int numberOfTimes) {
Matcher m = Pattern.compile(regex).matcher(input);
StringBuilder sb = new StringBuilder();
int i = 0;
while(i++ < numberOfTimes && m.find() ){
m.appendReplacement(sb, replacement); // replaces currently matched part with replacement,
// and writes replaced version to StringBuilder
// along with text before the match
}
m.appendTail(sb); //lets add to builder text after last match
return sb.toString();
}
Usage example:
System.out.println(replaceNMatches("abcdefgh", "[efgh]", "X", 2)); //abcdXXgh

Parse out specific characters from java string

I have been trying to drop specific values from a String holding JDBC query results and column metadata. The format of the output is:
[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]
I am trying to get it into the following format:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
I have tried just dropping everything before the "=", but some of the "someVal" data has "=" in them. Is there any efficient way to solve this issue?
below is the code I used:
for(int i = 0; i < finalResult.size(); i+=modval) {
String resulttemp = finalResult.get(i).toString();
String [] parts = resulttemp.split(",");
//below is only for
for(int z = 0; z < columnHeaders.size(); z++) {
String replaced ="";
replaced = parts[z].replace("*=", "");
System.out.println("Replaced: " + replaced);
}
}
You don't need any splitting here!
You can use replaceAll() and the power of regular expressions to simply replace all occurrences of those unwanted characters, like in:
someString.replaceAll("[\\[\\]\\{\\}", "")
When you apply that to your strings, the resulting string should exactly look like required.
You could use a regular expression to replace the square and curly brackets like this [\[\]{}]
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
System.out.println(s.replaceAll("[\\[\\]{}]", ""));
That would produce the following output:
I_Col1=someValue1, I_Col2=someVal2, I_Col3=someVal3
which is what you expect in your post.
A better approach however might be to match instead of replace if you know the character set that will be in the position of 'someValue'. Then you can design a regex that will match this perticular string in such a way that no matter what seperates I_Col1=someValue1 from the rest of the String, you will be able to extract it :-)
EDIT:
With regards to the matching approach, given that the value following I_Col1= consists of characters from a-z and _ (regardless of the case) you could use this pattern: (I_Col\d=\w+),?
For example:
String s = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]";
Matcher m = Pattern.compile("(I_Col\\d=\\w+),?").matcher(s);
while (m.find())
System.out.println(m.group(1));
This will produce:
I_Col1=someValue1
I_Col2=someVal2
I_Col3=someVal3
You could do four calls to replaceAll on the string.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
String queryWithoutBracesAndBrackets = query.replaceAll("\\{", "").replaceAll("\\]", "").replaceAll("\\]", "").replaceAll("\\[", "")
Or you could use a regexp if you want the code to be more understandable.
String query = "[{I_Col1=someValue1, I_Col2=someVal2}, {I_Col3=someVal3}]"
queryWithoutBracesAndBrackets = query.replaceAll("\\[|\\]|\\{|\\}", "")

How do i split a String into an array with split and regex?

The string that I am trying to split is the following:
#1 Single" (2006)\t\t\t\t\t2006-????
The regex i am trying is:
(["#0-9 a-zA-Z]*\w") (\([0-9]*\w\)).*([0-9{4}]*\d-[\?0-9{4}]*)
This however takes the whole string and not parts.
How do i make it into an array?
array("\"#1 Single\"", "2006", "2006-????");
You are already grouping in the regex the different parts you're interested in, so you should retrieve them separately and use them to populate the resulting array :
//assumes a Matcher matcher which has already matched the text with .find() or .matches()
int groupCount = 3; // for more complex cases, use matcher.groupCount();
String[] parts = new String[groupCount];
for (int groupIndex = 0; groupIndex < groupCount; groupIndex++) {
parts[groupIndex] = matcher.group(groupIndex);
}

Splitting string in between two characters in Java

I am currently attempting to interpret some code I wrote for something. The information I would like to split looks something like this:
{hey=yes}TEST
What I am trying to accomplish, is splitting above string in between '}' and 'T' (T, which could be any letter). The result I am after is (in pseudocode):
["{hey=yes}", "TEST"]
How would one go about doing so? I know basic regex, but have never gotten into using it to split strings in between letters before.
Update:
In order to split the string I am using the String.split method. Do tell if there is a better way to go about doing this.
You can use String's split method, as follow:
String str = "{hey=foo}TEST";
String[] split = str.split("(?<=})");
System.out.println(split[0] + ", " + split[1]);
It splits the string and prints this:
{hey=foo}, TEST
?<=}, is to split after the character } and keep the character while doing it. By default, if you just split on a character, it will be removed by the split.
This other answer provides a complete explanation of all options when using the split method:
how-to-split-string-with-some-separator-but-without-removing-that-separator-in-j
Usage of regexp for such a small piece of code can be really slow, if it is repeated thousands of times (e.g. like analysing Alfresco metadata for lot of documents).
Look at this snippet:
String s = "{key=value}SOMETEXT";
String[] e = null;
long now = 0L;
now = new Date().getTime();
for (int i = 0; i < 3000000; i++) {
e = s.split("(?<=})");
}
System.out.println("Regexp: " + (new Date().getTime() - now));
now = new Date().getTime();
for (int i = 0; i < 3000000; i++) {
int idx = s.indexOf('}') + 1;
e = new String[] { s.substring(0, idx), s.substring(idx) };
}
System.out.println("IndexOf:" + (new Date().getTime() - now));
result is
Regexp: 2544
IndexOf:113
This means that regexp is 25 times slower than a (easier) substring. Keep it in mind: it can make the difference between a efficient code and a elegant (!) one.
If you're looking for a regex approach and also want some validation that input follows the expected syntax you probably want something like this:
public List<String> splitWithRegexp(String string)
{
Matcher matcher = Pattern.compile("(\\{.*\\})(.*)").matcher(string);
if (matcher.find())
return Arrays.asList(matcher.group(1), matcher.group(2));
else
throw new IllegalArgumentException("Input didn't match!");
}
The parenthesis in the regexp captures groups, which you can access with matcher.group(n) calls. Group 0 matches the whole pattern.

How to extract integers from a complicated string?

I am having a hard time figuring with out. Say I have String like this
String s could equal
s = "{1,4,204,3}"
at another time it could equal
s = "&5,3,5,20&"
or it could equal at another time
s = "/4,2,41,23/"
Is there any way I could just extract the numbers out of this string and make a char array for example?
You can use regex for this sample:
String s = "&5,3,5,20&";
System.out.println(s.replaceAll("[^0-9,]", ""));
result:
5,3,5,20
It will replace all the non word except numbers and commas. If you want to extract all the number you can just call split method -> String [] sArray = s.split(","); and iterate to all the array to extract all the number between commas.
You can use RegEx and extract all the digits from the string.
stringWithOnlyNumbers = str.replaceAll("[^\\d,]+","");
After this you can use split() using deliminator ',' to get the numbers in an array.
I think split() with replace() must help you with that
Use regular expressions
String a = "asdf4sdf5323ki";
String regex = "([0-9]*)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(a);
while (matcher.find())
{
String group = matcher.group(1);
if (group.length() > 0)
{
System.out.println(group);
}
}
from your cases, if the pattern of string is same in all cases, then something like below would work, check for any exceptions, not mentioned here :
String[] sArr= s.split(",");
sArr[0] = sArr[0].substring(1);
sArr[sArr.length()-1] =sArr[sArr.length()-1].substring(0,sArr[sArr.length()-1].length()-1);
then convert the String[] to char[] , here is an example converter method
You can use Scanner class with , delimiter
String s = "{1,4,204,3}";
Scanner in = new Scanner(s.substring(1, s.length() - 1)); // Will scan the 1,4,204,3 part
in.useDelimiter(",");
while(in.hasNextInt()){
int x = in.nextInt();
System.out.print(x + " ");
// do something with x
}
The above will print:
1 4 204 3

Categories

Resources