Parsing string from the name - java

I am trying to parse the certain name from the filename.
The examples of File names are
xs_1234323_00_32
sf_12345233_99_12
fs_01923122_12_12
I used String parsedname= child.getName().substring(4.9) to get the 1234323 out of the first line. Instead, how do I format it for the above 3 to output only the middle numbers(between the two _)? Something using split?

one line solution
String n = str.replaceAll("\\D+(\\d+).+", "$1");
most efficent solution
int i = str.indexOf('_');
int j = str.indexOf('_', i + 1);
String n = str.substring(i + 1, j);

String [] tokens = filename.split("_");
/* xs_1234323_00_32 would be
[0]=>xs [1]=> 1234323 [2]=> 00 [3] => 32
*/
String middleNumber = tokens[2];

You can try using split using the '_' delimiter.

The String.split methods splits this string around matches of the given ;parameter. So use like this
String[] output = input.split("_");
here output[1] will be your desired result
ANd input will be like
String input = "xs_1234323_00_32"

I would do this:
filename.split("_", 3)[1]
The second argument of split indicates the maximum number of pieces the string should be split into, in your case you only need 3. This will be faster than using the single-argument version of split, which will continue splitting on the delimiter unnecessarily.

Related

How to get the desired character from the variable sized strings?

I need to extract the desired string which attached to the word.
For example
pot-1_Sam
pot-22_Daniel
pot_444_Jack
pot_5434_Bill
I need to get the names from the above strings. i.e Sam, Daniel, Jack and Bill.
Thing is if I use substring the position keeps on changing due to the length of the number. How to achieve them using REGEX.
Update:
Some strings has 2 underscore options like
pot_US-1_Sam
pot_RUS_444_Jack
Assuming you have a standard set of above formats, It seems you need not to have any regex, you can try using lastIndexOf and substring methods.
String result = yourString.substring(yourString.lastIndexOf("_")+1, yourString.length());
Your answer is:
String[] s = new String[4];
s[0] = "pot-1_Sam";
s[1] = "pot-22_Daniel";
s[2] = "pot_444_Jack";
s[3] = "pot_5434_Bill";
ArrayList<String> result = new ArrayList<String>();
for (String value : s) {
String[] splitedArray = value.split("_");
result.add(splitedArray[splitedArray.length-1]);
}
for(String resultingValue : result){
System.out.println(resultingValue);
}
You have 2 options:
Keep using the indexOf method to get the index of the last _ (This assumes that there is no _ in the names you are after). Once that you have the last index of the _ character, you can use the substring method to get the bit you are after.
Use a regular expression. The strings you have shown essentially have the pattern where in you have numbers, followed by an underscore which is in turn followed by the word you are after. You can use a regular expression such as \\d+_ (which will match one or more digits followed by an underscore) in combination with the split method. The string you are after will be in the last array position.
Use a string tokenizer based on '_' and get the last element. No need for REGEX.
Or use the split method on the string object like so :
String[] strArray = strValue.split("_");
String lastToken = strArray[strArray.length -1];
String[] s = {
"pot-1_Sam",
"pot-22_Daniel",
"pot_444_Jack",
"pot_5434_Bill"
};
for (String e : s)
System.out.println(e.replaceAll(".*_", ""));

why split() produces extra , after sets limit -1

I want to split Area Code and preceding number from Telephone number without brackets so i did this.
String pattern = "[\\(?=\\)]";
String b = "(079)25894029".trim();
String c[] = b.split(pattern,-1);
for (int a = 0; a < c.length; a++)
System.out.println("c[" + a + "]::->" + c[a] + "\nLength::->"+ c[a].length());
Output:
c[0]::-> Length::->0
c[1]::->079 Length::->3
c[2]::->25894029 Length::->8
Expected Output:
c[0]::->079 Length::->3
c[1]::->25894029 Length::->8
So my question is why split() produces and extra blank at the start, e.g
[, 079, 25894029]. Is this its behavior, or I did something go wrong here?
How can I get my expected outcome?
First you have unnecessary escaping inside your character class. Your regex is same as:
String pattern = "[(?=)]";
Now, you are getting an empty result because ( is the very first character in the string and split at 0th position will indeed cause an empty string.
To avoid that result use this code:
String str = "(079)25894029";
toks = (Character.isDigit(str.charAt(0))? str:str.substring(1)).split( "[(?=)]" );
for (String tok: toks)
System.out.printf("<<%s>>%n", tok);
Output:
<<079>>
<<25894029>>
From the Java8 Oracle docs:
When there is a positive-width match at the beginning of this string
then an empty leading substring is included at the beginning of the
resulting array. A zero-width match at the beginning however never
produces such empty leading substring.
You can check that the first character is an empty string, if yes then trim that empty string character.
Your regex has problems, as does your approach - you can't solve it using your approach with any regex. The magic one-liner you seek is:
String[] c = b.replaceAll("^\\D+|\\D+$", "").split("\\D+");
This removes all leading/trailing non-digits, then splits on non-digits. This will handle many different formats and separators (try a few yourself).
See live demo of this:
String b = "(079)25894029".trim();
String[] c = b.replaceAll("^\\D+|\\D+$", "").split("\\D+");
System.out.println(Arrays.toString(c));
Producing this:
[079, 25894029]

Java Character change of a String

i have java String variables format with(include spaces) String id = "CA T 4443" i need to get my String value as id=CA4443 need to remove T and spaces. can any java expert help me to concatenate these characters.
my value array
CA T 4443
CB T 4562
CG T 6365
DA T 5552
CX T 9875
CS T 5454
RA T 2377
second challenge
CAF T 444352
CBAD T 4562
CG T 636535
DA T 555255
CX T 98755665
CS T 545455
RA T 237766
i need to get as (only 1st two latter and last 4 digits)
CA4352
CB4562
CG6535
DA5255
CX5665
CS5455
RA7766
If it is always two lettes, a space, a T, and a number then you could do :
String id = "CA T 4443"
String result = id.substring(0, 2) + id.substring(5, id.length);
Or you could just do :
String result = id.replace(" T ", "");
Just do this:
public static String getFormatedValue(String data) {
String[] split = data.split(" ", 3);
return split[0] + split[2];
}
This will take the first and last section, and skip the middle T section.
You have several things going on - it's hard to tell what's the best approach without seeing the raw data.
easiest / most fragile: if you know for a fact that every line is exactly the same length, "CA T 4443" you can just manually grab the characters at that position with substring or directly from the char array. This will break if one line is larger, probably safer to trim() the string before calling substring
or you can call split:
String id = "CA T 4443";
String[] split = id.split(" "); -> gives ["CA","T", "4443"]
A bit more flexible with lengths but depends on formatting. Splitting on a regex for whitespace if your data is possibly dirty
or just grab pieces through regex matching.
Depends on how normalized your data is.
EDIT: For the firtst challenge only
If it is always that the IDs have T between the two "segments" you are trying to concatenate, then following is my solution:
public static String makeID(String[] myValueArray){
String newID = "";
for (String s: myValueArray){
String[] previousID = s.split(" T ");
newID = newID + previousID[0] + previousID[1] + "\n";
}
return newID;
}

Java: single line substring

I need to sub string the string after "2:" to the end of line as it is a changeable string:
Which mean in this example that I want to take the string "LOV_TYPE" from this 2 lines
ObjMgrSqlLog Detail 4 2014-03-26 13:19:58 Bind variable 2: LOV_TYPE
ObjMgrSqlLog Detail 4 2014-03-26 13:19:58 Bind variable 3: AUDIT_LEVEL
I tried to use subString(int startingIndex, int endingIndex) method, I can determine the first argument which is starting point.. but I can't determine the end point.
You can use two substrings, one that gets the String after 2:, and then one that gets the string before the next new line.
string = string.substring(string.indexOf("2:") + 2);
string = string.substring(0, string.indexOf("ObjMgrSqlLog));
If you need to get rid of the spaces on either end, you can then trim the string.
string = string.trim();
source:
String str = "ObjMgrSqlLog Detail 4 2014-03-26 13:19:58 Bind variable 2: LOV_TYPE";
You can use regex
String out1 = str.replaceAll("^.*?.\\:.*[ ]", "");
or classic index-of
int lastCh = str.lastIndexOf(":");
String out2 = str.substring(++lastCh).trim();
output:
System.out.println(out1);
System.out.println(out2);
If you use str.substring(startingIndex), you will have the substring to the end of the string. It seems to be what you want. If you have extra spaces at the end of the string, you can always use a str.trim() to remove the spaces.
Use substring along with .length() to get the value of the length of the string. For example:
String original = "ObjMgrSqlLog Detail 4 2014-03-26 13:19:58 Bind variable 2: LOV_TYPE";
String newString = original.substring (62, original.length ());
System.out.print (newString);

Cutting / splitting strings with Java

I have a string as follows:
2012/02/01,13:27:20,872226816,-1174749184,2136678400,2138578944,-17809408,2147352576
I want to extract the number: 872226816, so in this case I assume after the second comma start reading the data and then the following comma end the reading of data.
Example output:
872226816
s = "2012/02/01,13:27:20,872226816,-1174749184,2136678400,2138578944,-17809408,2147352576";
s.split(",")[2];
Javadoc for String.split()
If the number you want will always be after the 2nd comma, you can do something like so:
String str = "2012/02/01,13:27:20,872226816,-1174749184,2136678400,2138578944,-17809408,2147352576";
String[] line = str.split(",");
System.out.println(line[2]);

Categories

Resources