How to break a string into an array - java

I have a problem with parsing text, i have transcript of interview and i have a tag which channel is talking (ch1,ch2). And i need to break it into array and i could to search in which channel someone tells specific word.
For example this is a part of interview
<ch1>Hello</ch1> <ch2>Hello</ch2> <ch1>How are you</ch1><ch2>I'm fine</ch2>
This is a string
String text = "<ch1>Hello</ch1> <ch2>Hello</ch2> <ch2>How are you</ch2>
<ch2>I'm fine</ch2>";
And i want output
String output[] = {<ch1>Hello</ch1>,<ch2>Hello</ch2>,....}
Thanks for help.

You can use a regular expression with lookahead and lookbehind:
String dialogue = "<ch1>Hello</ch1> <ch2>Hello</ch2> <ch1>How are you</ch1><ch2>I'm fine</ch2>";
String[] statements = dialogue.split("(?<=</ch[12]>)\\s*(?=<ch[12]>)");
System.out.println(Arrays.asList(statements));
Output:
[<ch1>Hello</ch1>, <ch2>Hello</ch2>, <ch1>How are you</ch1>, <ch2>I'm fine</ch2>]
It's a bit hard to read due to the many < and >, but the pattern is like this:
split("(?<=endOfLastPart)inBetween(?=startOfNextPart)")

text.split("<ch").join("-<ch").split("-").
Can be any string instead of "-" which can be used.

Related

How to split a string and get only segament after the question mark

how do you split a string and get the sentence only after the question mark. For example say you have the line : hello?myNameIs...
how could you only get what's after the question mark
many thanks
If all your use cases will be similar to the simple example you've stated, you could use a simple split.
The code:
String delim = "\\?";
String s = "hello?myNameIs";
String token[] = s.split(delim);
System.out.println(token[1]);
Gives the output:
myNameIs
Of course, you can tinker with it to solve your specific problems.
try this one
String sArray[] = src.split(Pattern.quote("?"));

Regex Java when we have specific text upto a pattern

As i haven't much worked on regex, can someone help me out in getting the answer for below thing:
(1)I want to remove a text say Element
(2)It may of may not followed by delimiter say pipe(||)
I tried below thing, but it is not working in the way i want:
String str = "String:abc||Element:abc||Value:abc"; // Sample text 1
String str1 = "String:abc||Element:abc"; // Sample text 2
System.out.println(str.replaceFirst("Element.*\\||", ""));
System.out.println(str1.replaceFirst("Element.*\\||", ""));
Required output in above cases:
String:abc||Value:abc //for the first case
String:abc //for the second case
Assuming that you can decide to give another value to the original pattern which is Element in this case, you can use Pattern.quote to escape it as below:
String str = "String:abc||Element:abc||Value:abc"; // Sample text 1
String str1 = "String:abc||Element:abc"; // Sample text 2
String originalPattern = "Element";
String pattern = String.format("\\|{2}%s[^\\|]+", Pattern.quote(originalPattern));
System.out.println(str.replaceFirst(pattern, ""));
System.out.println(str1.replaceFirst(pattern, ""));
Your patter is then generic and its value is String.format("\\|{2}%s[^\\|]+", Pattern.quote(originalPattern))
Output:
String:abc||Value:abc
String:abc
You put the escape wrong. It should be:
Element(.*?\|\||.*$)
Put the escape on each pipe, and use ? for non greedy Regex so you only replace just enough string, not everything.
String text = "String:abc||Element:abc||Value:abc";
text = text.replaceAll("\\belement\\b", "");
you might need to use replace all this will replace all element from your string here i am using '\b' word boundary in java regular expression in between the words

Determine a number from text string

I want determine number from specify string.
Ex: I have many text strings, such as "3.2p" or "3.2px" or "xp3.2" or "p3.2x".
The final result I want is can get number from text in above. Expected result "3.2".
People who know,
Please help me,
Thanks,
I would first remove all the non-numeric characters using a regex, then parse what remains.
String str = input.replaceAll("[^\\d.]", "");
Float.parseFloat(str);
Use this:
String s = "ffffa32.334tccy";
s = s.replaceAll("[^\\d.]", "");

The filter string - removing some chars

is there a function in Java which removed from a string unwanted chars given by me? If not, what the most effective way to do it. I would like realize it in JAVA
EDIT:
But, I want reach for example:
String toRescue="#8*"
String text = "ra#dada882da(*%"
and after call function:
string text2="#88*"
You can use a regular expression, for example:
String text = "ra#dada882da(*%";
String text2 = text.replaceAll("[^#8*]", "");
After executing the above snippet, text2 will contain the string "#88*".
The Java String has many methods which can help you, such as
String.replace(char old, char new);
String.split(regex);
String.substring(int beginIndex);
These and many others are described in the javadoc : http://docs.oracle.com/javase/6/docs/api/java/lang/String.html

Split the string

abcd+xyz
i want to split the string and get left and right components with respect to "+"
that is i need to get abcd and xyz seperatly.
I tried the below code.
String org = "abcd+xyz";
String splits[] = org.split("+");
But i am getting null value for splits[0] and splits[1]...
Please help..
The string you send as an argument to split() is interpreted as a regex (documentation for split(String regex)). You should add an escape character before the + sign:
String splits[] = org.split("\\+");
You might also find the Summary of regular-expression constructs worth reading :)
"+" is wild character for regular expression.
So just do
String splits[] = org.split("\\+");
This will work
the expression "+" means one or many in java regular expression.
split takes Regex as a argument hence the comparion given by you fails
So use
String org = "abcd+xyz";
String splits[] = org.split(""\+");
regards!!
Try:
String splits[] = org.split("\\+");

Categories

Resources