How to extract specific substring from a bigger string java

How to extract specific substring from a bigger string java - java

I have the following string:
String n = "(.........)(......)(.......)(......) etc"
I want to write a method which will fill a List<String> with every substring of n which is between ( and ) . Thank you in advance!

It can be done in one line:
String[] parts = input.replaceAll("(^.*\\()|(\\).*$)", "").split("\\)\\(");
The call to replaceAll() strips off the leasing and trailing brackets (plus any other junk characters before/after those first/last brackets), then you just split() on bracket pairs.

I'm not very familiar with the String methods, so I'm sure there's a way that it could be done without having to code it yourself, and just using some fancy method, but here you go:
Tested, works 100% perfect :)
String string = "(stack)(over)(flow)";
ArrayList<String> subStrings = new ArrayList<String>();
for(int c = 0; c < string.length(); c++) {
if(string.charAt(c) == '(') {
c++;
String newString = "";
for(;c < string.length() && string.charAt(c) != ')'; c++) {
newString += string.charAt(c);
}
subStrings.add(newString);
}
}

If the (...) pairs aren't nested, you can use a regular expression in Java. Take a look at the java.util.regex.Pattern class.

I made this regex version, but it's kind of lengthy. I'm sure it could be improved upon. (note: "n" is your input string)
Pattern p = Pattern.compile("\\((.*?)\\)");
Matcher matcher = p.matcher(n);
List<String> list = new ArrayList<String>();
while (matcher.find())
{
list.add(matcher.group(1)); // 1 == stuff between the ()'s
}

This should work:
String in = "(bla)(die)(foo)";
in = in .substring(1,in.length()-1);
String[] out = in .split(Pattern.quote(")("));

Related

java extract elements separated by comma in a String

i have a problem to extract a particular kind of element in a String.
The string is like this:
String input = [[2,3],[4,5],'hello',3,[3,[5,[6,7]]]],'hi',3
I'm using the split method to split the three elements of the string above, but i cannot find a regex that allows me to consider only the commas outside the list.
In a precedence question was suggested me to use this regex:
,(?![^\[]*[\]])
This regex works in some cases, but not in the case above.
I tried in different ways, but honestly i have not found a solution.

Using regex, you just can achieve it using regex recursion but java's standard regex library doesn't support recursion.
But you can achieve what you are trying to do doing something similar to this:
String input = "[[2,3],[4,5],'hello',3,[3,[5,[6,7]]]],'hi',3";
String[] splited = input.split(",");
List<String> result = new ArrayList<String>();
int brackets = 0;
String aux = "";
for (String string : splited) {
char[] word = string.toCharArray();
// count the brackets
for (char c : word) {
if(c=='['){
brackets++;
}
else if(c==']'){
brackets--;
}
}
aux = aux + string;
// if all opened brackets are closed
if (brackets == 0) {
result.add(aux);
aux = "";
} else {
aux = aux + ",";
}
}
// the list 'result' contains 3 elemets. Each one is one element separeted by comma

How to make substring() read a string until a certain character?

I want to create a substring say:
String s1 = "derp123";
s2 = s1.substring(0, *index where there's a number*);
Is there any way I can achieve this other than by using the replaceAll method?
I don't want to replace all the numbers in the String, I want to stop reading the string once the method detects a number 0-9.

For this, regular expression is your friend.
I'll just show you the code for your example, but you should learn more about the power (and limitations) of regular expressions. See javadoc of Pattern.
String s1 = "derp123";
Matcher m = Pattern.compile("^\\D*").matcher(s1);
if (m.find())
System.out.println(m.group()); // prints: derp
Note that the replaceAll() method shown in comments is also using a regular expression.

Here is what you want.
String s1 = "derp123";
String patternStr = "[0-9]";
Matcher matcher = Pattern.compile(patternStr).matcher(s1);
if (matcher.find()) {
System.out.println(s1.substring(0, matcher.start()));
}
And this will give you the string part without numbers

Since OP said that he did not want to use replace and therefore regex this is my suggestion. Note, the REGEX version already given is much more elegant in my opinion I'm only providing this to show that are others ways to do so.
String s1 = "blahblah1234";
String s2 = s1.substring(0, firstNumberPos(s1));
System.out.println(s2);
And the firstNumberPos definition
public static int firstNumberPos(String str){
for ( int i=0; i<str.length(); i++ ){
if ( str.charAt(i) >= '0' && str.charAt(i) <= '9'){
return i;
}
}
return str.length();
}
Note that I didn't care about the null points, you still have to check it.

How to replace the same character with deferent value?

I'm facing a problem in replacing character in a string with its index.
e.g I wanna replace every '?' With its index String:
"a?ghmars?bh?" -> will be "a1ghmars8bh11".
Any help is truly appreciated.
P.s I need to solve this assignment today so I can pass it to my instructor.
Thanks in adv.
So far I get to manage replacing the ? With 0; through this piece of code:
public static void main(String[] args) {
String name = "?tsds?dsds?";
String myarray[] = name.split("");
for (int i = 0; i < myarray.length; i++) {
name = name.replace("?", String.valueOf(i++));
}
System.out.println(name);
output:
0tsds0dsds0
it should be:
0tsds5dsds10

For simple replace operations, String.replaceAll is sufficient. For more complex operations, you have to retrace partly, what this method does.
The documentation of String.replaceAll says that it is equivalent to
Pattern.compile(regex).matcher(str).replaceAll(repl)
whereas the linked documentation of replaceAll contains a reference to the method appendReplacement which is provided by Java’s regex package publicly for exactly the purpose of supporting customized replace operations. It’s documentation also gives a code example of the ordinary replaceAll operation:
Pattern p = Pattern.compile("cat");
Matcher m = p.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, "dog");
}
m.appendTail(sb);
System.out.println(sb.toString());
Using this template, we can implement the desired operation as follows:
String name = "?tsds?dsds?";
Matcher m=Pattern.compile("?", Pattern.LITERAL).matcher(name);
StringBuffer sb=new StringBuffer();
while(m.find()) {
m.appendReplacement(sb, String.valueOf(m.start()));
}
m.appendTail(sb);
name=sb.toString();
System.out.println(name);
The differences are that we use a LITERAL pattern to inhibit the special meaning of ? in regular expressions (that’s easier to read than using "\\?" as pattern). Further, we specify a String representation of the found match’s location as the replacement (which is what your question was all about). That’s it.

In previous answer wrong read question, sorry. This code replace every "?" with its index
String string = "a?ghmars?bh?das?";
while ( string.contains( "?" ) )
{
Integer index = string.indexOf( "?" );
string = string.replaceFirst( "\\?", index.toString() );
System.out.println( string );
}
So from "a?ghmars?bh?das?" we got "a1ghmars8bh11das16"

You are (more or less) replacing each target with the cardinal number of the occurrence (1 for 1st, 2 for 2nd, etc) but you want the index.
Use a StringBuilder - you only need a few lines:
StringBuilder sb = new StringBuilder(name);
for (int i = name.length - 1; i <= 0; i--)
if (name.charAt(i) == '?')
sb.replace(i, i + 1, i + "");
Note counting down, not up, allowing for the replacement index to be multiple digits, which if you counted up would change the index of subsequent calls (eg everything would get shuffled to the right by one char when the index of "?" was 10 or more).

I think this may work i have not checked it.
public class Stack{
public static void main(String[] args) {
String name = "?tsds?dsds?";
int newvalue=50;
int countspecialcharacter=0;
for(int i=0;i<name.length();i++)
{
char a=name.charAt(i);
switch(a)
{
case'?':
countspecialcharacter++;
if(countspecialcharacter>1)
{
newvalue=newvalue+50;
System.out.print(newvalue);
}
else
{
System.out.print(i);
}
break;
default:
System.out.print(a);
break;
}
}
}
}

Check below code
String string = "a?ghmars?bh?das?";
for (int i = 0; i < string.length(); i++) {
Character r=string.charAt(i);
if(r.toString().equals("?"))
System.out.print(i);
else
System.out.print(r);
}

Simplify & condense multiple editorial operations on an array. Java

I have some raw output that I want to clean up and make presentable but right now I go about it in a very ugly and cumbersome way, I wonder if anyone might know a clean and elegant way in which to perform the same operation.
int size = charOutput.size();
for (int i = size - 1; i >= 1; i--)
{
if(charOutput.get(i).compareTo(charOutput.get(i - 1)) == 0)
{
charOutput.remove(i);
}
}
for(int x = 0; x < charOutput.size(); x++)
{
if(charOutput.get(x) == '?')
{
charOutput.remove(x);
}
}
String firstOne = Arrays.toString(charOutput.toArray());
String secondOne = firstOne.replaceAll(",","");
String thirdOne = secondOne.substring(1, secondOne.length() - 1);
String output = thirdOne.replaceAll(" ","");
return output;

ZouZou has the right code for fixing the final few calls in your code. I have some suggestions for the for loops. I hope I got them right...
These work after you get the String represented by charOutput, using a method such as the one suggested by ZouZou.
Your first block appears to remove all repeated letters. You can use a regular expression for that:
Pattern removeRepeats = Pattern.compile("(.)\\1{1,}");
// "(.)" creates a group that matches any character and puts it into a group
// "\\1" gets converted to "\1" which is a reference to the first group, i.e. the character that "(.)" matched
// "{1,}" means "one or more"
// So the overall effect is "one or more of a single character"
To use:
removeRepeats.matcher(s).replaceAll("$1");
// This creates a Matcher that matches the regex represented by removeRepeats to the contents of s, and replaces the parts of s that match the regex represented by removeRepeats with "$1", which is a reference to the first group captured (i.e. "(.)", which is the first character matched"
To remove the question mark, just do
Pattern removeQuestionMarks = Pattern.compile("\\?");
// Because "?" is a special symbol in regex, you have to escape it with a backslash
// But since backslashes are also a special symbol, you have to escape the backslash too.
And then to use, do the same thing as was done above except with replaceAll("");
And you're done!
If you really wanted to, you can combine a lot of regex into two super-regex expressions (and one normal regex expression):
Pattern p0 = Pattern.compile("(\\[|\\]|\\,| )"); // removes brackets, commas, and spaces
Pattern p1 = Pattern.compile("(.)\\1{1,}"); // Removes duplicate characters
Pattern p2 = Pattern.compile("\\?");
String removeArrayCharacters = p0.matcher(charOutput.toString()).replaceAll("");
String removeDuplicates = p1.matcher(removeArrayCharacters).replaceAll("$1");
return p2.matcher(removeDuplicates).replaceAll("");

Use a StringBuilder and append each character you want, at the end just return myBuilder.toString();
Instead of this:
String firstOne = Arrays.toString(charOutput.toArray());
String secondOne = firstOne.replaceAll(",","");
String thirdOne = secondOne.substring(1, secondOne.length() - 1);
String output = thirdOne.replaceAll(" ","");
return output;
Simply do:
StringBuilder sb = new StringBuilder();
for(Character c : charOutput){
sb.append(c);
}
return sb.toString();
Note that you are doing a lot of unnecessary work (by iterating through the list and removing some elements). What you can actually do is just iterate one time and then if the condition fullfits your requirements (the two adjacent characters are not the same and no question mark) then append it to the StringBuilder directly.
This task could also be a job for a regular expression.

If you don't want to use Regex try this version to remove consecutive characters and '?':
int size = charOutput.size();
if (size == 1) return Character.toString((Character)charOutput.get(0));
else if (size == 0) return null;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < size - 1; i++) {
Character temp = (Character)charOutput.get(i);
if (!temp.equals(charOutput.get(i+1)) && !temp.equals('?'))
sb.append(temp);
}
//for the last element
if (!charOutput.get(size-1).equals(charOutput.get(size-2))
&& !charOutput.get(size-1).equals('?'))
sb.append(charOutput.get(size-1));
return sb.toString();

Parsing comma delimited text in Java

If I have an ArrayList that has lines of data that could look like:
bob, jones, 123-333-1111
james, lee, 234-333-2222
How do I delete the extra whitespace and get the same data back? I thought you could maybe spit the string by "," and then use trim(), but I didn't know what the syntax of that would be or how to implement that, assuming that is an ok way to do it because I'd want to put each field in an array. So in this case have a [2][3] array, and then put it back in the ArrayList after removing the whitespace. But that seems like a funny way to do it, and not scaleable if my list changed, like having an email on the end. Any thoughts? Thanks.
Edit:
Dumber question, so I'm still not sure how I can process the data, because I can't do this right:
for (String s : myList) {
String st[] = s.split(",\\s*");
}
since st[] will lose scope after the foreach loop. And if I declare String st[] beforehand, I wouldn't know how big to create my array right? Thanks.

You could just scan through the entire string and build a new string, skipping any whitespace that occurs after a comma. This would be more efficient than splitting and rejoining. Something like this should work:
String str = /* your original string from the array */;
StringBuilder sb = new StringBuilder();
boolean skip = true;
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
if (skip && Character.isWhitespace(ch))
continue;
sb.append(ch);
if (ch == ',')
skip = true;
else
skip = false;
}
String result = sb.toString();

If you use a regex for you split, you can specify, a comma followed by optional whitespace (which includes spaces and tabs just in case).
String[] fields = mystring.split(",\\s*");
Depending on whether you want to parse each line separately or not you may first want to create an array split on a line return
String[] lines = mystring.split("\\n");

Just split() on each line with the delimiter set as ',' to get an array of Strings with the extra whitespace, and then use the trim() method on the elements of the String array, perhaps as they are being used or in advance. Remember that the trim() method gives you back a new string object (a String object is immutable).

If I understood your problem, here is a solution:
ArrayList<String> tmp = new ArrayList<String>();
tmp.add("bob, jones, 123-333-1111");
tmp.add(" james, lee, 234-333-2222");
ArrayList<String> fixedStrings = new ArrayList<String>();
for (String i : tmp) {
System.out.println(i);
String[] data = i.split(",");
String result = "";
for (int j = 0; j < data.length - 1; ++j) {
result += data[j].trim() + ", ";
}
result += data[data.length - 1].trim();
fixedStrings.add(result);
}
System.out.println(fixedStrings.get(0));
System.out.println(fixedStrings.get(1));
I guess it could be fixed not to create a second ArrayLis. But it's scalable, so if you get lines in the future like: "bob, jones , bobjones#gmail.com , 123-333-1111 " it will still work.

I've had a lot of success using this library.

Could be a bit more elegant, but it works...
ArrayList<String> strings = new ArrayList<String>();
strings.add("bob, jones, 123-333-1111");
strings.add("james, lee, 234-333-2222");
for(int i = 0; i < strings.size(); i++) {
StringBuilder builder = new StringBuilder();
for(String str: strings.get(i).split(",\\s*")) {
builder.append(str).append(" ");
}
strings.set(i, builder.toString().trim());
}
System.out.println("strings = " + strings);

I would look into:
http://download.oracle.com/docs/cd/E17476_01/javase/1.4.2/docs/api/java/lang/String.html#split(java.lang.String)
or
http://download.oracle.com/docs/cd/E17476_01/javase/1.5.0/docs/api/java/util/Scanner.html

you can use Sting.split() method in java or u can use split() method from google guava library's Splitter class as shown below
static final Splitter MY_SPLITTER = Splitter.on(',')
.trimResults()
.omitEmptyStrings();

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to extract specific substring from a bigger string java - java

I have the following string: String n = "(.........)(......)(.......)(......) etc" I want to write a method which will fill a List<String> with every substring of n which is between ( and ) . Thank you in advance!

It can be done in one line: String[] parts = input.replaceAll("(^.\\()|(\\).$)", "").split("\\)\\("); The call to replaceAll() strips off the leasing and trailing brackets (plus any other junk characters before/after those first/last brackets), then you just split() on bracket pairs.

If the (...) pairs aren't nested, you can use a regular expression in Java. Take a look at the java.util.regex.Pattern class.

This should work: String in = "(bla)(die)(foo)"; in = in .substring(1,in.length()-1); String[] out = in .split(Pattern.quote(")("));

Related

java extract elements separated by comma in a String

How to make substring() read a string until a certain character?

How to replace the same character with deferent value?

Simplify & condense multiple editorial operations on an array. Java

Parsing comma delimited text in Java

Categories

Resources

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to extract specific substring from a bigger string java - java

I have the following string: String n = "(.........)(......)(.......)(......) etc" I want to write a method which will fill a List<String> with every substring of n which is between ( and ) . Thank you in advance!

It can be done in one line: String[] parts = input.replaceAll("(^.*\\()|(\\).*$)", "").split("\\)\\("); The call to replaceAll() strips off the leasing and trailing brackets (plus any other junk characters before/after those first/last brackets), then you just split() on bracket pairs.

If the (...) pairs aren't nested, you can use a regular expression in Java. Take a look at the java.util.regex.Pattern class.

This should work: String in = "(bla)(die)(foo)"; in = in .substring(1,in.length()-1); String[] out = in .split(Pattern.quote(")("));

Related

java extract elements separated by comma in a String

How to make substring() read a string until a certain character?

How to replace the same character with deferent value?

Simplify & condense multiple editorial operations on an array. Java

Parsing comma delimited text in Java

Categories

Resources

It can be done in one line: String[] parts = input.replaceAll("(^.\\()|(\\).$)", "").split("\\)\\("); The call to replaceAll() strips off the leasing and trailing brackets (plus any other junk characters before/after those first/last brackets), then you just split() on bracket pairs.