Replacing substrings in a string Java - java

I'm trying to replace multiple substrings in a string, for example I have the following string wordlist
one two three
Where I want to replace \t tab characters with \r\n new line characters.
I define the separator variable as \n and replacement variable as \r\n.
Then I use wordlist = wordlist.replaceAll(separator, replacement); to replace all the characters, but when I display the wordlist again, it gives me the following result
onerntwornthree
I also tried splitting the wordlist by the substring separator into an array and then joining it again word by word into a new string separated by the replacement, but then it just gave me a result as
one\r\ntwo\r\nthree
Does anybody know how to solve this problem? In case you need it, here's the whole code:
System.out.print("Separator to replace: ");
separator = scanner.next( );
System.out.print("Replacement for separator: ");
replacement = scanner.next( );
wordlist = wordlist.replaceAll(separator, replacement);

Your input character for tab seems to be incorrect.
This code gives
String wordlist="one two three";
wordlist = wordlist.replaceAll("\t", "\r\n");
System.out.println(wordlist);
This output-
one
two
three

What you want to do is probably to split the string and the write the different lines one at a time to a PrintStream. That way you can use println.
Java is a platform independent language, and new lines are platform dependent. Making use of PrintStream.println will make sure your code is portable.

Why do you set the separator to \n?, it should be \t I assume?
The following code works fine for jdoodle:
String s = "one\ttwo\tthree";
s = s.replaceAll("\t","\r\n");
System.out.println(s);
EDIT
The reason why this doesn't work is because you query the user for the separator and when he enters \t, this is a string with the first character \ and the second t and not an escape character.
You should use StringEscapeUtils.unescapeJava first.
Thus:
Scanner sc = new Scanner(System.in);
String separator = sc.nextLine();
separator = StringEscapeUtils.unescapeJava(separator);
String s = "one\ttwo\tthree";
s = s.replaceAll(separator,"\r\n");
System.out.println(s);
If org.apache.commons.lang.StringEscapeUtils is not available, you can do this explicitly:
Scanner sc = new Scanner(System.in);
String separator = sc.nextLine();
separator = separator.replaceAll("\\t","\t");
String s = "one\ttwo\tthree";
s = s.replaceAll(separator,"\r\n");
System.out.println(s);
demo

Related

string.replaceAll("\\n","") not working if string is taken as input from console and contains newline character

Case 1: Taking string input from scanner and replacing \n with -- (Not working)
Scanner sc = new Scanner(System.in);
String str = sc.nextLine();
str = str.replaceAll("\n", "--");
System.out.println(str);
input: "UY9Q3HGjqYE1aHNIG+Rju2hS3WAAEFlakOSGZWffabFpWkeQ\nz4g6mfKoGVR2\nF1QkiHRMZfL4mCvChAuL7gCT3d3SrmxD6lBnOiWiFTPUz4Q=\n"
Case2: Same thing works if I directly assign string with same value as above.
String str = "UY9Q3HGjqYE1aHNIG+Rju2hS3WAAEFlakOSGZWffabFpWkeQ\nz4g6mfKoGVR2\nF1QkiHRMZfL4mCvChAuL7gCT3d3SrmxD6lBnOiWiFTPUz4Q=\n";
str = str.replaceAll("\n", "--");
PS: I have already tried using \n, line.separater
String str = "Input\\nwith backslash and n";
str = str.replaceAll("\\\\n", "--");
System.out.println(str);
Output:
Input--with backslash and n
We need to escape the backslash twice: To tell the regular expression that a literal backslash is intended we need to put two backslashes. And to tell the Java compiler that we intend literal backslashes in the string, each of those two needs to be entered as two backslashes. So we end up typing four of them.
nextLine() reads one line, so the line cannot contain a newline character. So I have assumes that you were entering a backslash and an n as part of your input.
Less confusing solution
We don’t need to use any regular expression here, and doing that complicates the escaping business. So don’t.
String str = "Input\\nwith backslash\\nand n\\n";
str = str.replace("\\n", "--");
System.out.println(str);
Input--with backslash--and n--
The replace method replaces all occurrences of the literal string given (in spite of not having All in the method name). So now we only need one escape, the one for the Java compiler.
In regular expression if you use single backward slash “\” throws error as it is a escape character. If you use double backward slash “\”, it throws “java.util.regex.PatternSyntaxException: Unexpected internal error near index” exception.
The double backward slash is treated as a single backward slash “\” in regular expression. So four backward slash “\\” should be added to match a single backward slash in a String.
Please try replacing \n with \\n:
Scanner sc = new Scanner(System.in);
String str = sc.nextLine();
str = str.replaceAll("\\\\n", "--");
System.out.println(str);

How to use split function when input is new line?

The question is we have to split the string and write how many words we have.
Scanner in = new Scanner(System.in);
String st = in.nextLine();
String[] tokens = st.split("[\\W]+");
When I gave the input as a new line and printed the no. of tokens .I have got the answer as one.But i want it as zero.What should i do? Here the delimiters are all the symbols.
Short answer: To get the tokens in str (determined by whitespace separators), you can do the following:
String str = ... //some string
str = str.trim() + " "; //modify the string for the reasons described below
String[] tokens = str.split("\\s+");
Longer answer:
First of all, the argument to split() is the delimiter - in this case one or more whitespace characters, which is "\\s+".
If you look carefully at the Javadoc of String#split(String, int) (which is what String#split(String) calls), you will see why it behaves like this.
If the expression does not match any part of the input then the resulting array has just one element, namely this string.
This is why "".split("\\s+") would return an array with one empty string [""], so you need to append the space to avoid this. " ".split("\\s+") returns an empty array with 0 elements, as you want.
When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array.
This is why " a".split("\\s+") would return ["", "a"], so you need to trim() the string first to remove whitespace from the beginning.
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
Since String#split(String) calls String#split(String, int) with the limit argument of zero, you can add whitespace to the end of the string without changing the number of words (because trailing empty strings will be discarded).
UPDATE:
If the delimiter is "\\W+", it's slightly different because you can't use trim() for that:
String str = ...
str = str.replaceAll("^\\W+", "") + " ";
String[] tokens = str.split("\\W+");
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String line = null;
while (!(line = in.nextLine()).isEmpty()) {
//logic
}
System.out.print("Empty Line");
}
output
Empty Line

java string split based on new line

I have following string
String str="aaaaaaaaa\n\n\nbbbbbbbbbbb\n \n";
I want to break it on \n so at the end i should two string aaaaaaaa and bbbbbbbb. I dont want last one as it only contain white space. so if i split it based on new line character using str.split() final array should have two entry only.
I tried below:
String str="aaaaaaaaa\n\n\nbbbbbbbbbbb\n \n".replaceAll("\\s+", " ");
String[] split = str.split("\n+");
it ignore all \n and give single string aaaaaaaaaa bbbbbbbb.
Delete the call to replaceAll(), which is removing the newlines too. Just this will do:
String[] split = str.split("\n\\s*");
This will not split on just spaces - the split must start at a newline (followed by optional further whitespace).
Here's some test code using your sample input with edge case enhancement:
String str = "aaaaaaaaa\nbbbbbb bbbbb\n \n";
String[] split = str.split("\n\\s*");
System.out.println(Arrays.toString(split));
Output:
[aaaaaaaaa, bbbbbb bbbbb]
This should do the trick:
String str="aaaaaaaaa\n\n\nbbbbbbbbbbb\n \n";
String[] lines = str.split("\\s*\n\\s*");
It will also remove all trailing and leading whitespace from all lines.
The \ns are removed by your first statement: \s matches \n

Convert a string to an array of strings

If I have:
Scanner input = new Scanner(System.in);
System.out.println("Enter an infixed expression:");
String expression = input.nextLine();
String[] tokens;
How do I scan the infix expression around spaces one token at a time, from left to right and put in into an array of strings? Here a token is defined as an operand, operator, or parentheses symbol.
Example: "3 + (9-2)" ==> tokens = [3][+][(][9][-][2][)]
String test = "13 + (9-2)";
List<String> allMatches = new ArrayList<String>();
Matcher m = Pattern.compile("\\d+|\\(|\\)|\\+|\\*|-|/")
.matcher(test);
while (m.find()) {
allMatches.add(m.group());
}
Can someone test this please?
I think it would be easiest to read the line into one string, and then split based on space. There is a handy string function split that does this for you.
String[] tokens = input.split("");
It's probably overkill for your example, but in case it gets more complex, take a look at JavaCC, the Java Compiler Compiler. JavaCC allows you to create a parser in Java based on a grammar definition.
Be aware that it is not an easy tool to get started with. However, the grammar definition will be much easier to read than the corresponding regular expressions.
if tokens[] must be String you can use this
String ex="3 + (9-2)";
String tokens[];
StringTokenizer tok=new StringTokenizer(ex);
String line="";
while(tok.hasMoreTokens())line+=tok.nextToken();
tokens=new String[line.length()];
for(int i=1;i<line.length()+1;i++)tokens[i-1]=line.substring(i-1,i);
tokens can be a charArray so:
String ex="3 + (9-2)";
char tokens[];
StringTokenizer tok=new StringTokenizer(ex);
String line="";
while(tok.hasMoreTokens())line+=tok.nextToken();
tokens=line.toCharArray();
This (IMHO elegant) single line of code works (tested):
String[] tokens = input.split("(?<=[^ ])(?<!\\B) *");
This regex also caters for input containing multiple character numbers (eg 123) which would be split into separate characters but for the negative look-behind for a non-word boundary (?<!\\B).
The first look-behind (?<=[^ ]) prevents an initial blank string split at start if input, and assures spaces are consumed.
The final part of the regex " *" assures spaces are consumed.

Replace new line/return with space using regex

Pretty basic question for someone who knows.
Instead of getting from
"This is my text.
And here is a new line"
To:
"This is my text. And here is a new line"
I get:
"This is my text.And here is a new line.
Any idea why?
L.replaceAll("[\\\t|\\\n|\\\r]","\\\s");
I think I found the culprit.
On the next line I do the following:
L.replaceAll( "[^a-zA-Z0-9|^!|^?|^.|^\\s]", "");
And this seems to be causing my issue.
Any idea why?
I am obviously trying to do the following: remove all non-chars, and remove all new lines.
\s is a shortcut for whitespace characters in regex. It has no meaning in a string. ==> You can't use it in your replacement string. There you need to put exactly the character(s) that you want to insert. If this is a space just use " " as replacement.
The other thing is: Why do you use 3 backslashes as escape sequence? Two are enough in Java. And you don't need a | (alternation operator) in a character class.
L.replaceAll("[\\t\\n\\r]+"," ");
Remark
L is not changed. If you want to have a result you need to do
String result = L.replaceAll("[\\t\\n\\r]+"," ");
Test code:
String in = "This is my text.\n\nAnd here is a new line";
System.out.println(in);
String out = in.replaceAll("[\\t\\n\\r]+"," ");
System.out.println(out);
The new line separator is different for different OS-es - '\r\n' for Windows and '\n' for Linux.
To be safe, you can use regex pattern \R - the linebreak matcher introduced with Java 8:
String inlinedText = text.replaceAll("\\R", " ");
Try
L.replaceAll("(\\t|\\r?\\n)+", " ");
Depending on the system a linefeed is either \r\n or just \n.
I found this.
String newString = string.replaceAll("\n", " ");
Although, as you have a double line, you will get a double space. I guess you could then do another replace all to replace double spaces with a single one.
If that doesn't work try doing:
string.replaceAll(System.getProperty("line.separator"), " ");
If I create lines in "string" by using "\n" I had to use "\n" in the regex. If I used System.getProperty() I had to use that.
Your regex is good altough I would replace it with the empty string
String resultString = subjectString.replaceAll("[\t\n\r]", "");
You expect a space between "text." and "And" right?
I get that space when I try the regex by copying your sample
"This is my text. "
So all is well here. Maybe if you just replace it with the empty string it will work. I don't know why you replace it with \s. And the alternation | is not necessary in a character class.
You May use first split and rejoin it using white space.
it will work sure.
String[] Larray = L.split("[\\n]+");
L = "";
for(int i = 0; i<Larray.lengh; i++){
L = L+" "+Larray[i];
}
This should take care of space, tab and newline:
data = data.replaceAll("[ \t\n\r]*", " ");

Categories

Resources