Java Regex from beginning to first char - java

How can I find any word from beginning of string to first char "~" using java?
Example:
Worddjjfdskfjsdkfjdsj ~ Word ~ Word
I want it to capture
Worddjjfdskfjsdkfjdsj

You can also do it without regex in a very simple way.
First of all use indexOf() String method to find the index of the "~" character. Then use the substring() method to extract the string you are lookin for.
Here is an example:
String stringToProcess = "hello~world";
int charIndex = stringToProcess.indexOf('~');
String finalString = stringToProcess.substring(0, charIndex);

You can use this regex to capture all character from start of string ^ to first occurrence of ~:
^[^~]*
[^~]* is negation based regex that matches 0 or more of anything but ~

Without regex it can be solved
Simply split your string by ~.
String str[] = "Worddjjfdskfjsdkfjdsj ~ Word ~ Word".split("~");
System.out.println(str[0]);

Here is regular expression that you can use: ^(.*?)~.
However in your simple case you do not need regular expressions at all. Use indexOf() and substring():
int tilda = str.indexOf('~');
if (tilda >= 0) {
word = str.substring(0, tilda);
}

Related

Java- How to find non-alphabetical letters in a string? (Quick way)

In Java, given a string, like "abc#df" where the character '#' could be ANY other non-letter, like '%', '^', '&', etc. What would be the most efficient way to find that index? I know that a for loop would be kind of quick (depending on the string length), but what about any other quicker methods? A method that finds all index(es) of non-alphabetical letters or the closest one to a given index (like indexOf(string, startingIdx))
Thanks!
A for loop, you can use the Character class to determine if each character is a Letter (or other type). See: https://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isAlphabetic(int)
You should probably use a regular expression:
Pattern patt = Pattern.compile("[^A-Za-z]");
Matcher mat = patt.matcher("avc#dgh");
boolean found = mat.find();
System.out.println(found ? mat.start() : -1);
You could use regex to split the string on anything that is not alphabetic:
String str = "abc#df";
String[] split = str.split("[^A-za-z]");
Then you can use the length of the strings in that array to find the index of the non - alphabetic chars:
int firstIndex = split[0].length();
And so on:
int secondIndex = firstIndex + split[1].length();

How to search for substrings

I'm looking for patterns like "tip" and "top" in the string -- length-3, starting with 't' and ending with 'p'. The goal is to return a string where for all such words, the middle letter is gone. So for example, "tipXtap" yields "tpXtp".
So far, I've thought about using recursion, and the replace() method, but am not sure if that is the best way to approach this problem.
Here is my code thus far:
String result = "";
if(str.length() < 3)
return str;
for(int i = 0; i <= str.length() - 2; i++){
if(str.charAt(i) == 't' && str.charAt(i + 2) == 'p'){
str.replaceAll(str.substring(i + 1, i + 2), "");
}
return str;
}
return str;
Use this Java code:
String str = "tipXtap";
str = str.replaceAll("t.p", "tp");
This uses regular expressions and the String.replaceAll function. The . (dot) character is a regex metacharacter that matches any single character.
One way of doing this.
Convert the String to a char array.
Use if conditions to validate first and third letter from the first letter. First look whether a char of a String is T and then check the char two chars away is a 'p'. You have to do this inside a loop traversing the char array.
If the validation condition is true, remove the middle element. You will have to move the element in the char array.
Convert the char array to a String and return it.
Hope this helps.
Here's a JavaScript solution to this problem using regular expressions:
foo = 'tipXtop'
foo.replace(/t\wp/g, 'tp')
The \w regex operator matches a word character like a-z, A-Z, 0-9 or _.
The g regex flag will match all instances of the regex in the string.

String replace method issue in java

My problem is to replace only the last occurrence of a character in the string with another character. When I used the String.replace(char1, char2), it replaces all the occurrences of the character in the string.
For example, I have an address string like
String str = "Addressline1,Addressline2,City,State,Country,";.
I need to replace the occurrence of ',' at the end of the string with '.'.
My code to replace the character is
str = str.replace(str.charAt(str.lastIndexOf(",")),'.');
After replacing, the string looks like:
Addressline1.Addressline2.City.State.Country.
Is there the problem in Java SDK?. If yes, how to resolve it?
You should use String.replaceAll which use regex
str = str.replaceAll (",$", ".");
The $ mean the end of the String
The Java replace function has a method declaration of:
public String replace(char oldChar, char newChar)
According to the docs replace will:
Return a new string resulting from replacing all occurrences of oldChar in this string with newChar.
So your code:
str.charAt(str.lastIndexOf(","))
Will clearly return the character ,. replace will then replace all instances of the oldChar , with the newChar .. This explains the behavior you were seeing.
The solution that #ScaryWombat beat me to is your best option:
str = str.replaceAll(",$", ".");
Since, in regular expression terms, $ denotes the end of a String.
Hope this helps!

Use regex to replace sequences in a string with modified characters

I am trying to solve a codingbat problem using regular expressions whether it works on the website or not.
So far, I have the following code which does not add a * between the two consecutive equal characters. Instead, it just bulldozes over them and replaces them with a set string.
public String pairStar(String str) {
Pattern pattern = Pattern.compile("([a-z])\\1", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(str);
if(matcher.find())
matcher.replaceAll(str);//this is where I don't know what to do
return str;
}
I want to know how I could keep using regex and replace the whole string. If needed, I think a recursive system could help.
This works:
while(str.matches(".*(.)\\1.*")) {
str = str.replaceAll("(.)\\1", "$1*$1");
}
return str;
Explanation of the regex:
The search regex (.)\\1:
(.) means "any character" (the .) and the brackets create a group - group 1 (the first left bracket)
\\1, which in regex is \1 (a java literal String must escape a backslash with another backslash) means "the first group" - this kind of term is called a "back reference"
So together (.)\1 means "any repeated character"
The replacement regex $1*$1:
The $1 term means "the content captured as group 1"
Recursive solution:
Technically, the solution called for on that site is a recursive solution, so here is recursive implementation:
public String pairStar(String str) {
if (!str.matches(".*(.)\\1.*")) return str;
return pairStar(str.replaceAll("(.)\\1", "$1*$1"));
}
FWIW, here's a non-recursive solution:
public String pairStar(String str) {
int len = str.length();
StringBuilder sb = new StringBuilder(len*2);
char last = '\0';
for (int i=0; i < len; ++i) {
char c = str.charAt(i);
if (c == last) sb.append('*');
sb.append(c);
last = c;
}
return sb.toString();
}
I dont know java, but I believe there is replace function for string in java or with regular expression. Your match string would be
([a-z])\\1
And the replace string would be
$1*$1
After some searching I think you are looking for this,
str.replaceAll("([a-z])\\1", "$1*$1").replaceAll("([a-z])\\1", "$1*$1");
This is my own solutions.
Recursive solution (which is probably more or less the solution that the problem is designed for)
public String pairStar(String str) {
if (str.length() <= 1) return str;
else return str.charAt(0) +
(str.charAt(0) == str.charAt(1) ? "*" : "") +
pairStar(str.substring(1));
}
If you want to complain about substring, then you can write a helper function pairStar(String str, int index) which does the actual recursion work.
Regex one-liner one-function-call solution
public String pairStar(String str) {
return str.replaceAll("(.)(?=\\1)", "$1*");
}
Both solution has the same spirit. They both check whether the current character is the same as the next character or not. If they are the same then insert a * between the 2 identical characters. Then we move on to check the next character. This is to produce the expected output a*a*a*a from input aaaa.
The normal regex solution of "(.)\\1" has a problem: it consumes 2 characters per match. As a result, we failed to compare whether the character after the 2nd character is the same character. The look-ahead is used to resolve this problem - it will do comparison with the next character without consuming it.
This is similar to the recursive solution, where we compare the next character str.charAt(0) == str.charAt(1), while calling the function recursively on the substring with only the current character removed pairStar(str.substring(1).

Remove end of line characters from end of Java String

I have a string which I'd like to remove the end of line characters from the very end of the string only using Java
"foo\r\nbar\r\nhello\r\nworld\r\n"
which I'd like to become
"foo\r\nbar\r\nhello\r\nworld"
(This question is similar to, but not the same as question 593671)
You can use s = s.replaceAll("[\r\n]+$", "");. This trims the \r and \n characters at the end of the string
The regex is explained as follows:
[\r\n] is a character class containing \r and \n
+ is one-or-more repetition of
$ is the end-of-string anchor
References
regular-expressions.info/Anchors, Character Class, Repetition
Related topics
You can also use String.trim() to trim any whitespace characters from the beginning and end of the string:
s = s.trim();
If you need to check if a String contains nothing but whitespace characters, you can check if it isEmpty() after trim():
if (s.trim().isEmpty()) {
//...
}
Alternatively you can also see if it matches("\\s*"), i.e. zero-or-more of whitespace characters. Note that in Java, the regex matches tries to match the whole string. In flavors that can match a substring, you need to anchor the pattern, so it's ^\s*$.
Related questions
regex, check if a line is blank or not
how to replace 2 or more spaces with single space in string and delete leading spaces only
Wouldn't String.trim do the trick here?
i.e you'd call the method .trim() on your string and it should return a copy of that string minus any leading or trailing whitespace.
The Apache Commons Lang StringUtils.stripEnd(String str, String stripChars) will do the trick; e.g.
String trimmed = StringUtils.stripEnd(someString, "\n\r");
If you want to remove all whitespace at the end of the String:
String trimmed = StringUtils.stripEnd(someString, null);
Well, everyone gave some way to do it with regex, so I'll give a fastest way possible instead:
public String replace(String val) {
for (int i=val.length()-1;i>=0;i--) {
char c = val.charAt(i);
if (c != '\n' && c != '\r') {
return val.substring(0, i+1);
}
}
return "";
}
Benchmark says it operates ~45 times faster than regexp solutions.
If you have Google's guava-librariesin your project (if not, you arguably should!) you'd do this with a CharMatcher:
String result = CharMatcher.any("\r\n").trimTrailingFrom(input);
String text = "foo\r\nbar\r\nhello\r\nworld\r\n";
String result = text.replaceAll("[\r\n]+$", "");
"foo\r\nbar\r\nhello\r\nworld\r\n".replaceAll("\\s+$", "")
or
"foo\r\nbar\r\nhello\r\nworld\r\n".replaceAll("[\r\n]+$", "")

Categories

Resources