Deleting content of every string after first empty space - java

How can I delete everything after first empty space in a string which user selects? I was reading this how to remove some words from a string in java. Can this help me in my case?

You can use replaceAll with a regex \s.* which match every thing after space:
String str = "Hello java word!";
str = str.replaceAll("\\s.*", "");
output
Hello
regex demo
Like #Coffeehouse Coder mention in comment, This solution will replace every thing if the input start with space, so if you want to avoid this case, you can trim your input using string.trim() so it can remove the spaces in start and in end.

Assuming that there is no space in the beginning of the string.
Follow these steps-
Split the string at space. It will create an array.
Get the first element of that array.
Hope this helps.
str = "Example string"
String[] _arr = str.split("\\s");
String word = _arr[0];
You need to consider multiple white spaces and space in the beginning before considering the above code.
I am not native to JAVA Programming but have an idea that it has split function for string.
And the reference you cited in the question is bit complex, while you can achieve the desired thing very easily.
P.S. In future if you make a mind to get two words or three, splitting method is better (assuming you have already dealt with multiple white-spaces) else substring is better.

A simple way to do it can be:
System.out.println("Hello world!".split(" ")[0]);

// Taking 'str' as your string
// To remove the first space(s) of the string,
str = str.trim();
int index = str.indexOf(" ");
String word = str.substring(0, index);
This is just one method of many.
str = str.replaceAll("\\s+", " "); // This replaces one or more spaces with one space
String[] words = str.split("\\s");
String first = words[0];

The simplest solution in my opinion would be to just locate the index which the user wants it to be cut off at and then call the substring() method from 0 to the index they wanted. Set that = to a new string and you have the string they want.
If you want to replace the string then just set the original string = to the result of the substring() method.
Link to substring() method: https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#substring(int,%20int)

There are already 5 perfectly good answers, so let me add a sixth one. Variety is the spice of life!
private static final Pattern FIRST_WORD = Pattern.compile("\\S+");
public static String firstWord(CharSequence text) {
Matcher m = FIRST_WORD.matcher(text);
return m.find() ? m.group() : "";
}
Advantages over the .split(...)[0]-type answers:
It directly does exactly what is being asked, i.e. "Find the first sequence of non-space characters." So the self-documentation is more explicit.
It is more efficient when called on multiple strings (e.g. for batch processing a large list of strings) because the regular expression is compiled only once.
It is more space-efficient because it avoids unnecessarily creating a whole array with references to each word when we only need the first.
It works without having to trim the string.
(I know this is probably too late to be of any use to the OP but I'm leaving it here as an alternative solution for future readers.)

This would be more efficient
String str = "Hello world!";
int spaceInd = str.indexOf(' ');
if(spaceInd != -1) {
str = str.substring(0, spaceInd);
}
System.out.println(String.format("[%s]", str));

Related

Best way to trim exactly one quote from each side of Java string

I want to be able to trim one quote from each side of a java string. Here are some examples.
"foo" -> foo
"foo\"" -> foo\"
"\"foo\"" -> \"foo\"
I'm currently using StringUtils.trim from common lang but when I end the string with a escaped quote, it trims that too because they are consecutive. I want to be able to trim exactly one quote.
I ended up using org.apache.commons.lang3.StringUtils.substringBetween and it works.
You may also use the substring() method and trim the first and last characters on condition although it's a bit long.
trimedString= s.substring((s.charAt(0)=='"')?1:0 , (s.charAt(s.length()-1)=='"')?s.length()-1:s.length());
I prefer to use this String method
public String[] split(String regex)
basically if you feed in the quotation mark then you will get an array of strings holding all of the chunks between your quotation marks.
String[] parts = originalString.split("\"");
String quoteReduced = parts[0];
for (int i = 1; i < (parts.length() -1); i++){
quoteReduced = quoteReduced.concat( parts[i] +"\"" );
}
quoteReduced = quoteReduced.concat( "\"" +parts[parts.length()-1]);
While it may not be the most straight forward it is the way that I would get around this. The first piece and last piece could be included in the loop but would require an if statement.

Java - Changing multiple words in a string at once?

I'm trying to create a program that can abbreviate certain words in a string given by the user.
This is how I've laid it out so far:
Create a hashmap from a .txt file such as the following:
thanks,thx
your,yr
probably,prob
people,ppl
Take a string from the user
Split the string into words
Check the hashmap to see if that word exists as a key
Use hashmap.get() to return the key value
Replace the word with the key value returned
Return an updated string
It all works perfectly fine until I try to update the string:
public String shortenMessage( String inMessage ) {
String updatedstring = "";
String rawstring = inMessage;
String[] words = rawstring.replaceAll("[^a-zA-Z ]", "").toLowerCase().split("\\s+");
for (String word : words) {
System.out.println(word);
if (map.containsKey(word) == true) {
String x = map.get(word);
updatedstring = rawstring.replace(word, x);
}
}
System.out.println(updatedstring);
return updatedstring;
}
Input:
thanks, your, probably, people
Output:
thanks, your, probably, ppl
Does anyone know how I can update all the words in the string?
Thanks in advance
updatedstring = rawstring.replace(word, x);
This keeps replacing your updatedstring with the rawstring with a the single replacement.
You need to do something like
updatedstring = rawstring;
...
updatedString = updatedString.replace(word, x);
Edit:
That is the solution to the problem you are seeing but there are a few other problems with your code:
Your replacement won't work for things that you needed to lowercased or remove characters from. You create the words array that you iterate from altered version of your rawstring. Then you go back and try to replace the altered versions from your original rawstring where they don't exist. This will not find the words you think you are replacing.
If you are doing global replacements, you could just create a set of words instead of an array since once the word is replaced, it shouldn't come up again.
You might want to be replacing the words one at a time, because your global replacement could cause weird bugs where a word in the replacement map is a sub word of another replacement word. Instead of using String.replace, make an array/list of words, iterate the words and replace the element in the list if needed and join them. In java 8:
String.join(" ", elements);

Finding multiple substrings using boundaries in Java

Alright so here is my problem. Basically I have a string with 4 words in it, with each word seperated by a #. What I need to do is use the substring method to extract each word and print it out. I am having trouble figuring out the parameters for it though. I can always get the first one right, but the following ones generally have problems.
Here is the first piece of the code:
word = format.substring( 0 , format.indexOf('#') );
Now from what I understand this basically means start at the beginning of the string, and end right before the #. So using the same logic, I tried to extract the second word like so:
wordTwo = format.substring ( wordlength + 1 , format.indexOf('#') );
//The plus one so I don't start at the #.
But with this I continually get errors saying it doesn't exist. I figured that the compiler was trying to read the first # before the second word, so I rewrote it like so:
wordTwo = format.substring (wordlength + 1, 1 + wordLength + format.indexOf('#') );
And with this it just completely screws it up, either not printing the second word or not stopping in the right place. If I could get any help on the formatting of this, it would be greatly appreciated. Since this is for a class, I am limited to using very basic methods such as indexOf, length, substring etc. so if you could refrain from using anything to complex that would be amazing!
If you have to use substring then you need to use the variant of indexOf that takes a start. This means you can start look for the second # by starting the search after the first one. I.e.
wordTwo = format.substring ( wordlength + 1 , format.indexOf('#', wordlength + 1 ) );
There are however much better ways of splitting a string on a delimiter like this. You can use a StringTokenizer. This is designed for splitting strings like this. Basically:
StringTokenizer tok = new StringTokenizer(format, "#");
String word = tok.nextToken();
String word2 = tok.nextToken();
String word3 = tok.nextToken();
Or you can use the String.split method which is designed for splitting strings. e.g.
String[] parts = String.split("#");
String word = parts[0];
String word2 = parts[1];
String word3 = parts[2];
You can go with split() for this kind of formatting strings.
For instance if you have string like,
String text = "Word1#Word2#Word3#Word4";
You can use delimiter as,
String delimiter = "#";
Then create an string array like,
String[] temp;
For splitting string,
temp = text.split(delimiter);
You can get words like this,
temp[0] = "Word1";
temp[1] = "Word2";
temp[2] = "Word3";
temp[3] = "Word4";
Use split() method to do this with "#" as the delimiter
String s = "hi#vivek#is#good";
String temp = new String();
String[] arr = s.split("#");
for(String x : arr){
temp = temp + x;
}
Or if you want to exact each word... you have it already in arr
arr[0] ---> First Word
arr[1] ---> Second Word
arr[2] ---> Third Word
I suggest that you've a look at the Javadoc for String before you proceed further.
Since this is your homework, I'll give you a couple of hints and maybe you can solve it yourself:
The format for subString is public void subString(int beginIndex, int endIndex). As per the javadoc for this method:
Returns a new string that is a substring of this string. The substring
begins at the specified beginIndex and extends to the character at
index endIndex - 1. Thus the length of the substring is
endIndex-beginIndex.
Note that if you've to use this method, understand that you'll have to shift your beginIndex and endIndex each time because in your situation, you'll have multiple words that are separated by #.
However if you look closely, there's another method in String class that might be helpful to you. That's the public String[] split(String regex) method. The javadoc for this one states:
Splits this string around matches of the given regular expression.
This method works as if by invoking the two-argument split method with
the given expression and a limit argument of zero. Trailing empty
strings are therefore not included in the resulting array.
The split() method looks pretty interesting for your case. You can split your String with the delimiter that you have as the parameter to this method, get the String array and work with that.
Hope this helps you to understand your problem and get started towards a solution :)
Since this is a home work, it may be better to have try to write it your self. But I will give a clue.
Clue:
The indexOf method has another overload: int indexOf(int chr,
int fromIndex) which find the first character chr in the string
from the fromIndex.
http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/String.html
From this clue, the program will look something like this:
Find the index of the first '#' from the start of the string.
Extract the word from 0th character to that index.
Find the index of the first '#' from the character AFTER the first '#'.
Extract the word from the first '#' that index.
... Just do it until you get 4 words or the string ends.
Hope this helps.
I don't know why you're forced to use String#substring, but as others have mentioned, it seems like the wrong method for the kind of functionality you need.
String#split(String regex) is what you would use for such a problem, or, if your input sequence is something you don't control, I would suggest you look at the overloaded method String#split(String regex, int limit); this way you can impose a limit on the amount of matches you make, controlling your resulting array.

Splitting strings based on a delimiter

I am trying to break apart a very simple collection of strings that come in the forms of
0|0
10|15
30|55
etc etc. Essentially numbers that are seperated by pipes.
When I use java's string split function with .split("|"). I get somewhat unpredictable results. white space in the first slot, sometimes the number itself isn't where I thought it should be.
Can anybody please help and give me advice on how I can use a reg exp to keep ONLY the integers?
I was asked to give the code trying to do the actual split. So allow me to do that in hopes to clarify further my problem :)
String temp = "0|0";
String splitString = temp.split("|");
results
\n
0
|
0
I am trying to get
0
0
only. Forever grateful for any help ahead of time :)
I still suggest to use split(), it skips null tokens by default. you want to get rid of non numeric characters in the string and only keep pipes and numbers, then you can easily use split() to get what you want. or you can pass multiple delimiters to split (in form of regex) and this should work:
String[] splited = yourString.split("[\\|\\s]+");
and the regex:
import java.util.regex.*;
Pattern pattern = Pattern.compile("\\d+(?=([\\|\\s\\r\\n]))");
Matcher matcher = pattern.matcher(yourString);
while (matcher.find()) {
System.out.println(matcher.group());
}
The pipe symbol is special in a regexp (it marks alternatives), you need to escape it. Depending on the java version you are using this could well explain your unpredictable results.
class t {
public static void main(String[]_)
{
String temp = "0|0";
String[] splitString = temp.split("\\|");
for (int i=0; i<splitString.length; i++)
System.out.println("splitString["+i+"] is " + splitString[i]);
}
}
outputs
splitString[0] is 0
splitString[1] is 0
Note that one backslash is the regexp escape character, but because a backslash is also the escape character in java source you need two of them to push the backslash into the regexp.
You can do replace white space for pipes and split it.
String test = "0|0 10|15 30|55";
test = test.replace(" ", "|");
String[] result = test.split("|");
Hope this helps for you..
You can use StringTokenizer.
String test = "0|0";
StringTokenizer st = new StringTokenizer(test);
int firstNumber = Integer.parseInt(st.nextToken()); //will parse out the first number
int secondNumber = Integer.parseInt(st.nextToken()); //will parse out the second number
Of course you can always nest this inside of a while loop if you have multiple strings.
Also, you need to import java.util.* for this to work.
The pipe ('|') is a special character in regular expressions. It needs to be "escaped" with a '\' character if you want to use it as a regular character, unfortunately '\' is a special character in Java so you need to do a kind of double escape maneuver e.g.
String temp = "0|0";
String[] splitStrings = temp.split("\\|");
The Guava library has a nice class Splitter which is a much more convenient alternative to String.split(). The advantages are that you can choose to split the string on specific characters (like '|'), or on specific strings, or with regexps, and you can choose what to do with the resulting parts (trim them, throw ayway empty parts etc.).
For example you can call
Iterable<String> parts = Spliter.on('|').trimResults().omitEmptyStrings().split("0|0")
This should work for you:
([0-9]+)
Considering a scenario where in we have read a line from csv or xls file in the form of string and need to separate the columns in array of string depending on delimiters.
Below is the code snippet to achieve this problem..
{ ...
....
String line = new BufferedReader(new FileReader("your file"));
String[] splittedString = StringSplitToArray(stringLine,"\"");
...
....
}
public static String[] StringSplitToArray(String stringToSplit, String delimiter)
{
StringBuffer token = new StringBuffer();
Vector tokens = new Vector();
char[] chars = stringToSplit.toCharArray();
for (int i=0; i 0) {
tokens.addElement(token.toString());
token.setLength(0);
i++;
}
} else {
token.append(chars[i]);
}
}
if (token.length() > 0) {
tokens.addElement(token.toString());
}
// convert the vector into an array
String[] preparedArray = new String[tokens.size()];
for (int i=0; i < preparedArray.length; i++) {
preparedArray[i] = (String)tokens.elementAt(i);
}
return preparedArray;
}
Above code snippet contains method call to StringSplitToArray where in the method converts the stringline into string array splitting the line depending on the delimiter specified or passed to the method. Delimiter can be comma separator(,) or double code(").
For more on this, follow this link : http://scrapillars.blogspot.in

Find a complete word in a string java

I am writing a piece of code in which i have to find only complete words for example if i have
String str = "today is tuesday";
and I'm searching for "t" then I should not find any word.
Can anybody tell how can I write such a program in java?
I use a regexps for such tasks. In your case it should look something like this:
String str = "today is tuesday";
return str.matches(".*?\\bt\\b.*?"); // returns "false"
String str = "today is t uesday";
return str.matches(".*?\\bt\\b.*?"); // returns "true"
A short explanation:
. matches any character, *? is for zero or more times, \b is a word boundary.
More information on regexps can be found here or specifically for java here
String sentence = "Today is Tuesday";
Set<String> words = new HashSet<String>(
Arrays.asList(sentence.split(" "))
);
System.out.println(words.contains("Tue")); // prints "false"
System.out.println(words.contains("Tuesday")); // prints "true"
Each contains(word) query is O(1), so short of implementing your own sophisticated dictionary data structure, this is the fastest most practical solution if you have many words to look for in a text.
This uses String.split to separate out the words from the sentence on the " " delimiter. Other possible variations, depending on how the problem is defined, is to use \b, the word boundary anchor. The problem is considerably more difficult if you must take every grammatical features of natural languages into consideration (e.g. "can't" is split by \b into "can" and "t").
Case insensitivity can be easily introduced by using the traditional case normalization trick: split and hash sentence.toLowerCase() instead, and see if it contains(word.toLowerCase()).
See also
regular-expressions.info -- Anchors
Wikipedia -- String searching algorithm
Wikipedia -- Patricia Trie
String[] tokens = str.split(" ");
for(String s: tokens) {
if ("t".equals(s)) {
// t exists
break;
}
}
String[] words = str.split(" ");
Arrays.sort(words);
Arrays.binarySearch(words, searchedFor);
String str = "today is tuesday";
StringTokenizer stringTokenizer = new StringTokenizer(str);
bool exists = false;
while (stringTokenizer.hasMoreTokens()) {
if (stringTokenizer.nextToken().equals("t")) {
exists = true;
break;
}
}
use a regex like "\bt\b".
you can do that by putting a regex which should end with a space.
I would recommend you use the "split" functionality for String with spaces as separators, then go through these elements one by one and make a direct comparison.
I would suggest using this regex pattern1 = ".\bt\b." instead of pattern2 = ".?\bt\b.?" . Pattern1 will help you to match the complete String if 't' occurs in that string rather than the pattern2 which just reaches the string "t" you are searching for and ignores rest of the string. There is not much difference in two approaches and for your particular use case of returning true/false will run fine both the ways. The one I suggested will help you to improvise the regex in case you make further changes in your use case

Categories

Resources