java strings with numbers - java

I am having a group of strings in Arraylist.
I want to remove all the strings with only numbers
and also strings like this : (0.75%),$1.5 ..basically everything that does not contain the characters.
2) I want to remove all special characters in the string before i write to the console.
"God should be printed God.
"Including should be printed: quoteIncluding
'find should be find

Java boasts a very nice Pattern class that makes use of regular expressions. You should definitely read up on that. A good reference guide is here.
I was going to post a coding solution for you, but styfle beat me to it! The only thing I was going to do different here was within the for loop, I would have used the Pattern and Matcher class, as such:
for(int i = 0; i < myArray.size(); i++){
Pattern p = Pattern.compile("[a-z][A-Z]");
Matcher m = p.matcher(myArray.get(i));
boolean match = m.matches();
//more code to get the string you want
}
But that too bulky. styfle's solution is succinct and easy.

When you say "characters," I'm assuming you mean only "a through z" and "A through Z." You probably want to use Regular Expressions (Regex) as D1e mentioned in a comment. Here is an example using the replaceAll method.
import java.util.ArrayList;
public class Test {
public static void main(String[] args) {
ArrayList<String> list = new ArrayList<String>(5);
list.add("\"God");
list.add(""Including");
list.add("'find");
list.add("24No3Numbers97");
list.add("w0or5*d;");
for (String s : list) {
s = s.replaceAll("[^a-zA-Z]",""); //use whatever regex you wish
System.out.println(s);
}
}
}
The output of this code is as follows:
God
quotIncluding
find
NoNumbers
word
The replaceAll method uses a regex pattern and replaces all the matches with the second parameter (in this case, the empty string).

Related

Android split string with regex not working

I'm trying to split a string at every "." or "?" and I use this regular expression:
(?<=(?!.[0-9])[?.])
In theory the code also prevents splitting if the point is followed by a number so things like 3.000 are not split and it also includes the point in the new string.
For example if I have this text: "Hello. What's your favourite number? It's 3.560." I want to get thi: "Hello.","What's your favourite number?","It's 3.560."
I've made a simple java program on my computer and it works exactly like I want:
String[] x = c.split("(?<=(?!.[0-9])[?.])");
for(String v : x){
System.out.println(v);
}
However when I use this same regex in my Android app it doesn't split anything...
x = txt.split("(?<=(?!.[0-9])[?.])");
//This, in my android app, returns an array with only one entry which is the whole string without splitting.
PS. Using (?<=[?.]) works so the problem must be in the (?!.[0-9]) part which is meant to exclude points followed by a number.
Use regex pattern
(?:(?<=[.])(?![0-9])|(?<=[?]))
str.split("(?:(?<=[.])(?![0-9])|(?<=[?]))");
Remember that outside a square bracket character class, . in a regular expression means any single character. You need \. to match a literal dot, which in turn means you need \\. in the string literal.
Try this.
public class Tester {
public static void main(String[] args){
String regex = "[?.][^\\d]";
String tester = "Testing 3.015 dsd . sd ? sds";
String[] arr = tester.split(regex);
for (String s : arr){
System.out.println(s);
}
}
}
Output:
Testing 3.015 dsd
sd
sds

Java- Extract part of a string between two special characters

I have been trying to figure out how to extract a portion of a string between two special characters ' and " I've been looking into regex, but frankly I cannot understand it.
Example in Java code:
String str="21*90'89\"";
I would like to pull out 89
In general I would just like to know how to extract part of a string between two specific characters please.
Also it would be nice to know how to extract part of the string from the beginning to a specific character like to get 21.
Try this regular expression:
'(.*?)"
As a Java string literal you will have to write it as follows:
"'(.*?)\""
Here is a more complete example demonstrating how to use this regular expression with a Matcher:
Pattern pattern = Pattern.compile("'(.*?)\"");
Matcher matcher = pattern.matcher(str);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
See it working online: ideone
If you'll always have a string like that (with 3 parts) then this is enough:
String str= "21*90'89\"";
String between = str.split("\"|'")[1];
Another option, if you can assure that your strings will always be in the format you provide, you can use a quick-and-dirty substring/indexOf solution:
str.substring(str.indexOf("'") + 1, str.indexOf("\""));
And to get the second piece of data you asked for:
str.substring(0, str.indexOf("*"));
public static void main(final String[] args) {
final String str = "21*90'89\"";
final Pattern pattern = Pattern.compile("[\\*'\"]");
final String[] result = pattern.split(str);
System.out.println(Arrays.toString(result));
}
Is what you are looking for... The program described above produces:
[21, 90, 89]
I'm missing the simplest possible solution here:
str.replaceFirst(".*'(.*)\".*", "$1");
This solution is by far the shortest, however it has some drawbacks:
In case the string looks different, you get the whole string back without warning.
It's not very efficient, as the used regex gets compiled for each use.
I wouldn't use it except as a quick hack or if I could be really sure about the input format.
String str="abc#defg#lmn!tp?pqr*tsd";
String special="!?##$%^&*()/<>{}[]:;'`~";
ArrayList<Integer> al=new ArrayList<Integer>();
for(int i=0;i<str.length();i++)
{
for(int j=0;j<special.length();j++)
if(str.charAt(i)==special.charAt(j))
al.add(i);
}
for(int i=0;i<al.size()-1;i++)
{
int start=al.get(i);
int end=al.get(i+1);
for(int j=start+1;j<end;j++)
System.out.print(str.charAt(j));
System.out.print(" ");
}
String str= 21*90'89;
String part= str.split("[*|']");
System.out.println(part[0] +""+part[1]);

how to replace parts of string using regular expressions

I am not a beginner to regular expressions, but their use in perl seems a bit different than in Java.
Anyways, I basically have a dictionary of shorthand words and their definitions. I want to iterate over words in the dictionary and replace them with their meanings. what is the best way to do this in JAVA?
I have seen String.replaceAll(), String.replace(), as well as the Pattern/Matcher classes. I wish to do a case insensitive replacement along the lines of:
word =~ s/\s?\Q$short_word\E\s?/ \Q$short_def\E /sig
While I am at it, do you think that it is best to extract all the words from the string and then apply my dictionary or just apply the dictionary to the string? I know that I need to be careful, because the shorthand words could match parts of other shorthand meanings.
Hopefully this all makes sense.
Thanks.
Clarification:
Dictionary is something like:
lol:laugh out loud, rofl:rolling on the floor laughing, ll:like lemons
string is:
lol, i am rofl
replaced text:
laugh out loud, i am rolling on the floor laughing
notice how the ll wasnt added anywhere
The danger is false positives inside of normal words. "fell" != "felikes lemons"
One way is to split the words on whitespace (do multiple spaces need to be conserved?) then loop over the List performing the 'if contains() { replace } else { output original } idea above.
My output class would be a StringBuffer
StringBuffer outputBuffer = new StringBuffer();
for(String s: split(inputText)) {
outputBuffer.append( dictionary.contains(s) ? dictionary.get(s) : s);
}
Make your split method smart enough to return word delimiters also:
split("now is the time") -> now,<space>,is,<space>,the,<space><space>,time
Then you don't have to worry about conserving white space - the loop above will just append anything that isn't a dictionary word to the StringBuffer.
Here's a recent SO thread on retaining delimiters when regexing.
If you insist on using regex, this would work (taking Zoltan Balazs' dictionary map approach):
Map<String, String> substitutions = loadDictionaryFromSomewhere();
int lengthOfShortestKeyInMap = 3; //Calculate
int lengthOfLongestKeyInMap = 3; //Calculate
StringBuffer output = new StringBuffer(input.length());
Pattern pattern = Pattern.compile("\\b(\\w{" + lengthOfShortestKeyInMap + "," + lengthOfLongestKeyInMap + "})\\b");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
String candidate = matcher.group(1);
String substitute = substitutions.get(candidate);
if (substitute == null)
substitute = candidate; // no match, use original
matcher.appendReplacement(output, Matcher.quoteReplacement(substitute));
}
matcher.appendTail(output);
// output now contains the text with substituted words
If you plan to process many inputs, pre-compiling the pattern is more efficient than using String.split(), which compiles a new Pattern each call.
(edit) Compiling all of the keys into a single pattern yields a more efficient approach, like so:
Pattern pattern = Pattern.compile("\\b(lol|rtfm|rofl|wtf)\\b");
// rest of the method unchanged, don't need the shortest/longest key stuff
This allows the regex engine to skip over any words that happen to be short enough but aren't in the list, saving you a lot of map accesses.
The first thing, that comes into my mind is this:
...
// eg: lol -> laugh out loud
Map<String, String> dictionatry;
ArrayList<String> originalText;
ArrayList<String> replacedText;
for(String string : originalText) {
if(dictionary.contains(string)) {
replacedText.add(dictionary.get(string));
} else {
replacedText.add(string);
}
...
Or you could use a StringBuffer instead of the replacedText.

String splitting

I have a string in what is the best way to put the things in between $ inside a list in java?
String temp = $abc$and$xyz$;
how can i get all the variables within $ sign as a list in java
[abc, xyz]
i can do using stringtokenizer but want to avoid using it if possible.
thx
Maybe you could think about calling String.split(String regex) ...
The pattern is simple enough that String.split should work here, but in the more general case, one alternative for StringTokenizer is the much more powerful java.util.Scanner.
String text = "$abc$and$xyz$";
Scanner sc = new Scanner(text);
while (sc.findInLine("\\$([^$]*)\\$") != null) {
System.out.println(sc.match().group(1));
} // abc, xyz
The pattern to find is:
\$([^$]*)\$
\_____/ i.e. literal $, a sequence of anything but $ (captured in group 1)
1 and another literal $
The […] is a character class. Something like [aeiou] matches one of any of the lowercase vowels. [^…] is a negated character class. [^aeiou] matches one of anything but the lowercase vowels.
(…) is used for grouping. (pattern) is a capturing group and creates a backreference.
The backslash preceding the $ (outside of character class definition) is used to escape the $, which has a special meaning as the end of line anchor. That backslash is doubled in a String literal: "\\" is a String of length one containing a backslash).
This is not a typical usage of Scanner (usually the delimiter pattern is set, and tokens are extracted using next), but it does show how'd you use findInLine to find an arbitrary pattern (ignoring delimiters), and then using match() to access the MatchResult, from which you can get individual group captures.
You can also use this Pattern in a Matcher find() loop directly.
Matcher m = Pattern.compile("\\$([^$]*)\\$").matcher(text);
while (m.find()) {
System.out.println(m.group(1));
} // abc, xyz
Related questions
Validating input using java.util.Scanner
Scanner vs. StringTokenizer vs. String.Split
Just try this one:temp.split("\\$");
I would go for a regex myself, like Riduidel said.
This special case is, however, simple enough that you can just treat the String as a character sequence, and iterate over it char by char, and detect the $ sign. And so grab the strings yourself.
On a side node, I would try to go for different demarkation characters, to make it more readable to humans. Use $ as start-of-sequence and something else as end-of-sequence for instance. Or something like I think the Bash shell uses: ${some_value}. As said, the computer doesn't care but you debugging your string just might :)
As for an appropriate regex, something like (\\$.*\\$)* or so should do. Though I'm no expert on regexes (see http://www.regular-expressions.info for nice info on regexes).
Basically I'd ditto Khotyn as the easiest solution. I see you post on his answer that you don't want zero-length tokens at beginning and end.
That brings up the question: What happens if the string does not begin and end with $'s? Is that an error, or are they optional?
If it's an error, then just start with:
if (!text.startsWith("$") || !text.endsWith("$"))
return "Missing $'s"; // or whatever you do on error
If that passes, fall into the split.
If the $'s are optional, I'd just strip them out before splitting. i.e.:
if (text.startsWith("$"))
text=text.substring(1);
if (text.endsWith("$"))
text=text.substring(0,text.length()-1);
Then do the split.
Sure, you could make more sophisticated regex's or use StringTokenizer or no doubt come up with dozens of other complicated solutions. But why bother? When there's a simple solution, use it.
PS There's also the question of what result you want to see if there are two $'s in a row, e.g. "$foo$$bar$". Should that give ["foo","bar"], or ["foo","","bar"] ? Khotyn's split will give the second result, with zero-length strings. If you want the first result, you should split("\$+").
If you want a simple split function then use Apache Commons Lang which has StringUtils.split. The java one uses a regex which can be overkill/confusing.
You can do it in simple manner writing your own code.
Just use the following code and it will do the job for you
import java.util.ArrayList;
import java.util.List;
public class MyStringTokenizer {
/**
* #param args
*/
public static void main(String[] args) {
List <String> result = getTokenizedStringsList("$abc$efg$hij$");
for(String token : result)
{
System.out.println(token);
}
}
private static List<String> getTokenizedStringsList(String string) {
List <String> tokenList = new ArrayList <String> ();
char [] in = string.toCharArray();
StringBuilder myBuilder = null;
int stringLength = in.length;
int start = -1;
int end = -1;
{
for(int i=0; i<stringLength;)
{
myBuilder = new StringBuilder();
while(i<stringLength && in[i] != '$')
i++;
i++;
while((i)<stringLength && in[i] != '$')
{
myBuilder.append(in[i]);
i++;
}
tokenList.add(myBuilder.toString());
}
}
return tokenList;
}
}
You can use
String temp = $abc$and$xyz$;
String array[]=temp.split(Pattern.quote("$"));
List<String> list=new ArrayList<String>();
for(int i=0;i<array.length;i++){
list.add(array[i]);
}
Now the list has what you want.

Dividing a string into substring in JAVA

As per my project I need to devide a string into two parts.
below is the example:
String searchFilter = "(first=sam*)(last=joy*)";
Where searchFilter is a string.
I want to split above string to two parts
first=sam* and last=joy*
so that i can again split this variables into first,sam*,last and joy* as per my requirement.
I dont have much hands on experience in java. Can anyone help me to achieve this one. It will be very helpfull.
Thanks in advance
The most flexible way is probably to do it with regular expressions:
import java.util.regex.*;
public class Test {
public static void main(String[] args) {
// Create a regular expression pattern
Pattern spec = Pattern.compile("\\((.*?)=(.*?)\\)");
// Get a matcher for the searchFilter
String searchFilter = "(first=sam*)(last=joy*)";
Matcher m = spec.matcher(searchFilter);
// While a "abc=xyz" pattern can be found...
while (m.find())
// ...print "abc" equals "xyz"
System.out.println("\""+m.group(1)+"\" equals \""+m.group(2)+"\"");
}
}
Output:
"first" equals "sam*"
"last" equals "joy*"
Take a look at String.split(..) and String.substring(..), using them you should be able to achieve what you are looking for.
you can do this using split or substring or using StringTokenizer.
I have a small code that will solve ur problem
StringTokenizer st = new StringTokenizer(searchFilter, "(||)||=");
while(st.hasMoreTokens()){
System.out.println(st.nextToken());
}
It will give the result you want.
I think you can do it in a lot of different ways, it depends on you.
Using regexp or what else look at https://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html.
Anyway I suggest:
int separatorIndex = searchFilter.indexOf(")(");
String filterFirst = searchFilter.substring(1,separatorIndex);
String filterLast = searchFilter.substring(separatorIndex+1,searchFilter.length-1);
This (untested snippet) could do it:
String[] properties = searchFilter.replaceAll("(", "").split("\)");
for (String property:properties) {
if (!property.equals("")) {
String[] parts = property.split("=");
// some method to store the filter properties
storeKeyValue(parts[0], parts[1]);
}
}
The idea behind: First we get rid of the brackets, replacing the opening brackets and using the closing brackets as a split point for the filter properties. The resulting array includes the String {"first=sam*","last=joy*",""} (the empty String is a guess - can't test it here). Then for each property we split again on "=" to get the key/value pairs.

Categories

Resources