most efficient way to check if a string contains specific characters

most efficient way to check if a string contains specific characters - java

I have a string that should contain only specific characters: {}()[]
I've created a validate method that checks if the string contains forbidden characters (by forbidden characters I mean everything that is not {}()[] )
Here is my code:
private void validate(String string) {
char [] charArray = string.toCharArray();
for (Character c : charArray) {
if (!"{}()[]".contains(c.toString())){
throw new IllegalArgumentException("The string contains forbidden characters");
}
}
}
I'm wondering if there are better ways to do it since my approach doesn't seem right.

If I took the way you implement this, I would personally modify it like below:
private static void validate(String str) {
for (char c : str.toCharArray()) {
if ("{}()[]".indexOf(c) < 0){
throw new IllegalArgumentException("The string contains forbidden characters");
}
}
}
The changes are as follows:
Not declaring a temporary variable for the char array.
Using indexOf to find a character instead of converting c to String to use .contains().
Looping on the primitive char since you no longer need
toString().
Not naming the parameter string as this can cause confusion and is not good practice.
Note: contains calls indexOf(), so this does also technically save you a method call each iteration.

I'd suggest using Stream if you are using Java 8.
This allow you omit char to String boxing stuff.
private void validate_stream(String str) {
if(str.chars().anyMatch(a -> a==125||a==123||a==93||a==91||a==41||a==40))
throw new IllegalArgumentException("The string contains forbidden characters");
}
The numbers are ASCII codes for forbidden characters, you can replace them with chars if you want:
(a -> a=='{'||a=='}'||a=='['||a==']'||a=='('||a==')')

I hope this works for you: I have added my code along with your code.
I have used a regex pattern, where \\ escapes brackets, which has special meaning in regex. And use matches method of string, it try to matches the given string value with given reg ex pattern. In this case as we used not(!), if we give string like "{}()[]as", it satisfies the if not condition and prints "not matched", otherwise if we give string like "{}()[]", else case will will print. You can change this how you like by throwing exception.
private static void validate(String string)
{
String pattern = "\\{\\}\\(\\)\\[\\]";
if(!string.matches(pattern)) {
System.out.println("not matched:"+string);
}
else {
System.out.println("data matched:"+string);
}
char [] charArray = string.toCharArray();
for (Character c : charArray) {
if (!"{}()[]".contains(c.toString())){
throw new IllegalArgumentException("The string contains forbidden characters");
}
}
}
All the brackets are Meta characters, referenced here:
http://tutorials.jenkov.com/java-regex/index.html

Related

deleting special characters from a string

okay.
this is my first post here and I'm kind of new to java
so my question is simple :
is there any instruction in java that remove special characters from a string ?
my string should be only letters
so when the user enters a spacebar or a point or whatever that isn't a letter
it should be removed or ignored
well my idea was about making an array of characters and shift letters to the left each time there is something that isn't a letter
so I wrote this code knowing that x is my string
char h[]=new char [d];
for (int f=0;f<l;f++)
{
h[f]=x.charAt(f);
}
int ii=0;
while (ii<l)
{
if(h[ii]==' '||h[ii]==','||h[ii]=='-'||h[ii]=='\\'||h[ii]=='('||h[ii]==')'||h[ii]=='_'||h[ii]=='\''||h[ii]=='/'||h[ii]==';'||h[ii]=='!'||h[ii]=='*'||h[ii]=='.')
{
for(int m=ii;m<l-1;m++)
{
h[m]=h[m+1];
}
d=d-1;
ii--;
}
ii++;
}
well this works it removes the special char but I can't include all the exceptions in the condition I wonder if there is something easier :)

As others have said Strings in Java are immutable.
One way to catch all characters you do not want is to only allow the ones you want:
final String input = "some string . ";
final StringBuffer sb = new StringBuffer();
final String permittedCharacters = "1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
for (char c : input.toCharArray()){
if (permittedCharacters.indexOf(c)>=0){
sb.append(c);
}
}
final String endString = sb.toString();

Short answer - No, String is immutable. But you can use StringBuffer instead. This c ass contains deleteCharAt(int) method, that can be useful.

Java Get first character values for a string

I have inputs like
AS23456SDE
MFD324FR
I need to get First Character values like
AS, MFD
There should no first two or first 3 characters input can be changed. Need to get first characters before a number.
Thank you.
Edit : This is what I have tried.
public static String getPrefix(String serial) {
StringBuilder prefix = new StringBuilder();
for(char c : serial.toCharArray()){
if(Character.isDigit(c)){
break;
}
else{
prefix.append(c);
}
}
return prefix.toString();
}

Here is a nice one line solution. It uses a regex to match the first non numeric characters in the string, and then replaces the input string with this match.
public String getFirstLetters(String input) {
return new String("A" + input).replaceAll("^([^\\d]+)(.*)$", "$1")
.substring(1);
}
System.out.println(getFirstLetters("AS23456SDE"));
System.out.println(getFirstLetters("1AS123"));
Output:
AS
(empty)

A simple solution could be like this:
public static void main (String[]args) {
String str = "MFD324FR";
char[] characters = str.toCharArray();
for(char c : characters){
if(Character.isDigit(c))
break;
else
System.out.print(c);
}
}

Use the following function to get required output
public String getFirstChars(String str){
int zeroAscii = '0'; int nineAscii = '9';
String result = "";
for (int i=0; i< str.lenght(); i++){
int ascii = str.toCharArray()[i];
if(ascii >= zeroAscii && ascii <= nineAscii){
result = result + str.toCharArray()[i];
}else{
return result;
}
}
return str;
}
pass your string as argument

I think this can be done by a simple regex which matches digits and java's string split function. This Regex based approach will be more efficient than the methods using more complicated regexs.
Something as below will work
String inp = "ABC345.";
String beginningChars = inp.split("[\\d]+",2)[0];
System.out.println(beginningChars); // only if you want to print.
The regex I used "[\\d]+" is escaped for java already.
What it does?
It matches one or more digits (d). d matches digits of any language in unicode, (so it matches japanese and arabian numbers as well)
What does String beginningChars = inp.split("[\\d]+",2)[0] do?
It applies this regex and separates the string into string arrays where ever a match is found. The [0] at the end selects the first result from that array, since you wanted the starting chars.
What is the second parameter to .split(regex,int) which I supplied as 2?
This is the Limit parameter. This means that the regex will be applied on the string till 1 match is found. Once 1 match is found the string is not processed anymore.
From the Strings javadoc page:
The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
This will be efficient if your string is huge.
Possible other regex if you want to split only on english numerals
"[0-9]+"

public static void main(String[] args) {
String testString = "MFD324FR";
int index = 0;
for (Character i : testString.toCharArray()) {
if (Character.isDigit(i))
break;
index++;
}
System.out.println(testString.substring(0, index));
}
this prints the first 'n' characters before it encounters a digit (i.e. integer).

when finding whether a character is in a string or not error occurs

I try to use contains method to find whether a character is in a string or not?
I get the error :
method contains in class String cannot be applied to given types
if(str.contains(ch))
required:CharSequence
found:char
code :
str1=rs.getString(1);
int len=str1.length();
while(i<len)
{
char ch=str1.charAt(i);
if (str.contains(ch))
continue;
else
str=str+str1.charAt(i);
i++;
}

if ( str.indexOf( ch ) != -1 ) should work.
String.contains only accepts a CharSequence, but one Character is not a CharSequence. The way above works for Characters, too. Another way, as other people have posted (but I want to explain a little bit more), would be to make your single Character into a CharSequence, for example by creating a String...
String x = "" + b; // implicit conversion
String y = Character.valueOf(ch).toString(); // explicit conversion

This is because the String has not any overloaded contains() method for char.
Use the String.contains() method for CharSequence like -
String ch = "b";
str.contains(ch);
Her the CharSequence is an interface. A CharSequence is a readable sequence of char values. This interface provides uniform, read-only access to many different kinds of char sequences.
All known implementation of at JDK are: CharSequence are - CharBuffer, Segment, String, StringBuffer, StringBuilder.

Following is the declaration for java.lang.String.contains() method
public boolean contains(CharSequence s)
So you have to Convert the character to string before passing it to the function
Character ch = 'c';
str.contains(ch.toString());//converts ch to string and passes to the function
or
str.contains(ch+"");//converts ch to string and passes to the function
Correct Code
str1=rs.getString(1);
int len=str1.length();
while(i<len)
{
char ch=str1.charAt(i);
if (str.contains(ch+""))//changed line
continue;
else
str=str+str1.charAt(i);
i++;
}

Generate new word from wildcard [duplicate]

This question already has answers here:
Returning a list of wildcard matches from a HashMap in java
(3 answers)
Closed 7 years ago.
Im trying to generate a word with a wild card and check and see if this word is stored in the dictionary database. Like "appl*" should return apply or apple. However the problem comes in when I have 2 wild cards. "app**" will make words like appaa, appbb..appzz... instead of apple. The second if condition is just for a regular string that contains no wildcards"*"
public static boolean printWords(String s) {
String tempString, tempChar;
if (s.contains("*")) {
for (char c = 'a'; c <= 'z'; c++) {
tempChar = Character.toString(c);
tempString = s.replace("*", tempChar);
if (myDictionary.containsKey(tempString) == true) {
System.out.println(tempString);
}
}
}
if (myDictionary.containsKey(s) == true) {
System.out.println(s);
return true;
} else {
return false;
}
}

You're only using a single for loop over characters, and replacing all instances of * with that character. See the API for String.replace here. So it's no surprise that you're getting strings like Appaa, Appbb, etc.
If you want to actually use Regex expressions, then you shouldn't be doing any String.replace or contains, etc. etc. See Anubian's answer for how to handle your problem.
If you're treating this as a String exercise and don't want to use regular expressions, the easiest way to do what you're actually trying to do (try all combinations of letters for each wildcard) is to do it recursively. If there are no wild cards left in the string, check if it is a word and if so print. If there are wild cards, try each replacement of that wildcard with a character, and recursively call the function on the created string.
public static void printWords(String s){
int firstAsterisk = s.indexOf("*");
if(firstAsterisk == -1){ // doesn't contain asterisk
if (myDictionary.containsKey(s))
System.out.println(s);
return;
}
for(char c = 'a', c <= 'z', c++){
String s2 = s.subString(0, firstAsterisk) + c + s.subString(firstAsterisk + 1);
printWords(s2);
}
}
The base cause relies on the indexOf function - when indexOf returns -1, it means that the given substring (in our case "*") does not occur in the string - thus there are no more wild cards to replace.
The substring part basically recreates the original string with the first asterisk replaced with a character. So supposing that s = "abcd**ef" and c='z', we know that firstAsterisk = 4 (Strings are 0-indexed, index 4 has the first "*"). Thus,
String s2 = s.subString(0, firstAsterisk) + c + s.subString(firstAsterisk + 1);
= "abcd" + 'z' + "*ef"
= "abcdz*ef"

The * character is a regex wildcard, so you can treat the input string as a regular expression:
for (String word : myDictionary) {
if (word.matches(s)) {
System.out.println(word);
}
}
Let the libraries do the heavy lifting for you ;)

With your approach you have to check all possible combinations.
The better way would be to make a regex out of your input string, so replace all * with ..
Than you can loop over your myDirectory and check for every entry whether it matches the regex.
Something like this:
Set<String> dict = new HashSet<String>();
dict.add("apple");
String word = "app**";
Pattern pattern = Pattern.compile(word.replace('*', '.'));
for (String entry : dict) {
if (pattern.matcher(entry).matches()) {
System.out.println("matches: " + entry);
}
}
You have to take care if your input string already contains . than you have to escape them with a \. (The same for other special regex characters.)
See also
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html and
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Matcher.html

Exception "String must not end with a space" in Java

Need to write a method signature for a method called wordCount() that takes a String parameter, and returns the number of words in that String.
For the purposes of this question, a ‘word’ is any sequence of characters; it does not have to be a real English word. Words are separated by spaces.
For example: wordCount(“Java”) should return the value 1.
I have written a code, but the problem is in throwing exceptions. I have an error saying: "a string containing must not end with a space in java" and "a string containing must not start with a space in java"
my try:
int wordCount(String s){
if (s==null) throw new NullPointerException ("string must not be null");
int counter=0;
for(int i=0; i<=s.length()-1; i++){
if(Character.isLetter(s.charAt(i))){
counter++;
for(;i<=s.length()-1;i++){
if(s.charAt(i)==' '){
counter++;
}
}
}
}
return counter;
}

You're on the right track with your exception handling, but not quite there (as you've noticed).
Try the code below:
public int wordCount(final String sentence) {
// If sentence is null, throw IllegalArgumentException.
if(sentence == null) {
throw new IllegalArgumentException("Sentence cannot be null.");
}
// If sentence is empty, throw IllegalArgumentException.
if(sentence.equals("")) {
throw new IllegalArgumentException("Sentence cannot be empty.");
}
// If sentence ends with a space, throw IllegalArgumentException. "$" matches the end of a String in regex.
if(sentence.matches(".* $")) {
throw new IllegalArgumentException("Sentence cannot end with a space.");
}
// If sentence starts with a space, throw IllegalArgumentException. "^" matches the start of a String in regex.
if(sentence.matches("^ .*")) {
throw new IllegalArgumentException("Sentence cannot start with a space.");
}
int wordCount = 0;
// Do wordcount operation...
return wordCount;
}
Regular Expressions (or "regex" to the cool kids in the know) are fantastic tools for String validation and searching. The method above practices fail-fast implementation, that is that the method will fail before performing expensive processing tasks that will just fail anyway.
I'd suggest brushing up on both practices covered here, bot regex and exception handling. Some excellent resources to help you get started are included below:
You Don’t Know Anything About Regular Expressions: A Complete Guide
Understanding Java Exceptions
Debuggex - A wonderful tool to help understand and debug regex

I would use the String.split() method. This takes a regular expression which returns a string array containing the substrings. It is easy enough from there to get and return the length of the array.
This sounds like homework so I will leave the specific regular expression to you: but it should be very short, perhaps even one character long.

I would use String.split() to handle this scenario. It will be more efficient than the pasted code. Make sure you check for empty characters. This will help with sentences with multiple spaces (e.g. "This_sentences_has__two_spaces).
public int wordCount(final String sentence) {
int wordCount = 0;
String trimmedSentence = sentence.trim();
String[] words = trimmedSentence.split(" ");
for (int i = 0; i < words.length; i++) {
if (words[i] != null && !words[i].equals("")) {
wordCount++;
}
}
return wordCount;
}

I would use the splitter from google guava library. It will work more correctry, because standart String.split() working incorrectly even in this simple case:
// there is only two words, but between 'a' and 'b' are two spaces
System.out.println("a b".split(" ").length);// print '3' becouse but it think than there is
// empty line between these two spaces
With guava you can do just this:
Iterables.size(Splitter.on(" ").trimResults().omitEmptyStrings().split("same two_spaces"));// 2

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

most efficient way to check if a string contains specific characters - java

Related

deleting special characters from a string

Java Get first character values for a string

when finding whether a character is in a string or not error occurs

Generate new word from wildcard [duplicate]

Exception "String must not end with a space" in Java

Categories

Resources