Word By Word Comparison

Word By Word Comparison - java

I want to ask if anyone knows whether Java has built in library for doing something like the following.
For instance,
I have 2 Strings which are:
String a = "Yeahh, I love Java programming.";
String b = "love";
I want to check whether the String b which contains "love" is part of the tokens in the String a. Hence, I want to ask are there any Java API to do so?
I want something like,
a.contains (b) <--------- Return true result
Are there any???
because,,,,if there's no Java API for that, I would write my own algorithm then.
Thanks in advance for any helps..^^

The suggestions so far (indexOf, contains) are all fine if you just want to find substrings. However, given the title of your question, I assume you actually want to find words. For instance, if asked whether "She wore black gloves" contained "love" my guess is you'd want the answer to be no.
Regular expressions are probably the best way forward here, using a word boundary around the word in question:
import java.util.regex.*;
public class Test
{
public static void main(String[] args)
{
System.out.println(containsWord("I love Java", "love"));
System.out.println(containsWord("She wore gloves", "love"));
System.out.println(containsWord("start match", "start"));
System.out.println(containsWord("match at end", "end"));
}
public static boolean containsWord(String input, String word)
{
Pattern pattern = Pattern.compile("\\b" + Pattern.quote(word) + "\\b");
return pattern.matcher(input).find();
}
}
Output:
true
false
true
true

string strToCheck = "check me";
int firstOccurence = strToCheck .indexOf("me");
//0 if no any

the method contains(CharSequence s) exist in the class String.
So your a.contains(b) will work

You can use the indexOf() function in the String library to check for it:
if(a.indexOf(b)>=0)
return true;

Yes, there is: http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#contains(java.lang.CharSequence).

If you want to find the words will be quite easy, you just have to use split.
//Split to an array using space as delimiter
String[] arrayOfWords = a.split(" ");
And then you'll compare like:
"love".equalsIgnoreCase(arrayOfWords[i]);
something like that, you get the idea

Related

How can i find whether the string starts with 's','r','p' in java

Example: This is my string,
String sample = "s5656";
If the first character of the string contains 's' or 'p' or 'r' means i should remove the character,Otherwise i have to
return the original string.
Is there any optimized way to do that like "regex" or "StringUtils" in apache common?

Why do you want to add 3rd party jar for this kind of simple requirement? You can try as follows
String sample = "s5656";
if(sample.startsWith("s")||sample.startsWith("r")||sample.startsWith("p")){
// do necessary
}else{
// do necessary
}
String#startsWith()

A simple regex could solve your problem :
public static void main(String[] args) {
String s = "s5656s";
System.out.println(s.replaceFirst("^[spr]", "")); // a String which begins with s,p or r
}
O/P:
5656s
PS: regex here leads to smaller/simpler but inefficient code. Use Ruchira's answer for a rather long but efficient code. :)

^(s|p|r)
Try this.Use yourString.replaceAll() / replaceFirst() with empty string.Use m.
See demo.
http://regex101.com/r/dZ1vT6/49

I should go for replaceAll function with multiline modifier (?m).
String s = "s5656s\n" +
"r878dsjhj\n" +
"fshghg";
System.out.println(s.replaceAll("(?m)^[spr]", ""));
Output:
5656s
878dsjhj
fshghg

How to properly use java Pattern object to match string patterns

I wrote a code that does several string operations including checking whether a given string matches with a certain regular expression. It ran just fine with 70,000 input but it started to give me out of memory error when I iteratively ran it for five-fold cross validation. It just might be the case that I have to assign more memory, but I have a feeling that I might have written an inefficient code, so wanted to double check if I didn't make any obvious mistake.
static Pattern numberPattern = Pattern.compile("^[a-zA-Z]*([0-9]+).*");
public static boolean someMethod(String line) {
String[] tokens = line.split(" ");
for(int i=0; i<tokens.length; i++) {
tokens[i] = tokens[i].replace(",", "");
tokens[i] = tokens[i].replace(";", "");
if(numberPattern.matcher(tokens[i]).find()) return true;
}
return false;
}
and I have also many lines like below:
token.matches("[a-z]+[A-Z][a-z]+");
Which way is more memory efficient? Do they look efficient enough? Any advice is appreciated!
Edited:
Sorry, I had a wrong code, which I intended to modify before posting this question but I forgot at the last minute. But the problem was I had many similar looking operations all over, aside from the fact that the example code did not make sense, I wanted to know if regexp comparison part was efficient.
Thanks for all of your comments, I'll look through and modify the code following the advice!

Well, first at all, try a second look at your code... it will always return a "true" value ! You are not reading the 'match' variable, just putting values....
At second, String is immutable, so, each time you're splitting, you're creating another instances... why don't you try so create a pattern that makes the matches you want ignoring the commas and semicolons? I'm not sure, but I think it will take you less memory...

Yes, this code is inefficient indeed because you can return immediately once you've found that match = true; (no point to continue looping).
Further, are you sure you need to break the line into tokens ? why not check the regex only once ?
And last, if all comparisons checks failed, you should return false (last line).

Instead of altering the text and splitting it you can put it all in the regex.
// the \\b means it must be the start of the String or a word
static Pattern numberPattern = Pattern.compile("\\b[a-zA-Z,;]*[0-9,;]*[0-9]");
// return true if the string contains
// a number which might have letters in front
public static boolean someMethod(String line) {
return numberPattern.matcher(line).find());
}

Aside from what #alfasin has mentioned in his answer, you should avoid duplicating code; Rewrite the following:
{
tokens[i] = tokens[i].replace(",", "");
tokens[i] = tokens[i].replace(";", "");
}
Into:
tokens[i] = tokens[i].replaceAll(",|;", "");
And please just compute this before it was .split(), such that the operation doesn't have to be repeated within the loop:
String[] tokens = line.replaceAll(",|;", "").split(" ");
^^^^^^^^^^^^^^^^^^^^^^
Edit: After staring at your code for a bit I think I have a better solution, using regex ;)
public static boolean someMethod(String line) {
return Pattern.compile("\\b[a-zA-Z]*\\d")
.matcher(line.replaceAll(",|;", "")).find();
}
Online Regex DemoOnline Code Demo
\b is a Word Boundary.
It asserts position at the Boundary of a word (Start of line + after spacing)
Code Demo STDOUT:
foo does not match
bar does not match
bar1 does match
foo baz bar bar1 lolz does match
password_01 does not match

Java : Omitting a Word in a String

I have a String , from which i need to omit a particular word from it .
As shown below the String may contain a Word "Baci" OR "BACI" in it
I have written a sample program shown below which works fine , but i want to know if there is better way to do it ??
public class Test {
public static void main(String args[]) {
String str = "Mar 14 Baci WIC";
if(str!=null&&!str.isEmpty())
{
if(str.contains("Baci") || str.contains("BACI"))
{
str = str.replaceAll("(?i) Baci", "");
}
}
System.out.println(str);
}
}

I think better way here will be to not additionally check the existance of "Baci", i.e. without the following if check
if(str.contains("Baci") || str.contains("BACI"))

You could improve it a little by using the \b regexp (which matches a "word boundary") :
str = str.replaceAll("(?i) Baci\\b", "");
That way, you code will not replace "my bacil is..." with "myl is..."

Your second if condition is unnecessary, since replaceAll() will replace zero or more occurrences of the String without error.

you can .toUpperCase your String and then only ask for contains("BACI"). Inside the if block, then just call replace twice with both Baci and BACI.
Thinking it again, I think it's better just calling replace twice without asking if your String contains it or not. If it doesn't find anything to replace, then it won't replace nothing.
Hope it would be useful!

java strings with numbers

I am having a group of strings in Arraylist.
I want to remove all the strings with only numbers
and also strings like this : (0.75%),$1.5 ..basically everything that does not contain the characters.
2) I want to remove all special characters in the string before i write to the console.
"God should be printed God.
"Including should be printed: quoteIncluding
'find should be find

Java boasts a very nice Pattern class that makes use of regular expressions. You should definitely read up on that. A good reference guide is here.
I was going to post a coding solution for you, but styfle beat me to it! The only thing I was going to do different here was within the for loop, I would have used the Pattern and Matcher class, as such:
for(int i = 0; i < myArray.size(); i++){
Pattern p = Pattern.compile("[a-z][A-Z]");
Matcher m = p.matcher(myArray.get(i));
boolean match = m.matches();
//more code to get the string you want
}
But that too bulky. styfle's solution is succinct and easy.

When you say "characters," I'm assuming you mean only "a through z" and "A through Z." You probably want to use Regular Expressions (Regex) as D1e mentioned in a comment. Here is an example using the replaceAll method.
import java.util.ArrayList;
public class Test {
public static void main(String[] args) {
ArrayList<String> list = new ArrayList<String>(5);
list.add("\"God");
list.add(""Including");
list.add("'find");
list.add("24No3Numbers97");
list.add("w0or5*d;");
for (String s : list) {
s = s.replaceAll("[^a-zA-Z]",""); //use whatever regex you wish
System.out.println(s);
}
}
}
The output of this code is as follows:
God
quotIncluding
find
NoNumbers
word
The replaceAll method uses a regex pattern and replaces all the matches with the second parameter (in this case, the empty string).

Dividing a string into substring in JAVA

As per my project I need to devide a string into two parts.
below is the example:
String searchFilter = "(first=sam*)(last=joy*)";
Where searchFilter is a string.
I want to split above string to two parts
first=sam* and last=joy*
so that i can again split this variables into first,sam*,last and joy* as per my requirement.
I dont have much hands on experience in java. Can anyone help me to achieve this one. It will be very helpfull.
Thanks in advance

The most flexible way is probably to do it with regular expressions:
import java.util.regex.*;
public class Test {
public static void main(String[] args) {
// Create a regular expression pattern
Pattern spec = Pattern.compile("\\((.*?)=(.*?)\\)");
// Get a matcher for the searchFilter
String searchFilter = "(first=sam*)(last=joy*)";
Matcher m = spec.matcher(searchFilter);
// While a "abc=xyz" pattern can be found...
while (m.find())
// ...print "abc" equals "xyz"
System.out.println("\""+m.group(1)+"\" equals \""+m.group(2)+"\"");
}
}
Output:
"first" equals "sam*"
"last" equals "joy*"

Take a look at String.split(..) and String.substring(..), using them you should be able to achieve what you are looking for.

you can do this using split or substring or using StringTokenizer.

I have a small code that will solve ur problem
StringTokenizer st = new StringTokenizer(searchFilter, "(||)||=");
while(st.hasMoreTokens()){
System.out.println(st.nextToken());
}
It will give the result you want.

I think you can do it in a lot of different ways, it depends on you.
Using regexp or what else look at https://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html.
Anyway I suggest:
int separatorIndex = searchFilter.indexOf(")(");
String filterFirst = searchFilter.substring(1,separatorIndex);
String filterLast = searchFilter.substring(separatorIndex+1,searchFilter.length-1);

This (untested snippet) could do it:
String[] properties = searchFilter.replaceAll("(", "").split("\)");
for (String property:properties) {
if (!property.equals("")) {
String[] parts = property.split("=");
// some method to store the filter properties
storeKeyValue(parts[0], parts[1]);
}
}
The idea behind: First we get rid of the brackets, replacing the opening brackets and using the closing brackets as a split point for the filter properties. The resulting array includes the String {"first=sam*","last=joy*",""} (the empty String is a guess - can't test it here). Then for each property we split again on "=" to get the key/value pairs.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Word By Word Comparison - java

string strToCheck = "check me"; int firstOccurence = strToCheck .indexOf("me"); //0 if no any

the method contains(CharSequence s) exist in the class String. So your a.contains(b) will work

You can use the indexOf() function in the String library to check for it: if(a.indexOf(b)>=0) return true;

Yes, there is: http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#contains(java.lang.CharSequence).

If you want to find the words will be quite easy, you just have to use split. //Split to an array using space as delimiter String[] arrayOfWords = a.split(" "); And then you'll compare like: "love".equalsIgnoreCase(arrayOfWords[i]); something like that, you get the idea

Related

How can i find whether the string starts with 's','r','p' in java

How to properly use java Pattern object to match string patterns

Java : Omitting a Word in a String

java strings with numbers

Dividing a string into substring in JAVA

Categories

Resources