Java Regular expression with variable string - java

I want to find the occurrences of all the words in a ListArray comparing it with a String. So far, I am able to do it as a for loop, where I store all the possible combinations and run them using a matches i.e.
for(String temp_keywords: keywords){
final_keywords_list.add(" "+ temp_keywords+ " ");
final_keywords_list.add(" "+ temp_keywords+".");
final_keywords_list.add(" "+ temp_keywords+ ",");
final_keywords_list.add(" "+ temp_keywords+ "!");
final_keywords_list.add(" "+ temp_keywords+ "/");
final_keywords_list.add(" "+ temp_keywords+ "?");
}
for (String temp_keywords : final_keywords_list) {
String add_space = temp_keywords.toLowerCase();
p = Pattern.compile(add_space);
m = p.matcher(handler_string);
int count = 0;
while (m.find()) {
count += 1;
}
However, I want to remove the manual addition for the combinations and do a regex. I've seen examples of words with regex but how do I add a variable string to the regex? Sorry, I am a beginner java learner.

Is this what you need?
String inputString = ....
String[] keywords = ....
StringBuilder sb = new StringBuilder();
for(String keyword: keywords)
sb.append("(?<= )").append(keyword).append("(?=[ .,!/?])").append("|");
sb.setLength(sb.length() - 1); //Removes trailing "|". Assumes keywords.size() > 0.
Pattern p = Pattern.compile(sb.toString());
Matcher m = p.matcher(inputString);
int count = 0;
while (m.find())
count++;
It creates a single regex, compiles it, and then counts the matches.

Related

how to seperate last and remaining all words in java?

What is the easiest way to get last word and remaining all words if the user enters multiple whitespaces?
String listOfWords = "This is a sentence";
String[] b = listOfWords.split("\\s+");
String lastWord = b[b.length - 1];
i expect the output like lastWord = sentence
and firstWords = this is a
String listOfWords = "This is a sentence";
String lastWord = listOfWords.replaceFirst("^((.*\\s+)?)(^\\S+)\\s*$", "$3");
String firstWords = listOfWords.replaceFirst("^((.*\\s+)?)(^\\S+)\\s*$", "$2").trim();
Identify the last word as (\\S+)\\s*$ : non-spaces possibly followed by spaces at the end ($).
Works not when there is no word
Works when there is exactly one word
Works when there are spaces at the end
Here is quick fix for you. Check following code.
Input :
This is a sentence
Output :
First Words :This is a
Last Words :sentence
String test = "This is a sentence";
String first = test.substring(0, test.lastIndexOf(" "));
String last = test.substring(test.lastIndexOf(" ") + 1);
System.out.println("First Words :" + first);
System.out.print("Last Words :" + last);
Hope this solution works.
To add one more answer using regex to split the sentence at the last space:
String listOfWords = "This is a sentence";
String[] splited = listOfWords.split("\\s(?=[^\\s]+$)");
System.out.println(Arrays.toString(splited));
//output [This is a, sentence]
I am not saying it's a good solution, however you can get the solution with below way:
public static void main(String[] args) {
String listOfWords = " This is a sentence ";
listOfWords = listOfWords.trim().replaceAll("\\s+", " ");
String[] b = listOfWords.split("\\s+");
String lastWord = b[b.length - 1];
String firstWord = listOfWords.substring(0, listOfWords.length() - lastWord.length());
System.out.println(lastWord.trim());
System.out.println(firstWord.trim());
}
You can use
System.arraycopy(Object[] src, int srcStartIndex, Object[] dest, int dstStartIndex, int lengthOfCopiedIndices);
Please check this:
String listOfWords = "This is a sentence";
String[] b = listOfWords.split("\\s+");
String lastWord = b[b.length - 1];
String[] others = Arrays.copyOfRange(b, 0, b.length - 1);
//You can test with this
for(int i=0;i< others.length;i++){
System.out.println(others[i]);
}
String listOfWords = "This is a sentence";
String first=listOfWords.substring(0,listOfWords.lastIndexOf(' '));
String last=listOfWords.substring(listOfWords.lastIndexOf(' ')+1);
Hope this might help you.
You can use Regular expression for perfect match
String listOfWords = "This is a sentence";
Pattern r = Pattern.compile("^(.+?)(\\s+)([^\\s]+?)$");
Matcher m = r.matcher(listOfWords);
while(m.find()){
System.out.println("Last word : "+ m.group(3));
System.out.println("Remaining words : "+ m.group(1));
}
Where pattern "^(.+?)(\s+)([^\s]+?)$" works like below
^(.+?) - match all characters including space from the start(^) of the sentence
(\s+) - match more than one space if present
([^\s]+?)$ - match the last word by ignoring the space till the end($)
Output:
Last word : sentence
Remaining words : This is a
One way I can think of is:
Trim the sentence using String#trim.
Using the String#lastIndexOf, find the position of the last whitespace in the sentence.
Split the substring until the last whitespace using \\s+ and join the resulting array using String#join.
Demo:
public class Main {
public static void main(String args[]) {
String sentence = " This is a sentence";
sentence = sentence.trim();
int index = sentence.lastIndexOf(" ");
if (index != -1) {
String allButLastWord = String.join(" ", sentence.substring(0, index).split("\\s+"));
System.out.println("First words: " + allButLastWord);
System.out.println("Last word: " + sentence.substring(index + 1));
} else {
System.out.println("Last word: " + sentence);
}
}
}
Output:
First words: This is a
Last word: sentence

How can I move the punctuation from the end of a string to the beginning?

I am attempting to write a program that reverses a string's order, even the punctuation. But when my backwards string prints. The punctuation mark at the end of the last word stays at the end of the word instead of being treated as an individual character.
How can I split the end punctuation mark from the last word so I can move it around?
For example:
When I type in : Hello my name is jason!
I want: !jason is name my Hello
instead I get: jason! is name my Hello
import java.util.*;
class Ideone
{
public static void main(String[] args) {
Scanner userInput = new Scanner(System.in);
System.out.print("Enter a sentence: ");
String input = userInput.nextLine();
String[] sentence= input.split(" ");
String backwards = "";
for (int i = sentence.length - 1; i >= 0; i--) {
backwards += sentence[i] + " ";
}
System.out.print(input + "\n");
System.out.print(backwards);
}
}
Manually rearranging Strings tends to become complicated in no time. It's usually better (if possible) to code what you want to do, not how you want to do it.
String input = "Hello my name is jason! Nice to meet you. What's your name?";
// this is *what* you want to do, part 1:
// split the input at each ' ', '.', '?' and '!', keep delimiter tokens
StringTokenizer st = new StringTokenizer(input, " .?!", true);
StringBuilder sb = new StringBuilder();
while(st.hasMoreTokens()) {
String token = st.nextToken();
// *what* you want to do, part 2:
// add each token to the start of the string
sb.insert(0, token);
}
String backwards = sb.toString();
System.out.print(input + "\n");
System.out.print(backwards);
Output:
Hello my name is jason! Nice to meet you. What's your name?
?name your What's .you meet to Nice !jason is name my Hello
This will be a lot easier to understand for the next person working on that piece of code, or your future self.
This assumes that you want to move every punctuation char. If you only want the one at the end of the input string, you'd have to cut it off the input, do the reordering, and finally place it at the start of the string:
String punctuation = "";
String input = "Hello my name is jason! Nice to meet you. What's your name?";
System.out.print(input + "\n");
if(input.substring(input.length() -1).matches("[.!?]")) {
punctuation = input.substring(input.length() -1);
input = input.substring(0, input.length() -1);
}
StringTokenizer st = new StringTokenizer(input, " ", true);
StringBuilder sb = new StringBuilder();
while(st.hasMoreTokens()) {
sb.insert(0, st.nextToken());
}
sb.insert(0, punctuation);
System.out.print(sb);
Output:
Hello my name is jason! Nice to meet you. What's your name?
?name your What's you. meet to Nice jason! is name my Hello
Like the other answers, need to separate out the punctuation first, and then reorder the words and finally place the punctuation at the beginning.
You could take advantage of String.join() and Collections.reverse(), String.endsWith() for a simpler answer...
String input = "Hello my name is jason!";
String punctuation = "";
if (input.endsWith("?") || input.endsWith("!")) {
punctuation = input.substring(input.length() - 1, input.length());
input = input.substring(0, input.length() - 1);
}
List<String> words = Arrays.asList(input.split(" "));
Collections.reverse(words);
String reordered = punctuation + String.join(" ", words);
System.out.println(reordered);
The below code should work for you
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class ReplaceSample {
public static void main(String[] args) {
String originalString = "TestStr?";
String updatedString = "";
String regex = "end\\p{Punct}+|\\p{Punct}+$";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(originalString);
while (matcher.find()) {
int start = matcher.start();
updatedString = matcher.group() + originalString.substring(0, start);<br>
}
System.out.println("Original -->" + originalString + "\nReplaced -->" + updatedString);
}
}
You need to follow the below steps:
(1) Check for the ! character in the input
(2) If input contains ! then prefix it to the empty output string variable
(3) If input does not contain ! then create empty output string variable
(4) Split the input string and iterate in reverse order (you are already doing this)
You can refer the below code:
public static void main(String[] args) {
Scanner userInput = new Scanner(System.in);
System.out.print("Enter a sentence: ");
String originalInput = userInput.nextLine();
String backwards = "";
String input = originalInput;
//Define your punctuation chars into an array
char[] punctuationChars = {'!', '?' , '.'};
String backwards = "";
//Remove ! from the input
for(int i=0;i<punctuationChars.length;i++) {
if(input.charAt(input.length()-1) == punctuationChars[i]) {
input = input.substring(0, input.length()-1);
backwards = punctuationChars[i]+"";
break;
}
}
String[] sentence= input.split(" ");
for (int i = sentence.length - 1; i >= 0; i--) {
backwards += sentence[i] + " ";
}
System.out.print(originalInput + "\n");
System.out.print(input + "\n");
System.out.print(backwards);
}
Don't split by spaces; split by word boundaries. Then you don't need to care about punctuation or even putting spaces back, because you just reverse them too!
And it's only 1 line:
Arrays.stream(input.split("\\b"))
.reduce((a, b) -> b + a)
.ifPresent(System.out::println);
See live demo.

How to prepend "\n" to the last word of String?

I want to prepend "\n" to the last word of the string
for example
Hello friends 123
Here i want to add "\n" just before the word "123"
I tried below code but having no idea what to do now
String sentence = "I am Mahesh 123"
String[] parts = sentence.split(" ");
String lastWord = "\n" + parts[parts.length - 1];
Try this
String sentence = "Hello friends 123456";
String[] parts = sentence.split(" ");
parts[parts.length - 1] = "\n" + parts[parts.length - 1];
StringBuilder builder = new StringBuilder();
for (String part : parts) {
builder.append(part);
builder.append(" ");
}
System.out.println(builder.toString());
Output will be :~
Hello friends
123456
Try the below code...it will work
parts[parts.length]=parts[parts.length-1];
parts[parts.length-1]="\n";
Please try this.
String sentence = "I am Mahesh 123";
String[] parts = sentence.split(" ");
String string="";
for (int i =0;i<parts.length;i++)
{
if (i==parts.length-1)
{
string = string+"\n"+parts[i];
}
else
string = string+" "+parts[i];
}
Toast.makeText(Help.this, string, Toast.LENGTH_SHORT).show();
You want to add a break/new line at the end of your string.
You can find the space via lastIndexOf(), this will give you the int of where the space is located in the String sentence.
You can use this small example here:
public class Main {
public static void main(String[] args) {
String sentence = "I am Mahesh 123";
int locationOfLastSpace = sentence.lastIndexOf(' ');
String result = sentence.substring(0, locationOfLastSpace) //before the last word
+ "\n"
+ sentence.substring(locationOfLastSpace).trim(); //the last word, trim just removes the spaces
System.out.println(result);
}
}
Note that StringBuilder is not used because since Java 1.6 the compiler will create s StringBuilder for you

java regex: capitalize words with certain number of characters

I am trying to capitalize the words in a string with more than 5 characters.
I was able to retrieve the number of words that are greater 5 characters using .length, and I could exclude the words that were greater than 5 characters but I couldn't capitalize them.
Ex. input: "i love eating pie"
Ex. output: "i love Eating pie"
Here's my code:
public static void main(String[] args) {
String sentence = "";
Scanner input = new Scanner(System.in);
System.out.println("Enter a sentence: ");
sentence = input.nextLine();
String[] myString = sentence.split("\\s\\w{6,}".toUpperCase());
for (String myStrings : myString) {
System.out.println(sentence);
System.out.println(myStrings);
}
String sentence = "";
StringBuilder sb = new StringBuilder(sentence.length());
Scanner input = new Scanner(System.in);
System.out.println("Enter a sentence: ");
sentence = input.nextLine();
/*
* \\s (match whitespace character)
* (<?g1> (named group with name g1)
* \\w{6,}) (match word of length 6) (end of g1)
* | (or)
* (?<g2> (named group with name g2)
* \\S+) (match any non-whitespace characters) (end of g2)
*/
Pattern pattern = Pattern.compile("\\s(?<g1>\\w{6,})|(?<g2>\\S+)");
Matcher matcher = pattern.matcher(sentence);
//check if the matcher found a match
while (matcher.find())
{
//get value from g1 group (null if not found)
String g1 = matcher.group("g1");
//get value from g2 group (null if not found)
String g2 = matcher.group("g2");
//if g1 is not null and is not an empty string
if (g1 != null && g1.length() > 0)
{
//get the first character of this word and upercase it then append it to the StringBuilder
sb.append(Character.toUpperCase(g1.charAt(0)));
//sanity check to stop us from getting IndexOutOfBoundsException
//check if g1 length is more than 1 and append the rest of the word to the StringBuilder
if(g1.length() > 1) sb.append(g1.substring(1, g1.length()));
//append a space
sb.append(" ");
}
//we only need to check if g2 is not null here
if (g2 != null)
{
//g2 is smaller than 5 characters so just append it to the StringBuilder
sb.append(g2);
//append a space
sb.append(" ");
}
}
System.out.println("Original Sentence: " + sentence);
System.out.println("Modified Sentence: " + sb.toString());
Split input sentence with space as delimiter and use intiCap method if length greater than 5:
PS: System.out.print to be replaced with StringBuilder.
String delim = " ";
String[] myString = sentence.split(delim);
for (int i = 0; i < myString.length; i++) {
if (i != 0) System.out.print(delim);
if (myString[i].length() > 5)
System.out.print(intiCap(myString[i]));
else
System.out.print(myString[i]);
}
private static String intiCap(String string) {
return Character.toUpperCase(string.charAt(0)) + string.substring(1);
}
You can use the following (short and sweet :P):
Pattern p = Pattern.compile("(?=\\b\\w{6,})([a-z])\\w+");
Matcher m = p.matcher(sentence);
StringBuffer s = new StringBuffer();
while (m.find()){
m.appendReplacement(s, m.group(1).toUpperCase() + m.group(0).substring(1));
}
System.out.println(s.toString());
See Ideone Demo

Counting number of time the articles "a","an" are being used in a text file

I'm trying to make a program that count the number of words, lines, sentences, and also the number of articles 'a', 'and','the'.
So far I got the words, lines, sentences. But I have no idea who I am going to count the articles. How can a program make the difference between 'a' and 'and'.
This my code so far.
public static void main(String[]args) throws FileNotFoundException, IOException
{
FileInputStream file= new FileInputStream("C:\\Users\\nlstudent\\Downloads\\text.txt");
Scanner sfile = new Scanner(new File("C:\\Users\\nlstudent\\Downloads\\text.txt"));
int ch,sentence=0,words = 0,chars = 0,lines = 0;
while((ch=file.read())!=-1)
{
if(ch=='?'||ch=='!'|| ch=='.')
sentence++;
}
while(sfile.hasNextLine()) {
lines++;
String line = sfile.nextLine();
chars += line.length();
words += new StringTokenizer(line, " ,").countTokens();
}
System.out.println("Number of words: " + words);
System.out.println("Number of sentence: " + sentence);
System.out.println("Number of lines: " + lines);
System.out.println("Number of characters: " + chars);
}
}
How can a program make the difference between 'a' and 'and'.
You can use regex for this:
String input = "A and Andy then the are a";
Matcher m = Pattern.compile("(?i)\\b((a)|(an)|(and)|(the))\\b").matcher(input);
int count = 0;
while(m.find()){
count++;
}
//count == 4
'\b' is a word boundary, '|' is OR, '(?i)' — ignore case flag. All list of patterns you can find here and probably you should learn about regex.
The tokenizer will split each line into tokens. You can evaluate each token (a whole word) to see if it matches a string you expect. Here is an example to count a, and, the.
int a = 0, and = 0, the = 0, forCount = 0;
while (sfile.hasNextLine()) {
lines++;
String line = sfile.nextLine();
chars += line.length();
StringTokenizer tokenizer = new StringTokenizer(line, " ,");
words += tokenizer.countTokens();
while (tokenizer.hasMoreTokens()) {
String element = (String) tokenizer.nextElement();
if ("a".equals(element)) {
a++;
} else if ("and".equals(element)) {
and++;
} else if ("for".equals(element)) {
forCount++;
} else if ("the".equals(element)) {
the++;
}
}
}

Categories

Resources