Find Word Count"- My code doesn't work properly - java

"Find Word Count"- Instructions:
Given an input string (assume it's essentially a paragraph of text) and a
word to find, return the number of times in the input string that the word is
found. Should be case agnostic and remove space, commas, full stops, quotes, tabs etc while finding the matching word.
=======================
My code doesn't work properly.
`
String input = " It can hardly be a coincidence that no language on" +
" Earth has ever produced the expression as pretty as an airport." +
" Airports are ugly. Some are very ugly. Some attain a degree of ugliness" +
" that can only be the result of a special effort. This ugliness arises " +
"because airports are full of people who are tired, cross, and have just " +
"discovered that their luggage has landed in Murmansk (Murmansk airport " +
"is the only known exception to this otherwise infallible rule), and architects" +
" have on the whole tried to reflect this in their designs. They have sought" +
" to highlight the tiredness and crossness motif with brutal shapes and nerve" +
" jangling colors, to make effortless the business of separating the traveller" +
" for ever from his or her luggage or loved ones, to confuse the traveller with" +
" arrows that appear to point at the windows, distant tie racks, or the current " +
"position of Ursa Minor in the night sky, and wherever possible to expose the " +
"plumbing on the grounds that it is functional, and conceal the location of the" +
"departure gates, presumably on the grounds that they are not.";
input = input.toLowerCase();
String whichWord = "be";
whichWord = whichWord.toLowerCase();
int lastIndex = 0;
int count = 0;
while(lastIndex != -1){
lastIndex = input.indexOf(whichWord,lastIndex);
if(lastIndex != -1){
count ++;
lastIndex += whichWord.length();
}
}
System.out.println(count);
`

In your code you are not checking complete word. So, its matching both 'be' and 'because'. You're checking if there are any sub-strings contains the word 'be'. Could you please try below solution using regex? It will solve your purpose:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class WordCount {
public static void main(String[] args) {
String input = " It can hardly be a coincidence that no language on" +
" Earth has ever produced the expression as pretty as an airport." +
" Airports are ugly. Some are very ugly. Some attain a degree of ugliness" +
" that can only be the result of a special effort. This ugliness arises " +
"because airports are full of people who are tired, cross, and have just " +
"discovered that their luggage has landed in Murmansk (Murmansk airport " +
"is the only known exception to this otherwise infallible rule), and architects" +
" have on the whole tried to reflect this in their designs. They have sought" +
" to highlight the tiredness and crossness motif with brutal shapes and nerve" +
" jangling colors, to make effortless the business of separating the traveller" +
" for ever from his or her luggage or loved ones, to confuse the traveller with" +
" arrows that appear to point at the windows, distant tie racks, or the current " +
"position of Ursa Minor in the night sky, and wherever possible to expose the " +
"plumbing on the grounds that it is functional, and conceal the location of the" +
"departure gates, presumably on the grounds that they are not.";
input = input.toLowerCase();
String whichWord = "be";
whichWord = whichWord.toLowerCase();
int count = 0;
String regex = "(\\W|^)" + whichWord + "(\\W|$)";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(input);
while(matcher.find()) {
count++;
}
System.out.println(count);
}
}

Related

Is there an alternative way to generate quote marks without using the escape sequence?

I am trying to get my output to display double quotations around the abbreviations and also the translated abbreviations. However I have not covered escape sequences in my current class so I was wondering if there was another way to accomplish this. The workbook will not accept when I try with escape sequence.
I have tried escape sequence and using two single quotes ('' '') but neither have worked. Perhaps I am missing something and am fairly new to the java language. Just trying to learn the most efficient way from a fundamental standpoint.
import java.util.Scanner;
public class TextMsgExpander {
public static void main(String[] args) {
Scanner scnr = new Scanner(System.in);
String txtMsg;
String BFF = "best friend forever";
String IDK = "I don't know";
String JK = "just kidding";
String TMI = "too much information";
String TTYL = "talk to you later";
System.out.println("Enter text: ");
txtMsg = scnr.nextLine();
System.out.println("You entered: " + txtMsg);
System.out.println();
if(txtMsg.contains("BFF")) {
txtMsg = txtMsg.replace("BFF", BFF);
System.out.println("Replaced BFF with " + BFF);
} // above line is where I tried escape sequence
if(txtMsg.contains("IDK")) {
txtMsg = txtMsg.replace("IDK", IDK);
System.out.println("Replaced IDK with " + IDK);
}
if(txtMsg.contains("JK")) {
txtMsg = txtMsg.replace("JK", JK);
System.out.println("Replaced JK with " + JK);
}
System.out.println();
System.out.println("Expanded: " + txtMsg);
return;
}
}
Your output
Enter text:
You entered: IDK how that happened. TTYL.
Replaced IDK with I don't know
Replaced TTYL with talk to you later
Expanded: I don't know how that happened. talk to you later.
Expected output
Enter text:
You entered: IDK how that happened. TTYL.
Replaced "IDK" with "I don't know".
Replaced "TTYL" with "talk to you later".
Expanded: I don't know how that happened. talk to you later.
Have you tried this:
\"example text\"
So you would have something like this:
System.out.println("Replaced \"BFF\" with " + "\"" + BFF + "\"");
or
System.out.println("Replaced \"BFF\" with \"" + BFF + "\"");
Normally it should work with escape characters.
Have u tried something like this:
System.out.println("\"These two semi colons are removed when i am printed\"");
I tested it and it worked for me.
If you cannot use \ escape sequences, for whatever reason, you can use the fact that an ' apostrophe doesn't need to be escaped in a "xx" string literal, and that a " double-quote doesn't need to be escaped in a 'x' character literal.
E.g. to print Replacing "foo" with 'bar' was easy, and foo and bar are from variables, you can do this:
String s = "Replacing " + '"' + foo + '"' + " with '" + bar + "' was easy"`;

Java and if NULL

I have never done anything in java before so I really am a newb but while building the program I have ran into a snag that i just can't figure out. I will try to explain and show to the best of my abilities.
Here is what I am building
The UI
Here is the code I have so far to make it work.
private void jButtonGenerateActionPerformed(java.awt.event.ActionEvent evt) {
String ObjectName = jtObjectName.getText();
String ObjectBase = jcbBaseNPC.getSelectedItem().toString();
String NPCName = jtNPCName.getText();
String MinLevel = jtMinLevel.getText();
if (MinLevel != null && MinLevel.isEmpty())
MinLevel = MinLevel.replace(MinLevel, "minLevel" + MinLevel);{
}
//Alignment Combo Box Start
String Alignment = jcbAlignment.getSelectedItem().toString();
if (Alignment.contains("Good")) {
Alignment = Alignment.replace("Good", "255");
}
if (Alignment.contains("Neutral")) {
Alignment = Alignment.replace("Neutral", "127");
}
if (Alignment.contains("Evil")) {
Alignment = Alignment.replace("Evil", "0");
}
//Alignment Combo Box End
// Print to Output Box
jaOutput.append("object" + " " + ObjectName + " " + "of" + " " + ObjectBase +
"\n\tproperties" + "\n\tname" +" " + "\"" + NPCName + "\"" + MinLevel
);
What I can not understand it how to check to see if there is something entered in a String and if there is I need to add to it so the output looks like this.
minLevel 1100
maxLevel 1500
only thing I will be adding is the numbers so i need to add something like
minLevel + MinLevel
and if it is empty just skip it all together. If I add it to the append and its empty I will just get minLevel and i can't have it like that.
Any tips would be great.
Thank you all
Donald

How do i parse a string to get specific information using java?

Here are some lines from a file and I'm not sure how to parse it to extract 4 pieces of information.
11::American President, The (1995)::Comedy|Drama|Romance
12::Dracula: Dead and Loving It (1995)::Comedy|Horror
13::Balto (1995)::Animation|Children's
14::Nixon (1995)::Drama
I would like to get the number, title, release date and genre.
Genre has multiple genres so I would like to save each one in a variable as well.
I'm using the .split("::|\\|"); method to parse it but I'm not able to parse out the release date.
Can anyone help me!
The easiest would be matching by regex, something like this
String x = "11::Title (2016)::Category";
Pattern p = Pattern.compile("^([0-9]+)::([a-zA-Z ]+)\\(([0-9]{4})\\)::([a-zA-Z]+)$");
Matcher m = p.matcher(x);
if (m.find()) {
System.out.println("Number: " + m.group(1) + " Title: " + m.group(2) + " Year: " + m.group(3) + " Categories: " + m.group(4));
}
(please don't nail me on the exact syntax, just out of my head)
Then first capture will be the number, the second will be the name, the third is the year and the fourth is the set of categories, which you may then split by '|'.
You may need to adjust the valid characters for title and categories, but you should get the idea.
If you have multiple lines, split them into an ArrayList first and treat each one separately in a loop.
Try this
String[] s = {
"11::American President, The (1995)::Comedy|Drama|Romance",
"12::Dracula: Dead and Loving It (1995)::Comedy|Horror",
"13::Balto (1995)::Animation|Children's",
"14::Nixon (1995)::Drama",
};
for (String e : s) {
String[] infos = e.split("::|\\s*\\(|\\)::");
String number = infos[0];
String title = infos[1];
String releaseDate = infos[2];
String[] genres = infos[3].split("\\|");
System.out.printf("number=%s title=%s releaseDate=%s genres=%s%n",
number, title, releaseDate, Arrays.toString(genres));
}
output
number=11 title=American President, The releaseDate=1995 genres=[Comedy, Drama, Romance]
number=12 title=Dracula: Dead and Loving It releaseDate=1995 genres=[Comedy, Horror]
number=13 title=Balto releaseDate=1995 genres=[Animation, Children's]
number=14 title=Nixon releaseDate=1995 genres=[Drama]

More efficient way to make a string in a string of just words

I am making an application where I will be fetching tweets and storing them in a database. I will have a column for the complete text of the tweet and another where only the words of the tweet will remain (I need the words to calculate which words were most used later).
How I currently do it is by using 6 different .replaceAll() functions which some of them might be triggered twice. For example I will have a for loop to remove every "hashtag" using replaceAll().
The problem is that I will be editing as many as thousands of tweets that I fetch every few minutes and I think that the way I am doing it will not be too efficient.
What my requirements are in this order (also written in comments down bellow):
Delete all usernames mentioned
Delete all RT (retweets flags)
Delete all hashtags mentioned
Replace all break lines with spaces
Replace all double spaces with single spaces
Delete all special characters except spaces
Here is a Short and Compilable Example:
public class StringTest {
public static void main(String args[]) {
String text = "RT #AshStewart09: Vote for Lady Gaga for \"Best Fans\""
+ " at iHeart Awards\n"
+ "\n"
+ "RT!!\n"
+ "\n"
+ "My vote for #FanArmy goes to #LittleMonsters #iHeartAwards"
+ " htt…";
String[] hashtags = {"#FanArmy", "#LittleMonsters", "#iHeartAwards"};
System.out.println("Before: " + text + "\n");
// Delete all usernames mentioned (may run multiple times)
text = text.replaceAll("#AshStewart09", "");
System.out.println("First Phase: " + text + "\n");
// Delete all RT (retweets flags)
text = text.replaceAll("RT", "");
System.out.println("Second Phase: " + text + "\n");
// Delete all hashtags mentioned
for (String hashtag : hashtags) {
text = text.replaceAll(hashtag, "");
}
System.out.println("Third Phase: " + text + "\n");
// Replace all break lines with spaces
text = text.replaceAll("\n", " ");
System.out.println("Fourth Phase: " + text + "\n");
// Replace all double spaces with single spaces
text = text.replaceAll(" +", " ");
System.out.println("Fifth Phase: " + text + "\n");
// Delete all special characters except spaces
text = text.replaceAll("[^a-zA-Z0-9 ]+", "").trim();
System.out.println("Finaly: " + text);
}
}
Relying on replaceAll is probably the biggest performance killer as it compiles the regex again and again. The use of regexes for everything is probably the second most significant problem.
Assuming all usernames start with #, I'd replace
// Delete all usernames mentioned (may run multiple times)
text = text.replaceAll("#AshStewart09", "");
by a loop copying everything until it founds a #, then checking if the following chars match any of the listed usernames and possibly skipping them. For this lookup you could use a trie. A simpler method would be a replaceAll-like loop for the regex #\w+ together with a HashMap lookup.
// Delete all RT (retweets flags)
text = text.replaceAll("RT", "");
Here,
private static final Pattern RT_PATTERN = Pattern.compile("RT");
is a sure win. All the following parts could be handled similarly. Instead of
// Delete all special characters except spaces
text = text.replaceAll("[^a-zA-Z0-9 ]+", "").trim();
you could use Guava's CharMatcher. The method removeFrom does exactly what you did, but collapseFrom or trimAndCollapseFrom might be better.
According to the now closed question, it all boils down to
tweet = tweet.replaceAll("#\\w+|#\\w+|\\bRT\\b", "")
.replaceAll("\n", " ")
.replaceAll("[^\\p{L}\\p{N} ]+", " ")
.replaceAll(" +", " ")
.trim();
The second line seems to be redundant as the third one does remove \n too. Changing the first line's replacement to " " doesn't change the outcome an allows to aggregate the replacements.
tweet = tweet.replaceAll("#\\w*|#\\w*|\\bRT\\b|[^##\\p{L}\\p{N} ]+", " ")
.replaceAll(" +", " ")
.trim();
I've changed the usernames and hashtags part to eating also lone # or #, so that it doesn't need to be consumed by the special chars part. This is necessary for corrent processing of strings like !#AshStewart09.
For maximum performance, you surely need a precompiled pattern. I'd also re-suggest to use Guava's CharMatcher for the second part. Guava is huge (2 MB I guess), but you surely find more useful things there. So in the end you can get
private static final Pattern PATTERN =
Pattern.compile("#\\w*|#\\w*|\\bRT\\b|[^##\\p{L}\\p{N} ]+");
private static final CharMatcher CHAR_MATCHER = CharMacher.is(" ");
tweet = PATTERN.matcher(tweet).replaceAll(" ");
tweet = CHAR_MATCHER.trimAndCollapseFrom(tweet, " ");
You can inline all of the things that are being replaced with nothing into one call to replace all and everything that is replaced with a space into one call like so (also using a regex to find the hashtags and usernames as this seems easier):
text = text.replaceAll("#\w+|#\w+|RT", "");
text = text.replaceAll("\n| +", " ");
text = text.replaceAll("[^a-zA-Z0-9 ]+", "").trim();

How do I extract words from a string?

I did some research and I could only find answers for search results. I am trying to extract a word from a String and adding a word from another String, replacing the one that was removed. Each situation is different, so I can't make it a constant. I'd really appreciate the help!
package rudolph;
import javax.swing.JOptionPane;
public class IfBlankWasBlank {
public static void main(String[] args) {
String charInput = "";
String thing = "";
String pun = "";
charInput = JOptionPane.showInputDialog(null,
"Enter the person or thing you'd like to make fun of:");
thing = JOptionPane.showInputDialog(null,
"Enter the thing that " + charInput +" is doing:");
pun = JOptionPane.showInputDialog(null,
"Enter the pun that " + charInput + " is doing:");
String msg = "If " + charInput + " was " + thing + ", then they'd be " + pun + ".";
JOptionPane.showMessageDialog(null, msg);
}
}
So if I enter in Rudolph the Red Nosed Reindeer in the CharInput String, energy efficient in the thing String, and LED in the pun String, I want the msg to be "If Rudolph the Red Nosed Reindeer was energy efficient, then they'd be Rudolph the LED Nosed Reindeer."
I know it's silly, but I'd like to know how to utilize it If I could! Thanks so much for the help!
you should use string split method.String#split()

Categories

Resources