Remove String of length 1 from a Collection excluding some values - java

I have an arraylist of type String with many words, and in some cases they are just single letters. Such as the letter "K".
I am essentially trying to remove all single instance characters, EXCEPT "A" and "I".
Here is the code/regex I was trying, to no avail:
//removing all single letters
ArrayList<String> newList2 = new ArrayList<String>();
for(String word : words) {
newList2.add(word.replace("[BCDEFGHJKLMOPQRSTUVWXYZ]", ""));
}
words = newList2;
Should I not use regex? Is there a better method, or is there a way I am not using regex correctly? From my understanding my implementation, if it even worked, would only replace it with an empty spot, not completely remove the element.. my goal is to remove the element entirely if it exists, perhaps by the .remove method... Not sure how to go about this. (JAVA)
(P.S, ideally I would also remove the "=" and other symbols if they are apparent, but characters is my gripe at the moment)

No need to use stream api for it. List#removeIf will suffice here:
list.removeIf(s -> s.length() == 1 && ! List.of("A", "I").contains(s))
Note: It is a mutative operation.

A solution with loop:
for(int i=0; i < newList2.size(); i++){
if(newList2.get(i).length() == 1){
if(!newList2.get(i).equals("A") || !newList2.get(i).equals("I")){
newList2.remove(i)
}
}
}

Related

Java ArrayList - Replace a specific letter or character within an ArrayList of Strings?

I'm trying to replace/remove certain characters within a String ArrayList. Here is the code that I've already tried:
for (int i = 0; i < newList.size(); i++) {
if (newList.get(i).contains(".")) {
newList.get(i).replace(".", "");
}
}
I tried iterating through the ArrayList and, if any of the elements contained a ".", then to replace that with emptiness. For example, if one of the words in my ArrayList was "Works.", then the resulted output would be "Works". Unfortunately, this code segment does not seem to be functioning. I did something similar to remove any elements which contains numbers and it worked fine, however I'm not sure if there's specific syntax or methods that needs to be used when replacing just a single character in a String while still keeping the ArrayList element.
newList.get(i).replace(".", "");
This doesn't update the list element - you construct the replaced string, but then discard that string.
You could use set:
newList.set(i, newList.get(i).replace(".", ""));
Or you could use a ListIterator:
ListIterator<String> it = newList.listIterator();
while (it.hasNext()) {
String s = it.next();
if (s.contains(".")) {
it.set(s.replace(".", ""));
}
}
but a better way would be to use replaceAll, with no need to loop explicitly:
newList.replaceAll(s -> s.replace(".", ""));
There's no need to check contains either: replace won't do anything if the search string is not present.

How to find the longest string object in an arrayList

Here is is my problem, I have to explain a lot, because it's a quite complicated.
I have created an arrayList<Word>, that contains strings as objects. In my case, I have three classes, all working together, that is supposed to represent a dictionary.
The first class is called "Word", it has a constructor, and has simple methodes like .length, adds 1 to the counter, and keeps track of the words that are repeated.
The second class is called "Wordlist", and uses more advanced methodes like reading a file, a summary of 30,000 words. Some methodes is for instance, to add words to the arraylist, as objects. Methods like search if they can find a perticular word. Now, these tasks contains parametres, with variables that I can use.
But I came upon a task in which I had to find the longest word(string) in the arrayList, without any parameters. The method is called: public Word findLongest().
In the thrid class, I have the test case, where I use the methodes. The most central part here is to read the file, and add it to the arrayList object. In what way can I use the method to find the longest word(string) in the arrayList without any parameters?
It is very confusing with arrayList as objects.
I am aware of the for (each : array) use in this sense, but have no idea how use it properly.
If your Word class provides a length() method which simply returns the length of the word represented by that object, then you can run through your ArrayList<Word> and find the longest like this:
private Word getLongestWordFromList(List<Word> listOfWords) {
Word longestWord = null;
for (Word word : listOfWords) {
if (longestWord == null || word.length() > longestWord.length()) {
longestWord = word;
}
}
return longestWord;
}
The for (Word word : listOfWords) pattern simply says: iterate through the List called "listOfWords" and store the current Word object in a variable called "word". Then you just check the length of each word to see whether it is longer than the longest already found. (If the longestWord variable is null then it means you haven't processed any words so far, so whatever is the first Word found will go into that variable, ready to be compared with the next.)
Note: if there is more than one word with the longest length then this method will simply return the first word which is found with that length. If you need to return a list of all words having the longest length then you'll need to modify this pattern to generate a List<Word> which contains all words of the longest length.
Assuming your Word class has a method called length(), maybe something like this:
Word longest = arrayList.get(0);
for (int i=1; i<arrayList.size(); ++i) {
if (arrayList.get(i).length() > longest.length()) {
longest = arrayList.get(i);
}
}
For Java Stream enthusiasts one very declarative solution is:
private Word getLongestWordFromList(List<Word> listOfWords) {
return listOfWords.stream()
.filter(word -> word != null)
.max(Comparator.comparing(Word::length))
.orElse(null);
}
filter makes sure the stream skips over null values.
max compares the word lengths and returns an Optional<Word> with the largest length value. Note that instead of Word::length we could have also written word -> word.length().
orElse "unpacks" the optional value. If it didn't find anything (eg. because every value is null or listOfWords is empty), it returns null.

Java - Changing multiple words in a string at once?

I'm trying to create a program that can abbreviate certain words in a string given by the user.
This is how I've laid it out so far:
Create a hashmap from a .txt file such as the following:
thanks,thx
your,yr
probably,prob
people,ppl
Take a string from the user
Split the string into words
Check the hashmap to see if that word exists as a key
Use hashmap.get() to return the key value
Replace the word with the key value returned
Return an updated string
It all works perfectly fine until I try to update the string:
public String shortenMessage( String inMessage ) {
String updatedstring = "";
String rawstring = inMessage;
String[] words = rawstring.replaceAll("[^a-zA-Z ]", "").toLowerCase().split("\\s+");
for (String word : words) {
System.out.println(word);
if (map.containsKey(word) == true) {
String x = map.get(word);
updatedstring = rawstring.replace(word, x);
}
}
System.out.println(updatedstring);
return updatedstring;
}
Input:
thanks, your, probably, people
Output:
thanks, your, probably, ppl
Does anyone know how I can update all the words in the string?
Thanks in advance
updatedstring = rawstring.replace(word, x);
This keeps replacing your updatedstring with the rawstring with a the single replacement.
You need to do something like
updatedstring = rawstring;
...
updatedString = updatedString.replace(word, x);
Edit:
That is the solution to the problem you are seeing but there are a few other problems with your code:
Your replacement won't work for things that you needed to lowercased or remove characters from. You create the words array that you iterate from altered version of your rawstring. Then you go back and try to replace the altered versions from your original rawstring where they don't exist. This will not find the words you think you are replacing.
If you are doing global replacements, you could just create a set of words instead of an array since once the word is replaced, it shouldn't come up again.
You might want to be replacing the words one at a time, because your global replacement could cause weird bugs where a word in the replacement map is a sub word of another replacement word. Instead of using String.replace, make an array/list of words, iterate the words and replace the element in the list if needed and join them. In java 8:
String.join(" ", elements);

Word Count no duplicates

Here is my word count program using java. I need to reprogram this so that something, something; something? something! and something count as one word. That means it should not count the same word twice irregardless of case and punctuation.
import java.util.Scanner;
public class WordCount1
{
public static void main(String[]args)
{
final int Lines=6;
Scanner in=new Scanner (System.in);
String paragraph = "";
System.out.println( "Please input "+ Lines + " lines of text.");
for (int i=0; i < Lines; i+=1)
{
paragraph=paragraph+" "+in.nextLine();
}
System.out.println(paragraph);
String word="";
int WordCount=0;
for (int i=0; i<paragraph.length()-1; i+=1)
{
if (paragraph.charAt(i) != ' ' || paragraph.charAt(i) !=',' || paragraph.charAt(i) !=';' || paragraph.charAt(i) !=':' )
{
word= word + paragraph.charAt(i);
if(paragraph.charAt(i+1)==' ' || paragraph.charAt(i) ==','|| paragraph.charAt(i) ==';' || paragraph.charAt(i) ==':')
{
WordCount +=1;
word="";
}
}
}
System.out.println("There are "+WordCount +" words ");
}
}
Since this is homework, here are some hints and advice.
There is a clever little method called String.split that splits a string into parts, using a separator specified as a regular expression. If you use it the right way, this will give you a one line solution to the "word count" problem. (If you've been told not to use split, you can ignore that ... though it is the simple solution that a seasoned Java developer would consider first.)
Format / indent your code properly ... before you show it to other people. If your instructor doesn't deduct marks for this, he / she isn't doing his job properly.
Use standard Java naming conventions. The capitalization of Lines is incorrect. It could be LINES for a manifest constant or lines for variable, but a mixed case name starting with a capital letter should always be a class name.
Be consistent in your use of white space characters around operators (including the assignment operator).
It is a bad idea (and completely unnecessary) to hard wire the number of lines of input that the user must supply. And you are not dealing with the case where he / supplies less than 6 lines.
You should just remove punctuation and change to a single case before doing further processing. (Be careful with locales and unicode)
Once you have broken the input into words, you can count the number of unique words by passing them into a Set and checking the size of the set.
Here You Go. This Works. Just Read The Comments And You Should Be Able To Follow.
import java.util.Arrays;
import java.util.HashSet;
import javax.swing.JOptionPane;
// Program Counts Words In A Sentence. Duplicates Are Not Counted.
public class WordCount
{
public static void main(String[]args)
{
// Initialize Variables
String sentence = "";
int wordCount = 1, startingPoint = 0;
// Prompt User For Sentence
sentence = JOptionPane.showInputDialog(null, "Please input a sentence.", "Input Information Below", 2);
// Remove All Punctuations. To Check For More Punctuations Just Add Another Replace Statement.
sentence = sentence.replace(",", "").replace(".", "").replace("?", "");
// Convert All Characters To Lowercase - Must Be Done To Compare Upper And Lower Case Words.
sentence = sentence.toLowerCase();
// Count The Number Of Words
for (int i = 0; i < sentence.length(); i++)
if (sentence.charAt(i) == ' ')
wordCount++;
// Initialize Array And A Count That Will Be Used As An Index
String[] words = new String[wordCount];
int count = 0;
// Put Each Word In An Array
for (int i = 0; i < sentence.length(); i++)
{
if (sentence.charAt(i) == ' ')
{
words[count] = sentence.substring(startingPoint,i);
startingPoint = i + 1;
count++;
}
}
// Put Last Word In Sentence In Array
words[wordCount - 1] = sentence.substring(startingPoint, sentence.length());
// Put Array Elements Into A Set. This Will Remove Duplicates
HashSet<String> wordsInSet = new HashSet<String>(Arrays.asList(words));
// Format Words In Hash Set To Remove Brackets, And Commas, And Convert To String
String wordsString = wordsInSet.toString().replace(",", "").replace("[", "").replace("]", "");
// Print Out None Duplicate Words In Set And Word Count
JOptionPane.showMessageDialog(null, "Words In Sentence:\n" + wordsString + " \n\n" +
"Word Count: " + wordsInSet.size(), "Sentence Information", 2);
}
}
If you know the marks you want to ignore (;, ?, !) you could do a simple String.replace to remove the characters out of the word. You may want to use String.startsWith and String.endsWith to help
Convert you values to lower case for easier matching (String.toLowercase)
The use of a 'Set' is an excellent idea. If you want to know how many times a particular word appears you could also take advantage of a Map of some kind
You'll need to strip out the punctuation; here's one approach: Translating strings character by character
The above can also be used to normalize the case, although there are probably other utilities for doing so.
Now all of the variations you describe will be converted to the same string, and thus be recognized as such. As pretty much everyone else has suggested, as set would be a good tool for counting the number of distinct words.
What your real problem is, is that you want to have a Distinct wordcount, so, you should either keep track of which words allready encountered, or delete them from the text entirely.
Lets say that you choose the first one, and store the words you already encountered in a List, then you can check against that list whether you allready saw that word.
List<String> encounteredWords = new ArrayList<String>();
// continue after that you found out what the word was
if(!encounteredWords.contains(word.toLowerCase()){
encounteredWords.add(word.toLowerCase());
wordCount++;
}
But, Antimony, made a interesting remark as well, he uses the property of a Set to see what the distinct wordcount is. It is defined that a set can never contain duplicates, so if you just add more of the same word, the set wont grow in size.
Set<String> wordSet = new HashSet<String>();
// continue after that you found out what the word was
wordSet.add(word.toLowerCase());
// continue after that you scanned trough all words
return wordSet.size();
remove all punctuations
convert all strings to lowercase OR uppercase
put those strings in a set
get the size of the set
As you parse your input string, store it word by word in a map data structure. Just ensure that "word", "word?" "word!" all are stored with the key "word" in the map, and increment the word's count whenever you have to add to the map.

printing elements in a nice format

I have a simple, general question regarding a real small issue that bothers me:
I'm printing a list of elements on the fly, so I don't have prior knowledge about the number of printed elements. I want a simple format where the elements are separated by a comma (elem1, elem2...) or something similar. Now, if I use a simple loop like:
while(elements to check) {
if (elem should be printed) {
print elem . ","
}
}
I get a comma after the last element...
I know this sounds quite stupid, but is there a way to handle this?
Let's assume that "should be printed" means "at least one non-whitespace character. In Perl, the idiomatic way to write this would be (you'll need to adjust the grep to taste):
print join "," => grep { /\S/ } #elements;
The "grep" is like "filter" in other languages and the /S/ is a regex matching one non-whitespace character. Only elements matching the expression are returned. The "join" should be self-explanatory.
perldoc -f join
perldoc -f grep
the way of having all your data in an array and then
print join(',', #yourarray)
is a good one.
You can also, after looping for your concatenation
declare eltToPrint
while (LOOP on elt) {
eltToPrint .= elt.','
}
remove the last comma with a regex :
eltToPrint =~s/,$//;
ps : works also if you put the comma at the beginning
eltToPrint =~s/^,//;
Java does not have a build-in join, but if you don't want to reinvent the wheel, you can use Guava's Joiner. It can skipNulls, or useForNull(something).
An object which joins pieces of text (specified as an array, Iterable, varargs or even a Map) with a separator. It either appends the results to an Appendable or returns them as a String. Example:
Joiner joiner = Joiner.on("; ").skipNulls();
return joiner.join("Harry", null, "Ron", "Hermione");
This returns the string "Harry; Ron; Hermione". Note that all input elements are converted to strings using Object.toString() before being appended.
Why not add a comma BEFORE each element (but the first one)? Pseudo-code:
is_first = true
loop element over element_array
BEGIN LOOP
if (! is_first)
print ","
end if
print element
is_first = false
END
print NEWLINE
I guess the simplest way is to create a new array containing only the elements from the original array that you need to print (i.e. a filter operation). Then print the newly created array, preferably using your language's built-in array/vector print/join function.
(In Perl)
#orig=("a","bc","d","ef","g");
#new_list=();
for $x(#orig){
push(#new_list,$x) if (length($x)==1);
}
print join(',',#new_list)."\n";
(In Java)
List<String> orig=Arrays.asList(new String[]{"a","bc","d","ef","g"});
List<String> new_list=new ArrayList<String>();
for(String x: orig){
if (x.length()==1)
new_list.add(x);
}
System.out.println(new_list);
You have several options depending on language.
e.g. in JavaScript just do:
var prettyString = someArray.join(', ');
in PHP you can implode()
$someArray = array('apple', 'orange', 'pear');
$prettyString = implode(",", $someArray);
if all else fails, you can either add the comma after every entry and trim the last one when done, or check in you while/foreach loop (bad for perf) if this is not the last item (if so, add a comma)
update: since you noted Java... you could create a method like this:
public static String join(String[] strings, String separator) {
StringBuffer sb = new StringBuffer();
for (int i=0; i < strings.length; i++) {
if (i != 0) sb.append(separator);
sb.append(strings[i]);
}
return sb.toString();
}
update 2: sounds like you really want this then if you are not outputting every element (pseudo-code):
first = true;
for(item in list){
if(item meets condition){
if(!first){
print ", ";
} else {
first = false;
}
print item;
}
}

Categories

Resources