How can i get String result in stringPattern value birdantantcatbirdcat - java

i have dataDic that is an array {"ant","bird","cat"}
dataDic is array of word that i want to search on stringPattern
I want to use dataDic to get word result from stringPattern = birdantantcatbirdcat
Ex1.
dataDic = {"ant","bird","cat"}
answer is {bird,ant,ant,cat,bird,cat}
Ex2.
dataDic = {"ant","cat"}
answer is {ant,ant,cat,cat}
this is my code
`private static String stringTest="birdantantcatbirdcat";
private static List dicListWord;
private static ListresultString = new ArrayList<>();
public static void main(String[] args) {
dicListWord = new ArrayList<>();
dicListWord.add("ant");
dicListWord.add("bird");
dicListWord.add("cat");
String[] data = stringTest.split("");
for (String dataDic:dicListWord) {
String [] wordList = dataDic.split("");
String foundWord = "";
for (String charTec:data) {
for (String dicWord:wordList) {
if(charTec.equals(dicWord)){
foundWord = foundWord.concat(charTec);
if(dataDic.equals(foundWord)){
resultString.add(foundWord);
foundWord = "";
}
}
}
}
}
for (String w1:data) {
for (String result:resultString) {
System.out.println(result);
}
}
}`
///////////////////////////////////////////////////////////////////////////////
and Result that i run is
{ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,antbird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird,ant,ant,bird,bird}

Use a TreeMap to store the position of a word as the key and the word itself as the value as you navigate the string to find matches for the word. The reason why you need to choose a TreeMap is that it is sorted according to the natural ordering of its keys which is an important aspect for your requirement.
Your requirement states that the words in the resulting list should be in the order of their occurrences in the string.
Demo:
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
public class Main {
public static void main(String[] args) {
List<String> words = List.of("ant", "bird", "cat");
String str = "birdantantcatbirdcat";
System.out.println(getMatchingWords(words, str));
}
static List<String> getMatchingWords(List<String> words, String str) {
Map<Integer, String> map = new TreeMap<Integer, String>();
for (String word : words) {
Pattern pattern = Pattern.compile(word);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
map.put(matcher.start(), matcher.group());
}
}
return map.values().stream().collect(Collectors.toList());
}
}
Output:
[bird, ant, ant, cat, bird, cat]

This is a word break problem and can be solved using a depth-first search. But it is wise to check before if the given string pattern is breakable or not to get better run-time in scenario where we have given a long string pattern that doesn't match any words in the dictionary.
public class P00140_Word_Break_II {
public static void main(String[] args) {
String input = "catsanddog";
List<String> wordDict = Arrays.asList("cat", "cats", "and", "sand", "dog");
P00140_Word_Break_II solution = new P00140_Word_Break_II();
List<String> results = solution.wordBreak(input, wordDict);
System.out.println(results);
String input1 = "birdantantcatbirdcat";
List<String> wordDict1 = Arrays.asList("ant","bird","cat");
List<String> results1 = solution.wordBreak(input1, wordDict1);
System.out.println(results1);
}
public List<String> wordBreak(String s, List<String> wordDict) {
Set<String> dict = new HashSet<>(wordDict);
List<String> result = new ArrayList<>();
if (s == null || s.length() == 0 || !isbreakable(s, dict)) {
return result;
}
helper(s, 0, new StringBuilder(), dict, result);
return result;
}
public void helper(String s, int start, StringBuilder item, Set<String> dict, List<String> results) {
if (start >= s.length()) {
results.add(item.toString());
return;
}
if (start != 0) {
item.append(" ");
}
for (int i = start; i < s.length(); i++) {
String temp = s.substring(start, i + 1);
if (dict.contains(temp)) {
item.append(temp);
helper(s , i+1 , item , dict , results);
item.delete(item.length() + start - i - 1 , item.length());
}
}
if(start!=0) item.deleteCharAt(item.length()-1);
}
private boolean isbreakable(String s, Set<String> dict) {
boolean[] dp = new boolean[s.length() + 1];
dp[0] = true;
for (int i = 1; i <= s.length(); i++) {
for (int j = 0; j < i; j++) {
String subString = s.substring(j, i);
if (dp[j] && dict.contains(subString)) {
dp[i] = true;
break;
}
}
}
return dp[s.length()];
}
}

Related

Fastest way to search several strings in a string

Below is my code to find the occurrences of all the substrings in a given single string
public static void main(String... args) {
String fullString = "one is a good one. two is ok. three is three. four is four. five is not four";
String[] severalStringArray = { "one", "two", "three", "four" };
Map<String, Integer> countMap = countWords(fullString, severalStringArray);
}
public static Map<String, Integer> countWords(String fullString, String[] severalStringArray) {
Map<String, Integer> countMap = new HashMap<>();
for (String searchString : severalStringArray) {
if (countMap.containsKey(searchString)) {
int searchCount = countMatchesInString(fullString, searchString);
countMap.put(searchString, countMap.get(searchString) + searchCount);
} else
countMap.put(searchString, countMatchesInString(fullString, searchString));
}
return countMap;
}
private static int countMatchesInString(String fullString, String subString) {
int count = 0;
int pos = fullString.indexOf(subString);
while (pos > -1) {
count++;
pos = fullString.indexOf(subString, pos + 1);
}
return count;
}
Assume the full string might be a full file read as a string. Is the above is the efficient way of search or any other better way or fastest way to do it?
Thanks
You could just form a regex alternation of words to search, and then do a single search against that regex:
public static int matchesInString(String fullString, String regex) {
int count = 0;
Pattern r = Pattern.compile(regex);
Matcher m = r.matcher(fullString);
while (m.find())
++count;
return count;
}
String fullString = "one is a good one. two is ok. three is three. four is four. five is not four";
String[] severalStringArray = { "one", "two", "three", "four" };
String regex = "\\b(?:" + String.join("|", severalStringArray) + ")\\b";
int count = matchesInString(fullString, regex);
System.out.println("There were " + count + " matches in the input");
This prints:
There were 8 matches in the input
Note that the regex pattern used in the above example was:
\b(?:one|two|three|four)\b
Regular expressions
Your problem can be solved using regex (regular expressions). Regular expressions are a tool that help you matching patterns in strings. This pattern can be a word or can be a set of chars.
Regular expressions in Java
In Java there are two Objects helping you with regular expressions: Pattern and Matcher.
Below you can see an example for searching if the word stackoverflow exists in the string stackoverflowXstackoverflowXXXstackoverflowXX in Java.
String pattern = "stackoverflow";
String stringToExamine = "stackoverflowXstackoverflowXXXstackoverflowXX";
Pattern patternObj = Pattern.compile(pattern);
Matcher matcherObj = patternObj.matcher(stringToExamine);
Counting how many occurrencies of a word in a given string
As written here you have different solution based on your Java version:
Java 9+
long matches = matcherObj.results().count();
Older Java versions
int count = 0;
while (matcherObj.find())
count++;
Regular expressions in your problem
You use a method for calculating how many times a word is occurring in a text (a string), and you can modify it like this:
Java 9+
public static int matchesInString(String fullString, String pattern)
{
Pattern patternObj = Pattern.compile(pattern);
Matcher matcherObj = patternObj.matcher(fullString);
return matcherObj.results().count();
}
Older Java versions
public static int matchesInString(String fullString, String pattern)
{
int count = 0;
Pattern patternObj = Pattern.compile(pattern);
Matcher matcherObj = patternObj.matcher(fullString);
while (matcherObj.find())
count++;
return count;
}
Actually, the fastest way is to scan the string first and count all existed words and save it into Map. Then select required words only.
Just be simple! The regular expression is too complicated and not efficient for this simple task. Let's solve it with a hummer!
public static void main(String... args) {
String str = "one is a good one. two is ok. three is three. four is four. five is not four";
Set<String> words = Set.of("one", "two", "three", "four");
Map<String, Integer> map = countWords(str, words);
}
public static Map<String, Integer> countWords(String str, Set<String> words) {
Map<String, Integer> map = new HashMap<>();
for (int i = 0, j = 0; j <= str.length(); j++) {
char ch = j == str.length() ? '\0' : str.charAt(j);
if (j == str.length() || !isWordSymbol(ch)) {
String word = str.substring(i, j);
if (!word.isEmpty() && words.contains(word))
map.put(word, map.getOrDefault(word, 0) + 1);
i = j + 1;
}
}
return map;
}
private static boolean isWordSymbol(char ch) {
return Character.isLetter(ch) || ch == '-' || ch == '_';
}
An implementation of the Trie tree that someone commented on. I don't know if it's fast or not.
static class Trie {
static final long INC_NODE_NO = 1L << Integer.SIZE;
private long nextNodeNo = 0;
private Node root = new Node();
private final Map<Long, Node> nodes = new HashMap<>();
public void put(String word) {
Node node = root;
for (int i = 0, len = word.length(); i < len; ++i)
node = node.put(word.charAt(i));
node.data = word;
}
public List<String> findPrefix(String text, int start) {
List<String> result = new ArrayList<>();
Node node = root;
for (int i = start, length = text.length(); i < length; ++i) {
if ((node = node.get(text.charAt(i))) == null)
break;
String v = node.data;
if (v != null)
result.add(v);
}
return result;
}
public Map<String, Integer> find(String text) {
Map<String, Integer> result = new HashMap<>();
for (int i = 0, length = text.length(); i < length; ++i)
for (String w : findPrefix(text, i))
result.compute(w, (k, v) -> v == null ? 1 : v + 1);
return result;
}
class Node {
final long no;
String data;
Node() {
this.no = nextNodeNo;
nextNodeNo += INC_NODE_NO;
}
Node get(int key) {
return nodes.get(no | key);
}
Node put(int key) {
return nodes.computeIfAbsent(no | key, k -> new Node());
}
}
}
public static void main(String args[]) throws IOException {
String fullString = "one is a good one. two is ok. three is three. four is four. five is not four";
String[] severalStringArray = { "one", "two", "three", "four" };
Trie trie = new Trie();
for (String word : severalStringArray)
trie.put(word);
Map<String, Integer> count = trie.find(fullString);
System.out.println(count);
}
output:
{four=3, one=2, three=2, two=1}

Problems with sorting the characters of a word

I have a problem that I have been struggling with for some time.
I am given a word consisting of small or large letters of the English alphabet, to sort the characters so that in the first positions appear the characters that appear most often in the word, and if they appear by the same number of times, they will be sorted lexicographical.
Such as:
input:
Instructions
output:
iinnssttcoru
So far I have written this, but from here I do not know how to sort them and display properly, a tip?
public class Main {
public static void main(String[] args) throws IOException {
String testString = " ";
BufferedReader rd = new BufferedReader(new InputStreamReader(System.in));
testString = rd.readLine();
Map<Character, List<Character>> map = new HashMap<>();
for (int i = 0; i < testString.length(); i++) {
char someChar = testString.charAt(i);
if (someChar == ' ') {
continue;
}
char ch = testString.charAt(i);
List<Character> characters = map.getOrDefault(Character.toLowerCase(ch), new ArrayList<>());
characters.add(ch);
map.put(Character.toLowerCase(ch), characters);
}
List<Map.Entry<Character, List<Character>>> list = new ArrayList<>(map.entrySet());}
You can add TreeMap counterAppear with the key is the number of repetitions of the character and value is a list of characters has the same number of key repetitions. This list needs to be sorted before printing to ensure the order as required. Use TreeMap to make sure the map is sorted by key(the number of repetitions).
public static void main(String[] args) throws IOException {
String testString = " ";
BufferedReader rd = new BufferedReader(new InputStreamReader(System.in));
testString = rd.readLine();
Map<Character, List<Character>> map = new HashMap<>();
for (int i = 0; i < testString.length(); i++) {
char someChar = testString.charAt(i);
if (someChar == ' ') {
continue;
}
char ch = testString.charAt(i);
//Change to Optimize Code
Character keyCharacter = Character.toLowerCase(ch);
if (map.get(keyCharacter) == null) {
map.put(keyCharacter, new ArrayList<>());
}
List<Character> characters = map.get(keyCharacter);
characters.add(ch);
}
TreeMap<Integer, List<Character>> counterAppear = new TreeMap<>();
for (Map.Entry<Character, List<Character>> entry : map.entrySet()) {
Character character = entry.getKey();
int repeatCharTime = entry.getValue().size();
if (counterAppear.get(repeatCharTime) == null) {
counterAppear.put(repeatCharTime, new ArrayList<>());
}
List<Character> characters = counterAppear.get(repeatCharTime);
characters.add(character);
}
for (Integer repeatCharTime : counterAppear.descendingKeySet()) {
List<Character> keyCharacters = counterAppear.get(repeatCharTime);
Collections.sort(keyCharacters);
for (Character character : keyCharacters) {
for (int i = 0; i < repeatCharTime; i++) {
System.err.print(character);
}
}
}
}
Here's my solution:
import java.util.*;
public class Test
{
static void process(String s)
{
HashMap<Character,Integer> map = new HashMap<Character,Integer>();
for(Character c : s.toLowerCase().toCharArray())
{
Integer nb = map.get(c);
map.put(c, nb==null ? 1 : nb+1);
}
ArrayList<Map.Entry<Character,Integer>> list = new ArrayList<>(map.entrySet());
Collections.sort(list, (a,b) ->
{
int res = b.getValue().compareTo(a.getValue());
if(res!=0)
return res;
return a.getKey().compareTo(b.getKey());
});
for(Map.Entry<Character,Integer> e : list)
{
for(int i=0;i<e.getValue();i++)
System.out.print(e.getKey());
}
}
public static void main(String[] args)
{
process("Instructions");
}
}

How do you remove repetitions in string characters,and sort it?

I'm having a problem with this, it is supposed to take 2 string and return the largest one, sorted alphabetically, with no repetitions.
like String x = "xbbacd" and String y = "ppacd"
would return "abcdx".
It's also giving a no output return without System.....ln();
import java.util.ArrayList;
import java.util.HashSet;
import java.util.Collections;
public class MyClass {
public static String longest(String s1, String s2) {
// your code
HashSet<String> list1 = new HashSet<String>();
HashSet<String> list2 = new HashSet<String>();
for (char x : s1.toCharArray()) {
String y = Character.toString(x);
list1.add(y);
}
for (char q : s2.toCharArray()) {
String y = Character.toString(q);
list2.add(y);
}
ArrayList<String> arr1 = new ArrayList<String>();
ArrayList<String> arr2 = new ArrayList<String>();
for (String t : list1) {
arr1.add(t);
}
for (String z : list2) {
arr2.add(z);
}
Collections.sort(arr1);
Collections.sort(arr2);
String one = "";
if (arr1.size() > arr2.size()) {
for (String i : arr1) {
one = one + i;
}
} else {
for (String i : arr2) {
one = one + i;
}
}
// System.out.print(one);
return one;
}
public static void main(String[] args) {
DeleteMe a = new DeleteMe();
a.longest("adfafasf", "xvsdvwv");
}
}
If you check the string length in the beginning you do not need to process both strings, just the longer one. You save quite a bit of coding effort by only working on the longer string. You should also check on how to handle the edge cases of both strings being equal in length, null, empty strings, etc.
Try this:
import java.util.ArrayList;
import java.util.HashSet;
import java.util.Collections;
public class MyClass {
public static String longest (String s1, String s2) {
// your code
if (len1 == null) return "String 1 is null";
if (len2 == null) return "String 2 is null";
// first determine which string is longer
int len1 = s1.length();
int len2 = s2.length();
String longerString = null;
if (len1 >= len2) {
longerString = len1;
} else {
longerString = len2;
}
HashSet<String> stringHash = new HashSet<String>();
for(char x : longerString.toCharArray() )
{
String y = Character.toString(x);
stringHash.add(y);
}
ArrayList<String> arr1 = new ArrayList<String>();
for(String t : list1){ arr1.add(t); }
Collections.sort(arr1);
String one = new String();
for(String i : arr1){ one = one + i; }
// System.out.print(one);
return one;
}
public static void main(String[ ] args) {
MyClass a = new MyClass();
System.out.println(a.longest("adfafasf","xvsdvwv"));
}
You could (and most likely should) extract the 'distinct and sort' part of your method into a new method to reduce code-duplication. Your variable names are somewhat confusing (at least for me) as you name a set 'list' and a list 'arr'.
Regarding the no output: You currently do not use the return value of the method, you want your last line to be System.out.println(longest("adfafasf", "xvsdvwv");
Below a refactored version of your version (the creation of the shorter string could be avoided but performance seems negligible for this problem):
public static String longest(
final String s1,
final String s2) {
////
final String ds1 = distinctSorted(s1);
final String ds2 = distinctSorted(s2);
return ds1.length() >= ds2.length() ? ds1 : ds2;
}
private static String distinctSorted(
final String s) {
////
final Set<Character> set = new HashSet<>();
for (final char c : s.toCharArray()) {
set.add(c);
}
final List<Character> list = new ArrayList<>(set);
Collections.sort(list);
final StringBuilder sb = new StringBuilder(list.size());
for (final char c : list) {
sb.append(c);
}
return sb.toString();
}
An alternative for the distinct and sort method:
private static String distinctSorted(
final String s) {
////
return s.chars().sorted().distinct()
.collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append)
.toString();
}

How to generate sentences using a Markov chain?

I'm trying to make a simple chatbot using a Markov chain. I've been able to successfully create the dictionary using patterns in input text, but I'm unable to figure out how to use it to generate sentences.
import java.text.BreakIterator;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
final class MarkovChain {
private static final BreakIterator sentenceIterator = BreakIterator.getSentenceInstance();
private static final BreakIterator wordIterator = BreakIterator.getWordInstance();
private static final Map<String, List<String>> dictionary = new TreeMap<>();
public static void addDictionary(String string) {
string = string.toLowerCase().trim();
for (final String sentence : splitSentences(string)) {
String lastWord = null, lastLastWord = null;
for (final String word : splitWords(sentence)) {
if (lastLastWord != null) {
final String key = lastLastWord + ' ' + lastWord;
List<String> value = dictionary.get(key);
if (value == null)
value = new ArrayList<>();
value.add(word);
dictionary.put(key, value);
}
lastLastWord = lastWord;
lastWord = word;
}
}
}
private static List<String> splitSentences(final String string) {
sentenceIterator.setText(string);
final List<String> sentences = new ArrayList<>();
for (int start = sentenceIterator.first(), end = sentenceIterator.next(); end != BreakIterator.DONE; start = end, end = sentenceIterator.next()) {
sentences.add(string.substring(start, end).trim());
}
return sentences;
}
private static List<String> splitWords(final String string) {
wordIterator.setText(string);
final List<String> words = new ArrayList<>();
for (int start = wordIterator.first(), end = wordIterator.next(); end != BreakIterator.DONE; start = end, end = wordIterator.next()) {
String word = string.substring(start, end).trim();
if (word.length() > 0 && Character.isLetterOrDigit(word.charAt(0)))
words.add(word);
}
return words;
}
}
How would I go about generating sentences from the dictionary?
Here is how I would change your code to make it possible to generate sentences. I added Map<String, List<String>> singleWords pointing previous word to list of possible next words and code filling this map in loop iterating over words in sentence. In addition I added dots on both sides of word list in order to register special states called "before first word" and "after last word" (see addDots(...)).
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.text.BreakIterator;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Random;
import java.util.TreeMap;
final class MarkovChain {
private static final BreakIterator sentenceIterator = BreakIterator.getSentenceInstance();
private static final BreakIterator wordIterator = BreakIterator.getWordInstance();
private static final Map<String, List<String>> singleWords = new TreeMap<>();
private static final Map<String, List<String>> dictionary = new TreeMap<>();
public static void main(String[] args) throws Exception {
String text = new String(Files.readAllBytes(Paths.get("text.txt")), Charset.defaultCharset());
addDictionary(text);
StringBuilder output = new StringBuilder();
generateSentence(singleWords, dictionary, output, 5);
System.out.println(output.toString());
}
public static void addDictionary(String string) {
string = string.toLowerCase().trim();
for (final String sentence : splitSentences(string)) {
String lastWord = null, lastLastWord = null;
for (final String word : addDots(splitWords(sentence))) {
if (lastLastWord != null) {
final String key = lastLastWord + ' ' + lastWord;
List<String> value = dictionary.get(key);
if (value == null)
value = new ArrayList<>();
value.add(word);
dictionary.put(key, value);
}
if (lastWord != null) {
final String key = lastWord;
List<String> value = singleWords.get(key);
if (value == null)
value = new ArrayList<>();
value.add(word);
singleWords.put(key, value);
}
lastLastWord = lastWord;
lastWord = word;
}
}
}
private static List<String> splitSentences(final String string) {
sentenceIterator.setText(string);
final List<String> sentences = new ArrayList<>();
for (int start = sentenceIterator.first(), end = sentenceIterator.next(); end != BreakIterator.DONE; start = end, end = sentenceIterator.next()) {
sentences.add(string.substring(start, end).trim());
}
return sentences;
}
private static List<String> splitWords(final String string) {
wordIterator.setText(string);
final List<String> words = new ArrayList<>();
for (int start = wordIterator.first(), end = wordIterator.next(); end != BreakIterator.DONE; start = end, end = wordIterator.next()) {
String word = string.substring(start, end).trim();
if (word.length() > 0 && Character.isLetterOrDigit(word.charAt(0)))
words.add(word);
}
return words;
}
private static List<String> addDots(List<String> words) {
words.add(0, ".");
words.add(".");
return words;
}
public static void generateSentence(Map<String, List<String>> singleWords,
Map<String, List<String>> dictionary, StringBuilder target, int count) {
Random r = new Random();
for (int i = 0; i < 5; i++) {
String w1 = ".";
String w2 = pickRandom(singleWords.get(w1), r);
while (w2 != null) {
target.append(w2).append(" ");
if (w2.equals("."))
break;
String w3 = pickRandom(dictionary.get(w1 + " " + w2), r);
w1 = w2;
w2 = w3;
}
target.append("\n");
}
}
private static String pickRandom(List<String> alternatives, Random r) {
return alternatives.get(r.nextInt(alternatives.size()));
}
}
I should mention that this approach is not optimized. If I needed to make it more efficient I would count number of words in dictionary map and at the end normalize them to produce frequencies. Something like: Map<String, Map<String, Double>> dictionary, where inner map points word to frequency. It would require to pick words differently than it's done in my example though.

Finding repeated words on a string and counting the repetitions

I need to find repeated words on a string, and then count how many times they were repeated. So basically, if the input string is this:
String s = "House, House, House, Dog, Dog, Dog, Dog";
I need to create a new string list without repetitions and save somewhere else the amount of repetitions for each word, like such:
New String: "House, Dog"
New Int Array: [3, 4]
Is there a way to do this easily with Java? I've managed to separate the string using s.split() but then how do I count repetitions and eliminate them on the new string? Thanks!
You've got the hard work done. Now you can just use a Map to count the occurrences:
Map<String, Integer> occurrences = new HashMap<String, Integer>();
for ( String word : splitWords ) {
Integer oldCount = occurrences.get(word);
if ( oldCount == null ) {
oldCount = 0;
}
occurrences.put(word, oldCount + 1);
}
Using map.get(word) will tell you many times a word occurred. You can construct a new list by iterating through map.keySet():
for ( String word : occurrences.keySet() ) {
//do something with word
}
Note that the order of what you get out of keySet is arbitrary. If you need the words to be sorted by when they first appear in your input String, you should use a LinkedHashMap instead.
Try this,
public class DuplicateWordSearcher {
#SuppressWarnings("unchecked")
public static void main(String[] args) {
String text = "a r b k c d se f g a d f s s f d s ft gh f ws w f v x s g h d h j j k f sd j e wed a d f";
List<String> list = Arrays.asList(text.split(" "));
Set<String> uniqueWords = new HashSet<String>(list);
for (String word : uniqueWords) {
System.out.println(word + ": " + Collections.frequency(list, word));
}
}
}
public class StringsCount{
public static void main(String args[]) {
String value = "This is testing Program testing Program";
String item[] = value.split(" ");
HashMap<String, Integer> map = new HashMap<>();
for (String t : item) {
if (map.containsKey(t)) {
map.put(t, map.get(t) + 1);
} else {
map.put(t, 1);
}
}
Set<String> keys = map.keySet();
for (String key : keys) {
System.out.println(key);
System.out.println(map.get(key));
}
}
}
As mentioned by others use String::split(), followed by some map (hashmap or linkedhashmap) and then merge your result. For completeness sake putting the code.
import java.util.*;
public class Genric<E>
{
public static void main(String[] args)
{
Map<String, Integer> unique = new LinkedHashMap<String, Integer>();
for (String string : "House, House, House, Dog, Dog, Dog, Dog".split(", ")) {
if(unique.get(string) == null)
unique.put(string, 1);
else
unique.put(string, unique.get(string) + 1);
}
String uniqueString = join(unique.keySet(), ", ");
List<Integer> value = new ArrayList<Integer>(unique.values());
System.out.println("Output = " + uniqueString);
System.out.println("Values = " + value);
}
public static String join(Collection<String> s, String delimiter) {
StringBuffer buffer = new StringBuffer();
Iterator<String> iter = s.iterator();
while (iter.hasNext()) {
buffer.append(iter.next());
if (iter.hasNext()) {
buffer.append(delimiter);
}
}
return buffer.toString();
}
}
New String is Output = House, Dog
Int array (or rather list) Values = [3, 4] (you can use List::toArray) for getting an array.
Using java8
private static void findWords(String s, List<String> output, List<Integer> count){
String[] words = s.split(", ");
Map<String, Integer> map = new LinkedHashMap<>();
Arrays.stream(words).forEach(e->map.put(e, map.getOrDefault(e, 0) + 1));
map.forEach((k,v)->{
output.add(k);
count.add(v);
});
}
Also, use a LinkedHashMap if you want to preserve the order of insertion
private static void findWords(){
String s = "House, House, House, Dog, Dog, Dog, Dog";
List<String> output = new ArrayList<>();
List<Integer> count = new ArrayList<>();
findWords(s, output, count);
System.out.println(output);
System.out.println(count);
}
Output
[House, Dog]
[3, 4]
If this is a homework, then all I can say is: use String.split() and HashMap<String,Integer>.
(I see you've found split() already. You're along the right lines then.)
It may help you somehow.
String st="I am am not the one who is thinking I one thing at time";
String []ar = st.split("\\s");
Map<String, Integer> mp= new HashMap<String, Integer>();
int count=0;
for(int i=0;i<ar.length;i++){
count=0;
for(int j=0;j<ar.length;j++){
if(ar[i].equals(ar[j])){
count++;
}
}
mp.put(ar[i], count);
}
System.out.println(mp);
Once you have got the words from the string it is easy.
From Java 10 onwards you can try the following code:
import java.util.Arrays;
import java.util.stream.Collectors;
public class StringFrequencyMap {
public static void main(String... args) {
String[] wordArray = {"House", "House", "House", "Dog", "Dog", "Dog", "Dog"};
var freq = Arrays.stream(wordArray)
.collect(Collectors.groupingBy(x -> x, Collectors.counting()));
System.out.println(freq);
}
}
Output:
{House=3, Dog=4}
You can use Prefix tree (trie) data structure to store words and keep track of count of words within Prefix Tree Node.
#define ALPHABET_SIZE 26
// Structure of each node of prefix tree
struct prefix_tree_node {
prefix_tree_node() : count(0) {}
int count;
prefix_tree_node *child[ALPHABET_SIZE];
};
void insert_string_in_prefix_tree(string word)
{
prefix_tree_node *current = root;
for(unsigned int i=0;i<word.size();++i){
// Assuming it has only alphabetic lowercase characters
// Note ::::: Change this check or convert into lower case
const unsigned int letter = static_cast<int>(word[i] - 'a');
// Invalid alphabetic character, then continue
// Note :::: Change this condition depending on the scenario
if(letter > 26)
throw runtime_error("Invalid alphabetic character");
if(current->child[letter] == NULL)
current->child[letter] = new prefix_tree_node();
current = current->child[letter];
}
current->count++;
// Insert this string into Max Heap and sort them by counts
}
// Data structure for storing in Heap will be something like this
struct MaxHeapNode {
int count;
string word;
};
After inserting all words, you have to print word and count by iterating Maxheap.
//program to find number of repeating characters in a string
//Developed by Subash<subash_senapati#ymail.com>
import java.util.Scanner;
public class NoOfRepeatedChar
{
public static void main(String []args)
{
//input through key board
Scanner sc = new Scanner(System.in);
System.out.println("Enter a string :");
String s1= sc.nextLine();
//formatting String to char array
String s2=s1.replace(" ","");
char [] ch=s2.toCharArray();
int counter=0;
//for-loop tocompare first character with the whole character array
for(int i=0;i<ch.length;i++)
{
int count=0;
for(int j=0;j<ch.length;j++)
{
if(ch[i]==ch[j])
count++; //if character is matching with others
}
if(count>1)
{
boolean flag=false;
//for-loop to check whether the character is already refferenced or not
for (int k=i-1;k>=0 ;k-- )
{
if(ch[i] == ch[k] ) //if the character is already refferenced
flag=true;
}
if( !flag ) //if(flag==false)
counter=counter+1;
}
}
if(counter > 0) //if there is/are any repeating characters
System.out.println("Number of repeating charcters in the given string is/are " +counter);
else
System.out.println("Sorry there is/are no repeating charcters in the given string");
}
}
public static void main(String[] args) {
String s="sdf sdfsdfsd sdfsdfsd sdfsdfsd sdf sdf sdf ";
String st[]=s.split(" ");
System.out.println(st.length);
Map<String, Integer> mp= new TreeMap<String, Integer>();
for(int i=0;i<st.length;i++){
Integer count=mp.get(st[i]);
if(count == null){
count=0;
}
mp.put(st[i],++count);
}
System.out.println(mp.size());
System.out.println(mp.get("sdfsdfsd"));
}
If you pass a String argument it will count the repetition of each word
/**
* #param string
* #return map which contain the word and value as the no of repatation
*/
public Map findDuplicateString(String str) {
String[] stringArrays = str.split(" ");
Map<String, Integer> map = new HashMap<String, Integer>();
Set<String> words = new HashSet<String>(Arrays.asList(stringArrays));
int count = 0;
for (String word : words) {
for (String temp : stringArrays) {
if (word.equals(temp)) {
++count;
}
}
map.put(word, count);
count = 0;
}
return map;
}
output:
Word1=2, word2=4, word2=1,. . .
import java.util.HashMap;
import java.util.LinkedHashMap;
public class CountRepeatedWords {
public static void main(String[] args) {
countRepeatedWords("Note that the order of what you get out of keySet is arbitrary. If you need the words to be sorted by when they first appear in your input String, you should use a LinkedHashMap instead.");
}
public static void countRepeatedWords(String wordToFind) {
String[] words = wordToFind.split(" ");
HashMap<String, Integer> wordMap = new LinkedHashMap<String, Integer>();
for (String word : words) {
wordMap.put(word,
(wordMap.get(word) == null ? 1 : (wordMap.get(word) + 1)));
}
System.out.println(wordMap);
}
}
I hope this will help you
public void countInPara(String str) {
Map<Integer,String> strMap = new HashMap<Integer,String>();
List<String> paraWords = Arrays.asList(str.split(" "));
Set<String> strSet = new LinkedHashSet<>(paraWords);
int count;
for(String word : strSet) {
count = Collections.frequency(paraWords, word);
strMap.put(count, strMap.get(count)==null ? word : strMap.get(count).concat(","+word));
}
for(Map.Entry<Integer,String> entry : strMap.entrySet())
System.out.println(entry.getKey() +" :: "+ entry.getValue());
}
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
public class DuplicateWord {
public static void main(String[] args) {
String para = "this is what it is this is what it can be";
List < String > paraList = new ArrayList < String > ();
paraList = Arrays.asList(para.split(" "));
System.out.println(paraList);
int size = paraList.size();
int i = 0;
Map < String, Integer > duplicatCountMap = new HashMap < String, Integer > ();
for (int j = 0; size > j; j++) {
int count = 0;
for (i = 0; size > i; i++) {
if (paraList.get(j).equals(paraList.get(i))) {
count++;
duplicatCountMap.put(paraList.get(j), count);
}
}
}
System.out.println(duplicatCountMap);
List < Integer > myCountList = new ArrayList < > ();
Set < String > myValueSet = new HashSet < > ();
for (Map.Entry < String, Integer > entry: duplicatCountMap.entrySet()) {
myCountList.add(entry.getValue());
myValueSet.add(entry.getKey());
}
System.out.println(myCountList);
System.out.println(myValueSet);
}
}
Input: this is what it is this is what it can be
Output:
[this, is, what, it, is, this, is, what, it, can, be]
{can=1, what=2, be=1, this=2, is=3, it=2}
[1, 2, 1, 2, 3, 2]
[can, what, be, this, is, it]
import java.util.HashMap;
import java.util.Scanner;
public class class1 {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
String inpStr = in.nextLine();
int key;
HashMap<String,Integer> hm = new HashMap<String,Integer>();
String[] strArr = inpStr.split(" ");
for(int i=0;i<strArr.length;i++){
if(hm.containsKey(strArr[i])){
key = hm.get(strArr[i]);
hm.put(strArr[i],key+1);
}
else{
hm.put(strArr[i],1);
}
}
System.out.println(hm);
}
}
Please use the below code. It is the most simplest as per my analysis. Hope you will like it:
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Scanner;
import java.util.Set;
public class MostRepeatingWord {
String mostRepeatedWord(String s){
String[] splitted = s.split(" ");
List<String> listString = Arrays.asList(splitted);
Set<String> setString = new HashSet<String>(listString);
int count = 0;
int maxCount = 1;
String maxRepeated = null;
for(String inp: setString){
count = Collections.frequency(listString, inp);
if(count > maxCount){
maxCount = count;
maxRepeated = inp;
}
}
return maxRepeated;
}
public static void main(String[] args)
{
System.out.println("Enter The Sentence: ");
Scanner s = new Scanner(System.in);
String input = s.nextLine();
MostRepeatingWord mrw = new MostRepeatingWord();
System.out.println("Most repeated word is: " + mrw.mostRepeatedWord(input));
}
}
package day2;
import java.util.ArrayList;
import java.util.HashMap;`enter code here`
import java.util.List;
public class DuplicateWords {
public static void main(String[] args) {
String S1 = "House, House, House, Dog, Dog, Dog, Dog";
String S2 = S1.toLowerCase();
String[] S3 = S2.split("\\s");
List<String> a1 = new ArrayList<String>();
HashMap<String, Integer> hm = new HashMap<>();
for (int i = 0; i < S3.length - 1; i++) {
if(!a1.contains(S3[i]))
{
a1.add(S3[i]);
}
else
{
continue;
}
int Count = 0;
for (int j = 0; j < S3.length - 1; j++)
{
if(S3[j].equals(S3[i]))
{
Count++;
}
}
hm.put(S3[i], Count);
}
System.out.println("Duplicate Words and their number of occurrences in String S1 : " + hm);
}
}
public class Counter {
private static final int COMMA_AND_SPACE_PLACE = 2;
private String mTextToCount;
private ArrayList<String> mSeparateWordsList;
public Counter(String mTextToCount) {
this.mTextToCount = mTextToCount;
mSeparateWordsList = cutStringIntoSeparateWords(mTextToCount);
}
private ArrayList<String> cutStringIntoSeparateWords(String text)
{
ArrayList<String> returnedArrayList = new ArrayList<>();
if(text.indexOf(',') == -1)
{
returnedArrayList.add(text);
return returnedArrayList;
}
int position1 = 0;
int position2 = 0;
while(position2 < text.length())
{
char c = ',';
if(text.toCharArray()[position2] == c)
{
String tmp = text.substring(position1, position2);
position1 += tmp.length() + COMMA_AND_SPACE_PLACE;
returnedArrayList.add(tmp);
}
position2++;
}
if(position1 < position2)
{
returnedArrayList.add(text.substring(position1, position2));
}
return returnedArrayList;
}
public int[] countWords()
{
if(mSeparateWordsList == null) return null;
HashMap<String, Integer> wordsMap = new HashMap<>();
for(String s: mSeparateWordsList)
{
int cnt;
if(wordsMap.containsKey(s))
{
cnt = wordsMap.get(s);
cnt++;
} else {
cnt = 1;
}
wordsMap.put(s, cnt);
}
return printCounterResults(wordsMap);
}
private int[] printCounterResults(HashMap<String, Integer> m)
{
int index = 0;
int[] returnedIntArray = new int[m.size()];
for(int i: m.values())
{
returnedIntArray[index] = i;
index++;
}
return returnedIntArray;
}
}
/*count no of Word in String using TreeMap we can use HashMap also but word will not display in sorted order */
import java.util.*;
public class Genric3
{
public static void main(String[] args)
{
Map<String, Integer> unique = new TreeMap<String, Integer>();
String string1="Ram:Ram: Dog: Dog: Dog: Dog:leela:leela:house:house:shayam";
String string2[]=string1.split(":");
for (int i=0; i<string2.length; i++)
{
String string=string2[i];
unique.put(string,(unique.get(string) == null?1:(unique.get(string)+1)));
}
System.out.println(unique);
}
}
//program to find number of repeating characters in a string
//Developed by Rahul Lakhmara
import java.util.*;
public class CountWordsInString {
public static void main(String[] args) {
String original = "I am rahul am i sunil so i can say am i";
// making String type of array
String[] originalSplit = original.split(" ");
// if word has only one occurrence
int count = 1;
// LinkedHashMap will store the word as key and number of occurrence as
// value
Map<String, Integer> wordMap = new LinkedHashMap<String, Integer>();
for (int i = 0; i < originalSplit.length - 1; i++) {
for (int j = i + 1; j < originalSplit.length; j++) {
if (originalSplit[i].equals(originalSplit[j])) {
// Increment in count, it will count how many time word
// occurred
count++;
}
}
// if word is already present so we will not add in Map
if (wordMap.containsKey(originalSplit[i])) {
count = 1;
} else {
wordMap.put(originalSplit[i], count);
count = 1;
}
}
Set word = wordMap.entrySet();
Iterator itr = word.iterator();
while (itr.hasNext()) {
Map.Entry map = (Map.Entry) itr.next();
// Printing
System.out.println(map.getKey() + " " + map.getValue());
}
}
}
public static void main(String[] args){
String string = "elamparuthi, elam, elamparuthi";
String[] s = string.replace(" ", "").split(",");
String[] op;
String ops = "";
for(int i=0; i<=s.length-1; i++){
if(!ops.contains(s[i]+"")){
if(ops != "")ops+=", ";
ops+=s[i];
}
}
System.out.println(ops);
}
For Strings with no space, we can use the below mentioned code
private static void findRecurrence(String input) {
final Map<String, Integer> map = new LinkedHashMap<>();
for(int i=0; i<input.length(); ) {
int pointer = i;
int startPointer = i;
boolean pointerHasIncreased = false;
for(int j=0; j<startPointer; j++){
if(pointer<input.length() && input.charAt(j)==input.charAt(pointer) && input.charAt(j)!=32){
pointer++;
pointerHasIncreased = true;
}else{
if(pointerHasIncreased){
break;
}
}
}
if(pointer - startPointer >= 2) {
String word = input.substring(startPointer, pointer);
if(map.containsKey(word)){
map.put(word, map.get(word)+1);
}else{
map.put(word, 1);
}
i=pointer;
}else{
i++;
}
}
for(Map.Entry<String, Integer> entry : map.entrySet()){
System.out.println(entry.getKey() + " = " + (entry.getValue()+1));
}
}
Passing some input as "hahaha" or "ba na na" or "xxxyyyzzzxxxzzz" give the desired output.
Hope this helps :
public static int countOfStringInAText(String stringToBeSearched, String masterString){
int count = 0;
while (masterString.indexOf(stringToBeSearched)>=0){
count = count + 1;
masterString = masterString.substring(masterString.indexOf(stringToBeSearched)+1);
}
return count;
}
package string;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
public class DublicatewordinanArray {
public static void main(String[] args) {
String str = "This is Dileep Dileep Kumar Verma Verma";
DuplicateString(str);
}
public static void DuplicateString(String str) {
String word[] = str.split(" ");
Map < String, Integer > map = new HashMap < String, Integer > ();
for (String w: word)
if (!map.containsKey(w)) {
map.put(w, 1);
}
else {
map.put(w, map.get(w) + 1);
}
Set < Map.Entry < String, Integer >> entrySet = map.entrySet();
for (Map.Entry < String, Integer > entry: entrySet)
if (entry.getValue() > 1) {
System.out.printf("%s : %d %n", entry.getKey(), entry.getValue());
}
}
}
Using Java 8 streams collectors:
public static Map<String, Integer> countRepetitions(String str) {
return Arrays.stream(str.split(", "))
.collect(Collectors.toMap(s -> s, s -> 1, (a, b) -> a + 1));
}
Input: "House, House, House, Dog, Dog, Dog, Dog, Cat"
Output: {Cat=1, House=3, Dog=4}
please try these it may be help for you.
public static void main(String[] args) {
String str1="House, House, House, Dog, Dog, Dog, Dog";
String str2=str1.replace(",", "");
Map<String,Integer> map=findFrquenciesInString(str2);
Set<String> keys=map.keySet();
Collection<Integer> vals=map.values();
System.out.println(keys);
System.out.println(vals);
}
private static Map<String,Integer> findFrquenciesInString(String str1) {
String[] strArr=str1.split(" ");
Map<String,Integer> map=new HashMap<>();
for(int i=0;i<strArr.length;i++) {
int count=1;
for(int j=i+1;j<strArr.length;j++) {
if(strArr[i].equals(strArr[j]) && strArr[i]!="-1") {
strArr[j]="-1";
count++;
}
}
if(count>1 && strArr[i]!="-1") {
map.put(strArr[i], count);
strArr[i]="-1";
}
}
return map;
}
as introduction of stream has changed the way we code; i would like to add some of the ways of doing this using it
String[] strArray = str.split(" ");
//1. All string value with their occurrences
Map<String, Long> counterMap =
Arrays.stream(strArray).collect(Collectors.groupingBy(e->e, Collectors.counting()));
//2. only duplicating Strings
Map<String, Long> temp = counterMap.entrySet().stream().filter(map->map.getValue() > 1).collect(Collectors.toMap(map -> map.getKey(), map -> map.getValue()));
System.out.println("test : "+temp);
//3. List of Duplicating Strings
List<String> masterStrings = Arrays.asList(strArray);
Set<String> duplicatingStrings =
masterStrings.stream().filter(i -> Collections.frequency(masterStrings, i) > 1).collect(Collectors.toSet());
Use Function.identity() inside Collectors.groupingBy and store everything in a MAP.
String a = "Gini Gina Gina Gina Gina Protijayi Protijayi ";
Map<String, Long> map11 = Arrays.stream(a.split(" ")).collect(Collectors
.groupingBy(Function.identity(),Collectors.counting()));
System.out.println(map11);
// output => {Gina=4, Gini=1, Protijayi=2}
In Python we can use collections.Counter()
a = "Roopa Roopi loves green color Roopa Roopi"
words = a.split()
wordsCount = collections.Counter(words)
for word,count in sorted(wordsCount.items()):
print('"%s" is repeated %d time%s.' % (word,count,"s" if count > 1 else "" ))
Output :
"Roopa" is repeated 2 times.
"Roopi" is repeated 2 times.
"color" is repeated 1 time.
"green" is repeated 1 time.
"loves" is repeated 1 time.

Categories

Resources