Replacing abbreviations/slangs with their fullforms - java

I am using a HashMap to store the full forms for abbreviations.
public class Test {
public static void main(String[] args) {
Map<String, String> slangs = new HashMap<String, String>();
slangs.put("lol", "laugh out loud");
slangs.put("r", " are ");
slangs.put("n", " and ");
slangs.put("idk", " I don't know ");
slangs.put("u", " you ");
Set set = slangs.entrySet();
Iterator i = set.iterator();
String sentence = "lol how are you";
StringBuilder sb = new StringBuilder();
for (String word : sentence.split(" ")) {
while(i.hasNext()) {
Map.Entry<String, String> me = (Map.Entry)i.next();
if (word.equalsIgnoreCase(me.getKey())) {
sb.append(me.getValue());
continue;
}
sb.append(word);
}
}
System.out.println(sb.toString());
}
}
The Output is:
lollollollaugh out loudlol
What is wrong here and how do I solve it?

You are not supposed to iterate over the entries to find a match, you are supposed to use get(Object key) or getOrDefault(Object key, V defaultValue) to get the full form of a given abbreviation, otherwise instead of getting your full form with a time complexity of O(1), you will get it with a O(n) which is of course not good in term of performances, you would lose the real benefit of having your key/value pairs in a Map. If you did it because of the case, simply put your keys only in lower case in your map and call get or getOrDefault with the word in lower case as below:
So your loop should be something like:
for (String word : sentence.split(" ")) {
// Get the full form of the value of word in lower case otherwise use
// the word itself
sb.append(slangs.getOrDefault(word.toLowerCase(), String.format(" %s", word)));
}
Output:
laugh out loud how are you
Using the Stream API, it could simply be:
String result = Pattern.compile(" ")
.splitAsStream(sentence)
.map(word -> slangs.getOrDefault(word.toLowerCase(), word))
.collect(Collectors.joining(" "));

Don't loop over the keys in the dictionary. Instead, just check whether the key is in the map and get the corresponding value. Also, don't forget to add the spaces back into the combined sentence.
for (String word : sentence.split(" ")) {
if (slangs.containsKey(word.toLowerCase())) {
sb.append(slangs.get(word.toLowerCase()));
} else {
sb.append(word);
}
sb.append(" ");
}
If you are using Java 8, you can also use String.join, Map.getOrDefault and Streams:
String s = String.join(" ", Stream.of(sentence.split(" "))
.map(word -> slangs.getOrDefault(word.toLowerCase(), word))
.toArray(n -> new String[n]));
This latter approach also has the benefit of not adding a space before the first or after the last word in the sentence.

Simply, I think you just need to check if slangs contain this keyword or not.
Please check my code.
public class Test {
public static void main(String[] args) {
Map<String, String> slangs = new HashMap<String, String>();
slangs.put("lol", "laugh out loud");
slangs.put("r", " are ");
slangs.put("n", " and ");
slangs.put("idk", " I don't know ");
slangs.put("u", " you ");
String sentence = "lol how are you";
String[] words = sentence.split(" ");
for (String word : words) {
String normalizeWord = word.trim().toLowerCase();
if(slangs.containsKey(normalizeWord)) {
sentence = sentence.replace(word, slangs.get(normalizeWord));
}
}
System.out.println(sentence);
}
}

Related

How do I remove the whitespace after each comma in an ArrayList?

I am struggling with removing spaces in an arraylist of strings.
Say the user input is as follows: "I walked my dog", my code outputs this:
[I, walked, my, dog]
I want it to have no spaces as such:
[I,walked,my,dog]
I tried to remove whitespace on each individual string before I add it to the arrayList, but the output still has the spaces.
Scanner input = new Scanner(System.in);
ArrayList<String> userWords = new ArrayList<String>();
ArrayList<String> SplituserWords = new ArrayList<String>();
System.out.println("Please enter your phrase: ");
userWords.add(input.nextLine());
for (int index = 0; index < userWords.size(); index++) {
String[] split = userWords.get(index).split("\\s+");
for (String word : split) {
word.replaceAll("\\s+","");
SplituserWords.add(word);
}
System.out.println(SplituserWords);
I suggest just taking advantage of the built-in Arrays#toString() method here:
String words = input.nextLine();
String output = Arrays.toString(words.split(" ")).replace(" ", "");
System.out.println(output); // [I,walked,my,dog]
When you are writing System.out.println(SplituserWords);, you are implicitly calling ArrayList#toString() which generates the list's string and that includes spaces after commas.
You can instead generates your own string output, for example with:
System.out.println("[" + String.join(",", SplituserWords) + "]");
If you insist on using List, it will do it for you.
String input = "I walked my dog";
List<String> SplitUserWords = Arrays.asList(input.split(" "));
String output = SplitUserWords.toString().replace(" ", "");
System.out.println(output); //[I,walked,my,dog]
I tried to remove whitespace on each individual string before I add it to the arrayList, but the output still has the spaces.
That won't work because that isn't the problem. The issue is that it is the list implementation that formats the output for you inserts a space after each comma. It does this in the toString() method. To avoid having to explicitly call replace each time you can also do it like this by overidding toString() when you create your List.
List<String> myList = new ArrayList<>(List.of("I","walked","my", "dog")) {
#Override
public String toString() {
// use super to call the overidden method to format the string
// then remove the spaces and return the new String
return super.toString().replace(" ", "");
}
};
System.out.println(myList);
myList.addAll(List.of("and", "then","fed", "my", "cat"));
System.out.println(myList);
prints
[I,walked,my,dog]
[I,walked,my,dog,and,then,fed,my,cat]
You can also subclass ArrayList as follows. Here I have added the three constructors that ArrayList implements. Note that is is a somewhat extreme solution and may not be worth it for occasionally reformatting of the output. I included it for your consideration.
class MyArrayList<E> extends ArrayList<E> {
public MyArrayList() {
super();
}
public MyArrayList(int capacity) {
super(capacity);
}
public MyArrayList(Collection<? extends E> c) {
super(c);
}
#Override
public String toString() {
return super.toString().replace(" ", "");
}
}
And it would work like so.
MyArrayList<String> myList = new MyArrayList<>(List.of("This", "is", "a","test."));
System.out.println(myList);
prints
[This,is,a,test.]

adding Set<String> into a Java ArrayList

I am writing a programme that reads a word String from a file. I am translating the string but the translator returns a Set
What I want to be able to do is to store the original word and the translated word next to each other in an ArrayList.
while (sc.hasNext()) {
String entry = (sc.next());
System.out.println(word);
al.add(ord);//String;
al.add(translate(word));//Set<String>;
}
Now what I want to do is to access both the word and the translated word....just to test I am trying to print but this is where my code is broken.....
for(int i=0;i<al.size();i++){
Object o = al.get(i);
Object p = al.get(i);
System.out.println("Value is "+o.toString());
System.out.println("Value is "+p.toSet<?>());
}
I would suggest not using a list. If you want to know which word was translated into which Set, use a Map (or a dictionary in other languages}
Map<String, Set<String>> translations = new HashMap<>();
while (sc.hasNext()) {
String entry = sc.next();
translations.put(entry, translate(entry));
}
System.out.println(translations);
The problem
System.out.println("Value is "+p.toSet<?>());
This won't compile. There is no toSet<>() function for java.lang.Object.
Another solution
The best solution for this might be to store the original word in a Map with the translated word as the value and the original word as the key. This would only work if there are no conflicts (i.e duplicate strings).
Map<String, Set<String>> words = new HashMap<>();
while (scanner.hasNext()) {
String word = sc.next();
Set<String> translated = translate(word);
words.put(word, translated);
}

String sort in which each word in the string contains a number indicating its position in the result

I have to sort a string in which each word in the string contains a number which tells the sort position of that word in the resultant string.Numbers can be from 1 to 9.The words in the String contains only valid consecutive numbers.
Eg: "is2 this1 test4 3a"
What is the most efficient way to solve this after splitting the string using space as the splitter, how to compare and arrange it using minimum number of loops?
Try this:
private final static String testString = "is2 this1 test4 3a";
public static void main(String[] args){
String[] splittedString = testString.split(" ");
Map<Integer, String> map = new TreeMap<Integer, String>();
for(String position: splittedString) {
map.put(Integer.parseInt(position.replaceAll("[^\\d.]" , "")), position) ;
}
Test the logic:
for(Map.Entry<Integer, String> entry : map.entrySet())
System.out.println(entry.getKey() + " " + entry.getValue());
...
Key here is using a TreeMap (Java standard implementation of sorted Map), that hides order work.
Rest is quite obvious I suppose...hardest thing is the regex that "cleans" string taking pure number value...
Last step:
String[] result = map.values().toArray(new String[map.size()]);
System.out.println(Arrays.toString(result).replace(",",""));
Hope it helps!

Searching through a hash map for multiple keys in Java

I am trying to figure out how to go about searching some user input for multiple keywords.The keywords come from a hash map called Synonym. So basically I enter some sentence and if the sentence contains one or more keywords or keyword synonyms I want to call a parse file method. So far I could only search for one keyword. I am stuck trying to get a user input which could be a long sentence or just one word containing the keyword(s) and search the hash map key for that matching word. For example, If the hash map is
responses.put("textbook name", new String[] { "name of textbook", "text", "portfolio" });
responses.put("current assignment", new String[] { "homework","current work" });
and the user inputs " what is the name of textbook that has the homework" I want to search a text file for textbook current assignment. Assuming that the text file contains the sentence The current assignment is in the second textbook name ralphy". I mean i got most of my implementation done, the issue is dealing with more than one keyword. Can someone help me solve this?
Here is my code
private static HashMap<String, String[]> responses = new HashMap<String, String[]>(); // this
public static void parseFile(String s) throws FileNotFoundException {
File file = new File("data.txt");
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
final String lineFromFile = scanner.nextLine();
if (lineFromFile.contains(s)) {
// a match!
System.out.println(lineFromFile);
// break;
}
}
}
private static HashMap<String, String[]> populateSynonymMap() {
responses.put("test", new String[] { "test load", "quantity of test","amount of test" });
responses.put("textbook name", new String[] { "name of textbook", "text", "portfolio" });
responses.put("professor office", new String[] { "room", "post", "place" });
responses.put("day", new String[] { "time", "date" });
responses.put("current assignment", new String[] { "homework","current work" });
return responses;
}
public static void main(String args[]) throws ParseException, IOException {
/* Initialization */
HashMap<String, String[]> synonymMap = new HashMap<String, String[]>();
synonymMap = populateSynonymMap(); // populate the map
Scanner scanner = new Scanner(System.in);
String input = null;
/*End Initialization*/
System.out.println("Welcome To DataBase ");
System.out.println("What would you like to know?");
System.out.print("> ");
input = scanner.nextLine().toLowerCase();
String[] inputs = input.split(" ");
for (String ing : inputs) { // iterate over each word of the sentence.
boolean found = false;
for (Map.Entry<String, String[]> entry : synonymMap.entrySet()) {
String key = entry.getKey();
String[] value = entry.getValue();
if (input.contains(key) || key.contains(input)|| Arrays.asList(value).contains(input)) {
found = true;
parseFile(entry.getKey());
}
}
}
}
Any help would be appreciated
I have answered very similar question Understand two or more keys with Hashmaps. But I'll make my point more clear. In the current set of datastructures that you have used lets consider the following structures
1) Input List --> Spilt words in the sentence (may be in order) and keep it in a list example [what,is,the,name,of,textbook,that,has,the,homework]
2) Keyword list --> All keys from the Hashmap database you are using example [test,textbook name,professor office]
Now you have to set some criteria by which you say I can have max 3 words phrase out of sentence (example 'name of textbook')as keyword, why this criteria - to limit the processing, otherwise you'll end up checking lot of combinations of input.
Once you have this, you check whats common in input list and keyword list for criteria you have set. If you don't set criteria then you may try all the combinations against the key set.Once you find single or multiple match, output the synonym list etc.
Example check [name of textbook] against all your keys of the map.
If you want to reverse check, the do the same process by creating a list of synonyms and checking it.
My two tips tackling this problem
1) Define set of keywords and don't check with value list, Hash map structure is not good for that. In this be prepared for redundant data.
2) Set how many words in order you want to search in this keyset. And preferably only keep distinct words.
Hope this helps!
You could use a single regex pattern per "dictionary entry" and test each pattern against your input. Depending on your performance requirements and the size of your dictionary and input, it might be a good solution.
If you're using java 8, try this:
public static class DicEntry {
String key;
String[] syns;
Pattern pattern;
public DicEntry(String key, String... syns) {
this.key = key;
this.syns = syns;
pattern = Pattern.compile(".*(?:" + Stream.concat(Stream.of(key), Stream.of(syns))
.map(x -> "\\b" + Pattern.quote(x) + "\\b")
.collect(Collectors.joining("|")) + ").*");
}
}
public static void main(String args[]) throws ParseException, IOException {
// Initialization
List<DicEntry> synonymMap = populateSynonymMap();
Scanner scanner = new Scanner(System.in);
// End Initialization
System.out.println("Welcome To DataBase ");
System.out.println("What would you like to know?");
System.out.print("> ");
String input = scanner.nextLine().toLowerCase();
boolean found;
for (DicEntry entry : synonymMap) {
if (entry.pattern.matcher(input).matches()) {
found = true;
System.out.println(entry.key);
parseFile(entry.key);
}
}
}
private static List<DicEntry> populateSynonymMap() {
List<DicEntry> responses = new ArrayList<>();
responses.add(new DicEntry("test", "test load", "quantity of test", "amount of test"));
responses.add(new DicEntry("textbook name", "name of textbook", "text", "portfolio"));
responses.add(new DicEntry("professor office", "room", "post", "place"));
responses.add(new DicEntry("day", "time", "date"));
responses.add(new DicEntry("current assignment", "homework", "current work"));
return responses;
}
Sample output:
Welcome To DataBase
What would you like to know?
> what is the name of textbook that has the homework
textbook name
current assignment
Make a list/append the keys that match. As for the given example , when keyword "textbook" matches store it in a "temp" variable. Now, continue the loop, now keyword "current" matches , append this to variable temp. So, now temp contains "textbook current". Similairly, continue and append the next keyword "assignment" into "temp".
Now, temp contains "textbook current assignment".
Now at the end call the parseFile(temp).
This should work for single or multiple matches.
//Only limitation is the keys are to be given in a ordered sequence , if you want
// to evaluate all the possible combinations then better add all the keys in a list
// And append them in the required combination.
//There might be corner cases which I havent thought of but this might help/point to a more better solution
String temp = "";
//flag - used to indicate whether any word was found in the dictionary or not?
int flag = 0;
for (String ing : inputs) { // iterate over each word of the sentence.
boolean found = false;
for (Map.Entry<String, String[]> entry : synonymMap.entrySet()) {
String key = entry.getKey();
String[] value = entry.getValue();
if (input.contains(key)) {
flag = 1;
found = true;
temp = temp +" "+ key;
}
else if (key.contains(input)) {
flag = 1;
found = true;
temp = temp +" "+ input;
}
else if (Arrays.asList(value).contains(input)) {
flag = 1;
found = true;
temp = temp +" "+ input;
}
}
}
if (flag == 1){
parseFile(temp);
}

Find any of the list element present in given string

I have a list of strings say "abc" "bcd" "xyz" etc
and a string "xyzxxxxxxxxx" and i need to find whether any of the list value is present in the string.
in C# we have .any function to find it . Is there any way in java??
I don't think it is possible in 1 line, but you can loop over the list and use .contains()
String[] listOfString = { "abc", "bcd", "xyz" };
String s = "xyzxxxxxxxxx";
for (String temp : listOfStrings) {
s.contains(temp);
}
You can also use indexOf() if you want to know the position of the occurrence.
You can look for a presence of a substring within a larger string by using the String.contains method:
String needle = "abc";
String haystack = "xyzxxxxxxxxx";
if (haystack.contains(needle)) {
// react accordingly
}
To expand that to your specific requirement, you can simply loop over all of your substrings and check each of them in turn. (Possibly short-circuiting early depending on what you want to do in that case where a match is found).
public class Main {
public static void main(String[] args) {
String[] strings = new String[]{"abc", "dfg"};
String ss = "abcd";
for(String s : strings) {
System.out.println(ss + " contains " + s + ": " + ss.contains(s));
}
}
}
This is an example that would do what you are trying to achieve:
public static void main(String[] args) {
List<String> list = Arrays.asList("abc","bcd", "xyz");
String search = "xyzxxxxxxxxx";
for (String s : list) {
if (search.contains(s)) {
System.out.println("Found " + s + " in " + search);
}
}
}
You could use something like:
String source = "xyzxxxxxxxxx";
List<String> strings = ...;
for (int i = 0; i < string.size();i++)
{
if (source.contains(strings.get(i))
{
System.out.println("Match found at " + (i + 1));
break;
}
}
Loop through your list and use
C# String.IndexOf
Java String.indexOf
you could iterate over the list, and call string.contains(listelement) everytime.

Categories

Resources