transform short word to original word - java

I used some word counting algorithm and by a closer look I was wondering because I got out less words than originally in the text because they count for example "it's" as one word. So I tried to find a solution but without any success, so I asked myself if their exist anything to transform a "short word" like "it's" to their "base words", say "it is".

Well, basically you need to provide a data structure that maps abbreviated terms to their corresponding long versions. However, this will not be as simple as it sounds, for example you won't want to transform "The client's car." to "The client is car."
To manage these cases, you will probably need a heuristic that has a deeper understanding of the language you are processing and the grammar rules it incorporates.

I just built this from scratch for the challenge. It seems to be working on my end. Let me know how it works for you.
public static void main(String[] args) {
String s = "it's such a lovely day! it's really amazing!";
System.out.println(convertText(s));
//output: it is such a lovely day! it is really amazing!
}
public static String convertText(String text) {
String noContraction = null;
String replaced = null;
String[] words = text.split(' ');
for (String word : words) {
if (word.contains("'s")) {
String replaceAposterphe = word.replace("'", "$");
String[] splitWord = replaceAposterphe.split('$');
noContraction = splitWord[0] + " is";
replaced = text.replace(word, noContraction);
}
}
return replaced;
}
I did this in C# and tried to convert it into Java. If you see any syntax errors, please point them out.

Related

How can I replace / edit only a specific word in Java?

I want a specific word being replaced/ edited. But unfortunately, other words are replaced too who contain the word to be replaced.
Example:
String test = "I am a ool tool";
Now if I want to replace the word "ool" with something, "tool" is gonna be changed as well. So how can I solve this problem? I JUST want ool to be edited. "tool" should stay like it is.
Here some code:
public class StringMethoden {
public static void main(String[] args) {
String bsp = "I am a ool tool";
if (bsp.matches("(.*)ool(.*)")){
bsp = bsp.replaceAll("ool", "test");
System.out.println(bsp);
}
else {
System.out.println("sentence does not conain 'ool' !");
}}
Outut: I am a test ttest
Word boundaries (\b) in Java RegEx make sure a certain point in the string is the start/end of a word.
bsp = bsp.replaceAll("\\bool\\b", "test");
Here is a similar approach for the issue, we can also use .contains() method in the following manner, feel free to ask questions if any.
String bsp = "I am a ool tool";
if (bsp.contains(" ool")) {
bsp = bsp.replaceAll(" ool", " test");
System.out.println(bsp);
} else {
System.out.println("sentence does not conain 'ool' !");
}
Output:
I am a test tool

Spell-Check: Find one-to-one token difference mapping between two strings

I recently stumbled over this question on an internet archive and am having some difficulty wrapping my head around it. I want to find a desired mapping amongst the different tokens between two strings. The output should a String-to-String map.
For example:
String1: hewlottpackardenterprise helped american raleways in N Y
String2: hewlett packard enterprise helped american railways in NY
Output:
hewlottpackardenterprise -> hewlett packard enterprise
hewlott -> hewlett
raleways -> railways
N Y -> NY
Note: I have been able to write an edit-distance method, which finds all types of edits (segregated by types, like deletion, substitution etc.) and can convert the first string to second by a convert method
What have I tried so far?
Approach 1: I began with a naive approach of splitting both the strings by space, inserting the tokens of the first string into a hash map and comparing the tokens of the other string with this hashmap. However, this approach quickly fails as misses on relevant mappings.
Approach 2: I utilize my covert method to find the edit positions in the string, and type of edits. Using space edits, I'm able to create a mapping from hewlottpackardenterprise -> hewlett packardenterprise. However, the method just explodes as more and more things need to be splitted within the same word.
Appreciate any thoughts in this regard! Will clear any doubts in the comments.
public String returnWhiteSpaceEdittoken(EditDone e, List<String> testTokens) {
int pos = e.pos, count=0, i=0;
String resultToken = null;
if (e.type.equals(DeleteEdit)) {
for (i=0;i<testTokens.size();i++) {
count+=testTokens.get(i).length();
if (count==pos) {
break;
}
if (i!=testTokens.size()-1) {
count++;
}
}
resultToken = testTokens.get(i) + " " + testTokens.get(i+1);
} else if (e.type.equals(InsertEdit)) {
for (i=0;i<testTokens.size();i++) {
count+=testTokens.get(i).length();
if (count>pos) {
break;
}
if (i!=testTokens.size()-1) {
count++;
}
}
String token = testTokens.get(i);
resultToken = token.substring(count-token.length(), pos) + token.substring(pos, count);
}
return resultToken;
}
A pretty common way of handling problems like this is to find the longest common subsequence (or it's dual the shortest edit script) between the two strings and then post-process the output to get the specific format you want; in your case the string maps.
Wikipedia has a pretty decent introduction to the problem here: https://en.wikipedia.org/wiki/Longest_common_subsequence_problem
and a great paper "An O(ND) Difference Algorithm and Its Variations" by Myers can be found here. http://www.xmailserver.org/diff2.pdf

how to end string in java when a double quote comes

I'm currently developing an online Multilingual Dictionary in JSP, for translation I'm using microsoft-translator-java-api, and for finding meaning I'm using services.aonaware.com/DictService/DictService dict service.
first I'm making request to services.aonaware.com/DictService/DictService dict service and I'm getting output after parsing
WORD
know v 1: be cognizant or aware of a fact or a specific piece of information; possess knowledge or information about; "I know that the President lied to the people"; "I want to know who is winning the game!"; "I know it's time" ...
now I want to get
be cognizant or aware of a fact or a specific piece of information; possess knowledge or information about
translated and I want
"I know that the President lied to the people"
be the same so I want to split string when ever ""/ double quote comes any help?
public static void main(String args[])
{
String a = "a; \"b\". c";
System.out.println("Original string:"+a);
// split by "
System.out.println("Split by \"");
for (String string : a.split("\""))
{
System.out.println(string.replaceAll("[.;]", ""));
}
}

How come my split("\n") isn't working

I've tried "\n\n" and "\r" and everything else, including replaceAll("\r\n", "n") and I still do not understand why it doesn't work. I've also tried "\w", "\n", "\n+" - I've basically tried everything under "My split("\n") doesn't work" on Google search.
I'm trying to split a word with a lot of "\n". I basically have two different classes. One generates this word, and via the other class constructor object transfers it into the split("\n") method. But whatever I do, the array still stays empty.
I've also tried word.split(System.getProperty("line.separator")) even though I didn't have a clue as to what it meant, but it also came up under one of the solutions to this problem.
Here's my Code:
//in Class A
public String getWord()
{
word = word +"\n" + horizontal;
return word;
}
//in Class B
classA a = new classA();
String grid = a.getWord();
String [] lines = grid.split("\n");
EDIT: Sorry, typo mistake, I'll just ask again later. I did actually put grid.split("\n") in my code. What now? The array really is empty. I did System.out.println(array.length) and it was 0. Also, I typed System.out.println("array is " + array) and it only gave me "array is" as output. I know I'm making a stupid mistake somewhere, and I know I can't expect people to answer my question if I don't know what info to provide.
I also wanted to add some stuff in the comments section here for the comfort of those sitting in front of their laptops...
word and horizontal is a string. It's actually a crossword puzzle together.
See? Look!
LONDONPYVRAOMNDDEFSG
GCPZVBATHYXAZXEZIMOZ
NKDGBERLINCHPLTMHMSM
ZMUKPGCHRKDTYGIMRLHO
TVRWBXPRETORIAJBVKWT
OGIVSDFULULHQHAHEJNV
PNWEJHBAKBJZNBPARIS
PHKCZCYGTXEEXDUCPMXF
QIMQMABRASILIALJOFJQ
GXNXKTAHIQMMIFPSYDLI
CAIROYKZYSWEFPUZPKRG
BTNAUNIDQAYVYAPGWWIN
QXZMQSZBTCBEIJINGBSD
QWQRYTBPTKRBCJUOMJTV
SODHAMSTERDAMEMSLVAM
YQHEVNXQQJXCDZKEYQVT
NAIROBISVDNTCFJNYDEG
AKXVOIGYTZTJHGIAFIKZ
BAGHDADSADJTWOOMVRYT
YCPOBXQQMQKBTDMYPYWT
It's city names. At the end of this, I'm supposed to show the solution to the puzzle by changing cases. I know how to do this, but the problem is that I can't seperate them into lines anymore. I don't know why. That's my only problem here. It seems to work for everyone, except for me.
Answers with clues will be appreciated? To delve into a dark and deep mystery...
It should be
grid.split("\n");
not
instance.split("\n")
Call grid.split("\n");
You can't split a class.
Better a.getWord().split("\n");
In your code there isn't no method named split , also your didn"t call your method getword inside System.out.println() ....
First Class :
public class A {
public String returnedWord ="";
public String getWord(String word , String horizontal)
{
returnedWord = word +"\n" + horizontal;
return returnedWord ;
}
}
the Second Class :
public class B {
public String word = "Hello";
public String horizontal = "World";
public static void main (String [] args ) {
A a = new A();
System.out.println(a.getword(word,horizontal));
}
}
you will get the output below :
Hello
World

Replacing words in a String java

Say I have a string,
String templatePhrase = "I have a string that needs changing";
I also have a method to replace words in any given String. Here is the method:
public String replace(String templatePhrase, String token, String wordToPut) {
return templatePhrase.replace(token, wordToPut);
}
Now say (for the sake of my actual task) I have all the words in my String str in a List named wordsInHashtags. I want to loop through all the words in wordsInHashtags and replace them with words from another List named replacement using the replace() method. Each time the loop iterates, the modified String should be saved so it will hold its replacement(s) for the next loop.
I will post my code if anyone would like to see it, but I think it would confuse more than help, and all I am interested in is a way to save the modified String for use in the next iteration of the loop.
I was just reading about strings in beginning Java 2 the other day, :"Strings Objects are immutable" Cant be changes basically however StringBuffer Objects were created to deal with such a circumstance as i understand it. You could try:
StringBuffer templatePhrase = "I have a string to be changed";
templatePhrase.replace(token, wordToPut);
String replacedString = (String)templatePhrase;
Line 3 may cause a problem?
public class Rephrase {
public static void main(String[] args) {
/***
Here is some code that might help to change word in string. originally this is a Question from Absolute Java 5th edition. It will change two variable whatever you want but algorithm never change.So the input from keyboard or any other input source.
********/
String sentence = "I hate you";
String replaceWord = " hate";
String replacementWord = "love";
int hateIndex = sentence.indexOf(replaceWord);
String fixed = sentence.substring(0,hateIndex)+" "+replacementWord+sentence.substring(hateIndex+replaceWord.length());
System.out.println(fixed);
}
}

Categories

Resources