Auto completion function Java - java

I am working on a Displaykeyboard for disabled peaople and i am thinking about to adding a auto word completion function.
I found a example from oracle that works as i need it. Its the Another Example: TextAreaDemo. The problem is i dont really understand the search algorithm and the problem is when i add some word to the arraylist the search algorithm stops working properly.
String prefix = content.substring(w + 1).toLowerCase();
int n = Collections.binarySearch(words, prefix);
if (n < 0 && -n <= words.size()) {
String match = words.get(-n - 1);
if (match.startsWith(prefix)) {
// A completion is found
String completion = match.substring(pos - w);
// We cannot modify Document from within notification,
// so we submit a task that does the change later
SwingUtilities.invokeLater(
new CompletionTask(completion, pos + 1));
}
} else {
// Nothing found
mode = Mode.INSERT;
}
Is there a way to modify the example so it will work with any words?

Make sure you're not just adding the word to the end of the list and then using binarySearch(). It's documentation says the following
The list must be sorted into ascending order according to the natural
ordering of its elements (as by the sort(List) method) prior to making
this call.
Read more about it here: https://docs.oracle.com/javase/7/docs/api/java/util/Collections.html#binarySearch(java.util.List,%20T)

Related

Get parameters of nested functions

I am trying to implement a parser in Java to extract the arguments of some functions.
When I have a function like:
max(1, 2, 3)
I just simply can use a Regular Expresion to extract the args.
But all my functions are not like that. If I have some nested function, eg:
max(sum(1, max(1,2,sum(2,5)), 3, 5, mult(3,3))
I would like to obtain:
sum(1, max(1,2,sum(2,5))
3
5
mult(3,3)
I tried using a Regular Expression, but I asume the language is not regular. Another approach was splliting by ',', but did not work as well.
Is there any method for extracting the arguments of a function? I do not really know how this type of problem can be solved since there is no a pattern to use for extracting the arguments.
Any help or insight would be really appreciated. Thanks!!
Parsing a source code into an some abstract model is quite complex topic, depending on the language complexity.
But first step is usually tokenization, where you read one character at a time and detect full tokens (like variable names, function names, operators, literals etc).
Since you presented only very limited scope for the problem , you have very small set of tokens:
name of a function
( and ) to indicate method call
, to separate arguments
numbers
Reading one symbol at the time, you should be able to very easily detect when one token ends and the next one begins. Also your tokens are very distinct (i.e. you don't have to differentiate function name from variable name), you can very easily categorize them.
Once you have a token, you know the grammar (you have only function calls), you can easily build a syntax tree (where at the root you have top level function call with its arguments being children nodes).
From that structure you can easily fetch whichever parts you wish.
If you are more interested in how it works in the javac compiler, you can always check out its source code (it's open source after all):
https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavacParser.java
https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavaTokenizer.java
However, that may be quite a long read.
Finally found a method that works:
public List<String> parseArgs(String l){
int startIdx = l.indexOf("(") + 1;
int endIdx = l.lastIndexOf(")") - 1;
int count = 0;
int argIdx = startIdx;
List<String> args = new ArrayList<>();
for (int i = startIdx; i < endIdx; i++) {
if (l.charAt(i) == '(')
count -= 1;
else if (l.charAt(i) == ')'){
count += 1;
}
else if (l.charAt(i) == ',' && count == 0){
args.add(l.substring(argIdx, i).trim());
argIdx = i + 1;
}
}
args.add(l.substring(argIdx, endIdx + 1).trim());
return args;
}
String l = "max(sum(1, max(1,2,sum(2,5))), 3, 5, mult(3,3))";
parseArgs(l).forEach(System.out::println);
//Prints
sum(1, max(1,2,sum(2,5)))
3
5
mult(3,3)

Dynamic addition & use of variable outside for loop- Selenium

Below is the scenario i am trying to automate:
Put all numerical values of the links in a Selenium Weblist & perform an addition and later to verify if the sum of count matches a fixed number.
The issue is that the numerical links returns a number engulfed in braces example:(20)(35)(16)(15)
I need to first trim these brackets & fetch only the numbers & then perform the addition i.e: 20+35+16+15
Later i need to assert the total against the number i.e: Assert.assertequals(sum,'86')
List<WebElement> lists=driver.findElements(By.cssSelector("span.ndocs"));
for (int i=0; i<lists.size(); ){
String trimmed_value=lists.get(i).getText();
trimmed_value=lists.get(i).getText().trim().substring(trimmed_value.indexOf("(") + 1);
trimmed_value=lists.get(i).getText().trim().substring(0, trimmed_value.indexOf(")"));
System.out.println(trimmed_value);
int numerical_value = Integer.parseInt(trimmed_value);
i++;
}
Till now i am able to get the elements, iterate them & able to remove the braces & get the numbers, I am stuck upon how to perform the addition operation & then do an Assert for the count.
Any help will be much appreciated here.
Try using below code.
Initialize a variable outside the method and add every trimmed_value to it as explained below.
import assertEquals(import static org.junit.Assert.assertEquals;)
int expected_value=86;
int numerical_value=0;
List<WebElement> lists = driver.findElements(By.cssSelector("span.ndocs"));
for (int i = 0; i < lists.size(); ) {
String trimmed_value = lists.get(i).getText();
trimmed_value = lists.get(i).getText().trim().substring(trimmed_value.indexOf("(") + 1);
trimmed_value = lists.get(i).getText().trim().substring(0, trimmed_value.indexOf(")"));
System.out.println(trimmed_value);
numerical_value =numerical_value+Integer.parseInt(trimmed_value);
i++;
}
assertEquals(expected_value, numerical_value);

Ektorp CouchDb: Query for pattern with multiple contains

I want to query multiple candidates for a search string which could look like "My sear foo".
Now I want to look for documents which have a field that contains one (or more) of the entered strings (seen as splitted by whitespaces).
I found some code which allows me to do a search by pattern:
#View(name = "find_by_serial_pattern", map = "function(doc) { var i; if(doc.serialNumber) { for(i=0; i < doc.serialNumber.length; i+=1) { emit(doc.serialNumber.slice(i), doc);}}}")
public List<DeviceEntityCouch> findBySerialPattern(String serialNumber) {
String trim = serialNumber.trim();
if (StringUtils.isEmpty(trim)) {
return new ArrayList<>();
}
ViewQuery viewQuery = createQuery("find_by_serial_pattern").startKey(trim).endKey(trim + "\u9999");
return db.queryView(viewQuery, DeviceEntityCouch.class);
}
which works quite nice for looking just for one pattern. But how do I have to modify my code to get a multiple contains on doc.serialNumber?
EDIT:
This is the current workaround, but there must be a better way i guess.
Also there is only an OR logic. So an entry fits term1 or term2 to be in the list.
#View(name = "find_by_serial_pattern", map = "function(doc) { var i; if(doc.serialNumber) { for(i=0; i < doc.serialNumber.length; i+=1) { emit(doc.serialNumber.slice(i), doc);}}}")
public List<DeviceEntityCouch> findBySerialPattern(String serialNumber) {
String trim = serialNumber.trim();
if (StringUtils.isEmpty(trim)) {
return new ArrayList<>();
}
String[] split = trim.split(" ");
List<DeviceEntityCouch> list = new ArrayList<>();
for (String s : split) {
ViewQuery viewQuery = createQuery("find_by_serial_pattern").startKey(s).endKey(s + "\u9999");
list.addAll(db.queryView(viewQuery, DeviceEntityCouch.class));
}
return list;
}
Looks like you are implementing a full text search here. That's not going to be very efficient in CouchDB (I guess same applies to other databases).
Correct me if I am wrong but from looking at your code looks like you are trying to search a list of serial numbers for a pattern. CouchDB (or any other database) is quite efficient if you can somehow index the data you will be searching for.
Otherwise you must fetch every single record and perform a string comparison on it.
The only way I can think of to optimize this in CouchDB would be the something like the following (with assumptions):
Your serial numbers are not very long (say 20 chars?)
You force the search to be always 5 characters
Generate view that emits every single 5 char long substring from your serial number - more or less this (could be optimized and not sure if I got the in):
...
for (var i = 0; doc.serialNo.length > 5 && i < doc.serialNo.length - 5; i++) {
emit([doc.serialNo.substring(i, i + 5), doc._id]);
}
...
Use _count reduce function
Now the following url:
http://localhost:5984/test/_design/serial/_view/complex-key?startkey=["01234"]&endkey=["01234",{}]&group=true
Will return a list of documents with a hit count for a key of 01234.
If you don't group and set the reduce option to be false, you will get a list of all matches, including duplicates if a single doc has multiple hits.
Refer to http://ryankirkman.com/2011/03/30/advanced-filtering-with-couchdb-views.html for the information about complex keys lookups.
I am not sure how efficient couchdb is in terms of updating that view. It depends on how many records you will have and how many new entries appear between view is being queried (I understand couchdb rebuilds the view's b-tree on demand).
I have generated a view like that that splits doc ids into 5 char long keys. Out of over 1K docs it generated over 30K results - id being 32 char long, simple maths really: (serialNo.length - searchablekey.length + 1) * docscount).
Generating the view took a while but the lookups where fast.
You could generate keys of multiple lengths, etc. All comes down to your records count vs speed of lookups.

A more efficient way of finding English words that are one letter off from eachother

I wrote a little program that tries to find a connection between two equal length English words. Word A will transform into Word B by changing one letter at a time, each newly created word has to be an English word.
For example:
Word A = BANG
Word B = DUST
Result:
BANG -> BUNG ->BUNT -> DUNT -> DUST
My process:
Load an English wordlist(consist of 109582 words) into a Map<Integer, List<String>> _wordMap = new HashMap();, key will be the word length.
User put in 2 words.
createGraph creates a graph.
calculate the shortest path between those 2 nodes
prints out the result.
Everything works perfectly fine, but I am not satisfied with the time it took in step 3.
See:
Completely loaded 109582 words!
CreateMap took: 30 milsecs
CreateGraph took: 17417 milsecs
(HOISE : HORSE)
(HOISE : POISE)
(POISE : PRISE)
(ARISE : PRISE)
(ANISE : ARISE)
(ANILE : ANISE)
(ANILE : ANKLE)
The wholething took: 17866 milsecs
I am not satisfied with the time it takes create the graph in step 3, here's my code for it(I am using JgraphT for the graph):
private List<String> _wordList = new ArrayList(); // list of all 109582 English words
private Map<Integer, List<String>> _wordMap = new HashMap(); // Map grouping all the words by their length()
private UndirectedGraph<String, DefaultEdge> _wordGraph =
new SimpleGraph<String, DefaultEdge>(DefaultEdge.class); // Graph used to calculate the shortest path from one node to the other.
private void createGraph(int wordLength){
long before = System.currentTimeMillis();
List<String> words = _wordMap.get(wordLength);
for(String word:words){
_wordGraph.addVertex(word); // adds a node
for(String wordToTest : _wordList){
if (isSimilar(word, wordToTest)) {
_wordGraph.addVertex(wordToTest); // adds another node
_wordGraph.addEdge(word, wordToTest); // connecting 2 nodes if they are one letter off from eachother
}
}
}
System.out.println("CreateGraph took: " + (System.currentTimeMillis() - before)+ " milsecs");
}
private boolean isSimilar(String wordA, String wordB) {
if(wordA.length() != wordB.length()){
return false;
}
int matchingLetters = 0;
if (wordA.equalsIgnoreCase(wordB)) {
return false;
}
for (int i = 0; i < wordA.length(); i++) {
if (wordA.charAt(i) == wordB.charAt(i)) {
matchingLetters++;
}
}
if (matchingLetters == wordA.length() - 1) {
return true;
}
return false;
}
My question:
How can I improve my algorithm inorder to speed up the process?
For any redditors that are reading this, yes I created this after seeing the thread from /r/askreddit yesterday.
Here's a starting thought:
Create a Map<String, List<String>> (or a Multimap<String, String> if you've using Guava), and for each word, "blank out" one letter at a time, and add the original word to the list for that blanked out word. So you'd end up with:
.ORSE => NORSE, HORSE, GORSE (etc)
H.RSE => HORSE
HO.SE => HORSE, HOUSE (etc)
At that point, given a word, you can very easily find all the words it's similar to - just go through the same process again, but instead of adding to the map, just fetch all the values for each "blanked out" version.
You probably need to run it through a profiler to see where most of the time is taken, especially since you are using library classes - otherwise you might put in a lot of effort but see no significant improvement.
You could lowercase all the words before you start, to avoid the equalsIgnoreCase() on every comparison. In fact, this is an inconsistency in your code - you use equalsIgnoreCase() initially, but then compare chars in a case-sensitive way: if (wordA.charAt(i) == wordB.charAt(i)). It might be worth eliminating the equalsIgnoreCase() check entirely, since this is doing essentially the same thing as the following charAt loop.
You could change the comparison loop so it finishes early when it finds more than one different letter, rather than comparing all the letters and only then checking how many are matching or different.
(Update: this answer is about optimizing your current code. I realize, reading your question again, that you may be asking about alternative algorithms!)
You can have the list of words of same length sorted, and then have a loop nesting of the kind for (int i = 0; i < n; ++i) for (int j = i + 1; j < n; ++j) { }.
And in isSimilar count the differences and on 2 return false.

Javabat substring counting

public boolean catDog(String str)
{
int count = 0;
for (int i = 0; i < str.length(); i++)
{
String sub = str.substring(i, i+1);
if (sub.equals("cat") && sub.equals("dog"))
count++;
}
return count == 0;
}
There's my code for catDog, have been working on it for a while and just cannot find out what's wrong. Help would be much appreciated!*/
EDIT- I want to Return true if the string "cat" and "dog" appear the same number of times in the given string.
One problem is that this will never be true:
if (sub.equals("cat") && sub.equals("dog"))
&& means and. || means or.
However, another problem is that your code looks like your are flailing around randomly trying to get it to work. Everyone does this to some extent in their first programming class, but it's a bad habit. Try to come up with a clear mental picture of how to solve the problem before you write any code, then write the code, then verify that the code actually does what you think it should do and that your initial solution was correct.
EDIT: What I said goes double now that you've clarified what your function is supposed to do. Your approach to solving the problem is not correct, so you need to rethink how to solve the problem, not futz with the implementation.
Here's a critique since I don't believe in giving code for homework. But you have at least tried which is better than most of the clowns posting homework here.
you need two variables, one for storing cat occurrences, one for dog, or a way of telling the difference.
your substring isn't getting enough characters.
a string can never be both cat and dog, you need to check them independently and update the right count.
your return statement should return true if catcount is equal to dogcount, although your version would work if you stored the differences between cats and dogs.
Other than those, I'd be using string searches rather than checking every position but that may be your next assignment. The method you've chosen is perfectly adequate for CS101-type homework.
It should be reasonably easy to get yours working if you address the points I gave above. One thing you may want to try is inserting debugging statements at important places in your code such as:
System.out.println(
"i = " + Integer.toString (i) +
", sub = ["+sub+"]" +
", count = " + Integer.toString(count));
immediately before the closing brace of the for loop. This is invaluable in figuring out what your code is doing wrong.
Here's my ROT13 version if you run into too much trouble and want something to compare it to, but please don't use it without getting yours working first. That doesn't help you in the long run. And, it's almost certain that your educators are tracking StackOverflow to detect plagiarism anyway, so it wouldn't even help you in the short term.
Not that I really care, the more dumb coders in the employment pool, the better it is for me :-)
choyvp obbyrna pngQbt(Fgevat fge) {
vag qvssrerapr = 0;
sbe (vag v = 0; v < fge.yratgu() - 2; v++) {
Fgevat fho = fge.fhofgevat(v, v+3);
vs (fho.rdhnyf("png")) {
qvssrerapr++;
} ryfr {
vs (fho.rdhnyf("qbt")) {
qvssrerapr--;
}
}
}
erghea qvssrerapr == 0;
}
Another thing to note here is that substring in Java's built-in String class is exclusive on the upper bound.
That is, for String str = "abcdefg", str.substring( 0, 2 ) retrieves "ab" rather than "abc." To match 3 characters, you need to get the substring from i to i+3.
My code for do this:
public boolean catDog(String str) {
if ((new StringTokenizer(str, "cat")).countTokens() ==
(new StringTokenizer(str, "dog")).countTokens()) {
return true;
}
return false;
}
Hope this will help you
EDIT: Sorry this code will not work since you can have 2 tokens side by side in your string. Best if you use countMatches from StringUtils Apache commons library.
String sub = str.substring(i, i+1);
The above line is only getting a 2-character substring so instead of getting "cat" you'll get "ca" and it will never match. Fix this by changing 'i+1' to 'i+2'.
Edit: Now that you've clarified your question in the comments: You should have two counter variables, one to count the 'dog's and one to count the 'cat's. Then at the end return true if count_cats == count_dogs.

Categories

Resources