So I was reading about Recursive searching in a binary search algorithm and I saw a line that says, that with every computation where you don't find the result, you cut the array you are looking through in half and create a new array. Is it really necessary to make a new array with every computation instead of adjusting the start and end index of the array you started with?
sure you can just adjust the start and end index. This is the implementation. What you are reading is an easy description of the algorithm, the implementation can differ if it still does the work.
Related
What would be the time complexity? I just want to avoid this being O(n!). Would using depth first search be time complexity O(n^2), as for each letter it may have to go through all other letters worst case?
I guess I'm not sure if I'm thinking about this the right way.
When I say use depth first search, I mean starting depth first search from first the letter, and then starting from the second letter, etc.
Is that necessary?
Note:
The original problem is to find all possible words in a crossword/boggle board. I'm thinking of using the trie data structure to find if a word is in the dictionary, but am thinking about ways of generating the words themselves.
Following the discussion above, here is my answer:
Definition: a trieX is a sub trie, with words of length X only.
Since we have a trie with all words in the desired language, we can also get the appropriate trieX.
We say that the crossword puzzle has w words, so we create an array w long where each entry is the root of a trieX where X is the length of the relevantword. This gives us the list of possible words in each blank word.
Then the iterate over intersections between words and eliminate words that can not be placed. When there are no more changes done - we stop.
Two remarks:
1. In order to improve performance, we start by adding either long words, or VERY short ones. What is short or long? have a look at this and this.
2. Elimination of words from the trieX's can also be done by checking dependencies between words (if THIS words is here, then THAT words can't be there, etc.). This is more complicated, so if anyone wants to add some ideas on how to do this easily - please do.
I have to make a Java program which finds all repeating sub-strings of length n in a given String. The input is string is extremely long and a brute-force approach takes too much time.
I alread tried:
Presently I am finding each sub-string separately and checking for repetitions of that sub-string using the KMP alogrithm. This too is taking too much time.
What is a more efficient approach for this problem?
1) You should look at using a suffix tree data structure.
Suffix Tree
This data structure can be built in O(N * log N) time
(I think even in O(N) time using Ukkonen's algorithm)
where N is the size/length of the input string.
It then allows for solving many (otherwise) difficult
tasks in O(M) time where M is the size/length of the pattern.
So even though I didn't try your particular problem, I am pretty sure that
if you use a suffix tree and a smart formulation of your problem, then the
problem can be solved by using a suffix tree (in reasonable O time).
2) A very good book on these (and related) subjects is this one:
Algorithms on Strings, Trees and Sequences
It's not really easy to read though unless you're well-trained in algorithms.
But OK, reading such things is the only way to get well-trained ;)
3) I suggest you have a quick look at this algorithm too.
Aho-Corasick Algorithm
Even though, I am not sure but... this one might be somewhat
off-topic with respect to your particular problem.
I am going to take #peter.petrov's suggestion and enhance it by explaining how can one actually use a suffix tree to solve the problem:
1. Create a suffix tree from the string, let it be `T`.
2. Find all nodes of depth `n` in the tree, let that set of nodes be `S`. This can be done using DFS, for example.
3. For each node `n` in `S`, do the following:
3.1. Do a DFS, and count the number of terminals `n` leads to. Let this number be `count`
3.2. If `count>1`, yield the substring that is related to `n` (the path from root to `n`), and `count`
Note that this algorithm treats any substring of length n and add it to the set S, and from there it search for how many times this was actually a substring by counting the number of terminals this substring leads to.
This means that the complexity of the problem is O(Creation + Traversal) - meaning, you first create the tree and then you traverse it (easy to see you don't traverse in steps 2-3 each node in the tree more than once). Since the traversal is obviously "faster" than creation of the tree - it leaves you with O(Creation), which as was pointed by #perer.petrov is O(|S|) or O(|S|log|S|) depending on your algorithm of choice.
I'm trying to find a data structure to use in my Java project. What I'm trying to do is get the next greatest value below an arbitrary number from a set of numbers, or be notified if no such number exists.
Example 1)
My Arbitrary number is 7.0.
{3.1, 6.0, 7.13131313, 8.0}
The number I'd need to get from this set would be 6.0.
Example 2)
My arbitrary number is 1.0.
{2.0, 3.5555, 999.0}
A next highest number doesn't exist in the set, so I'd need to know it doesn't exist.
The best I can think of is indexing and comparing through an array, and going back 1 step once I go over my arbitrary number. In worst case scenarios though my time complexity would be O(n). Is there a better way?
If you can pre-process the list of values, then you can sort the list (O(NLogN) time) and perform a binary search which will take O(LogN) for each value you want to get an answer for. otherwise you can't do better than O(N).
You need to sort the numbers at first.
And then you could do a simple binary search whose compare function modified to your need. At every point check the element is bigger than input, if so search in the left side or in the right side. Your modified binary search, at the end should be able to provide the immediate bigger and the smaller number with which you could solve your problem easily. Complexity is lg n.
I suggest that you look at either TreeSet.floor(...) or TreeSet.lower(...). One of those should satisfy your requirements, and both have O(logN) complexity ... assuming that you have already built the TreeSet instance.
If you only have a sorted array and don't want the overhead of building a TreeSet, then a custom binary search is the probably the best bet.
Your both example sets look sorted ...
If it is the case then you would need a binary search...
If it's not the case then you would need to visit every elements exactly one time.so it would take time n..
I am writing minimax as part of a project, but its awfully hard to check that it is working correctly. If I could print a tree of what it does, it would be extremely useful.
Is there an easy way to print a tree of recursive calls, selecting whatever variables are important to the situation?
Keep track of recursion depth by means of a parameter (in minimax, you'd do that anyway). Then print depth * a small number of spaces, followed by the interesting variables in each call to obtain
player=1, move=...
player=2, move=...
player=1, move=...
...
player=2, move=...
You might also want to print the return value of each recursive call.
If you desperately want a pretty picture of a tree, post-process the output of the above and feed it to a tree-drawing package.
how can I optimize the following:
final String[] longStringArray = {"1","2","3".....,"9999999"};
String searchingFor = "9999998"
for(String s : longStringArray)
{
if(searchingFor.equals(s))
{
//After 9999998 iterations finally found it
// Do the rest of stuff here (not relevant to the string/array)
}
}
NOTE: The longStringArray is only searched once per runtime & is not sorted & is different every other time I run the program.
Im sure there is a way to improve the worst case performance here, but I cant seem to find it...
P.S. Also would appreciate a solution, where string searchingFor does not exist in the array longStringArray.
Thank you.
Well, if you have to use an array, and you don't know if it's sorted, and you're only going to do one lookup, it's always going to be an O(N) operation. There's nothing you can do about that, because any optimization step would be at least O(N) to start with - e.g. populating a set or sorting the array.
Other options though:
If the array is sorted, you could perform a binary search. This will turn each lookup into an O(log N) operation.
If you're going to do more than one search, consider using a HashSet<String>. This will turn each lookup into an O(1) operation (assuming few collisions).
import org.apache.commons.lang.ArrayUtils;
ArrayUtils.indexOf(array, string);
ArrayUtils documentation
You can create a second array with the hash codes of the string and binary search on that.
You will have to sort the hash array and move the elements of the original array accordingly. This way you will end up with extremely fast searching capabilities but it's going to be kept ordered, so inserting new elements takes resources.
The most optimal would be implementing a binary tree or a B-tree, if you have really so much data and you have to handle inserts it's worth it.
Arrays.asList(longStringArray).contains(searchingFor)