I could only think of ways in which you would need to check each digit of the input value at least once to search for the digit.
As I understand, converting into string and using inbuilt functions like ".contains()" takes even longer.
Is there some algorithm or optimization that I can use to know if a particular digit is contained in the integer value in less than linear time with reference to number of digits in the value?
The input value may have as many as a hundred thousand digits.
Update: I have to work on 15 separate queries, each having one large integer of less than 10^6 digits(generated randomly), I have to determine if this value contains a particular digit(and keep a count upto 2 occurances,anything more can be ignored).
I wish to know if there is a way to perform this act of "digit-searching" in the integer value in less than linear time.
Suppose I have n number of strings, now I want to map each string to an integer within range from 0 to n-1 using a function such that whenever I call a function and pass the string and the n it will give me same and unique mapping on the go. So suppose if I have 4 strings "str1","str2","str3","str4" then the mapping will be from 0-3 and unique.
I tried doing something like : str.hashCode() % n, this is giving me the same mapping but is not within the range of 0 to n-1. I found something in PHP which is similar to this here-
https://madcoda.com/2014/04/how-to-hash-a-string-to-integer-with-a-range-php/
For the record
In Java, hash random strings down to an integer:
Math.abs(str.hashCode() % 7)
Result will be 0 inclusive to 6 inclusive.
Note:
If the input strings are really random and the same length etc (for example ... the input is a whole lot of uuids) then the output here will be randomly balanced.
If the input is - say - many human names, it is very unlikely the output will be randomly balanced.
Note:
What the OP was asking literally in the headline is answered here.
In fact, what the OP was asking about (in the body) has no connection at all to hashing. (That would just be a lookup table, a regex or such.)
I am looking for a mechanism in which I can represent a set of strings using unique numbers, so that when I want to sort them I can use the numbers to sort this values.
For eg, this is what I have in mind
I am keeping a fixed length number 20 digits
Each alphabet is represented with its ASCII/some alphabetical order value
cat - (03)(01)(20)(00)(00)(00)(00)(00)(00)(00) - 03012000000000000000
cataract - (03)(01)(20)(01)(18)(01)(03)(20)(00)(00) - 03012001180103200000
capital - (03)(01)(16)(09)(20)(01)(12)(00)(00)(00) - 03011609200112000000
So if I sort it based on the numbers, it will sort and say
capital, cat, cataract
Is this a good way of doing this?
Is there any other way for doing this so that I have more accuracy?
Thanks,
Sen
If your string length is fixed and your character set is fixed to say 100 different characters you could treat each character of your string as a number in a base 100 number to turn the string into a double.
If your set of strings is much smaller than the set of possible strings you could hash them and for collisions define the sort order arbitrarily but consistently.
In a specific case I probably wouldn't recommend either of those, but as a SUPER GENERAL solution to what you stated it works. But if you ask what seems to be a theoretical question, a theoretical answer seems appropriate.
I have this problem:
A positive integer is called a palindrome if its representation in the decimal system is the same when read from left to right and from right to left. For a given positive integer K of not more than 1000000 digits, write the value of the smallest palindrome larger than K to output. Numbers are always displayed without leading zeros.
Input
The first line contains integer t, the number of test cases. Integers K are given in the next t lines.
Output
For each K, output the smallest palindrome larger than K.
Example
Input:
2
808
2133
Output:
818
2222
My code converts the input to a string and evaluates either end of the string making adjustments accordingly and moves inwards. However, the problem requires that it can take values up to 10^6 digits long, if I try to parse large numbers I get a number format exception i.e.
Integer.parseInt(LARGENUMBER);
or
Long.parseInt(LARGENUMBER);
and LARGENUMBER is out of range. can anyone think of a work around or how to handle such large numbers?
You could probably use the BigInteger class to handle large integers like this.
However, I wouldn't count on it being efficient at such massive sizes. Because it still uses O(n^2) algorithms for multiplication and conversions.
Think of your steps that you do now. Do you see something that seems a little superfluous since you're converting the number to a string to process it?
While this problem talks about integers, its doing so only to restrict the input and output characters and format. This is really a string operations question with careful selection. Since this is the case, you really don't need to actually read the input in as integers, only strings.
This will make validating the palindrome simple. The only thing you should need to work out is choosing the next higher one.
I have an object with a String that holds a unique id .
(such as "ocx7gf" or "67hfs8")
I need to supply it an implementation of int hascode() which will be unique obviously.
how do i cast a string to a unique int in the easiest/fastest way?
10x.
Edit - OK. I already know that String.hashcode is possible. But it is not recommended in any place. Actually' if any other method is not recommended - Should I use it or not if I have my object in a collection and I need the hashcode. should I concat it to another string to make it more successful?
No, you don't need to have an implementation that returns a unique value, "obviously", as obviously the majority of implementations would be broken.
What you want to do, is to have a good spread across bits, especially for common values (if any values are more common than others). Barring special knowledge of your format, then just using the hashcode of the string itself would be best.
With special knowledge of the limits of your id format, it may be possible to customise and result in better performance, though false assumptions are more likely to make things worse than better.
Edit: On good spread of bits.
As stated here and in other answers, being completely unique is impossible and hash collisions are possible. Hash-using methods know this and can deal with it, but it does impact upon performance, so we want collisions to be rare.
Further, hashes are generally re-hashed so our 32-bit number may end up being reduced to e.g. one in the range 0 to 22, and we want as good a distribution within that as possible to.
We also want to balance this with not taking so long to compute our hash, that it becomes a bottleneck in itself. An imperfect balancing act.
A classic example of a bad hash method is one for a co-ordinate pair of X, Y ints that does:
return X ^ Y;
While this does a perfectly good job of returning 2^32 possible values out of the 4^32 possible inputs, in real world use it's quite common to have sets of coordinates where X and Y are equal ({0, 0}, {1, 1}, {2, 2} and so on) which all hash to zero, or matching pairs ({2,3} and {3, 2}) which will hash to the same number. We are likely better served by:
return ((X << 16) | (x >> 16)) ^ Y;
Now, there are just as many possible values for which this is dreadful than for the former, but it tends to serve better in real-world cases.
Of course, there is a different job if you are writing a general-purpose class (no idea what possible inputs there are) or have a better idea of the purpose at hand. For example, if I was using Date objects but knew that they would all be dates only (time part always midnight) and only within a few years of each other, then I might prefer a custom hash code that used only the day, month and lower-digits of the years, over the standard one. The writer of Date though can't work on such knowledge and has to try to cater for everyone.
Hence, If I for instance knew that a given string is always going to consist of 6 case-insensitive characters in the range [a-z] or [0-9] (which yours seem to, but it isn't clear from your question that it does) then I might use an algorithm that assigned a value from 0 to 35 (the 36 possible values for each character) to each character, and then walk through the string, each time multiplying the current value by 36 and adding the value of the next char.
Assuming a good spread in the ids, this would be the way to go, especially if I made the order such that the lower-significant digits in my hash matched the most frequently changing char in the id (if such a call could be made), hence surviving re-hashing to a smaller range well.
However, lacking such knowledge of the format for sure, I can't make that call with certainty, and I could well be making things worse (slower algorithm for little or even negative gain in hash quality).
One advantage you have is that since it's an ID in itself, then presumably no other non-equal object has the same ID, and hence no other properties need be examined. This doesn't always hold.
You can't get a unique integer from a String of unlimited length. There are 4 billionish (2^32) unique integers, but an almost infinite number of unique strings.
String.hashCode() will not give you unique integers, but it will do its best to give you differing results based on the input string.
EDIT
Your edited question says that String.hashCode() is not recommended. This is not true, it is recommended, unless you have some special reason not to use it. If you do have a special reason, please provide details.
Looks like you've got a base-36 number there (a-z + 0-9). Why not convert it to an int using Integer.parseInt(s, 36)? Obviously, if there are too many unique IDs, it won't fit into an int, but in that case you're out of luck with unique integers and will need to get by using String.hashCode(), which does its best to be close to unique.
Unless your strings are limited in some way or your integers hold more bits than the strings you're trying to convert, you cannot guarantee the uniqueness.
Let's say you have a 32 bit integer and a 64-character character set for your strings. That means six bits per character. That will allow you to store five characters into an integer. More than that and it won't fit.
represent each string character by a five-digit binary digit, eg. a by 00001 b by 00010 etc. thus 32 combinations are possible, for example, cat might be written as 00100 00001 01100, then convert this binary into decimal, eg. this would be 4140, thus cat would be 4140, similarly, you can get cat back from 4140 by converting it to binary first and Map the five digit binary to string
One way to do it is assign each letter a value, and each place of the string it's own multiple ie a = 1, b = 2, and so on, then everything in the first digit (read left to right) would be multiplied by a prime number, the next the next prime number and so on, such that the final digit was multiplied by a prime larger than the number of possible subsets in that digit (26+1 for a space or 52+1 with capitols and so on for other supported characters). If the number is mapped back to the first digits (leftmost character) any number you generate from a unique string mapping back to 1 or 6 whatever the first letter will be, gives a unique value.
Dog might be 30,3(15),101(7) or 782, while God 33,3(15),101(4) or 482. More importantly than unique strings being generated they can be useful in generation if the original digit is kept, like 30(782) would be unique to some 12(782) for the purposes of differentiating like strings if you ever managed to go over the unique possibilities. Dog would always be Dog, but it would never be Cat or Mouse.