This question already has answers here:
how can I get the String from hashCode
(4 answers)
Closed 3 years ago.
I need to somehow get the text from its hash in java.
I have this code:
String myString = new String("creashaks organzine");
int hashCode = myString.hashCode();
System.out.println("Hash:" + hashCode);
The result of this code will be 0.
But the hash of "pollinating sandboxes" string will also be 0.
There might be collisions, for example with "creashaks organzine" and "pollinating sandboxes" and I want to find collisions like in this case.
Since i don't have enough reputation to add comment, i will quote solution from another question
You know that several objects can have same hash(), as it mentioned in java doc for Object.hashCode()
It is not required that if two objects are unequal
* according to the {#link java.lang.Object#equals(java.lang.Object)}
* method, then calling the {#code hashCode} method on each of the
* two objects must produce distinct integer results.
It's obvious you can't restore different objects from same hash code, so it's impossible at all, simple logic.
how can I get the String from hashCode
This is a very interesting thing. Regarding the specification in https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/String.html#hashCode() says that the hashCode is calculated from the string content but the example seems to shows that is not true for the first string:
class Main
{
public static void main(String[] args)
{
String myString1 = "creashaks organzine";
String myString2 = "crsomething else";
String myString3 = "crsomething else";
System.out.println("Hash1:" + myString1.hashCode());
System.out.println("Hash2:" + myString2.hashCode());
System.out.println("Hash3:" + myString3.hashCode());
}
}
Outputs:
Hash1:0
Hash2:444616526
Hash3:444616526
But when I modify the string, then I get a different output:
String myString1 = "creashaks organzine...";
System.out.println("Hash1:" + myString1.hashCode());
Outputs:
Hash1:45678
So it seems that somebody tricked us by giving a very rare example string that produced exactly the "0" as output. Here you see that the hashCode is not very unique, so you cannot use is safely to compare strings.
Coming back to your initial question: The hashCode is a number with reduced details, so you cannot calculate it back to the original string. This applies to all hash codes.
Hash codes are so often used in server side databases instead of real password strings. They can be compared but not reconstructed.
Related
Hi I want to convert string to some unique number in java.
Exmple: "Production-0-1" to 100021
"Process-23-30" to 12310
And all return number has to be unique.
I dont wanted to use hashCode as they can return duplicate like "Aa" and "BB" has same has code.
Let me know math logic to create this is no method available.
String random = "Production-0-1";
String bi = new BigInteger(random.getBytes("UTF-8")).toString();
BigInteger numBig = new BigInteger(bi);
System.out.println(numBig);
Based on #markspace comments, I tried the following and every time it produces random unique number but beware if you have a very large String and a limited memory space then the output may go out of bound.
This question already has answers here:
String valueOf vs concatenation with empty string
(10 answers)
Closed 5 years ago.
I want to know the difference in two approaches. There are some old codes on which I'm working now, where they are setting primitive values to a String value by concatenating with an empty String "".
obj.setSomeString("" + primitiveVariable);
But in this link Size of empty Java String it says that If you're creating a separate empty string for each instance, then obviously that will take more memory.
So I thought of using valueOf method in String class. I checked the documentation String.valueOf() it says If the argument is null, then a string equal to "null"; otherwise, the value of obj.toString() is returned.
So which one is the better way
obj.setSomeString("" + primitiveVariable);
obj.setSomeString(String.valueOf(primitiveVariable));
The above described process of is done within a List iteration which is having a size of more than 600, and is expected to increase in future.
When you do "" that is not going to create an Object. It is going to create a String literal. There is a differenc(How can a string be initialized using " "?) actually.
Coming to your actual question,
From String concatenation docs
The Java language provides special support for the string concatenation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuilder(or StringBuffer) class and its append method.
So unnecissarly you are creating StringBuilder object and then that is giving another String object.
However valueOf directly give you a String object. Just go for it.
Besides the performance, just think generally. Why you concatenating with empty string, when actually you want to convert the int to String :)
Q. So which one is the better way
A. obj.setSomeString(String.valueOf(primitiveVariable)) is usually the better way. It's neater and more domestic. This prints the value of primitiveVariable as a String, whereas the other prints it as an int value. The second way is more of a "hack," and less organized.
The other way to do it is to use Integer.toString(primitiveVariable), which is basically the same as String.valueOf.
Also look at this post and this one too
Java doc for method String#hashCode() says:
Returns a hash code for this string. The hash code for a String object is computed as
s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
using int arithmetic, where s[i] is the ith character of the string, n is the length of the string, and ^ indicates exponentiation. (The hash value of the empty string is zero.)
Questions:
Is it possible to have same hash code for two string objects having different values? If yes then please share some examples.
Is it possible to get String value back from its hash code?
I am not using it any where in code. I have just asked this question to know more about Java String class.
Is it possible to have same hash code for two string objects having different values? If yes then please share some examples.
Here is a small sample of randomly generated examples of short strings with identical hash codes:
String 1 String 2 Common hash code
-------- -------- ----------------
VTBHKIGV - FLXCLLII -1242944431
FPESRBAH - GNFWMYVA 1778061647
UYDHRTXL - HGCNRCBE 1509241566
VXQMFMDE - YMYXDWKK -1553987354
VGWBSYRX - JZNQSUXK 700334696
Since multiple strings can share the same hash code, restoring the original from the hash is not possible.
Is it possible to have same hash code for two string objects having different values?
yes, how can you map infinite string possibilities to int without it
Is it possible to get String value back from its hash code?
no, read 1
It's absolutely possible to have two different strings (or objects) with the same hash code. That's why we have collision handling. So in general it's not possible to get the string value back from the hash code. This is because the hash code value quickly overflows the 32-bit integer for strings longer than 4 bytes.
assume your string is 2 characters long
c1,c2
your hash is 31*c1 + c2
can you think of different values that will map to the same hash?
it is worse in longer strings
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
hiding strings in Obfuscated code
I'm trying to hide a little some static Strings of my app in order to make it harder to decompile, this way like the constants like cipher algorithms names are harder to find in the obfuscated code.
I've considered things like:
String CONCAT= "concat"+"string";
String RAW_STRING= "raw_string";
String FROM_BYTES=new String("from_bytes".getBytes());
String FROM_CHARS=new String(new char[]{'f','r','o','m','_','c','h','a','r','s'});
String FROM_CHAR2=new String(new char[]{102,114,111,109,95,99,104,97,114,115,95,50});
And the last two options seems to be "darker" than the raw option but I imagine there are better ways for doing this.
How can I improve this? Thanks
For one, you shouldn't just write
String FROM_CHAR2=new String(new char[]{102,114,111,109,95,99,104,97,114,115,95,50});
It's a dead give-away that the char array is actually a String.
You can do a combination of the followings:
put your "String" in an int[] array
or even better, break your String into several int arrays
calculate/manipulate the array's values at various stage of the application, so its value will only become valid at a certain interval during a runtime, guaranteeing that it won't be deciphered at a curious glance by decompiling your code
passes the array(s) back and forth, through local variables, back to instance variables, etc, before finally converting the arrays to a single array to be passed to the String constructor
immediately set the String to null after use, just to reduce the amount of time the actual String exist at runtime
I would prefer to set the value in the static (class) initializer using an decryption algo
Something like
class ...
String CONCAT;
static {
CONCAT = uncrypt ("ahgsdhagcf");
}
where uncrypt might be really a good unencryption algo or somewhat weaker a base64 decode.
In any case you need a simple program to encode your string first.
Is there an equivalent to Java's String intern function in Go?
I am parsing a lot of text input that has repeating patterns (tags). I would like to be memory efficient about it and store pointers to a single string for each tag, instead of multiple strings for each occurrence of a tag.
No such function exists that I know of. However, you can make your own very easily using maps. The string type itself is a uintptr and a length. So, a string assigned from another string takes up only two words. Therefore, all you need to do is ensure that there are no two strings with redundant content.
Here is an example of what I mean.
type Interner map[string]string
func NewInterner() Interner {
return Interner(make(map[string]string))
}
func (m Interner) Intern(s string) string {
if ret, ok := m[s]; ok {
return ret
}
m[s] = s
return s
}
This code will deduplicate redundant strings whenever you do the following:
str = interner.Intern(str)
EDIT: As jnml mentioned, my answer could pin memory depending on the string it is given. There are two ways to solve this problem. Both of these should be inserted before m[s] = s in my previous example. The first copies the string twice, the second uses unsafe. Neither are ideal.
Double copy:
b := []byte(s)
s = string(b)
Unsafe (use at your own risk. Works with current version of gc compiler):
b := []byte(s)
s = *(*string)(unsafe.Pointer(&b))
I think that for example Pool and GoPool may fulfill your needs. That code solves one thing which Stephen's solution ignores. In Go, a string value may be a slice of a bigger string. Scenarios are where it doesn't matter and scenarios are where that is a show stopper. The linked functions attempt to be on the safe side.