I am using a NoSQL database which doesn't allow equality conditions of attributes that are projected. Eg Unequality operations such as select a from table where a > 10 and is allowed select a from table where b < 10, but select a from table where a = 10 is not allowed. Of course I need to use an equality in my case, so I want to turn a equality operations into an inequality operation.
So I need to retrieve a record by email. If could I would go select email from table where email = 'myemail#email.com', but this is not allowed so I want to get the lexicographic value right before myemail#email.com and the value right after. So the query would look like this:
select email from table where email < [1 value above] and email > [1 value below]
This way the statement would still return myemail#email.com. I am having trouble though how to accomplish this.
Usually to compare strings I go "myemail#email.com".compare("myemail#email.ca") to see which one bigger and which one is smaller. This method compares the values somehow based on lexicographic, but how? My question is how to get the lexicographic value right below a string and the lexicographic value right after the string?
The string immediately after a string is easy. It's just
str + '\0'
This works because '\0' is the lowest possible char value.
The string immediately before str is more tricky. If the string ends in '\0' you can just remove it. If the string doesn't end in '\0' you have serious issues. As an example, let's consider the string "foo".
Each of the following strings is below "foo" and each one is bigger than the last.
"fon" + Character.MAX_VALUE;
"fon" + Character.MAX_VALUE + Character.MAX_VALUE;
"fon" + Character.MAX_VALUE + Character.MAX_VALUE + Character.MAX_VALUE;
...
The largest String value less than "foo" is "fon" followed by something like 2^31 - 4 copies of Character.MAX_VALUE (this may not be right. I'm not sure what the largest possible length of a char[] is). However, you will not be able to store such a string in memory.
You should therefore try to find an different solution to your problem.
Assuming your alphabet is a-z0-9, and case-insensitive, you can treat your string as a base-36 number and simply increment/decrement the values using simple arithmetic.
Java's Long.valueOf method allows you to take a String with a given radix, and convert it to it's (base 10) Long equivalent. Once you have a Long instance, you can simply add 1 to get the next value.
public static String nextString(String str) {
return Long.toString(Long.valueOf(norm(str), 36) + 1, 36);
}
To reverse the operation, you can use the Long.toString method, which takes a long instance and converts it to a String representation, with a specified radix. So you can represent your base-10 long as a base-36 number, which will include the letters a-z.
public static String prevString(String str) {
return Long.toString(Long.valueOf(norm(str), 36) - 1, 36);
}
You'll want to normalize your strings when using these methods, so this will filter our invalid characters, ensure that everything is lower-case, and prevent null pointer exceptions or number format exceptions.
private static String norm(String str) {
if (str == null) {
return "0";
}
return str.toLowerCase().replaceAll("[^a-z0-9]", "");
}
Related
I'm presently trying to understand a particular algorithm at the CodingBat platform.
Here's the problem presented by CodingBat:
*Suppose the string "yak" is unlucky. Given a string, return a version where all the "yak" are removed, but the "a" can be any char. The "yak" strings will not overlap.
Example outputs:
stringYak("yakpak") → "pak"
stringYak("pakyak") → "pak"
stringYak("yak123ya") → "123ya"*
Here's the official code solution:
public String stringYak(String str) {
String result = "";
for (int i=0; i<str.length(); i++) {
// Look for i starting a "yak" -- advance i in that case
if (i+2<str.length() && str.charAt(i)=='y' && str.charAt(i+2)=='k') {
i = i + 2;
} else { // Otherwise do the normal append
result = result + str.charAt(i);
}
}
return result;
}
I can't make sense of this line of code below. Following the logic, result would only return the character at the index, not the remaining string.
result = result + str.charAt(i);
To me it would make better sense if the code was presented like this below, where the substring function would return the letter of the index and the remaining string afterwards:
result = result + str.substring(i);
What am I missing? Any feedback from anyone would be greatly helpful and thank you for your valuable time.
String concatenation
In order to be on the same page, let's recap how string concatenation works.
When at least one of the operands in the expression with plus sign + is an instance of String, plus sign will be interpreted a string concatenation operator. And the result of the execution of the expression will be a new string created by appending the right operand (or its string representation) to the left operand (or its string representation).
String str = "allow";
char ch = 'h';
Object obj = new Object();
System.out.println(ch + str); // prints "hallow"
System.out.println("test " + obj); // prints "test java.lang.Object#16b98e56"
Explanation of the code-logic
That said, I guess you will agree that this statement concatenates a character at position i in the str to the resulting string and assigns the result of concatenation to the same variable result:
result = result + str.charAt(i);
The condition in the code provided by coding bat ensures whether the index i+2 is valid and then checks characters at indices i and i+2. If they are equal to y and k respectively. If that is not the case, the character will be appended to the resulting string. Athowise it will be discarded and the indexed gets incremented by 2 in order to skip the whole group of characters that constitute "yak" (with a which can be an arbitrary symbol).
So the resulting string is being constructed in the loop character by characters.
Flavors of substring()
Method substring() is overload, there are two flavors of it.
A version that expects two argument: the starting index inclusive, the ending index, exclusivesubstring(int, int).
And you can use it to achieve the same result:
// an equivalent of result = result + str.charAt(i);
result = result + str.substring(i, i + 1);
Another version of this method, that expects one argument will not be useful here. Because the result returned by str.substring(i) will be not a string containing a single character, but a substring staring from the given index, i.e. encompassing all the characters until the end of the string as documentation of substring(int) states:
public String substring(int beginIndex)
Returns a string that is a substring of this string. The substring
begins with the character at the specified index and extends to the
end of this string.
Examples:
"unhappy".substring(2) returns "happy"
"Harbison".substring(3) returns "bison"
"emptiness".substring(9) returns "" (an empty string)
Side note:
This coding-problem was introduced in order to master the basic knowledge of loops and string-operations. But actually the simplest to solve this problem is by using method replaceAll() that expects a regular expression and a replacement-string:
return str.repalaceAll("y.k", "");
I'm newbie to Java and trying to sort email address alphabetically using compareTo() but the result is not as I expect. I put my question in the code, could you please advise?
public class SortTest {
public static void main(String[] args) {
String text1= "customer1#example.com";
String text2 = "customer10#example.com";
System.out.println(text1.compareTo(text2)); //Result is 16. Why? I expect a negative number as result.
String text3= "customer1";
String text4 = "customer10";
System.out.println(text3.compareTo(text4)); //result is -1 which is correct.
}
}
UPDATE:
I want to sort ascending with text1 and text2 above, the expected result in order is "customer1#example.com", then "customer10#example.com". Could u advise how to achieve it?
Take a look at the javadoc:
If there is no index position at which they differ, then the shorter string lexicographically precedes the longer string. In this case, compareTo returns the difference of the lengths of the strings -- that is, the value
This explains the second example.
In the first one "0" is lexicographically before "#". You can simply check that by running:
"#".compareTo("0")
which results in the value 16.
Or another way:
(int) '#' // 64
(int) '0' // 48
So the difference is 16.
Edit: to compare the emails the way you want it, you should involve some more logic, for example compare only login part (remove the domain separated by "#"): str1.split("#")[0].compareTo(str2.split("#")[0])
Take a look at the Ascii table - http://www.asciitable.com/index/asciifull.gif
'#' Dec value is 64
While '0' dec value is 48.
Now if you do (64-48) = 16.
the original question is like this.
public class test {
public static void main(String[] args){
int i = '1' + '2' + '3' + "";
System.out.println(i);
}
}
and this gives me an error:
Exception in thread "main" java.lang.Error: Unresolved compilation problem:
Type mismatch: cannot convert from String to int
then I changed the code like this:
public class test {
public static void main(String[] args){
int i = '1' + '2' + '3';
System.out.println(i);
}
}
the out put is 150.
but when I write my code like this:
public class test {
public static void main(String[] args){
System.out.println('a'+'b'+'c'+"");
}
}
the output become 294.
I wonder why.
The first one does not compile, because you concatenate a String at the end which cause the value to be a String which can't be converted directly to int.
The output of the second one is 150, because ASCII value for character 1,2,3 are 49,50,51 which return 150 when doing the addition.
The output of the last one is 294, because you are doing an addition of char values in the ASCII table (97+98+99)
You can verify the values here for a,b and c (or any other character).
Edit : To explain why the last one output the correct value instead of throwing an error, you first sum all the values as explained before, then convert it to a String adding "" to the sum of the ASCII values of the chars. However, the println method expect a String which is why it does not throw any error.
The first one would work if you would do Integer.parseInt('1' + '2' + '3' + "");
When you do this
int i = '1' + '2' + '3';
the JVM sums the ASCII codes of the given numbers. The result is 150.
When you add the empty String, you are trying to sum an int/char with a String. This is not possible. You can implicitly convert char to int and vice versa because they are primitive types. You cannot do this with String objects because they are not primitives but references. That's why you get an error.
When you do the println the primitive values are firstly summed and the automatically boxed into reference type so the sum is boxed into a Character object. The empty String is converted to a Character and then is added to the first one. So the result is a Character object that has an ASCII code 294. Then the toString method of the Character is called because that's what the println(Object) method does. And the result is 294
I hope this will help you to understand what is happening :)
The first is impossible because you can't convert String to int this way.
The second works because chars are kind of numbers, so adding chars is adding the numbers they really are. Char '1' is the number 49 (see ASCII table), so the sum is 49+50+51 which is 150.
The third works this way because + is a left parenthesized operator, which means that 'a'+'b'+'c'+"" should be read as (('a'+'b')+'c')+"". 'a' has ASCII code 97, so you have 294+"". Then Java knows that is should convert the value to a String to be able to catenate the two strings. At the end you have the the string 294. Modify your last code to the following System.out.println('a'+'b'+('c'+"")); and you will see that the result will be 195c.
You must note that System.out.println is a method that is used to convert values (of different types) to their String representation. This is always possible as every int can be converted to a String representation of it, but not the converse; not every String is a representation of an int (so Java will not let you do it so simply).
First: [int i = '1' + '2' + '3' + "";]
If you concat an empty string value, you convert it to a String object, and then String objects can't convert to int.
Second: [int i = '1' + '2' + '3';]
The binary arithmetic operations on char promote to int. It's equal to:
[int i = 49 + 50 + 51] - total: 150.
Third: [System.out.println('a'+'b'+'c'+"");]
At this case you convert 'a' + 'b' + 'c' (that is 294) to String (+"") and then print the result like a String value and that works ok.
Can I somehow prepend a minus sign to a numeric String and convert it into an int?
In example:
If I have 2 Strings :
String x="-";
String y="2";
how can i get them converted to an Int which value is -2?
You will first have to concatenate both Strings since - is not a valid integer character an sich. It is however acceptable when it's used together with an integer value to denote a negative value.
Therefore this will print -2 the way you want it:
String x = "-";
String y = "2";
int i = Integer.parseInt(x + y);
System.out.println(i);
Note that the x + y is used to concatenate 2 Strings and not an arithmetic operation.
Integer.valueOf("-") will throw a NumberFormatException because "-" by itself isn't a number. If you did "-1", however, you would receive the expected value of -1.
If you're trying to get a character code, use the following:
(int) "-".charAt(0);
charAt() returns a char value at a specific index, which is a two-byte unicode value that is, for all intensive purposes, an integer.
I have the following class:
public class Go {
public static void main(String args[]) {
System.out.println("G" + "o");
System.out.println('G' + 'o');
}
}
And this is compile result;
Go
182
Why my output contain a number?
In the second case it adds the unicode codes of the two characters (G - 71 and o - 111) and prints the sum. This is because char is considered as a numeric type, so the + operator is the usual summation in this case.
+ operator with character constant 'G' + 'o' prints addition of charCode and string concatenation operator with "G" + "o" will prints Go.
The plus in Java adds two numbers, unless one of the summands is a String, in which case it does string concatenation.
In your second case, you don't have Strings (you have char, and their Unicode code points will be added).
System.out.println("G" + "o");
System.out.println('G' + 'o');
First one + is acted as a concat operater and concat the two strings. But in 2nd case it acts as an addition operator and adds the ASCII (or you cane say UNICODE) values of those two characters.
This previous SO question should shed some light on the subject, in your case you basically end up adding their ASCII values (71 for G) + (111 for o) = 182, you can check the values here).
You will have to use the String.valueOf(char c) to convert that character back to a string.
The "+" operator is defined for both int and String:
int + int = int
String + String = String
When adding char + char, the best match will be :
(char->int) + (char->int) = int
But ""+'a'+'b' will give you ab:
( (String) + (char->String) ) + (char->String) = String
+ is always use for sum(purpose of adding two numbers) if it's number except String and if it is String then use for concatenation purpose of two String.
and we know that char in java is always represent a numeric.
that's why in your case it actually computes the sum of two numbers as (71+111)=182 and not concatenation of characters as g+o=go
If you change one of them as String then it'll concatenate the two
such as System.out.println('G' + "o")
it will print Go as you expect.