The most efficient way to decipher some text

The most efficient way to decipher some text - java

I am developing a small application where I am given some string "MRUGDQ" and a shift value 3 for example. Then I shift each letter to left by 3 and the result would be "JORDAN". For instance M would be replaced by J and R would be replaced by O and so on.
So now this is the approach I was thinking of using, but I was wondering if this is efficient and can I improve my solution?
Assumptions I make:
I am assuming my string will be either capital A to Z letter or small a to z letter and therefore the ascii range is from 65 to 90 and 97 to 122 respectively.
Pesudo Code
get ascii value of char
(assume char happens to be between the capital letter range)
add the shift value to the char ascii value
if new ascii value <= 90
replace old letter by new letter
else
int diff = new ascii value - 90
while (new ascii value <= 90) {
decrement diff value by 1
increment new ascii value by 1
}
add remaining diff to 65 and set that as new ascii value
replace old letter by new letter
Do that for each letter in the string.
Please let me know if my approach is correct or if I can be more efficient.

I don't see much to improve, except your handling of the new char is out of range.
If you like to "roll" the overlapping amount back to the beginning of your range, than just calculate: (x % 90) + 64(*) (with x being the ascii value after adding the shift value).
Example:
'Y' (89) + 3 = '\' 92
92 % 90 = 2
2 + 64 = 'B' (66)
You need to start from 64, to avoid skipping over 'A', which has the value 65.
(*) The general formula is: (value % upper bound) + (lower bound - 1).
You could also use (value % upper bound + 1) + lower bound.

Related

Java syntax int from char magic [duplicate]

This question already has answers here:
Java: Subtract '0' from char to get an int... why does this work?
(10 answers)
How does subtracting the character '0' from a char change it into an int?
(4 answers)
Closed 8 years ago.
I’m learning Java through "introduction to Java programming 9th edition" by Daniel Liang at chapter 9 "strings" I’ve encountered this piece of code :
public static int hexCharToDecimal(char ch) {
if (ch >= 'A' && ch <= 'F')
return 10 + ch - 'A';
else
return ch - '0';
}
Can someone explain what just happened in here? How is possible to add/subtract chars from integers and what's the meaning behind it?

From the Docs
The char data type is a single 16-bit Unicode character.
A char is represented by its code point value:
min '\u0000' (or 0)
max: '\uffff' (or 65,535)
You can see all of the English alphabetic code points on an ASCII table.
Note that 0 == \u0000 and 65,535 == \uffff, as well as everything in between. They are corresponding values.
A char is actually just stored as a number (its code point value). We have syntax to represent characters like char c = 'A';, but it's equivalent to char c = 65; and 'A' == 65 is true.
So in your code, the chars are being represented by their decimal values to do arithmetic (whole numbers from 0 to 65,535).
For example, the char 'A' is represented by its code point 65 (decimal value in ASCII table):
System.out.print('A'); // prints A
System.out.print((int)('A')); // prints 65 because you casted it to an int
As a note, a short is a 16-bit signed integer, so even though a char is also 16-bits, the maximum integer value of a char (65,535) exceeds the maximum integer value of a short (32,767). Therefore, a cast to (short) from a char cannot always work. And the minimum integer value of a char is 0, whereas the minimum integer value of a short is -32,768.
For your code, let's say that the char was 'D'. Note that 'D' == 68 since its code point is 68.
return 10 + ch - 'A';
This returns 10 + 68 - 65, so it will return 13.
Now let's say the char was 'Q' == 81.
if (ch >= 'A' && ch <= 'F')
This is false since 'Q' > 'F' (81 > 70), so it would go into the else block and execute:
return ch - '0';
This returns 81 - 48 so it will return 33.
Your function returns an int type, but if it were to instead return a char or have the int casted to a char afterward, then the value 33 returned would represent the '!' character, since 33 is its code point value. Look up the character in ASCII table or Unicode table to verify that '!' == 33 (compare decimal values).

This is because char is a primitive type which can be used as a numerical value. Every character in a string is encoded as a specific number (not entirely true in all cases, but good enough for a basic understanding of the matter) and Java allows you to use chars in such a way.
It probably allows this mostly for historical reasons, this is how it worked in C and they probably motivated it with "performance" or something like that.
If you think it's weird then don't worry, I think so too
The other answer is incorrect actually. ASCII is a specific encoding (an encoding is some specification that says "1 = A, 2 = B, ... , 255 = Space") and that is not the one used in Java. A Java char is two bytes wide and is interpreted through the unicode character encoding.

Chars are in turn stored as integers(ASCII value) so that you can perform add and sub on integers which will return ASCII value of a char

Regardless of how Java actually stores the char datatype, what's certain is this, the character 'A' subtracted from the character 'A' would be represented as the null character, \0. In memory, this means every bit is 0. The size in memory a char takes up in memory may vary from language to language, but as far as I know, the null character is the same in all the languages, every bit is equal to 0.
As an int value, a piece of memory with every bit equal to 0 represents the integer value of 0.
And as it turns out, when you do "character math", subtracting any alphabetical character from any other alphabetical character (of the same case) results in bits being flipped in such a way that, if you were to interpret them as an int, would represent the distance between these characters. Additionally, subtracting the char '0' from any other numeric char will result in int value of the char you subtracted from, for basically the same reason.
'A' - 'A' = '\0'
'a' - 'a' = '\0'
'0' - '0' = '\0'

Problem with suming up digits in an integer with charAt() in Java

I have one integer number and my goal is, to sum up digits in that integer
I tried with charAt(); but the weird part is when I'm trying to check numbers with their
index its works well but the part I don't understand is when I'm trying to sum them up
why 2 + 2 is 100
Scanner scanner = new Scanner(System.in);
int number = scanner.nextInt();
String string_number = Integer.toString(number);
System.out.println(string_number.charAt(0));
System.out.println(string_number.charAt(1));
System.out.println(string_number.charAt(0) + string_number.charAt(1));
input 22
Output
2
2
100

A character in Java is close to its unicode code point. And the unicode code point of '2' is... 0x32 or 50!
And yes, 50 + 50 is 100...
Fortunately, the value of a decimal digit is guaranteed to be c - '0', so what you want is:
System.out.println((string_number.charAt(0) - '0') + (string_number.charAt(1) - '0'));

You are calculating the sum of the ASCII values of the characters.
The ASCII value of the character '2' is 50. You are therefore adding 50 and 50.
You need to convert the character to the number first.

Converting from number to hexavigesimal letters?

I'm trying to re-order some Excel columns using JExcel. I also need to find references to other cells and then re-map them to reference the correct cells. I feel like I've done a lot of the hard work, but I've hit a stumbling block.
I found this code on wikipedia, as linked to from SO:
public static String toBase26(int number){
number = Math.abs(number);
String converted = "";
// Repeatedly divide the number by 26 and convert the
// remainder into the appropriate letter.
do
{
int remainder = number % 26;
converted = (char)(remainder + 'A') + converted;
number = (number - remainder) / 26;
} while (number > 0);
return converted;
}
But when I run the number 35 into it, this is what happens:
number = 35
remainder = 9
converted= char(9+'A')+"" = J
number = (35-9)/26 = 1
1>0
remainder = 1
char(1+'A') = B
converted= char(1+'A')+"J" = BJ
Which is, in a way expected, as Base 10 (35) = Base 26 (19). But I'm actually wanting to refer to column AJ.
I can't untangle what change I need to make to get the right letters out. Whenever I try to work it out on paper, I end up ruining the previous letters extracted. For instance, I don't think this would work, as it means I end up with remainder as 8, the first time, and then that would be converted into I, unless I've missed something?
Any help on this would be greatly appreciated. I've looked around and wasted enough time on this. I just want some help to get it to work.

The stumbling block behind this 'hexavidecimal system' is that it has a 0, but the units column skips the 0 and ranges only from A-Z. Consider the following conversion from decimal:
A 1 (0*26 + 1)
...
Z 26 (0*26 + 26)
AA 27 (1*26 + 1)
...
AZ 52 (1*26 + 26)
BA 53 (2*26 + 1)
...
BZ 78 (2*26 + 26)
CA 79 (3*26 + 1)
...
ZZ 702 (26*26 + 26)
AAA 703 (1*26*26 + 1*26 + 1)
See the problem? There are missing 'zeroes' in the hexavidecimal numbers:
00A 1
...
00Z 26
0AA 27
...
0AZ 52
0BA 53
...
0BZ 78
0CA 79
...
0ZZ 702 (26*26 + 26)
AAA 703 (1*26*26 + 1*26 + 1)
However, the units column does NOT have the zeroes ever!
Obviously we do not print these zeroes, but it should aid your understanding of what is going wrong.
Here's our algorithm. I wrote the algorithm under the assumption that decimal 0 = hexavidecimal A, 1 -> B, 25 -> Z, 26 -> AA and so on because it makes it easier for me to wrap my head around. If this isn't the assumption you want just subtract 1 before running the code :)
0. If number =< 0, return.
1. Modulo by 26. Convert 0-25 to 'A'-'Z'. //This is our units column.
Loop {
2. Divide the number by 26 (integer division rounding down).
3. If number =< 0, return.
4. Modulo by 26. Convert 0-25 to 'Z','A'-'Y'. //This is our next column (prepend to string output).
}
Example
Converting decimal 730 -> ABC hexavigesimal
Modulo of 730 by 26 = 2 -> 'C' for units column
Divide 730 by 26 = 28
Modulo 28 by 26 = 2 -> 'B' for tens column
Divide 28 by 26 = 1
Modulo 1 by 26 = 1 -> 'A' for hundreds column
Divide 1 by 26 = 0
Number is empty, therefore return 'ABC'

Here is a simple Python function to compute the hexavigesimal representation of a number (in an arbitrary base), where a is equal to 1 (not 0).
The tricky part of the problem is that at each step you're taking between 1 and 10 off the remainder, so you need to account for that in your modulo. The code below accounts for it by subtracting 1 from the number each time. Then 0 becomes a very convenient end condition, because you cannot represent 0 in hexavigesimal (the wikipedia entry denotes it λ).
# Formats a number as a bijective base N string.
def bijective(n, base):
chars = ''
while n != 0:
chars = chr((n - 1) % base + 97) + chars
n = (n - 1) / base
return chars
# Examples!
if __name__ == '__main__':
base = 26
for n in range(1, 2 * base * base):
print('{}: {}'.format(n, bijective(n, base)))
See it in action on pythonanywhere.
I included a javascript version in this gist.

Bitshift - Need help to understand the code

I am just trying to learn bitwise / shift operations.
I came across the below program but don't understand the AND condition part (checker & (1 << val) in the below program. When will the final Value be greater than 0? Can someone please explain whats happening there?
Sample input:
xyzz
Sample output:
8388608Value
0checker
0final value
16777216Value
8388608checker
0final value
33554432Value
25165824checker
0final value
33554432Value
58720256checker
33554432final value
public static boolean isUniqueChars(String str) {
int checker = 0;
for (int i = 0; i < str.length(); i++) {
int val = str.charAt(i) - 'a';
System.out.println((1 << val) + "Value");
System.out.println((checker) + "checker");
System.out.println(((checker & (1 << val))) + "final value\n");
if ((checker & (1 << val)) > 0) {
return false;
} else {
checker = checker | (1 << val);
}
}
return true;
}
}

OK, just to make sure you know what's going on:
int val = str.charAt(i) - 'a';
Assuming the English alphabet, this is taking the char value for your (lowercase) letter and subtracting 97 (the char value for 'a') to produce a number between 0 and 25 inclusive. Don't try this function on uppercase characters, you'll get errors unless you add a .toLowerCase() after the .charAt(i)
1 << val is bit-shifting 1 val places to the left. For instance, for 'x' (120 - 97 = 23, so... 1 << 23), the binary representation would be 00000000010000000000000000000000
OK, with me so far?
At the start, checker has all 0 bits, so it's 00000000000000000000000000000000
So... lets put in our numbers instead of our variables. For our x check, checker & (1 << val) becomes 00000000000000000000000000000000 & 00000000010000000000000000000000 which equals 00000000000000000000000000000000 because bit 23 isn't set in checker.
So, once x is processed, we add bit 23 to checker and move on to the next letter: y This time, checker & (1 << val) becomes 00000000010000000000000000000000 & 00000000100000000000000000000000 which equals 00000000000000000000000000000000 because bit 24 isn't set in checker.
For the first z, checker & (1 << val) becomes 00000000110000000000000000000000 & 00000001000000000000000000000000 which equals 00000000000000000000000000000000 because bit 25 isn't set in checker.
For the second z, checker & (1 << val) becomes 00000001110000000000000000000000 & 00000001000000000000000000000000 which equals 00000001000000000000000000000000 (decimal 33554432 or 2^25) because bit 25 is set in checker, therefore the > 0 is now true and the function returns false.

I think what your function does is check if all characters in the input string are different. It returns false iff the same (lower case) character appears more than once.
The checker variable serves as a kind of bit map that accumulates which characters have appeared so far. Data type int consists of 32 bits which is enough to assign one bit per each character (26).
The function loops over all characters of str. The row int val = str.charAt(i) - 'a'; assigns some sort of ordinal value to val depending on the character ('a' => 0, 'b' => 1, 'c' => 2, etc.).
The expression 1 << val assigns each val in the range of (0..25) to its bit position. Therefore, character 'a' is mapped to 1 << 0 == 1 == 00000001, character 'd' is mapped 1 << 3 == 00001000, and so on. Each character is assigned its unique bit mask with exactly one bit set and all other bits cleared.
The expression (checker & (1 << val)) is > 0 exactly iff the bit that is set in 1 << val is also set in checker (note that checker might have more than one bit set). If so, the currently iterated character has appeared earlier, and the function returns false. Otherwise, the bit mask of the current character is added via bitwise OR operator | to the checker that acts as an accumulator. If all characters have been looped over and no character has been met twice, the function returns true. Note that the function might ignore upper-case and other characters.

Parsing user input to correlating array values

I am trying to assign a number to a letter grade that a user inputs. The user will input a letter such as A, B, or C and then based on what they enter a value is stored in an integer.
I figured the easiest way to do this was setup an array such as:
char[] grade = char[] grade = {'A','B','C','D','F'};
grade[0] = 4;
grade[1] = 3;
// ... as so on
So, whenever a user inputs 'A' for their grade, I use the 4 when I need to.
I am trying to figure out how to read an input (JOptionPane) and read the letter they enter to the corresponding value I have assigned it. How do I go about parsing the letter input based on my array?

I'm not sure, whether I understood you right:
int grade (char input)
{
return 5 - (input - 'A');
}
Think of it as a graph. In computer encoding, Ascii or UTF8, the characters A-F are sequentially encoded, with A being the lowest, but not 0 or 1, but 65 or something, which we don't remember exactly.
5 | *
4 | *
3 | *
2 | *
1 | *
0 +-- ... ------------------*----->
A B C D E F
65 6 7 8 9 70
Drawing this graph, I mentioned that you jump form D to F - is that intentionally? If not:
If we subtract from 5 the difference from input and 'A', we get 5 - 0 for 'A', and 5 - 1 for 'B' and so on. Since we don't want to look up the number for 'A', we use 'A' directly, which is fine, since we can perform arithmetics on characters.
We could as well write
return 70 - input;
or
return 'F' - input;
The standard form of a linear equation is y = mx + n, where n is the cut through the y-axis (70), and m = -1, the gradient, negative in our case.

It might be easier to just cast the character to an int. A char basically has an int value. Doing this:
int i = (char)'A';
will yield 65. For a lower case a it would be 97. You could cast the char to int, then use that value to do bounds checking and some arithmetic. Sequential letters will yield sequential integers. This is safe since you're running on a JVM and don't have to take bizarro character set orders for different platforms into account.
Apart from that, seeing how you have limited allowed inputs, a map could work well too:
Map<Character, Integer> grades = new HashMap<Character, Integer>();
grades.put('A', 4); //optionally also: grades.put('a' 4);
...
Type params and auto-boxing and unboxing makes this a lot more convenient these days.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

The most efficient way to decipher some text - java

Related

Java syntax int from char magic [duplicate]

Problem with suming up digits in an integer with charAt() in Java

Converting from number to hexavigesimal letters?

Bitshift - Need help to understand the code

Parsing user input to correlating array values

Categories

Resources