Optimizing Project Euler #22

Optimizing Project Euler #22 - java

Thanks in advance.
I just solved Project Euler #22, a problem involving reading about 5,000 lines of text out of a file and determining the value of a specific name, based on the sum of that Strings characters, and its position alphabetically.
However, the code takes about 5-10 seconds to run, which is a bit annoying. What is the best way to optimize this code? I'm currently using a Scanner to read the file into a String. Is there another, more efficient way to do this? (I tried using a BufferedReader, but that was even slower)
public static int P22(){
String s = null;
try{
//create a new Scanner to read file
Scanner in = new Scanner(new File("names.txt"));
while(in.hasNext()){
//add the next line to the string
s+=in.next();
}
}catch(Exception e){
}
//this just filters out the quotation marks surrounding all the names
String r = "";
for(int i = 0;i<s.length();i++){
if(s.charAt(i) != '"'){
r += s.charAt(i);
}
}
//splits the string into an array, using the commas separating each name
String text[] = r.split(",");
Arrays.sort(text);
int solution = 0;
//go through each string in the array, summing its characters
for(int i = 0;i<text.length;i++){
int sum = 0;
String name = text[i];
for(int j = 0;j<name.length();j++){
sum += (int)name.charAt(j)-64;
}
solution += sum*(i+1);
}
return solution;
}

If you're going to use Scanner, why not use it for what it's supposed to do (tokenisation)?
Scanner in = new Scanner(new File("names.txt")).useDelimiter("[\",]+");
ArrayList<String> text = new ArrayList<String>();
while (in.hasNext()) {
text.add(in.next());
}
Collections.sort(text);
You do not need to strip quotes, or split on commas - Scanner does it all for you.
This snippet, including java startup time, executes in 0.625s (user time) on my machine. I suspect it should be a bit faster than what you were doing.
EDIT OP asked what the string passed to useDelimiter was. It's a regular expression. When you strip out the escaping required by Java to include a quote character into a string, it's [",]+ - and the meaning is:
[...] character class: match any of these characters, so
[",] match a quote or a comma
...+ one or more occurence modifier, so
[",]+ match one or more of quotes or commas
Sequences that would match this pattern include:
"
,
,,,,
""",,,",","
and indeed ",", what was what we were going after here.

I suggest you to run your code with profiler. It allows you to understand, what part is really slow (IO/computations etc). If IO is slow, check for NIO: http://docs.oracle.com/javase/1.4.2/docs/guide/nio/.

Appending strings in a loop with '+', like you do here:
/* That's actually not the problem since there is only one line. */
while(in.hasNext()){
//add the next line to the string
s+=in.next();
}
is slow, because it has to create a new string and copy everything around in each iteration. Try using a StringBuilder,
StringBuilder sb = new StringBuilder();
while(in.hasNext()){
sb.append(in.next());
}
s = sb.toString();
But, you shouldn't really read the file contents into a String, you should create a String[] or an ArrayList<String> from the file contents directly,
int names = 5000; // use the correct number of lines in the file!
String[] sa = new String[names];
for(int i = 0; i < names; ++i){
sa[i] = in.next();
}
However, upon checking, it turns out that the file does not contain about 5000 lines, rather, it is all on a single line, so your big problem is actually
/* This one is the problem! */
String r = "";
for(int i = 0;i<s.length();i++){
if(s.charAt(i) != '"'){
r += s.charAt(i);
}
}
Use a StringBuilder for that. Or, make your Scanner read until the next ',' and read directly into an ArrayList<String> and just remove the double quotes from each single name in the ArrayList.

5+ seconds is quite slow for this problem. My entire web application (600 Java classes) compiles in four seconds. The root of your problem is probably the allocation of a new String for every character in the file: r += s.charAt(i)
To really speed this up, you should not use Strings at all. Get the file size, and read the whole thing into a byte array in a single I/O call:
public class Names {
private byte[] data;
private class Name implements Comparable<Name> {
private int start; // index into data
private int length;
public Name(int start, int length) { ...; }
public int compareTo(Name arg0) {
...
}
public int score()
}
public Names(File file) throws Exception {
data = new byte[(int) file.length()];
new FileInputStream(file).read(data, 0, data.length);
}
public int score() {
SortedSet<Name> names = new ...
for (int i = 0; i < data.length; ++i) {
// find limits of each name, add to the set
}
// Calculate total score...
}
}

Depending on the application, StreamTokenizer is often measurably faster than Scanner. Examples comparing the two may be found here and here.
Addendum: Euler Project 22 includes deriving a kind of checksum of the characters in each token encountered. Rather than traversing the token twice, a custom analyzer could combine the recognition and calculation. The result would be stored in a SortedMap<String, Integer> for later iteration in finding the grand total.

An obtuse solution which may find interesting.
long start = System.nanoTime();
long sum = 0;
int runs = 10000;
for (int r = 0; r < runs; r++) {
FileChannel channel = new FileInputStream("names.txt").getChannel();
ByteBuffer bb = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
TLongArrayList values = new TLongArrayList();
long wordId = 0;
int shift = 63;
while (true) {
int b = bb.remaining() < 1 ? ',' : bb.get();
if (b == ',') {
values.add(wordId);
wordId = 0;
shift = 63;
if (bb.remaining() < 1) break;
} else if (b >= 'A' && b <= 'Z') {
shift -= 5;
long n = b - 'A' + 1;
wordId = (wordId | (n << shift)) + n;
} else if (b != '"') {
throw new AssertionError("Unexpected ch '" + (char) b + "'");
}
}
values.sort();
sum = 0;
for (int i = 0; i < values.size(); i++) {
long wordSum = values.get(i) & ((1 << 8) - 1);
sum += (i + 1) * wordSum;
}
}
long time = System.nanoTime() - start;
System.out.printf("%d took %.3f ms%n", sum, time / 1e6);
prints
XXXXXXX took 27.817 ms.

Related

Fixing a looping issue for removing letters in a String

So i'm making a program that removes duplicate letters in a string. The last step of it is updating the old string to the new string, and looping through the new string. I believe everything works besides the looping through the new string part. Any ideas what might be causing it to not work? It will work as intended for one pass through, and then after that it won't step through the new loop
public class homework20_5 {
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
int i;
for (i = 0; i < kb.length(); i++) {
char temp = kb.charAt(i);
if(temp == kb.charAt(i+1)) {
kb = kb.replace(""+temp, "");
i = kb.length() + i;
}
}
System.out.println(kb);
}
}

Instead of using complex algorithms and loops like this you can just use HashSet which will work just like a list but it won't allow any duplicate elements.
private static String removeDuplicateWords(String str) {
HashSet<Character> xChars = new LinkedHashSet<>();
for(char c: str.toCharArray()) {
xChars.add(c);
}
StringBuilder sb = new StringBuilder();
for (char c: xChars) {
sb.append(c);
}
return sb.toString();
}

So you actually want to remove all occurrences that appear more than once entirely and not just the duplicate appearances (while preserving one instance)?
"Yea that’s exactly right "
In that case your idea won't cut it because your duplicate letter detection can only detect continuous sequences of duplicates. A very simple way would be to use 2 sets in order to identify unique letters in one pass.
public class RemoveLettersSeenMultipleTimes {
public static void main(String []args){
String input = "abcabdgag";
Set<Character> lettersSeenOnce = lettersSeenOnceIn(input);
StringBuilder output = new StringBuilder();
for (Character c : lettersSeenOnce) {
output.append(c);
}
System.out.println(output);
}
private static Set<Character> lettersSeenOnceIn(String input) {
Set<Character> seenOnce = new LinkedHashSet<>();
Set<Character> seenMany = new HashSet<>();
for (Character c : input.toCharArray()) {
if (seenOnce.contains(c)) {
seenMany.add(c);
seenOnce.remove(c);
continue;
}
if (!seenMany.contains(c)) {
seenOnce.add(c);
}
}
return seenOnce;
}
}

There are a few problems here:
Problem 1
for (i = 0; i < kb.length(); i++) {
should be
for (i = 0; i < kb.length() - 1; i++) {
Because this
if (temp == kb.charAt(i+1))
will explode with an ArrayIndexOutOfBoundsException otherwise.
Problem 2
Delete this line:
i = kb.length() + i;
I don't understand what the intention is there, but nevertheless it must be deleted.
Problem 3
Rather than lots of code, there's a one-line solution:
String deduped = kb.replaceAll("[" + input.replaceAll("(.)(?=.*\\1)|.", "$1") + "]", "");
This works by:
finding all dupe chars via input.replaceAll("(.)(?=.*\\1)|.", "$1"), which in turn works by consuming every character, either capturing it as group 1 if it has a dupe or just consuming it if a non-dupe
building a regex character class from the dupes, which is used to delete them all (replace with a blank)

Say you feed the program with the input "AAABBC", then the expected output should be "ABC".
Now in the for-loop, i gets incremented from 0 to 5.
After 1st iteration:
kb becomes AABBC and i becomes 5 + 0 = 5 and gets incremented to 6.
And now the condition for the for-loop is that i < kb.length() which equates to 6 < 5 returning false. Hence the for-loop ends after just one iteration.
So the problematic line of code is i = kb.length() + i; and also the loop condition keeps changing as the size of kb changes.
I would suggest using a while loop like the following example if you don't worry too much about the efficiency.
public static void main(String[] arg) {
String kb = "XYYYXAC";
int i = 0;
while (i < kb.length()) {
char temp = kb.charAt(i);
for (int j = i + 1; j < kb.length(); j++) {
char dup = kb.charAt(j);
if (temp == dup) {
kb = removeCharByIndex(kb, j);
j--;
}
}
i++;
}
System.out.println(kb);
}
private static String removeCharByIndex(String str, int index) {
return new StringBuilder(str).deleteCharAt(index).toString();
}
Output: XYAC
EDIT: I misunderstood your requirements. So looking at the above comments, you want all the duplicates and the target character removed. So the above code can be changed like this.
public static void main(String[] arg) {
String kb = "XYYYXAC";
int i = 0;
while (i < kb.length()) {
char temp = kb.charAt(i);
boolean hasDup = false;
for (int j = i + 1; j < kb.length(); j++) {
if (temp == kb.charAt(j)) {
hasDup = true;
kb = removeCharByIndex(kb, j);
j--;
}
}
if (hasDup) {
kb = removeCharByIndex(kb, i);
i--;
}
i++;
}
System.out.println(kb);
}
private static String removeCharByIndex(String str, int index) {
return new StringBuilder(str).deleteCharAt(index).toString();
}
Output: AC
Although, this is not the best and definitely not an efficient solution to this, I think you can get the idea of iterating the input string character by character and removing it if it has duplicates.

The following answer concerns only the transformation of XYYYXACX to ACX. If we wanted to have AC, it's a whole different answer. The other answers already speak about it, and I'll invite you to consult the contains method of String too.
We should consider avoiding -most of the time- modifying the things we iterate. Using a temporary variable could be a kind of solution. To use it, we could change our mindset. Instead of erasing the undesired letters, we can save the ones we want.
To identify the desired character, we need to test if all surrounding letters are different from the tested one. It'll be the opposite of what you did with if(temp == kb.charAt(i+1)) { like if(temp != kb.charAt(i+1)) {. But considering that the tested string will not change anymore, we will need to test the previous letter too as if(temp != kb.charAt(i-1) && temp != kb.charAt(i+1)) {.
As previously said, once we have identified the letter, we will keep the value with a temporary variable. That will lead to replace kb = kb.replace(""+temp, ""); by buffer = buffer + temp; if buffer is our temporary variable initialized with an empty string (Aka. String buffer = "";). In the end, we could override our base value with the temporary one.
At this step, we will have:
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
String buffer = "";
int i;
for (i = 1; i < kb.length(); i++) {
char temp = kb.charAt(i);
if(temp != kb.charAt(i-1) && temp != kb.charAt(i+1)) {
buffer = buffer + temp;
}
}
kb = buffer;
System.out.println(kb);
}
That'll sadly not work, trying to access invalid indexes of our string. We should consider two particular behavior for the first and the last letter because they are close to only one letter. For these letters, we will have only one comparison. So, we can make them inside or outside the loop. For clarity, we will do it outside.
For the first one, it will look like to if (kb.charAt(0) != kb.charAt(1)) { and at if (kb.charAt(kb.length() - 1) != kb.charAt(kb.length() - 2)) { for the last. The body of the condition will remain the same as the one in the loop.
Once done, we will reduce the scope of our loop to exclude these character with for (i = 1; i < (kb.length() - 1); i++) {.
Now we will have something working, but only for one iteration:
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
String buffer = "";
int i;
if (kb.charAt(0) != kb.charAt(1)) {
buffer = buffer + kb.charAt(0);
}
for (i = 1; i < (kb.length() - 1); i++) {
char temp = kb.charAt(i);
if(temp != kb.charAt(i-1) && temp != kb.charAt(i+1)) {
buffer = buffer + temp;
}
}
if (kb.charAt(kb.length() - 1) != kb.charAt(kb.length() - 2)) {
buffer = buffer + kb.charAt(kb.length() - 1);
}
kb = buffer;
System.out.println(kb);
}
XYYYXACX will become XXACX.
Once said, our index problem can occur again if the string has only one letter. However, all of this would have been useless because obviously, we can't have a duplicate letter in this situation. As a fact, we should wrap the whole thing to ensure that we have at least two letters:
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
if (kb.length() >= 2) {
String buffer = "";
int i;
if (kb.charAt(0) != kb.charAt(1)) {
buffer = buffer + kb.charAt(0);
}
for (i = 1; i < (kb.length() - 1); i++) {
char temp = kb.charAt(i);
if (temp != kb.charAt(i - 1) && temp != kb.charAt(i + 1)) {
buffer = buffer + temp;
}
}
if (kb.charAt(kb.length() - 1) != kb.charAt(kb.length() - 2)) {
buffer = buffer + kb.charAt(kb.length() - 1);
}
kb = buffer;
}
System.out.println(kb);
}
The last thing to do is perform this treatment until we have no more undesired letters. For this task, the do { ... } while ( ... ) seems perfect. We can use for the condition comparison the size of the string. Because when the size of the previous iteration is equal to the temporary variable, we will know that we have finished.
We will need to perform this comparison before affecting the value of our temporary variable to the base one. Otherwise, it'll always be the same.
In the end, the following thing should be a potential solution:
public static void main(String[] arg) {
Scanner scanner = new Scanner(System.in);
String kb = scanner.nextLine();
Boolean modified;
do {
modified = false;
if (kb.length() >= 2) {
String buffer = "";
int i;
if (kb.charAt(0) != kb.charAt(1)) {
buffer = buffer + kb.charAt(0);
}
for (i = 1; i < (kb.length() - 1); i++) {
char temp = kb.charAt(i);
if (temp != kb.charAt(i - 1) && temp != kb.charAt(i + 1)) {
buffer = buffer + temp;
}
}
if (kb.charAt(kb.length() - 1) != kb.charAt(kb.length() - 2)) {
buffer = buffer + kb.charAt(kb.length() - 1);
}
modified = (kb.length() != buffer.length());
kb = buffer;
}
} while (modified);
System.out.println(kb);
}
Take note that this code is ugly for the sole purpose of the explanation. We should refactor this code. We can improve it a lot for the sake of brevity and, why not, performance.

Confusing Behavior with Array Initialization -- Processing (Java)

I'm encountering some confusing behavior when trying to create an array of certain length. The length is obtained by a file read in the function getTermNums. When I try to create an array of this size, my code runs in a strange order, my i values are skewed, and the code generally doesn't run as intended. When I instead create an array of a set integer amount, the code runs as intended without error.
double[] terms;
int numTerms = getNumTerms(lines[0]);
terms = new double[numTerms];
int i = 1;
for (i = 1; i<terms.length; i++){
//terms[i] = calculateTerm(T, lines[i]);
}
the above code runs incorrectly.
double[] terms;
int numTerms = getNumTerms(lines[0]);
int myNum = 200;
terms = new double[myNum];
int i = 1;
for (i = 1; i<terms.length; i++){
//terms[i] = calculateTerm(T, lines[i]);
the above code runs correctly
int getNumTerms(String line){
int i = 60;
int j = 0;
char[] word;
word = new char[4];
int numTerms;
int numTermLen = 0;
while(new String(word).compareTo("TERM") != 0){
for (j=0; j<4; j++){
word[j] = line.charAt(i + j);
}
i++;
}
j = i - 3;
while( new Character(line.charAt(j)).equals(' ') == false){
numTermLen++;
j--;
}
j++;
println("i in here: ", i);
numTerms = Integer.parseInt(line.substring(j, j + numTermLen));
return numTerms;
}
this is the function that I'm calling to read the numterms for the size of the array in the first example that doesn't work correctly.
When I use the function call to set the size of array terms[], i starts at some value like 380, and the iteration through array lines[] begins somewhere in the middle of the array.
When I use the integer myNum to set the size of array terms[], i starts at 1, and the iteration through array lines[] begins at the first line, as intended.
Any explanation is appreciated! I'm new to coding in java and am confused by the source of this error.
Thanks in advance.

Without seeing the text it's tiresome to deduct where in your getNumTerms the error occurs.
You can make use of String's indexOf() to find the index of "TERM" and substring() to extract the String containing the integer value.
As far as I understand the ideal string would have "TERM" followed by an integer then a space character. If these items are found and the value fits within 32 bits you should be able to use something like this:
String line = "LINE START TERM-1238847 LINE END";
int getTerm(String line){
int result = Integer.MAX_VALUE;
final String SEARCH_TOKEN = "TERM";
// look for TERM token and remember index
int termIndex = line.indexOf(SEARCH_TOKEN);
// handle not found
if(termIndex < 0){
System.err.println("error: " + SEARCH_TOKEN + " not found in line");
return result;
}
// move index by the size of the token
termIndex += SEARCH_TOKEN.length();
int spaceIndex = line.indexOf(' ',termIndex);
if(spaceIndex < 0){
System.err.println("error: no SPACE found after " + SEARCH_TOKEN);
return result;
}
// chop string extracing between token end and first space encountered
String intString = line.substring(termIndex,spaceIndex);
// try to parse int handling error
try{
result = Integer.parseInt(intString);
}catch(Exception e){
System.err.println("error parsing integer from string " + intString);
e.printStackTrace();
}
return result;
}
System.out.println("parsed integer: " + getTerm(line));

Is converting to String the most succinct way to remove the last comma in output in java?

So basically this is how my code looked like
public static void printPrime(int[] arr)
{
int len = arr.length;
for(int i = 0; i < len; i++)
{
int c = countFactor(arr[i]);
if(c == 2)
{
System.out.print(arr[i] + ",");
}
}
}
So the output did have the 'comma' in the end. I tried looking around to remove the last comma some answers say to print last element separately but that can only happen when output depends on the for loop and not the if condition.
But as you can see I don't know how many elements I am going to get from the if condition. Only two things I can think of, to add another loop or use String then substr to output.
So I converted it to String
public static void printPrime(int[] arr)
{
int len = arr.length;
String str = "";
for(int i = 0; i < len; i++)
{
int c = countFactor(arr[i]);
if(c == 2)
{
str = str + arr[i] + ",";
}
}
str = str.substring(0, str.length()-1);
System.out.println(str);
}
My question is about knowing the optimum way (converting to string then substringing it?) for similar questions or could there be a better way as well? That I seem to be missing.

You don't have to construct a string. Consider the following slight tweaks:
public static void printPrime(int[] arr)
{
int len = arr.length;
String sep = ""; // HERE
for(int i = 0; i < len; i++)
{
int c = countFactor(arr[i]);
if(c == 2)
{
System.out.print(sep); // HERE
sep = ",";
System.out.print(arr[i]);
}
}
}
Print the delimiter first, and store its value in a variable: the first time it's printed, it will print the empty string. Thereafter, it prints the comma.

Whatever means you use should operate correctly for an empty array (length 0), a singleton array (length 1) and a long array (a large length).
Adding the comma then removing it requires special case handling for the empty array case. So you must have conditional code (an if statement) whatever you do.

Counting the letters (uppercase and lowercase) of a string

I have here a program that enters a paragraph and writes it into a file. After that, it should count the occurrences of each letters (case sensitive). However, it doesn't count the number of letter occurrences. I think I put the for loop in the wrong place.
import java.io.*;
import java.util.*;
public class Exercise1 {
public static int countLetters (String line, char alphabet) {
int count = 0;
for (int i = 0; i <= line.length()-1; i++) {
if (line.charAt(i) == alphabet)
count++;
}
return count;
}
public static void main(String[] args) throws IOException {
BufferedReader buffer = new BufferedReader (new InputStreamReader(System.in));
PrintWriter outputStream = null;
Scanner input = new Scanner (System.in);
int total;
try {
outputStream = new PrintWriter (new FileOutputStream ("par.txt"));
System.out.println("How many lines are there in the paragraph you'll enter?");
int lines = input.nextInt();
System.out.println("Enter the paragraph: ");
String paragraph = buffer.readLine();
outputStream.println(paragraph);
int j;
for (j = 1; j<lines; j++) {
paragraph = buffer.readLine();
outputStream.println(paragraph);
}
outputStream.close();
System.out.println("The paragraph is written to par.txt");
for (int k=1; k<lines; k++) {
paragraph = buffer.readLine();
total = countLetters (paragraph, 'A');
if (total != 0)
System.out.println("A: "+total);
//I'll do bruteforce here up to lowercase z
}
}
catch(FileNotFoundException e) {
System.out.println("Error opening the file par.txt");
}
}
}
Please help me fix the code. I'm new in programming and I need help. Thank you very much!

First, your initial reading user input is a bit of a waste since you read once then enter the for loop for the rest - this is not a problem, just a better code.
// your code
String paragraph = buffer.readLine();
outputStream.println(paragraph);
int j;
for (j = 1; j<lines; j++) {
paragraph = buffer.readLine();
outputStream.println(paragraph);
}
You can just put them in the loop:
// better code
String paragraph;
int j;
for (j = 0; j<lines; j++) {
paragraph = buffer.readLine();
outputStream.println(paragraph);
}
Then your first problem comes from the way you read the lines:
// your code - not working
outputStream.close();
for (int k=1; k<lines; k++) {
paragraph = buffer.readLine();
total = countLetters (paragraph, 'A');
Consider what happened above:
The input is already DONE, the output is already written and stream is closed - up to here everything is good
Then when you try to count the number of characters, you do: paragraph = buffer.readLine(); - what does this code do? It waits for another user input (instead of reading what's been inserted)
To fix the problem above: you need to read from what's already been written - not asking for another input. Then instead of brute forcing every character one by one, you can just put them into a list and write a for loop.
So now, you want to read from the existing file that you already created (ie. reading what WAS inputted by the user):
BufferedReader fileReader = new BufferedReader(new FileReader(new File("par.txt")));
String allCharacters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
String aLineInFile;
// Read the file that was written earlier (whose content comes from user input)
// This while loop will go through line-by-line in the file
while((aLineInFile = fileReader.readLine()) != null)
{
// For every line in the file, count number of occurrences of characters
// This loop goes through every character (a-z and A-Z)
for(int i = 0; i < allCharacters.length(); i++)
{
// For each single character, check the number of occurrences in the current line
String charToLookAt = String.valueOf(allCharacters.charAt(i));
int numOfCharOccurancesInLine = countLetters (aLineInFile, charToLookAt);
System.out.println("For line: " + aLineInFile + ", Character: " + charToLookAt + " appears: " + numOfCharOccurancesInLine + " times " );
}
}
The above gives you the number of occurrences of every character in every line - now you just need to organize them to keep track of how many are in total for the whole file.
Code-wise, there might be better way to write this to have cleaner implementation, but the above is easy to understand (and I just wrote it very quickly).

Do everything in one loop:
for (j = 1; j<lines; j++) {
paragraph = buffer.readLine();
total = countLetters (paragraph, 'A');
if (total != 0)
System.out.println("A: "+total);
outputStream.println(paragraph);
}

You can use a HashTable for count each case sentitive letters :
final Pattern patt = Pattern.compile("A-Za-z]");
final HashMap<Character, Integer> tabChar = new HashMap<Character, Integer>(
52);
// replace : paragraph = buffer.readLine();
// Unless you use it outside, you can declare it 'final'
final char[] paragraph = "azera :;,\nApOUIQSaOOOF".toCharArray();
for (final Character c : paragraph ) {
if (Character.isLetter(c)) {
Integer tot = tabChar.get(c);
tabChar.put(c, (null == tot) ? 1 : ++tot);
}
}
Output :
{F=1, A=1, O=4, I=1, U=1, Q=1, S=1, e=1, a=3, r=1, p=1, z=1}
You can use final TreeSet<Character> ts = new TreeSet(tabChar.keySet()); to sort the characters and then get(c); them from tabChar

The previous answers would have solved your problem but another way of avoiding brute force might be to use a loop using ASCII character value.

java Run-length encoding

I have no idea how to start my assignment.
We got to make a Run-length encoding program,
for example, the users enters this string:
aaaaPPPrrrrr
is replaced with
4a3P5r
Can someone help me get started with it?

Hopefully this will get you started on your assignment:
The fundamental idea behind run-length encoding is that consecutively occurring tokens like aaaa can be replaced by a shorter form 4a (meaning "the following four characters are an 'a'"). This type of encoding was used in the early days of computer graphics to save space when storing an image. Back then, video cards supported a small number of colors and images commonly had the same color all in a row for significant portions of the image)
You can read up on it in detail on Wikipedia
http://en.wikipedia.org/wiki/Run-length_encoding
In order to run-length encode a string, you can loop through the characters in the input string. Have a counter that counts how many times you have seen the same character in a row. When you then see a different character, output the value of the counter and then the character you have been counting. If the value of the counter is 1 (meaning you only saw one of those characters in a row) skip outputting the counter.

public String runLengthEncoding(String text) {
String encodedString = "";
for (int i = 0, count = 1; i < text.length(); i++) {
if (i + 1 < text.length() && text.charAt(i) == text.charAt(i + 1))
count++;
else {
encodedString = encodedString.concat(Integer.toString(count))
.concat(Character.toString(text.charAt(i)));
count = 1;
}
}
return encodedString;
}
Try this one out.

This can easily and simply be done using a StringBuilder and a few helper variables to keep track of how many of each letter you've seen. Then just build as you go.
For example:
static String encode(String s) {
StringBuilder sb = new StringBuilder();
char[] word = s.toCharArray();
char current = word[0]; // We initialize to compare vs. first letter
// our helper variables
int index = 0; // tracks how far along we are
int count = 0; // how many of the same letter we've seen
for (char c : word) {
if (c == current) {
count++;
index++;
if (index == word.length)
sb.append(current + Integer.toString(count));
}
else {
sb.append(current + Integer.toString(count));
count = 1;
current = c;
index++;
}
}
return sb.toString();
}
Since this is clearly a homework assignment, I challenge you to learn the approach and not just simply use the answer as the solution to your homework. StringBuilders are very useful for building things as you go, thus keeping your runtime O(n) in many cases. Here using a couple of helper variables to track where we are in the iteration "index" and another to keep count of how many of a particular letter we've seen "count", we keep all necessary info for building our encoded string as we go.

Try this out:
private static String encode(String sampleInput) {
String encodedString = null;
//get the input to a character array.
// String sampleInput = "aabbcccd";
char[] charArr = sampleInput.toCharArray();
char prev=(char)0;
int counter =1;
//compare each element with its next element and
//if same increment the counter
StringBuilder sb = new StringBuilder();
for (int i = 0; i < charArr.length; i++) {
if(i+1 < charArr.length && charArr[i] == charArr[i+1]){
counter ++;
}else {
//System.out.print(counter + Character.toString(charArr[i]));
sb.append(counter + Character.toString(charArr[i]));
counter = 1;
}
}
return sb.toString();
}

Here is my solution in java
public String encodingString(String s){
StringBuilder encodedString = new StringBuilder();
List<Character> listOfChars = new ArrayList<Character>();
Set<String> removeRepeated = new HashSet<String>();
//Adding characters of string to list
for(int i=0;i<s.length();i++){
listOfChars.add(s.charAt(i));
}
//Getting the occurance of each character and adding it to set to avoid repeated strings
for(char j:listOfChars){
String temp = Integer.toString(Collections.frequency(listOfChars,j))+Character.toString(j);
removeRepeated.add(temp);
}
//Constructing the encodingString.
for(String k:removeRepeated){
encodedString.append(k);
}
return encodedString.toString();
}

import java.util.Scanner;
/**
* #author jyotiv
*
*/
public class RunLengthEncoding {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
System.out.println("Enter line to encode:");
Scanner s=new Scanner(System.in);
String input=s.nextLine();
int len = input.length();
int i = 0;
int noOfOccurencesForEachChar = 0;
char storeChar = input.charAt(0);
String outputString = "";
for(;i<len;i++)
{
if(i+1<len)
{
if(input.charAt(i) == input.charAt(i+1))
{
noOfOccurencesForEachChar++;
}
else
{
outputString = outputString +
Integer.toHexString(noOfOccurencesForEachChar+1) + storeChar;
noOfOccurencesForEachChar = 0;
storeChar = input.charAt(i+1);
}
}
else
{
outputString = outputString +
Integer.toHexString(noOfOccurencesForEachChar+1) + storeChar;
}
}
System.out.println("Encoded line is: " + outputString);
}
}
I have tried this one. It will work for sure.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Optimizing Project Euler #22 - java

I suggest you to run your code with profiler. It allows you to understand, what part is really slow (IO/computations etc). If IO is slow, check for NIO: http://docs.oracle.com/javase/1.4.2/docs/guide/nio/.

Related

Fixing a looping issue for removing letters in a String

Confusing Behavior with Array Initialization -- Processing (Java)

Is converting to String the most succinct way to remove the last comma in output in java?

Counting the letters (uppercase and lowercase) of a string

java Run-length encoding

Categories

Resources