Java, string difference algorithm

Java, string difference algorithm - java

I would like to get a feedback about the code below. Is there any way to improve it's performance? Maybe you know input values that might print bad output? The idea of the code is to count unique characters from s2 that are not listed in s1.
Ideone.com URL.
The code:
class Combine {
public static void main(String[] args) throws IOException {
BufferedReader bi = new BufferedReader(new InputStreamReader(System.in));
String s1 = bi.readLine();
String s2 = bi.readLine();
String usedCharacters = "";
for(int i = 0; i < s2.length(); i++) {
String c = Character.toString(s2.charAt(i));
if(!usedCharacters.contains(c) && !s1.contains(c))
usedCharacters += c;
}
System.out.println(usedCharacters.length());
}
}

I think this is fairly well optimized but you should probably check for null, as it will fail if you pass it null values.

I think this performance is good enough and any improvements would not make a sensitive difference.

I think you should try to iterate over the larger or the smaller String (measure the time overhead) - that's where you can save some CPU time.

If your input is relatively small (eg typed by a user on the console) then I see no performance problem with your solution (apart from the null checks as suggested by another answer).
If the input is large, eg redirected files on the command line of megabytes or more then I think your solution will yield O(n^2) run time performance as it iterates though s2 and the contains method call will also iterate across the whole string.
An alternative algorithm would be to sort the two input strings and then iterate across them to count the differences. That would result in O(n log n) performance.

Didn't try it but here's an idea:
class Combine {
public static void main(String[] args) throws IOException {
BufferedReader bi = new BufferedReader(new InputStreamReader(System.in));
String s1 = bi.readLine();
String s2 = bi.readLine();
int count = 0;
for(char c : new HashSet<Char>(s2.toCharArray())) {
if(s1.contains(c)) count++
}
System.out.println(count);
}
}

Related

Separating an unknown amount of hyphens in java?

Good day, guys,
I'm working on a program which requires me to input a name (E.g Patrick-Connor-O'Neill). The name can be composed of as many names as possible, so not necessarily restricted to solely 3 as seen in the example above.But the point of the program is to return the initials back so in this case PCO. I'm writing to ask for a little clarification. I need to separate the names out from the hyphens first, right? Then I need to take the first character of the names and print that out?
Anyway, my question is basically how do I separate the string if I don't know how much is inputted? I get that if it's only like two terms I would do:
final String s = "Before-After";
final String before = s.split("-")[0]; // "Before"
I did attempt to do the code, and all I have so far is:
import java.util.Scanner;
public class main {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
String input = scan.nextLine();
String[] x = input.split("-");
int u =0;
for(String i : x) {
String y = input.split("-")[u];
u++;
}
}
}
I'm taking a crash course in programming, so easy concepts are hard for me.Thanks for reading!

You don't need to split it a second time. By doing String[] x = input.split("-"); you have an Array of Strings. Now you can iterate over them which you already do with the enhanced for loop. It should look like this
String[] x = input.split("-");
String initials = "";
for (String name : x) {
initials += name.charAt(0);
}
System.out.println(initials);
Here are some Java Docs for the used methods
String#split
String#charAt
Assignment operator +=

You can do it without splitting the string by using String.indexOf to find the next -; then just append the subsequent character to the initials:
String initials = "" + input.charAt(0);
int next = -1;
while (true) {
next = input.indexOf('-', next + 1);
if (next < 0) break;
initials += input.charAt(next + 1);
}
(There are lots of edge cases not handled here; omitted to get across the main point of the approach).

In your for-each loop append first character of all the elements of String array into an output String to get the initials:
String output = "";
for(String i : x) {
output = output + y.charAt(0);
}

This will help.
public static void main(String[] args) {
String output = "";
String input = "Patrick-Connor-O'Neil-Saint-Patricks-Day";
String[] brokenInput = input.split("-");
for (String temp : brokenInput) {
if (!temp.equals(""))
output = output + temp.charAt(0);
}
System.out.println(output);
}

You could totally try something like this (a little refactor of your code):
import java.util.Scanner;
public class main {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
String input = "";
System.out.println("What's your name?");
input = scan.nextLine();
String[] x = input.split("-");
int u =0;
for(String i : x) {
String y = input.split("-")[u];
u++;
System.out.println(y);
}
}
}
I think it's pretty easy and straightforward from here if you want to simply isolate the initials. If you are new to Java make sure you use a lot of System.out since it helps you a lot with debugging.
Good coding.
EDIT: You can use #Mohit Tyagi 's answer with mine to achieve the full thing if you are cheating :P

This might help
String test = "abs-bcd-cde-fgh-lik";
String[] splitArray = test.split("-");
StringBuffer stringBuffer = new StringBuffer();
for (int i = 0; i < splitArray.length; i++) {
stringBuffer.append(splitArray[i].charAt(0));
}
System.out.println(stringBuffer);
}
Using StringBuffer will save your memory as, if you use String a new object will get created every time you modify it.

Improving this code for reversing the string and removing duplicate characters [duplicate]

This question already has answers here:
Reverse a string in Java
(36 answers)
Closed 8 years ago.
I recently attended an interview where I was asked to write a program.
The problem was:
Take a string. "Hammer", for example.
Reverse it and any character should not be repeated.
So, the output will be - "remaH".
This is the solution I gave:
public class Reverse {
public static void main(String[] args) {
String str = "Hammer";
String revStr = "";
for(int i=0; i<=str.length()-1;i++){
if(revStr.indexOf(str.charAt(i))==-1){
revStr = str.charAt(i)+revStr;
}
}
System.out.println(revStr);
}
}
How I can improve the above?

The problem is String is immutable object, and when using operator+ to concat a char with the current result, you actually create a new string.
This results in creating strings of length 1+2+...+n, which gives you total performance of O(n^2) (unless the compiler optimizes this for you).
Using a StringBuilder instead of concatting strings will give you O(n) performance, and with much better constants as well.
Note that a StringBuilder offers an efficient append() implementaiton, so you need to append elements to it, and NOT add them at the head of your StringBuilder.
You should also reconsider usage of indexOf() - if a characters cannot appear twice at all, consider using a Set<Chatacter> to maintain the list of 'used' characters, if it can appear twice, but not one after the other (for example "mam" is valid) - there is really no need for the indexOf() in the first place, just check the last character read.

Here is a solution without using any stringbuilder or intermediary String objects, just treating Strings as arrays of chars; this should be more efficient.
import java.util.Arrays;
public class Reverse {
public static void main(String[] args) {
String str = "Hammer";
String revStr = null;
char [] chars = str.toCharArray();
char [] reversedChars = new char[chars.length];
// copy first char
reversedChars[reversedChars.length - 1] = chars[0];
// process rest
int r = reversedChars.length - 2;
for(int i = 1 ; i < chars.length ; i++ ){
if(chars[i] != chars[i-1]){
reversedChars[r] = chars[i];
r--;
}
}
revStr = new String(Arrays.copyOfRange(reversedChars, r+1, reversedChars.length));
System.out.println(revStr);
}

package com.in.main;
public class Reverse {
public static void main(String[] args) {
String str = "Hammer";
StringBuilder revStr= new StringBuilder("");
for(int i=str.length(); i>=0;i--){
if(revStr.indexOf(str.charAt(i))==-1){
revStr.append(str.charAt(i));
}
}
System.out.println(revStr);
}
}

Removing duplicate chars from a string passed as a parameter

I am a little confused how to approach this problem. The userKeyword is passed as a parameter from a previous section of the code. My task is to remove any duplicate chars from the inputted keyword(whatever it is). We have just finished while loops in class so some hints regarding these would be appreciated.
private String removeDuplicates(String userKeyword){
String first = userKeyword;
int i = 0;
while(i < first.length())
{
if (second.indexOf(first.charAt(i)) > -1){
}
i++;
return "";
Here's an update of what I have tried so far - sorry about that.

This is the perfect place to use java.util.Set, a construct which is designed to hold unique elements. By trying to add each word to a set, you can check if you've seen it before, like so:
static String removeDuplicates(final String str)
{
final Set<String> uniqueWords = new HashSet<>();
final String[] words = str.split(" ");
final StringBuilder newSentence = new StringBuilder();
for(int i = 0; i < words.length; i++)
{
if(uniqueWords.add(words[i]))
{
//Word is unique
newSentence.append(words[i]);
if((i + 1) < words.length)
{
//Add the space back in
newSentence.append(" ");
}
}
}
return newSentence.toString();
}
public static void main(String[] args)
{
final String str = "Words words words I love words words WORDS!";
System.out.println(removeDuplicates(str)); //Words words I love WORDS!
}

Have a look at this answer.
You might not understand this, but it does the job (it cleverly uses a HashSet that doesn't allow duplicate values).
I think your teacher might be looking for a solution using loops however - take a look at William Morisson's answer for this.
Good luck!

For future reference, StackOverflow normally requires you to post what you have, and ask for suggestions for improvement.
As its not an active day, and I am bored I've done this for you. This code is pretty efficient and makes use of no advanced data structures. I did this so you could more easily understand it.
Please do try to understand what I'm doing. Learning is what StackOverflow is for.
I've added comments in the code to assist you in learning.
private String removeDuplicates(String keyword){
//stores whether a character has been encountered before
//a hashset would likely use less memory.
boolean[] usedValues = new boolean[Character.MAX_VALUE];
//Look into using a StringBuilder. Using += operator with strings
//is potentially wasteful.
String output = "";
//looping over every character in the keyword...
for(int i=0; i<keyword.length(); i++){
char charAt = keyword.charAt(i);
//characters are just numbers. if the value in usedValues array
//is true for this char's number, we've seen this char.
boolean shouldRemove = usedValues[charAt];
if(!shouldRemove){
output += charAt;
//now this character has been used in output. Mark that in
//usedValues array
usedValues[charAt] = true;
}
}
return output;
}
Example:
//output will be the alphabet.
System.out.println(removeDuplicates(
"aaaabcdefghijklmnopqrssssssstuvwxyyyyxyyyz"));

Sorting doubles in java

import java.io.*;
import java.util.*;
public class Main {
public static void main(String[] args) throws Exception {
BufferedReader in = new BufferedReader(new FileReader(new File(args[0])));
String line;
while ((line = in.readLine()) != null) {
StringTokenizer st = new StringTokenizer(line);
int len = st.countTokens();
Double[] seq = new Double[len];
for (int i = 0; i < len; i++)
seq[i] = Double.parseDouble(st.nextToken());
Arrays.sort(seq);
for (int i = 0; i < len; i++) {
if (i > 0) System.out.print(" ");
System.out.print(seq[i]);
} System.out.print("\n");
}
}
}
So I'm trying to solve this CodeEval problem (https://www.codeeval.com/open_challenges/91/) and my solution is not getting through all the test cases. I think my method of output is correct (spaces between numbers, trailing newline). I can't figure out what may be going on in the sorting or anywhere else.
The solution is apparently not correct when using floats either.

I also think this is a printing problem. The output seems to require exactly 3 decimal places on each number, based on the sample I/O. But, if you print out a double like 70.920 (one of the example inputs) it will display as 70.92.
double d = 70.920;
System.out.println(d);
System.out.printf("%.3f", d); // <-- try this
70.92
70.920
Notice how the second output is consistent with the format of the sample output whereas the first is not.

You may be sorting correctly but printing incorrectly. Decimal numbers are represented approximately. The runtime attempts to show them in the short format, but that is not guaranteed.
All the examples they gave should work, but who knows what the test suite does in the background.

I would use double not Double and I would only sort the values after reading the all, not after every line.
Perhaps some inputs have more than one line?

Android Garbage Collector Slow Down

I'm a semi experienced programmer, just not so much within java. To help learn Java/Android I started working on a world builder application, something that takes 2-7 characters and finds all common words out of that. Currently I have about 10,000 words split between 26 .txt files that are loaded based on what characters are inputted from the user. Together it's ~10kb of data.
The logic was the easy part but now, the GC seems to be slowing everything down and I'm struggling to find ways to optimize due to my lack of Java experience. Here is the code below that I'm almost postitive the GC is constantly running on. I'd like to point out with 2-4 characters the code below runs pretty quickly. Anything larger than that gets really slow.
public void readFile() throws IOException, NotFoundException
{
String dictionaryLine = new String(); //Current string from the .txt file
String currentString = new String(); //Current scrambled string
String comboStr = new String(); //Current combo string
int inputLength = myText.getText().length(); // lenth of the user input
//Loop through every "letter" dictionary
for(int z = 0; z < neededFiles.length - 1; z++)
{
if(neededFiles[z] == null)
break;
InputStream input = neededFiles[z];
InputStreamReader inputReader = new InputStreamReader(input);
BufferedReader reader = new BufferedReader(inputReader, inputLength);
//Loop through every line in the dictionary
while((dictionaryLine = reader.readLine()) != null)
{
Log.i(TAG, "dictionary: " + dictionaryLine);
//For every scrambled string...
for(int i = 0; i < scrambled.size(); i++)
{
currentString = scrambled.get(i).toString();
//Generate all possible combos from the scrambled string and populate 'combos'
generate(currentString);
//...lets find every possible combo from that current scramble
for(int j = 0; j < combos.size(); j++)
{
try
{
comboStr = combos.get(j).toString();
//If the input length is less than the current line, don't even compare
if(comboStr.length() < dictionaryLine.length() || comboStr.length() > dictionaryLine.length())
break;
//Add our match
if(dictionaryLine.equalsIgnoreCase(comboStr))
{
output.add(comboStr);
break;
}
}
catch(Exception error)
{
Log.d(TAG, error.getMessage());
}
}
combos.clear();
}
}
}
}
To help clarify this code generates many, many lines of the following:
GC_FOR_MALLOC freed 14000 objects / 510000 byes in 100ms
I appreciate any help you can give, even if it's just Java best practices.

In general, you reduce garbage collection activity by creating and losing less objects. There are a lot of places where objects can be generated:
Each line you are reading produces a String.
Strings are immutable, so likely more objects are being spawned in your generate() function.
If you are dealing with a lot of strings, consider a StringBuilder, which is a mutable string builder which reduces garbage.
However, 100ms for garbage collection is not bad, especially on a phone device.

Basically, you're in a bad way because that for each dictionary word you're generating all possible combinations for all scrambled strings, yikes! If you have enough memory, just generate all the combos for all words once and compare each one to every dictionary value.
However, it must be assumed that there isn't enough memory for this, in which case, this is going to get more complicated. What you can do is use a char[] to produce one scramble possibility, test it, rearrange the characters in the buffer, test, repeat, etc until all possibilities are exhausted.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java, string difference algorithm - java

I think this is fairly well optimized but you should probably check for null, as it will fail if you pass it null values.

I think this performance is good enough and any improvements would not make a sensitive difference.

I think you should try to iterate over the larger or the smaller String (measure the time overhead) - that's where you can save some CPU time.

Related

Separating an unknown amount of hyphens in java?

Improving this code for reversing the string and removing duplicate characters [duplicate]

Removing duplicate chars from a string passed as a parameter

Sorting doubles in java

Android Garbage Collector Slow Down

Categories

Resources