Which solution is better, faster and readable? [closed] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
ok, problem: "An isogram is a word that has no repeating letters, consecutive or non-consecutive. Implement a function that determines whether a string that contains only letters is an isogram. Assume the empty string is an isogram. Ignore letter case."
and I write 2 solutions
First solution:
Scanner scanner = new Scanner(System.in);
String string = scanner.next();
long start = System.currentTimeMillis();
char words[] = string.toCharArray();
boolean isIsogram=true;
for (int i=(words.length-1); i>=0; i--){
for(int j=0; j<(i-1);j++){
if(words[i]==words[j]){
isIsogram=false;
}
}
}
long finish = System.currentTimeMillis();
System.out.println(isIsogram + " time:"+ (finish-start) );
Second solution:
Scanner scanner = new Scanner(System.in);
String string = scanner.next();
long start = System.currentTimeMillis();
boolean isIsogram = (string.length() == string.toLowerCase().chars().distinct().count());
long finish = System.currentTimeMillis();
System.out.println(isIsogram + " time:"+ (finish-start) );
I have tested both solutions and there is results:
input: "asd"
1) true time 0
2) true time 113
and I want to know your ideas and opinion which solution is better?
My teacher told me 2 solution is better, but 1 solution takes
less time, and I am not sure which is better.....

The right answer is actually to profile your specific problem and see what works for your requirements.
Readability is important. What approach is most readable is very subjective. Stream-oriented operations usually attack the problem from a more declarative approach rather than an imperative approach. Declarative code is usually much easier to read, but imperative code is often faster.
But how fast do you need to be? Even your (very flawed) benchmark shows only a difference of 100 milliseconds. That's faster than the threshold of human perception. If your code isn't too slow, then don't worry about making it faster. Worry about making it clear, maintainable, debuggable, and correct first.
In any case, since this is a fun problem, I poked at it for a minute. You have 216 possible char values in a String, so if you use a BitSet you can have a yes/no bit for each of them, and still fit the whole thing in 8K of memory.
Your question said to do case folding. That sounds like a simplification, but it's really not, unless your data is ASCII (in which case, you only need a 256-bit BitSet, or possibly only a 26-bit one!). If you can use the full range of Unicode characters, even the problem of reliably converting from upper case to lower case becomes almost impossible to do correctly. (Case conversion is ultimately locale-specific.)
So I'm going to assume you want to handle all possible char values, which won't handle UTF-16 surrogates (like you need for emoji) correctly, but should handle everything that's considered a "letter" in alphabetic languages.
Here's what I came up with:
static boolean isIsogram(String text) {
java.util.BitSet bits = new java.util.BitSet(1 << 16);
for (int i = 0; i < text.length; i++) {
int ch = (int) text.charAt(i);
if (bits.get(ch)) {
return false;
}
bits.set(ch);
}
return true;
}

A few things on readability:
First codeblock:
I would have the counters counting in the same direction-- it will still compare each word this way. It's not a terribly important change, but it can save the reader a step so that they don't have to do any mental math to determine if the code is producing the intended result, since the result is apparent (it's easy to see that the code's time complexity is O(n^2)).
boolean isIsogram = true;
Scanner scanner = new Scanner(System.in);
String string = scanner.next();
long start = System.currentTimeMillis();
char words[] = string.toCharArray();
for (int i = 0 ; i <= words.length - 1; i++){
for(int j = 0; j <= words.length - 1; j++){
if(words[i] == words[j]){
isIsogram = false;
break; //I agree with the first answer
}
}
if (!isIsogram)
break;
}
long finish = System.currentTimeMillis();
System.out.println(isIsogram + " time:" + (finish-start) );
The second codeblock is quite readable, although I may be primed towards understanding the problem and so it might actually be more readable because of that. But the calls to compare distinct characters make complete sense in terms of the goal.

Related

Best way to implement while loop in java 8 or later

I have to remove all slash (/) from the beginning of string.
So I have written
while (convertedUrl.startsWith("\\"))
{
convertedUrl = convertedUrl.substring(1);
}
Above code creates string for each substring.
Is there any better way to write this code in java 8 or later?
How can I do it keeping mind memory utilisation and performance.
I would guess at:
int len = str.length();
int i=0;
for (; i<len && str.charAt(i) == '\\'; ++i) {
;
}
return str.substring(i);
I write str instead of convertedUrl because this should be in its own method.
It is unlikely this is a performance bottleneck, but the original code may run as slow as O(n^2) (depending on implementation).
can you simply not use something like this, to replace all the "/" in one go
convertedUrl = convertedUrl.replaceAll("\\/","");
I am sorry for the initial one, but I think this will do:
convertedUrl = convertedUrl.replaceFirst("^/*","");
OR this:
convertedUrl = convertedUrl.replaceAll("^/*","");
both will get the job done!
as they replaces all the leading "/" chars!

I need to decrypt a file efficiently [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
I am trying to decrypt an encrypted file with unknown key - the only thing I know about it is that the key is an integer x, 0 <= x < 1010 (i.e. a maximum of 10 decimal digits).
public static String enc(String msg, long key) {
String ans = "";
Random rand = new Random(key);
for (int i = 0; i < msg.length(); i = i + 1) {
char c = msg.charAt(i);
int s = c;
int rd = rand.nextInt() % (256 * 256);
int s2 = s ^ rd;
char c2 = (char) (s2);
ans += c2;
}
return ans;
}
private static String tryToDecode(String string) {
String returnedString = "";
long key;
String msg = reader(string);
for (long i = 0; i <= 999999999; i++) {
System.out.println("decoding message with key + " + i);
key = i;
System.out.println("decoding with key: " + i + "\n" + enc(msg, key));
}
return returnedString;
}
I expect to find the plain text
The program works very slowly, is there any way to make it more efficient?
You can use Parallel Array Operations added in JAVA 8 if you are using Java 8 to achive this.
The best fit for you would be to use Spliterator
public void spliterate() {
System.out.println("\nSpliterate:");
int[] src = getData();
Spliterator<Integer> spliterator = Arrays.spliterator(src);
spliterator.forEachRemaining( n -> action(n) );
}
public void action(int value) {
System.out.println("value:"+value);
// Perform some real work on this data here...
}
I am still not clear about your situation. Here some great tutorials and articles to figure out which parallel array operations of java 8 is going to help you ?
http://www.drdobbs.com/jvm/parallel-array-operations-in-java-8/240166287
https://blog.rapid7.com/2015/10/28/java-8-introduction-to-parallelism-and-spliterator/
First things first: You can't println billions of lines. This will take forever, and it's pointless - you won't be able to see the text as it scrolls by, and your buffer won't save billion of lines so you couldn't scroll back up later even if you wanted to. If you prefer (and don't mind it being 2-3% slower than it otherwise would be), you can output once every hundred million keys, just so you can verify your program is making progress.
You can optimize things by not concatenating Strings inside the loop. Strings are immutable, so the old code was creating a rather large number of Strings, especially in the enc method. Normally I'd use a StringBuilder, but in this case a simple character array will meet our needs.
And there's one more thing we need to do that your current code doesn't do: Detect when we have the answer. If we assume that the message will only contain characters from 0-127 with no Unicode or extended ASCII, then we know we have a possible answer when the entire message contains only characters in this range. And we can also use this to further optimize, as we can then immediately discard any message that has a character outside of this range. We don't even have to finish decoding it and can move on to the next key. (If the message is of any length, the odds are that only one key will produce a decoded message with characters in that range - but it's not guaranteed, which is why I do not stop when I get to a valid message. You could probably do that, though.)
Due to the way random numbers are generated in Java, anything in the seed above 32 bits is not used by the encoding/decoding algorithm. So you only need to go up to 4294967295 instead of 9999999999. (This also means the key that was originally used to encode the message might not be the key this program uses to decode it, since 2-3 keys in the 10 digit range will produce the same encoding/decoding.)
private static String tryToDecode4(String msg) {
String returnedString = "";
for (long i=0; i<=4294967295l; i++)
{
if (i % 100000000 == 0) // This part is just to see that it's making progress. Remove if desired for a small speed gain.
System.out.println("Trying " + i);
char[] decoded = enc4(msg, i);
if (decoded == null)
continue;
returnedString = String.valueOf(decoded);
System.out.println("decoding with key: " + i + " " + returnedString);
}
return returnedString;
}
private static char[] enc4(String msg, long key) {
char[] ansC = new char[msg.length()];
Random rand = new Random(key);
for(int i=0;i<msg.length();i=i+1)
{
char c = msg.charAt(i);
int s = c;
int rd = rand.nextInt()%(256*256);
int s2 = s^rd;
char c2 = (char)(s2);
if (c2 > 127)
return null;
ansC[i] = c2;
}
return ansC;
}
This code finished running in a little over 3 minutes on my machine, with a message of "Hello World".
This code will not work well for very short messages (3-4 characters or less.) It will not work if the message contains Unicode or extended ASCII, although it could easily be modified to do so if you know the range of characters that might be in the message.

Best way to convert integers to Strings in Java [duplicate]

This question already has answers here:
Java - Convert integer to string [duplicate]
(6 answers)
Closed 8 years ago.
So I have seen this question, which gives several ways to convert integers to Strings, but I am wondering if there is any difference between them.
If I want to just convert an integer i to a string, then is there a difference between these three ways (and are some faster than others)?
i+""
Integer.toString(i)
String.valueOf(i)
I would be inclined to use the second or third since the first one just seems weird to me. On the other hand, if I wanted to convert an integer i to a string and then concatenate it to another string, s, I could do:
i+s
Integer.toString(i)+s
String.valueOf(i)+s
Here I would be inclined to use the first one since I am concatenating anyway. But my question is: are there any standard practices that should be used here, and what exactly are the differences (if any) among these three methods?
The option 1 ""+i is actually interpreted by the compiler as option 1b new StringBuilder("").append(i).toString().
The second option String.valueOf(i) internally calls Integer.toString(i) and therefore has an overhead of one method call.
The third option Integer.toString(i) seems to be fastest in my benchmark bellow.
In my tests (aprx. average over multiple runs):
Option 1: ~64000 ms
Option 1b: ~64000 ms (same as option 1, due to equivalent compilation)
Option 2: ~86000 ms (due to additional method call)
Option 3: ~40000 ms
The simplistic benchmark code I used:
public static void main(String[] args) {
int i = 0;
String result;
long time;
time = System.currentTimeMillis();
while (i < Integer.MAX_VALUE) {
result = ""+i;
i++;
}
System.out.println("Option 1: " + (System.currentTimeMillis() - time));
i = 0;
time = System.currentTimeMillis();
while (i < Integer.MAX_VALUE) {
result = new StringBuilder("").append(i).toString();
i++;
}
System.out.println("Option 1b: " + (System.currentTimeMillis() - time));
i = 0;
time = System.currentTimeMillis();
while (i < Integer.MAX_VALUE) {
result = String.valueOf(i);
i++;
}
System.out.println("Option 2: " + (System.currentTimeMillis() - time));
i = 0;
time = System.currentTimeMillis();
while (i < Integer.MAX_VALUE) {
result = Integer.toString(1);
i++;
}
System.out.println("Option 3: " + (System.currentTimeMillis() - time));
}
As a conclusion, at least on my JVM (JDK 1.7.0_60 64bit) the option 3: Integer.toString(i) is the fastest and I'd recommend to use it.
The conclusion that ""+i is the fastest in one of the other posts is likely due to a flawed benchmark which enabled compilation into a constant.
A simple test revealed HUGE differences between them.
Allowing for start up and cool down between runs, then for every test, running to Integer.MAX_VALUE for every option, I got these results:
""+1 took 675 millis
String.valueOf(1) took 52244 millis
Integer.toString(1) took 53205 millis
Result: Whenever possible, use i+"" or ""+1.
Sense of social duty kicking in here:
My tests were (as I indicated) ""+1, String.valueOf(1), and Integer.toString(1). When I re-run my tests I find the same results. HOWEVER, when I use variables, as indicated by ASantos and VSchäfer I get times similar between the solutions.
Whew. I feel better now.
Good answers. I've tried it (just for fun) and got similar times.
""+i is interpreted as StringBuilder("").append(i).toString() where as ""+1 is inlined.
I suspect some of the benchmarks presented here were inlined.
Based on my tests I found these times:
Option 1 (K) ""+8 : 655 milliseconds
Option 1 (var) ""+i: 83462 milliseconds
Option 2 String.valueOf(i): 90685 milliseconds
Option 3 Integer.toString(i): 88764 milliseconds
Option 1 (K) is what I suspect some tests here were using.
Option 2 is slightly better than option 4. Maybe due to some optimization in the JVM?
I am using a MacBook Pro, the JVM is 1.7.0_21
This is the source code I used:
public class Main{
public static void main(String[] args) throws InterruptedException
{
// warm up the JVM
for (int i = 0; i < 1000; i++)
{
String aString = ""+i;
}
long now = System.currentTimeMillis();
for (int i = 0; i < Integer.MAX_VALUE; i++)
{
String aString = ""+8;
}
System.out.println("Option 1 (K) \"\"+8 "+(System.currentTimeMillis()-now));
now = System.currentTimeMillis();
for (int i = 0; i < Integer.MAX_VALUE; i++)
{
String aString = ""+i;
}
System.out.println("Option 1 (var) \"\"+i "+(System.currentTimeMillis()-now));
Thread.sleep(1000);
now = System.currentTimeMillis();
for (int i = 0; i < Integer.MAX_VALUE; i++)
{
String aString = String.valueOf(i);
}
System.out.println("Option 2 String.valueOf(i) "+(System.currentTimeMillis()-now));
Thread.sleep(1000);
now = System.currentTimeMillis();
for (int i = 0; i < Integer.MAX_VALUE; i++)
{
String aString = Integer.toString(i);
}
System.out.println("Option 3 Integer.toString(i) "+(System.currentTimeMillis()-now));
}
}
Differences:
String to int
//overhead as you are appending a empty string with an int
""+i
//More generic (can be the same for float, double, long & int)
String.valueOf(i)
//Type Safe you know for sure that what you are converting is an Integer.
Integer.toString(i)
No, there are no standard way it all depends on what you are trying to accomplish, if you want your code to be more robust you might try the valueOf(), if you want to be certain that it is in fact a Integer than the second (the first is not recommended).
There is no significant performance difference between these 3.
The other way is analog to the comments of the String to int.
i+""; Tends to be computationally faster than the other two but as far as industry standards go, it tends to be frowned upon. I personally would use String.valueOf(i) but I believe Integer.toString(i) is also acceptable. i + "S" in the case of concatenation is also fine.
String.valueOf(i);
actually calls
Integer.toString(i);
One more thing if you invoke toString() in a null object you will get a NullPointerException
Moreover the static valueOf() of String class takes different primitive parameters and offers more flexibility
The source code of String.valueOf(int) is:
public static String valueOf(int i) {
return Integer.toString(i);
}
Since the String class is final, this call will probably be inlined by the JIT.
In the case of i + "" it will create a StringBuilder and call its append(int) method, which ultimately results in a call to Integer.getChars, the same method that Integer.toString uses to get the string representation of the integer. It also calls the append(String) method to append the empty string, and then toString to get the resulting String from the StringBuilder.

Strange performance drop after innocent changes to a trivial program

Imagine you want to count how many non-ASCII chars a given char[] contains. Imagine, the performance really matters, so we can skip our favorite slogan.
The simplest way is obviously
int simpleCount() {
int result = 0;
for (int i = 0; i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
return result;
}
Then you think that many inputs are pure ASCII and that it could be a good idea to deal with them separately. For simplicity assume you write just this
private int skip(int i) {
for (; i < string.length; i++) {
if (string[i] >= 128) break;
}
return i;
}
Such a trivial method could be useful for more complicated processing and here it can't do no harm, right? So let's continue with
int smartCount() {
int result = 0;
for (int i = skip(0); i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
return result;
}
It's the same as simpleCount. I'm calling it "smart" as the actual work to be done is more complicated, so skipping over ASCII quickly makes sense. If there's no or a very short ASCII prefix, it can costs a few cycles more, but that's all, right?
Maybe you want to rewrite it like this, it's the same, just possibly more reusable, right?
int smarterCount() {
return finish(skip(0));
}
int finish(int i) {
int result = 0;
for (; i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
return result;
}
And then you ran a benchmark on some very long random string and get this
The parameters determine the ASCII to non-ASCII ratio and the average length of a non-ASCII sequence, but as you can see they don't matter. Trying different seeds and whatever doesn't matter. The benchmark uses caliper, so the usual gotchas don't apply. The results are fairly repeatable, the tiny black bars at the end denote the minimum and maximum times.
Does anybody have an idea what's going on here? Can anybody reproduce it?
Got it.
The difference is in the possibility for the optimizer/CPU to predict the number of loops in for. If it is able to predict the number of repeats up front, it can skip the actual check of i < string.length. Therefore the optimizer needs to know up front how often the condition in the for-loop will succeed and therefore it must know the value of string.length and i.
I made a simple test, by replacing string.length with a local variable, that is set once in the setup method. Result: smarterCount has runtime of about simpleCount. Before the change smarterCount took about 50% longer then simpleCount. smartCount did not change.
It looks like the optimizer looses the information of how many loops it will have to do when a call to another method occurs. That's the reason why finish() immediately ran faster with the constant set, but not smartCount(), as smartCount() has no clue about what i will be after the skip() step. So I did a second test, where I copied the loop from skip() into smartCount().
And voilà, all three methods return within the same time (800-900 ms).
My tentative guess would be that this is about branch prediction.
This loop:
for (int i = 0; i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
Contains exactly one branch, the backward edge of the loop, and it is highly predictable. A modern processor will be able to accurately predict this, and so fill its whole pipeline with instructions. The sequence of loads is also highly predictable, so it will be able to pre-fetch everything the pipelined instructions need. High performance results.
This loop:
for (; i < string.length - 1; i++) {
if (string[i] >= 128) break;
}
Has a dirty great data-dependent conditional branch sitting in the middle of it. That is much harder for the processor to predict accurately.
Now, that doesn't entirely make sense, because (a) the processor will surely quickly learn that the break branch will usually not be taken, (b) the loads are still predictable, and so just as pre-fetchable, and (c) after that loop exits, the code goes into a loop which is identical to the loop which goes fast. So i wouldn't expect this to make all that much difference.

Javabat substring counting

public boolean catDog(String str)
{
int count = 0;
for (int i = 0; i < str.length(); i++)
{
String sub = str.substring(i, i+1);
if (sub.equals("cat") && sub.equals("dog"))
count++;
}
return count == 0;
}
There's my code for catDog, have been working on it for a while and just cannot find out what's wrong. Help would be much appreciated!*/
EDIT- I want to Return true if the string "cat" and "dog" appear the same number of times in the given string.
One problem is that this will never be true:
if (sub.equals("cat") && sub.equals("dog"))
&& means and. || means or.
However, another problem is that your code looks like your are flailing around randomly trying to get it to work. Everyone does this to some extent in their first programming class, but it's a bad habit. Try to come up with a clear mental picture of how to solve the problem before you write any code, then write the code, then verify that the code actually does what you think it should do and that your initial solution was correct.
EDIT: What I said goes double now that you've clarified what your function is supposed to do. Your approach to solving the problem is not correct, so you need to rethink how to solve the problem, not futz with the implementation.
Here's a critique since I don't believe in giving code for homework. But you have at least tried which is better than most of the clowns posting homework here.
you need two variables, one for storing cat occurrences, one for dog, or a way of telling the difference.
your substring isn't getting enough characters.
a string can never be both cat and dog, you need to check them independently and update the right count.
your return statement should return true if catcount is equal to dogcount, although your version would work if you stored the differences between cats and dogs.
Other than those, I'd be using string searches rather than checking every position but that may be your next assignment. The method you've chosen is perfectly adequate for CS101-type homework.
It should be reasonably easy to get yours working if you address the points I gave above. One thing you may want to try is inserting debugging statements at important places in your code such as:
System.out.println(
"i = " + Integer.toString (i) +
", sub = ["+sub+"]" +
", count = " + Integer.toString(count));
immediately before the closing brace of the for loop. This is invaluable in figuring out what your code is doing wrong.
Here's my ROT13 version if you run into too much trouble and want something to compare it to, but please don't use it without getting yours working first. That doesn't help you in the long run. And, it's almost certain that your educators are tracking StackOverflow to detect plagiarism anyway, so it wouldn't even help you in the short term.
Not that I really care, the more dumb coders in the employment pool, the better it is for me :-)
choyvp obbyrna pngQbt(Fgevat fge) {
vag qvssrerapr = 0;
sbe (vag v = 0; v < fge.yratgu() - 2; v++) {
Fgevat fho = fge.fhofgevat(v, v+3);
vs (fho.rdhnyf("png")) {
qvssrerapr++;
} ryfr {
vs (fho.rdhnyf("qbt")) {
qvssrerapr--;
}
}
}
erghea qvssrerapr == 0;
}
Another thing to note here is that substring in Java's built-in String class is exclusive on the upper bound.
That is, for String str = "abcdefg", str.substring( 0, 2 ) retrieves "ab" rather than "abc." To match 3 characters, you need to get the substring from i to i+3.
My code for do this:
public boolean catDog(String str) {
if ((new StringTokenizer(str, "cat")).countTokens() ==
(new StringTokenizer(str, "dog")).countTokens()) {
return true;
}
return false;
}
Hope this will help you
EDIT: Sorry this code will not work since you can have 2 tokens side by side in your string. Best if you use countMatches from StringUtils Apache commons library.
String sub = str.substring(i, i+1);
The above line is only getting a 2-character substring so instead of getting "cat" you'll get "ca" and it will never match. Fix this by changing 'i+1' to 'i+2'.
Edit: Now that you've clarified your question in the comments: You should have two counter variables, one to count the 'dog's and one to count the 'cat's. Then at the end return true if count_cats == count_dogs.

Categories

Resources