Best way to implement while loop in java 8 or later - java

I have to remove all slash (/) from the beginning of string.
So I have written
while (convertedUrl.startsWith("\\"))
{
convertedUrl = convertedUrl.substring(1);
}
Above code creates string for each substring.
Is there any better way to write this code in java 8 or later?
How can I do it keeping mind memory utilisation and performance.

I would guess at:
int len = str.length();
int i=0;
for (; i<len && str.charAt(i) == '\\'; ++i) {
;
}
return str.substring(i);
I write str instead of convertedUrl because this should be in its own method.
It is unlikely this is a performance bottleneck, but the original code may run as slow as O(n^2) (depending on implementation).

can you simply not use something like this, to replace all the "/" in one go
convertedUrl = convertedUrl.replaceAll("\\/","");
I am sorry for the initial one, but I think this will do:
convertedUrl = convertedUrl.replaceFirst("^/*","");
OR this:
convertedUrl = convertedUrl.replaceAll("^/*","");
both will get the job done!
as they replaces all the leading "/" chars!

Related

Which solution is better, faster and readable? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
ok, problem: "An isogram is a word that has no repeating letters, consecutive or non-consecutive. Implement a function that determines whether a string that contains only letters is an isogram. Assume the empty string is an isogram. Ignore letter case."
and I write 2 solutions
First solution:
Scanner scanner = new Scanner(System.in);
String string = scanner.next();
long start = System.currentTimeMillis();
char words[] = string.toCharArray();
boolean isIsogram=true;
for (int i=(words.length-1); i>=0; i--){
for(int j=0; j<(i-1);j++){
if(words[i]==words[j]){
isIsogram=false;
}
}
}
long finish = System.currentTimeMillis();
System.out.println(isIsogram + " time:"+ (finish-start) );
Second solution:
Scanner scanner = new Scanner(System.in);
String string = scanner.next();
long start = System.currentTimeMillis();
boolean isIsogram = (string.length() == string.toLowerCase().chars().distinct().count());
long finish = System.currentTimeMillis();
System.out.println(isIsogram + " time:"+ (finish-start) );
I have tested both solutions and there is results:
input: "asd"
1) true time 0
2) true time 113
and I want to know your ideas and opinion which solution is better?
My teacher told me 2 solution is better, but 1 solution takes
less time, and I am not sure which is better.....
The right answer is actually to profile your specific problem and see what works for your requirements.
Readability is important. What approach is most readable is very subjective. Stream-oriented operations usually attack the problem from a more declarative approach rather than an imperative approach. Declarative code is usually much easier to read, but imperative code is often faster.
But how fast do you need to be? Even your (very flawed) benchmark shows only a difference of 100 milliseconds. That's faster than the threshold of human perception. If your code isn't too slow, then don't worry about making it faster. Worry about making it clear, maintainable, debuggable, and correct first.
In any case, since this is a fun problem, I poked at it for a minute. You have 216 possible char values in a String, so if you use a BitSet you can have a yes/no bit for each of them, and still fit the whole thing in 8K of memory.
Your question said to do case folding. That sounds like a simplification, but it's really not, unless your data is ASCII (in which case, you only need a 256-bit BitSet, or possibly only a 26-bit one!). If you can use the full range of Unicode characters, even the problem of reliably converting from upper case to lower case becomes almost impossible to do correctly. (Case conversion is ultimately locale-specific.)
So I'm going to assume you want to handle all possible char values, which won't handle UTF-16 surrogates (like you need for emoji) correctly, but should handle everything that's considered a "letter" in alphabetic languages.
Here's what I came up with:
static boolean isIsogram(String text) {
java.util.BitSet bits = new java.util.BitSet(1 << 16);
for (int i = 0; i < text.length; i++) {
int ch = (int) text.charAt(i);
if (bits.get(ch)) {
return false;
}
bits.set(ch);
}
return true;
}
A few things on readability:
First codeblock:
I would have the counters counting in the same direction-- it will still compare each word this way. It's not a terribly important change, but it can save the reader a step so that they don't have to do any mental math to determine if the code is producing the intended result, since the result is apparent (it's easy to see that the code's time complexity is O(n^2)).
boolean isIsogram = true;
Scanner scanner = new Scanner(System.in);
String string = scanner.next();
long start = System.currentTimeMillis();
char words[] = string.toCharArray();
for (int i = 0 ; i <= words.length - 1; i++){
for(int j = 0; j <= words.length - 1; j++){
if(words[i] == words[j]){
isIsogram = false;
break; //I agree with the first answer
}
}
if (!isIsogram)
break;
}
long finish = System.currentTimeMillis();
System.out.println(isIsogram + " time:" + (finish-start) );
The second codeblock is quite readable, although I may be primed towards understanding the problem and so it might actually be more readable because of that. But the calls to compare distinct characters make complete sense in terms of the goal.

Strange performance drop after innocent changes to a trivial program

Imagine you want to count how many non-ASCII chars a given char[] contains. Imagine, the performance really matters, so we can skip our favorite slogan.
The simplest way is obviously
int simpleCount() {
int result = 0;
for (int i = 0; i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
return result;
}
Then you think that many inputs are pure ASCII and that it could be a good idea to deal with them separately. For simplicity assume you write just this
private int skip(int i) {
for (; i < string.length; i++) {
if (string[i] >= 128) break;
}
return i;
}
Such a trivial method could be useful for more complicated processing and here it can't do no harm, right? So let's continue with
int smartCount() {
int result = 0;
for (int i = skip(0); i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
return result;
}
It's the same as simpleCount. I'm calling it "smart" as the actual work to be done is more complicated, so skipping over ASCII quickly makes sense. If there's no or a very short ASCII prefix, it can costs a few cycles more, but that's all, right?
Maybe you want to rewrite it like this, it's the same, just possibly more reusable, right?
int smarterCount() {
return finish(skip(0));
}
int finish(int i) {
int result = 0;
for (; i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
return result;
}
And then you ran a benchmark on some very long random string and get this
The parameters determine the ASCII to non-ASCII ratio and the average length of a non-ASCII sequence, but as you can see they don't matter. Trying different seeds and whatever doesn't matter. The benchmark uses caliper, so the usual gotchas don't apply. The results are fairly repeatable, the tiny black bars at the end denote the minimum and maximum times.
Does anybody have an idea what's going on here? Can anybody reproduce it?
Got it.
The difference is in the possibility for the optimizer/CPU to predict the number of loops in for. If it is able to predict the number of repeats up front, it can skip the actual check of i < string.length. Therefore the optimizer needs to know up front how often the condition in the for-loop will succeed and therefore it must know the value of string.length and i.
I made a simple test, by replacing string.length with a local variable, that is set once in the setup method. Result: smarterCount has runtime of about simpleCount. Before the change smarterCount took about 50% longer then simpleCount. smartCount did not change.
It looks like the optimizer looses the information of how many loops it will have to do when a call to another method occurs. That's the reason why finish() immediately ran faster with the constant set, but not smartCount(), as smartCount() has no clue about what i will be after the skip() step. So I did a second test, where I copied the loop from skip() into smartCount().
And voilĂ , all three methods return within the same time (800-900 ms).
My tentative guess would be that this is about branch prediction.
This loop:
for (int i = 0; i < string.length; i++) {
result += string[i] >= 128 ? 1 : 0;
}
Contains exactly one branch, the backward edge of the loop, and it is highly predictable. A modern processor will be able to accurately predict this, and so fill its whole pipeline with instructions. The sequence of loads is also highly predictable, so it will be able to pre-fetch everything the pipelined instructions need. High performance results.
This loop:
for (; i < string.length - 1; i++) {
if (string[i] >= 128) break;
}
Has a dirty great data-dependent conditional branch sitting in the middle of it. That is much harder for the processor to predict accurately.
Now, that doesn't entirely make sense, because (a) the processor will surely quickly learn that the break branch will usually not be taken, (b) the loads are still predictable, and so just as pre-fetchable, and (c) after that loop exits, the code goes into a loop which is identical to the loop which goes fast. So i wouldn't expect this to make all that much difference.

Common Substring of two strings

This particular interview-question stumped me:
Given two Strings S1 and S2. Find the longest Substring which is a Prefix of S1 and suffix of S2.
Through Google, I came across the following solution, but didnt quite understand what it was doing.
public String findLongestSubstring(String s1, String s2) {
List<Integer> occurs = new ArrayList<>();
for (int i = 0; i < s1.length(); i++) {
if (s1.charAt(i) == s2.charAt(s2.length()-1)) {
occurs.add(i);
}
}
Collections.reverse(occurs);
for(int index : occurs) {
boolean equals = true;
for(int i = index; i >= 0; i--) {
if (s1.charAt(index-i) != s2.charAt(s2.length() - i - 1)) {
equals = false;
break;
}
}
if(equals) {
return s1.substring(0,index+1);
}
}
return null;
}
My questions:
How does this solution work?
And how do you get to discovering this solution?
Is there a more intuitive / easier solution?
Part 2 of your question
Here is a shorter variant:
public String findLongestPrefixSuffix(String s1, String s2) {
for( int i = Math.min(s1.length(), s2.length()); ; i--) {
if(s2.endsWith(s1.substring(0, i))) {
return s1.substring(0, i);
}
}
}
I am using Math.min to find the length of the shortest String, as I don't need to and cannot compare more than that.
someString.substring(x,y) returns you the String you get when reading someString beginning from character x and stopping at character y. I go backwards from the biggest possible substring (s1 or s2) to the smallest possible substring, the empty string. This way the first time my condition is true it will be biggest possible substring the fulfills it.
If you prefer you can go the other way round, but you have to introduce a variable saving the length of the longest found substring fulfilling the condition so far:
public static String findLongestPrefixSuffix(String s1, String s2) {
if (s1.equals(s2)) { // this part is optional and will
return s1; // speed things up if s1 is equal to s2
} //
int max = 0;
for (int i = 0; i < Math.min(s1.length(), s2.length()); i++) {
if (s2.endsWith(s1.substring(0, i))) {
max = i;
}
}
return s1.substring(0, max);
}
For the record: You could start with i = 1 in the latter example for a tiny bit of extra performance. On top of this you can use i to specify how long the suffix has at least to be you want to get. ;) If you writ Math.min(s1.length(), s2.length()) - x you can use x to specify how long the found substring may be at most. Both of these things are possible with the first solution, too, but the min length is a bit more involving. ;)
Part 1 of your question
In the part above the Collections.reverse the author of the code searches for all positions in s1 where the last letter of s2 is and saves this position.
What follows is essentially what my algorithm does, the difference is, that he doesn't check every substring but only those that end with the last letter of s2.
This is some sort of optimization to speed things up. If speed is not that important my naive implementation should suffice. ;)
Where did you find that solution? Was it written by a credible, well-respected coder? If you're not sure of that, then it might not be worth reading it. One could write really complex and inefficient code to accomplish something really simple, and it will not be worth understanding the algorithm.
Rather than trying to understand somebody else's solution, it might be easier to come up with it on your own. I think you understand the problem much better that way, and the logic becomes your own. Over time and practice the thought process will start to come more naturally. Practice makes perfect.
Anyway, I put a more simple implementation in Python here (spoiler alert!). I suggest you first figure out the solution on your own, and compare it to mine later.
Apache commons lang3, StringUtils.getCommonPrefix()
Java is really bad in providing useful stuff via stdlib. On the plus side there's almost always some reasonable tool from Apache.
I converted the #TheMorph's answer to javascript. Hope this helps js developer
if (typeof String.prototype.endsWith !== 'function') {
String.prototype.endsWith = function(suffix) {
return this.indexOf(suffix, this.length - suffix.length) !== -1;
};
}
function findLongestPrefixSuffix(s2, s1) {
for( var i = Math.min(s1.length, s2.length); ; i--) {
if(s2.endsWith(s1.substring(0, i))) {
return s1.substring(0, i);
}
}
}
console.log(findLongestPrefixSuffix('abc', 'bcd')); // result: 'bc'

Java data type problem

I am working with matrix in java. ( another story :) )
I want to read a CSV file and store it in a variable. I will manipulate values then again store it in CSV file. I used STRING as data type. But if CSV file has like 500 columns. It kill my program speed :(. I think this is not good data type. Which data type I can use to temporary store LONG TEXT?
If my question is not clear please ask questions. I will explain.
Thanks
P.S: I am reading one line and storing it in variable like this
String str;
str += read line by line from CSV;
here is the loop
String reduceM="";
for(int kk=0;kk<W2.getRowDimension();kk++){
for(int jj=0;jj<W2.getColumnDimension();jj++){
reduceM += Double.toString(reduceMatrix[kk][jj]);
}
System.out.println("\r\n");
}
Use a StringBuilder (or StringBuffer if you're using Java 1.5 or older):
StringBuilder builder = new StringBuilder();
for (int kk = 0; kk<W2.getRowDimension(); kk++) {
for(int jj = 0; jj < W2.getColumnDimension(); jj++) {
builder.append(reduceMatrix[kk][jj]);
}
}
This will avoid it creating a new (and increasingly long) string for each iteration of the two loops.
However, there are no commas or line-breaks in this code - I suspect you actually want something like this:
StringBuilder builder = new StringBuilder();
for (int kk = 0; kk < W2.getRowDimension(); kk++) {
for (int jj = 0; jj < W2.getColumnDimension(); jj++) {
builder.append(reduceMatrix[kk][jj])
.append(",");
}
builder.append("\n"); // Or whatever line terminator you want
}
Note that that will leave an extra comma at the end of each row - let me know if you want ideas of how to remove that.
See this article for why this could make a huge improvement to your running time. (Note that it's an old article, talking about StringBuffer rather than StringBuilder - the latter is just an unsynchronized version of the former.)
Use the 'StringBuffer' class for concatenations. It is much more efficient.
Take a look at this article for an explanation: here
EDIT - Sorry I did not see this was already answered
Prefer StringBuilder, there is a big difference in performance compared to the string concatenation(+).
In addition to the skeet's great answer; dont write to system.out if its not necessary, or if you want to write to console use buffers or write when loop finishes. It makes your program cumbersome because each time it is encountered in the loop System.out stream opens writes flushes and closes.

Javabat substring counting

public boolean catDog(String str)
{
int count = 0;
for (int i = 0; i < str.length(); i++)
{
String sub = str.substring(i, i+1);
if (sub.equals("cat") && sub.equals("dog"))
count++;
}
return count == 0;
}
There's my code for catDog, have been working on it for a while and just cannot find out what's wrong. Help would be much appreciated!*/
EDIT- I want to Return true if the string "cat" and "dog" appear the same number of times in the given string.
One problem is that this will never be true:
if (sub.equals("cat") && sub.equals("dog"))
&& means and. || means or.
However, another problem is that your code looks like your are flailing around randomly trying to get it to work. Everyone does this to some extent in their first programming class, but it's a bad habit. Try to come up with a clear mental picture of how to solve the problem before you write any code, then write the code, then verify that the code actually does what you think it should do and that your initial solution was correct.
EDIT: What I said goes double now that you've clarified what your function is supposed to do. Your approach to solving the problem is not correct, so you need to rethink how to solve the problem, not futz with the implementation.
Here's a critique since I don't believe in giving code for homework. But you have at least tried which is better than most of the clowns posting homework here.
you need two variables, one for storing cat occurrences, one for dog, or a way of telling the difference.
your substring isn't getting enough characters.
a string can never be both cat and dog, you need to check them independently and update the right count.
your return statement should return true if catcount is equal to dogcount, although your version would work if you stored the differences between cats and dogs.
Other than those, I'd be using string searches rather than checking every position but that may be your next assignment. The method you've chosen is perfectly adequate for CS101-type homework.
It should be reasonably easy to get yours working if you address the points I gave above. One thing you may want to try is inserting debugging statements at important places in your code such as:
System.out.println(
"i = " + Integer.toString (i) +
", sub = ["+sub+"]" +
", count = " + Integer.toString(count));
immediately before the closing brace of the for loop. This is invaluable in figuring out what your code is doing wrong.
Here's my ROT13 version if you run into too much trouble and want something to compare it to, but please don't use it without getting yours working first. That doesn't help you in the long run. And, it's almost certain that your educators are tracking StackOverflow to detect plagiarism anyway, so it wouldn't even help you in the short term.
Not that I really care, the more dumb coders in the employment pool, the better it is for me :-)
choyvp obbyrna pngQbt(Fgevat fge) {
vag qvssrerapr = 0;
sbe (vag v = 0; v < fge.yratgu() - 2; v++) {
Fgevat fho = fge.fhofgevat(v, v+3);
vs (fho.rdhnyf("png")) {
qvssrerapr++;
} ryfr {
vs (fho.rdhnyf("qbt")) {
qvssrerapr--;
}
}
}
erghea qvssrerapr == 0;
}
Another thing to note here is that substring in Java's built-in String class is exclusive on the upper bound.
That is, for String str = "abcdefg", str.substring( 0, 2 ) retrieves "ab" rather than "abc." To match 3 characters, you need to get the substring from i to i+3.
My code for do this:
public boolean catDog(String str) {
if ((new StringTokenizer(str, "cat")).countTokens() ==
(new StringTokenizer(str, "dog")).countTokens()) {
return true;
}
return false;
}
Hope this will help you
EDIT: Sorry this code will not work since you can have 2 tokens side by side in your string. Best if you use countMatches from StringUtils Apache commons library.
String sub = str.substring(i, i+1);
The above line is only getting a 2-character substring so instead of getting "cat" you'll get "ca" and it will never match. Fix this by changing 'i+1' to 'i+2'.
Edit: Now that you've clarified your question in the comments: You should have two counter variables, one to count the 'dog's and one to count the 'cat's. Then at the end return true if count_cats == count_dogs.

Categories

Resources