The cheapest way to compare two sets of strings [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have to sets of strings
set 1:
"hello"
"world"
"stackoverflow"
set 2:
"world"
"hello"
"stackoverflow"
Before I tried to compare the content, I know exactly that these two sets contain only unique values. So I am not thinking about java Set for unique test.
So in Java, what should be the cheapest way to compare these two sets? By cheapest, I mean memory like.
I know I can do ArrayList.contains() forLoop, is there a better way?
And I was told Java HashSet consumes 5 times more resources than ArrayList when containing same length of contents. Is that true?
UPDATE
I don't have sample for you, since this is just an idea came to my mind.
By two sets of strings, I meant literally set, this set can also be stored in Java ArrayList.
what I want
is to compare these two sets of string to know if they are containing the same contents. of course I know before the actions that they both containing unique contents.
UPDATE
Sorry, this is not a practical problem I ran across with. This is just an idea I am wondering about.

I have absolutely no idea about the performance of this, but just to add a possible solution...
String[] a = {"hello","world","stackoverflow"};
String[] b = {"world","hello","stackoverflow"};
Arrays.sort(a);
Arrays.sort(b);
System.out.println(Arrays.equals(a,b) ? "same" : "different");
Result:
same

Since your only concern is memory footprint than basic nested loop on arrays (or whatever collections these strings are stored in at the moment) would work perfectly fine and only use fixed amount of extra space (for loop counters).
You can shorten code by using contains too - that function should require fixed amount of memory, but you'd waste some on call stack frame.
Note that this will trade run-time performance of comparison for memory usage. The penalty for time performance is significant, and likely will be easily noticeable for sets of size more 10-20 items. More balanced approach of comparing the sets is covered for example in What is the fastest way to compare two sets in Java? - memory cost of appropriate data structures is generally less of a concern than run-time cost.

Related

Are classes with identical values .equals()? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I'm using Point3D.
I have a list full of Point3Ds. I need to create a NEW Point3D at some point, and check if the list contains it.
Will this list ever contain it as it is technically a different reference/object BUT has identical values?
I understand this is a lack of fundamental knowledge in my Java.
A List doesn't care what you put in it - put ten things in, get ten things out.
A Set however will generally only have unique values in it (using .equals() and hashCode() and in some sets other things - e.g. Comparator).
Normally I'd say "to the javadoc" however in this case it has a case of cut/paste'ism from Point3D JavaDoc
equals
public boolean equals(Object obj)
Returns a hash code value for the point.
Overrides:
equals in class Object
Returns:
a hash code value for the point.
So assuming it does what most people expect if you have two points generated with identically valued doubles then yes it'll be as you expect.
HOWEVER, floating point numbers you can easily be "very very close" when you expect to be identical (due to the representation of the value as FP), and remember there are 3 doubles in a Point3D [so 3 potential lots of small error] - so to be safe typically you might decide things are the same within some small distance see javadoc for distance rather than relying on exact matching.

Should I put if(variable < 10) in own method? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have a check in a loop where I have to check if the number of occurrence is less than 10 which could be written as either
if(occ < 10){
}
or
if(checkIfOccurencyIsLessThan10(occ)){
values.add(current+"0"+occ);
}
else{
values.add(current+occ);
}
I'm reading Clean Code a handbook of agile software craftsmanship, where they say a method should do the least amount, and code should be hacked up into more pieces. Is this necessary right here? I'm trying to get a better grasp on how long a method should be, and how much it should be doing.
It depends on if this condition is spread across multiple pieces of code, and if this check could change in the future to include checking additional edge cases. If both of those things are true or could be true, then sure, extracting the check to its own function is wise. However, I would definitely say you should rename the function to not specify the functions implementation, because that defeats the purpose of being able to change out the conditional, right? Naming it something like occurenceNeedsZero is a much more flexible solution. Because if you come up with other use cases that need checking you can add them to this function as well!
However, if your question is "should I always make a simple conditional check such as "is x < 10" into its own function, then I would say no. That would be overengineering, in my opinion. Functions should be used to 1) separate logical portions of code, 2) increase readability, or 3) extract small pieces of code that are spread across multiple locations and likely to change in the future, as it simplifies future refactoring.
There are probably more cases than those 3, but those are the big ones.
It's better to use a static final variable to store this 10, instead hard code.
If there are other places need to check if occ < 10, you need extract it as a method. Otherwise it is unecessary.

What does it mean when there is a -- before an in value in java? Also what is a StringBuilder [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am trying to finish an assignment in my intro to Java course, and I have some questions. First off, what does it mean when there is a -- in FRONT of an int value? Also what is a String Builder? I had some help but want to understand what it is I'm using in the code. Thanks.
The -- in front of a value simply means subtract 1 from it. Similarly, ++ in front of a value means add 1 to it.
If you write ++
before the number it is called prefix operator and if after then its post fix
preFix: ++a will increase the value before using it, will first increase and then use it.
postFix a++ will first use it and then use it, for later use you will get the incremented value.
-- is a predecrement
Java StringBuilder class is used to create mutable (modifiable) string.
A String is immutable i.e string cannot be changed once created and everytime a value is change it create new string.
But in case of StringBuilder which is mutable string can change.
My experience is mostly with C# not Java, but in C# strings cannot be changed, when you concatenate two strings like "hello" + "world" you do not change either string, you create a new one and the old two still exist. If you need to do this many times (dozens or hundreds) it can use a lot of memory. A StringBuilder allows you to conserve memory by appending characters to the same block of memory while you are building your string, and then you can turn the result into a normal string for passing around to other functions.

Algorithm for finding a single number to represent two different numerical scales? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I'm looking for an algorithm to combine two values;
Where the first value indicates a positive result when the value is higher. For example, 0.98 is 'good' and 0.15 is 'bad'.
Where the second value indicates a positive result when the value is lower. For example, 10,000 is 'bad', whereas 1000 is 'good'.
I need a method of determining a value that can represent both of these scales with one number, so that I can sort my findings on my application from high to low accordingly. I'm not sure if anyone knows of such an algorithm, or any advice, but any help is greatly appreciated. Thank you.
P.S. I am aware I can 'negate' one of the two values, to have them appear on a similar scale, however I'm not sure how this would work in Java.
EDIT: Sorry, so to elaborate, I'm sorting images based on similarity to a user input image. Each of my algorithms that I'm using to return a value of similarity, function on a different scale. The first being a value between 0.00 and 1.00, with numbers being closer to 1.00, indicating the image is more similar to the original. Whereas, my second algorithm returns values from 1000+, with higher values indicating the image is less similar to the original. I need to take these two values and combine them to allow me to sort the resulting images in order of similarity, with the most similar image being shown at the top of my list, and the least similar at the bottom. Hopefully this helps clear up any confusion. Thanks again.
If your only goal is sorting, you need to come up with a function g(x,y) that represents the "goodness" of your pair of values. A pair (x1,y1) is better than (x2,y2) if and only if g(x1,y1) > g(x2,y2).
The function must represent what you consider "good". A simple example would be:
g(x,y) = x - y / 10000

Regex best practice: multiple patterns or single one with combined expression? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Think of the following scenario:
Application receives a list of regex from server (HTTP GET returns a List with each item indicating a regular expression.);
User input text needs to be validate against these expressions;
Application runs on Android, so memory is an issue;
List of expressions is not frequently changed.
What would be better:
Cache several Pattern objects, each one containing a single regex returned from server;
Concatenate the regex - (REGEX1)|(REGEX2)|(REGEX3)...|(REGEXN) - and maintain a single object on memory? - refreshing it whenever a single regex is added or removed, which doesn't happens very often.
I don't imagine there is a way to answer this question without having a specific list of regex's and the list of input. Because, each regex/input combination is going to result in a different amount of memory used. Here is what my instincts tell me:
Evaluate the Regex's one at a time. In the "OR" scenario, the regex must simultaneously evaluate all of the OR'd expressions, so that would take more RAM, I believe.
Order the Regex's in order of either: (a) Likelihood to match, so that you can abandon evaluating the rest of the regex's or (b) Early non-matching, so that the regex can be quickly discarded as never going to match (for example "^a" only requires evaluating the first character of the string where as "a" requires searching the whole string for an "a".)
Ultimately, only testing can really tell you what takes more time/memory, I'm afraid.

Categories

Resources