Queue is an interface that implements many different classes. I am confused about why my text book gives this sample text using Queue. I have also found this in an example on the Princeton website. Is this a customary way to provide code so it can be edited later to the programmers preferred type of queue?
This is code taken from an algorithm for a Binary Search Symbol Table.
public Iterable<Key> keys(Key lo, Key hi) {
Queue<Key> q = new Queue<Key>();
for (int i = rank(lo); i < rank(hi); i++) {
q.enqueue(keys[i]);
}
if (contains(hi)) {
q.enqueue(keys[rank(hi)]);
}
return q;
}
So first, there are many classes that implement Queue not Queue is an interface that implements many different classes.
Second, yes code convention is that you use the interface where ever possible to make code more flexible in case a different implementation is desired. This is especially true of method signatures but also good practice for variable and field declaration.
Third, looks like a typo. Line should be Queue<Key> q = new LinkedList<Key>();
See answer here: Why are variables declared with their interface name in Java?
Also, What does it mean to “program to an interface”?
I do this often for other interfaces where the specific implementation doesn't provide anything over the interface, like List.
List<String> strings = new ArrayList<>();
In that case I retain the most flexibility with the least amount of hassle.
Granted, you won't change implementations very often, but there are some specific ones where for example inserting takes O(n) time and lookup takes O(log n) time or vice versa and in those cases you have to do some research how you're using the interface.
Do you change the contents of the collection a lot, but read it sparingly? Or is it the other way around and do you insert just once and read often?
Your question is very broad. This technique is called programming to interface and it's an effect of years of gathering good practices in coding. It just makes things easier in so many ways. It revolutionised ways how we write software today. If you're an eager learner and want to know more, check this article as a starter.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm wondering what the best way is in Java 8 to work with all the values of an enum. Specifically when you need to get all the values and add it to somewhere, for example, supposing that we have the following enum:
public enum Letter {
A, B, C, D;
}
I could of course do the following:
for (Letter l : Letter.values()) {
foo(l);
}
But, I could also add the following method to the enum definition:
public static Stream<Letter> stream() {
return Arrays.stream(Letter.values());
}
And then replace the for from above with:
Letter.stream().forEach(l -> foo(l));
Is this approach OK or does it have some fault in design or performance? Moreover, why don't enums have a stream() method?
I'd go for EnumSet. Because forEach() is also defined on Iterable, you can avoid creating the stream altogether:
EnumSet.allOf(Letter.class).forEach(x -> foo(x));
Or with a method reference:
EnumSet.allOf(Letter.class).forEach(this::foo);
Still, the oldschool for-loop feels a bit simpler:
for (Letter x : Letter.values()) {
foo(x);
}
Three questions: three-part answer:
Is it okay from a design point of view?
Absolutely. Nothing wrong with it. If you need to do lots of iterating over your enum, the stream API is the clean way to go and hiding the boiler plate behind a little method is fine. Although I’d consider OldCumudgeon’s version even better.
Is it okay from a performance point of view?
It most likely doesn’t matter. Most of the time, enums are not that big. Therefore, whatever overhead there is for one method or the other probably doesn’t matter in 99.9% of the cases.
Of course, there are the 0.1% where it does. In that case: measure properly, with your real-world data and consumers.
If I had to bet, I’d expect the for each loop to be faster, since it maps more directly to the memory model, but don’t guess when talking performance, and don’t tune before there is actual need for tuning. Write your code in a way that is correct first, easy to read second and only then worry about performance of code style.
Why aren’t Enums properly integrated into the Stream API?
If you compare Java’s Stream API to the equivalent in many other languages, it appears seriously limited. There are various pieces that are missing (reusable Streams and Optionals as Streams, for example). On the other hand, implementing the Stream API was certainly a huge change for the API. It was postponed multiple times for a reason. So I guess Oracle wanted to limit the changes to the most important use cases. Enums aren’t used that much anyway. Sure, every project has a couple of them, but they’re nothing compared to the number of Lists and other Collections. Even when you have an Enum, in many cases you won’t ever iterate over it. Lists and Sets, on the other hand, are probably iterated over almost every time. I assume that these were the reasons why the Enums didn’t get their own adapter to the Stream world. We’ll see whether more of this gets added in future versions. And until then you always can use Arrays.stream.
My guess is that enums are limited in size (i.e the size is not limited by language but limited by usage)and thus they don't need a native stream api. Streams are very good when you have to manipulate transform and recollect the elements in a stream; these are not common uses case for Enum (usually you iterate over enum values, but rarely you need to transform, map and collect them).
If you need only to do an action over each elements perhaps you should expose only a forEach method
public static void forEach(Consumer<Letter> action) {
Arrays.stream(Letter.values()).forEach(action);
}
.... //example of usage
Letter.forEach(e->System.out.println(e));
I think the shortest code to get a Stream of enum constants is Stream.of(Letter.values()). It's not as nice as Letter.values().stream() but that's an issue with arrays, not specifically enums.
Moreover, why don't enums have a stream() method?
You are right that the nicest possible call would be Letter.stream(). Unfortunately a class cannot have two methods with the same signature, so it would not be possible to implicitly add a static method stream() to every enum (in the same way that every enum has an implicitly added static method values()) as this would break every existing enum that already has a static or instance method without parameters called stream().
Is this approach OK?
I think so. The drawback is that stream is a static method, so there is no way to avoid code duplication; it would have to be added to every enum separately.
I've got a fairly complicated project, which heavily uses Java's multithreading. In an answer to one of my previous questions I have described an ugly hack, which is supposed to overcome inherent inability to iterate over Java's ConcurrentHashMap in parallel. Although it works, I don't like ugly hacks, and I've had a lot of trouble trying to introduce proposed proof of concept in the real system. Trying to find an alternative solution I have encountered Scala's ParHashMap, which claims to implement a foreach method, which seems to operate in parallel. Before I start learning a new language to implement a single feature I'd like to ask the following:
1) Is foreach method of Scala's ParHashMap scalable?
2) Is it simple and straightforward to call Java's code from Scala and vice versa? I'll just remind that the code is concurrent and uses generics.
3) Is there going to be a performance penalty for switching a part of codebase to Scala?
For reference, this is my previous question about parallel iteration of ConcurrentHashMap:
Scalable way to access every element of ConcurrentHashMap<Element, Boolean> exactly once
EDIT
I have implemented the proof of concept, in probably very non-idiomatic Scala, but it works just fine. AFAIK it is IMPOSSIBLE to implement a corresponding solution in Java given the current state of its standard library and any available third-party libraries.
import scala.collection.parallel.mutable.ParHashMap
class Node(value: Int, id: Int){
var v = value
var i = id
override def toString(): String = v toString
}
object testParHashMap{
def visit(entry: Tuple2[Int, Node]){
entry._2.v += 1
}
def main(args: Array[String]){
val hm = new ParHashMap[Int, Node]()
for (i <- 1 to 10){
var node = new Node(0, i)
hm.put(node.i, node)
}
println("========== BEFORE ==========")
hm.foreach{println}
hm.foreach{visit}
println("========== AFTER ==========")
hm.foreach{println}
}
}
I come to this with some caveats:
Though I can do some things, I consider myself relatively new to Scala.
I have only read about but never used the par stuff described here.
I have never tried to accomplish what you are trying to accomplish.
If you still care what I have to say, read on.
First, here is an academic paper describing how the parallel collections work.
On to your questions.
1) When it comes to multi-threading, Scala makes life so much easier than Java. The abstractions are just awesome. The ParHashMap you get from a par call will distribute the work to multiple threads. I can't say how that will scale for you without a better understanding of your machine, configuration, and use case, but done right (particularly with regard to side effects) it will be at least as good as a Java implementation. However, you might also want to look at Akka to have more control over everything. It sounds like that might be more suitable to your use case than simply ParHashMap.
2) It is generally simple to convert between Java and Scala collections using JavaConverters and the asJava and asScala methods. I would suggest though making sure that the public API for your method calls "looks Java" since Java is the least common denominator. Besides, in this scenario, Scala is an implementation detail, and you never want to leak those anyway. So keep the abstraction at a Java level.
3) I would guess there will actually be a performance gain with Scala--at runtime. However, you will find much slower compile time (which can be worked around. ish). This Stack Overflow post by the author of Scala is old but still relevant.
Hope that helps. That's quite a problem you got there.
Since Scala compiles to the same bytecode as Java, doing the same in both languages is very well possible, no matter the task. There are however some things which are easier to solve in Scala, but if this is worth learning a new language is a different question. Especially since Java 8 will include exactly what you ask for: simple parallel execution of functions on lists.
But even now you can do this in Java, you just need to write what Scala already has on your own.
final ExecutorService executor = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
//...
final Entry<String, String>[] elements = (Entry<String, String>[]) myMap.entrySet().toArray();
final AtomicInteger index = new AtomicInteger(elements.length);
for (int i = Runtime.getRuntime().availableProcessors(); i > 0; --i) {
executor.submit(new Runnable() {
public void run() {
int myIndex;
while ((myIndex = index.decrementAndGet()) >= 0) {
process(elements[myIndex]);
}
}
});
}
The trick is to pull those elements into a temporary array, so threads can take out elements in a thread-safe way. Obviously doing some caching here instead of re-creating the Runnables and the array each time is encouraged, because the Runnable creation might already take longer than the actual task.
It is as well possible to instead copy the elements into a (reusable) LinkedBlockingQueue, then have the threads poll/take on it instead. This however adds more overhead and is only reasonable for tasks that require at least some calculation time.
I don't know how Scala actually works, but given the fact that it needs to run on the same JVM, it will do something similar in the background, it just happens to be easily accessible in the standard library.
When you're designing the API for a code library, you want it to be easy to use well, and hard to use badly. Ideally you want it to be idiot proof.
You might also want to make it compatible with older systems that can't handle generics, like .Net 1.1 and Java 1.4. But you don't want it to be a pain to use from newer code.
I'm wondering about the best way to make things easily iterable in a type-safe way... Remembering that you can't use generics so Java's Iterable<T> is out, as is .Net's IEnumerable<T>.
You want people to be able to use the enhanced for loop in Java (for Item i : items), and the foreach / For Each loop in .Net, and you don't want them to have to do any casting. Basically you want your API to be now-friendly as well as backwards compatible.
The best type-safe option that I can think of is arrays. They're fully backwards compatible and they're easy to iterate in a typesafe way. But arrays aren't ideal because you can't make them immutable. So, when you have an immutable object containing an array that you want people to be able to iterate over, to maintain immutability you have to provide a defensive copy each and every time they access it.
In Java, doing (MyObject[]) myInternalArray.clone(); is super-fast. I'm sure that the equivalent in .Net is super-fast too. If you have like:
class Schedule {
private Appointment[] internalArray;
public Appointment[] appointments() {
return (Appointment[]) internalArray.clone();
}
}
people can do like:
for (Appointment a : schedule.appointments()) {
a.doSomething();
}
and it will be simple, clear, type-safe, and fast.
But they could do something like:
for (int i = 0; i < schedule.appointments().length; i++) {
Appointment a = schedule.appointments()[i];
}
And then it would be horribly inefficient because the entire array of appointments would get cloned twice for every iteration (once for the length test, and once to get the object at the index). Not such a problem if the array is small, but pretty horrible if the array has thousands of items in it. Yuk.
Would anyone actually do that? I'm not sure... I guess that's largely my question here.
You could call the method toAppointmentArray() instead of appointments(), and that would probably make it less likely that anyone would use it the wrong way. But it would also make it harder for people to find when they just want to iterate over the appointments.
You would, of course, document appointments() clearly, to say that it returns a defensive copy. But a lot of people won't read that particular bit of documentation.
Although I'd welcome suggestions, it seems to me that there's no perfect way to make it simple, clear, type-safe, and idiot proof. Have I failed if a minority of people are unwitting cloning arrays thousands of times, or is that an acceptable price to pay for simple, type-safe iteration for the majority?
NB I happen to be designing this library for both Java and .Net, which is why I've tried to make this question applicable to both. And I tagged it language-agnostic because it's an issue that could arise for other languages too. The code samples are in Java, but C# would be similar (albeit with the option of making the Appointments accessor a property).
UPDATE: I did a few quick performance tests to see how much difference this made in Java. I tested:
cloning the array once, and iterating over it using the enhanced for loop
iterating over an ArrayList using
the enhanced for loop
iterating over an unmodifyable
ArrayList (from
Collections.unmodifyableList) using
the enhanced for loop
iterating over the array the bad way (cloning it repeatedly in the length check
and when getting each indexed item).
For 10 objects, the relative speeds (doing multiple repeats and taking the median) were like:
1,000
1,300
1,300
5,000
For 100 objects:
1,300
4,900
6,300
85,500
For 1000 objects:
6,400
51,700
56,200
7,000,300
For 10000 objects:
68,000
445,000
651,000
655,180,000
Rough figures for sure, but enough to convince me of two things:
Cloning, then iterating is definitely
not a performance issue. In fact
it's consistently faster than using a
List. (this is why Java's
enum.values() method returns a
defensive copy of an array instead of
an immutable list.)
If you repeatedly call the method,
repeatedly cloning the array unnecessarily,
performance becomes more and more of an issue the larger the arrays in question. It's pretty horrible. No surprises there.
clone() is fast but not what I would describe as super faster.
If you don't trust people to write loops efficiently, I would not let them write a loop (which also avoids the need for a clone())
interface AppointmentHandler {
public void onAppointment(Appointment appointment);
}
class Schedule {
public void forEachAppointment(AppointmentHandler ah) {
for(Appointment a: internalArray)
ah.onAppointment(a);
}
}
Since you can't really have it both ways, I would suggest that you create a pre generics and a generics version of your API. Ideally, the underlying implementation can be mostly the same, but the fact is, if you want it to be easy to use for anyone using Java 1.5 or later, they will expect the usage of Generics and Iterable and all the newer languange features.
I think the usage of arrays should be non-existent. It does not make for an easy to use API in either case.
NOTE: I have never used C#, but I would expect the same holds true.
As far as failing a minority of the users, those that would call the same method to get the same object on each iteration of the loop would be asking for inefficiency regardless of API design. I think as long as that's well documented, it's not too much to ask that the users obey some semblance of common sense.
I read an article on Joel On Software about the idea of using higher order functions to greatly simplify code through the use of map and reduce. He mentioned that this was difficult to do in Java. The article: http://www.joelonsoftware.com/items/2006/08/01.html
The example from the article below, loops through an array, and uses the function fn that was passed as an argument on each element in the array:
function map(fn, a)
{
for (i = 0; i < a.length; i++)
{
a[i] = fn(a[i]);
}
}
This would be invoked similar to the below in practice:
map( function(x){return x*2;}, a );
map( alert, a );
Ideally I'd like to write a map function to work on arrays, or Collections of any type if possible.
I have been looking around on the Internet, and I am having a difficult time finding resources on the subject. Firstly, are anonymous functions possible in java? Is this possible to do in another way? Will it be available in a future version of java? If possible, how can I do it?
I imagine that if this is not possible in Java there is some kind of 'pattern'/technique that people use to achieve the same effect, as I imagine anonymous functions are a very powerful tool in the software world. the only similar question I was able to find was this: Java generics - implementing higher order functions like map and it makes absolutely no sense to me.
Guava provides map (but it's called transform instead, and is in utility classes like Lists and Collections2). It doesn't provide fold/reduce, however.
In any case, the syntax for using transform feels really clunky compared to using map in Scheme. It's a bit like trying to write with your left hand, if you're right-handed. But, this is Java; what do you expect. :-P
Looks like this one?
How can I write an anonymous function in Java?
P.S: try Functional Java. Maybe it could give you hints.
Single method anonymous classes provide a similar, but much more verbose, way of writing an anonymous function in Java.
For example, you could have:
Iterable<Source> foos = ...;
Iterable<Destination> mappedFoos = foos.map(new Function<Source, Destination>()
{
public Destination apply(Source item) { return ... }
});
For an example of a Java library with a functional style, see Guava
interface Func<V,A> {
V call (A a);
}
static <V,A> List<V> map (Func<V,A> func, List<A> as) {
List<V> vs = new ArrayList<V>(as.size());
for (A a : as) {
Vs.add(func.call(a));
}
return vs;
}
Paguro has an open-source implementation of higher order functions. Initial test show it to be 98% as fast as the native Java forEach loop. The operations it supports are applied lazily without modifying the underlying collection. It outputs to type-safe versions of the immutable (and sometimes mutable) Clojure collections. Transformable is built into Paguro's unmodifiable and immutable collections and interfaces. To use a raw java.util collection as input, just wrap it with the xform() function.
Suppose we have classes Gallery and Image. There can be many images in one gallery.
Gallery should have some method which returns number of nested images. My suggestions:
int getImagesCount ();
int countImages ();
int imagesCount ();
I saw examples of each of these 3 suggestions in different APIs (of course with another noun or even without it, like method size () in Collections API).
What would you prefer and why? (One of my thoughts against countImages () is that this name can make user think that this method does some complex calculations.)
My suggestion:
public interface Gallery {
int getNumberOfImages();
//...
}
I agree, that countXX() leads to the impression, that calling this method triggers some sort of calculation. So I wouldn't use it here as well.
I would rule out countImages () too, for the same reason.
Other than that, if there is a more or less consistent naming convention in the codebase Gallery belongs to, I would recommend adhering to it. In absence of a clear project convention, I personally would lean towards size().
If you are thinking of having one method that returns the size, and another one that returns the nth element, my recommendation in general: Don't do it! Use the power of the collections api!: create a method that returns a List<Image>.
That way the user of the API can more easily do things like iterating over the images (using the enhanced for-loop instead of having to juggle with indices), sorting them, combining them in other collections,...
It is really easy to implement:
If the images are internally already represented as a list, you can just return that (optionally wrapped with Collections.unmodifiableList).
And even if you don't store them as a list, exposing them as List<Image> is as easy as writing a small subclass of AbstractList by just overriding size() and get(int), the two methods you were going to implement anyways.
The problem with size() is that it's not necessarily clear whether you're returning the number of elements (images in this case) or the size in bytes/kilobytes/megabytes of the collection. For the same reason, I tend to prefer getImageDimensions() to getImageSize(). In your case, I like countImages() because it's clear and concise, but it doesn't follow naming conventions if for example, there are methods like getImage(int). Therefore, I will have to go with int getImagesCount() or better, int getNumberOfImages() or int getNumImages()