Is it an alright practice to reuse the same variable to keep making new Object instances (like case 2)?
Some pseudo code:
//case 1:
class main {
List<Foo> bar = new ArrayList;
public static main(String[] args) {
Foo baz = new Foo(params);
Foo baz1 = new Foo(differentParams);
Foo baz2 = new Foo..
}
}
class Foo {
Foo(params..) {
main.bar.add(this);
}
}
//case 2:
class main {
List<Foo> bar = new ArrayList;
public static main(String[] args) {
Foo baz = new Foo(params);
baz = new Foo(differentParams);
baz = new Foo..
}
}
class Foo {
Foo(params..) {
main.bar.add(this);
}
}
I am wondering if case 2 is an okay design practice. Since in Foo I am storing the instance of the class in the list in main, I will never need the direct object variable baz1 during runtime, but will be iterating through the list to apply some logic to each object.
So my question is, is it an alright practice to reuse the same variable to keep making new Object instances (like case 2)? Conventional practices would suggest to keep a separate variable for the objects as you make them (like case 1).
This question came up in my mind when thinking about memory and whether or not doing case 2 would save more memory compared to case 1 since you are not declaring a new variable each time.
This question came up in my mind when thinking about memory and whether or not doing case 2 would save more memory compared to case 1 since you are not declaring a new variable each time.
On Android, the memory is severely limited, IIRC an app can use only 20 MB. One variable may take 4 bytes, and you save two variables, i.e., 8 bytes. This means that when you write 250,000 such classes, you save 2 MB, i.e., 10% of available memory.
Are you going to write a quarter million of such classes?
I hope, you are not.
But then, it doesn't matter as local variables exist only when the method is called and nobody cares about 8 bytes.
Moreover, your code gets processed by an optimizing compiler, which doesn't transform your code line by line. So it's possible, that there's no difference in the generated code for your two versions.
So please, don't try optimizing irrelevant details.
Write proper code instead. As you've been said, "a constructor shouldn't have side effects, and shouldn't escape this".
I'd suggest something like this:
class Main { // class names are always UpperCamelCase
private List<Foo> bar = new ArrayList; // "private"
public static main(String[] args) {
new Main().go(); // leave the static context ASAP
}
private void go() {
bar.add(new Foo(params));
bar.add(new Foo(differentParams));
bar.add(new Foo(....));
}
}
No magic there. The constructor constructs and does nothing else.
Everything gets added explicitly... this may get verbose, but then you'd just write a method encapsulation common operations.
Related
I want to modify list of already created objects in stream. I realized three approaches that may do that, but I not sure about their performance and possible downsize.
Return same object - not waste of time to creating new object, but object is mutable
Create new object - parameter is not modified, but for huge object creation is time consuming
Modify parameter - can only use ForEach, no parallel usage
Code below code with explaining comments.
public class Test {
public static void main(String[] args) {
//Already created objects
List<Foo> foos0 = Arrays.asList(new Foo("A"));
//However I need to apply some modification on them, that is dependent on themselves
//1. Returning same object
List<Foo> foos1 = foos0.stream().map(Test::modifyValueByReturningSameObject).collect(Collectors.toList());
//2. Creating new object
List<Foo> foos2 = foos0.stream().map(Test::modifyValueByCreatingNewObject).collect(Collectors.toList());
//3. Modifying param
foos0.stream().forEach(Test::modifyValueByModifyingParam);
}
//Lets imagine that all methods below are somehow dependent on param Foo
static Foo modifyValueByReturningSameObject(Foo foo) {
foo.setValue("fieldValueDependentOnParamFoo");
return foo;
}
static Foo modifyValueByCreatingNewObject(Foo foo) {
Foo newFoo = new Foo("fieldValueDependentOnParamFoo");
return newFoo;
}
static void modifyValueByModifyingParam(Foo foo) {
foo.setValue("fieldValueDependentOnParamFoo");
return;
}
}
public class Foo {
public String value;
public Foo(String value) {
this.value = value;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
}
So the question is which is the most stream approach?
EDIT:
By stream approach I mean, that the most advantages in sense of performence.
EDIT2:
1. Which is functional approach?
2. Which is best in sense of performance?
The javadoc states that Streams should avoid side effects :
Side-effects in behavioral parameters to stream operations are, in general, discouraged, as they can often lead to unwitting violations of the statelessness requirement, as well as other thread-safety hazards.
So, you should prefer the solution where you create new objects instead of modifying existing ones.
The different aproaches will in your case most likely result in no difference regarding performance.
Reason: optimization. Java will not really create new classes and will use direct access to fields. It might(and will if analysis sugests it) even skip a whole chain of calls and replace it by a precalculated value. Java runtime even utilizes a profiler to optimize and find hotspots...
Also: Regarding performance it is in general(particular cases may differ) more important to create a simple structure and help the runtime to make the right assumptions.
So if you hide what you are doing in unesseary manual "optimization", that hides optimization posibilities(lots of branches/decisions, unnecessary pinning, chain of "unknown" methods ...) from the runtime you might end up with a slower result.
For clarity and sideffects(see also other answer) I rather use the version that creates new instances.
What sort of optimizations would Java Runtime perform on the follow snippet of code? The bytecode doesn't reveal any optimization however I feel that Java should take the last value of the for loop without running the entire for loop since String is a rudimentary Java class.
NOTE. this question was asked on a class test; however, I couldn't provide enough evidence to back my claim.
public class Main {
public static void main(String[] args) {
String str = null;
for (long i = 0; i < 10000000000L; i++) {
str = new String("T");
}
System.out.println(str);
}
}
While I can't speak to exactly what the jit compiler is doing, the optimization you are asking it to do (to determine that it is safe to skip the loop body entirely) is actually extremely difficult to do, and so I highly doubt it it is done. This is true regardless of String being a "rudimentary Java class".
To understand better, first let's assume that instead of String, we are creating instances of an arbitrary class Foo. It would only be safe to skip the creation of all those Foo objects if we knew two things: that calling new Foo() didn't have any observable side effects; and that no references to Foo "escaped" the loop body.
An observable side effect would be something like setting the value of a static member (e.g. if the Foo class kept a static count of all the times Foo() had been called). An example of a reference escaping would be if the this variable inside of Foo() was passed somewhere else.
Note that it isn't enough to just look at Foo(), you need to look at Foo's superclass' constructor (and all the way up the chain to Object). And then you need to look at all the code that gets executed upon initialization of each of those objects. And then look at all the code that gets called by that code. That would be a tremendous amount of analysis to do "just-in-time".
public class Foo extends Bazz{
static int count = 0;
public Foo(){
// Implicit call to Bazz() has side effect
count++; // side effect
Bazz.onNewFoo(this); // reference escaping
}
Bazz bazz = new Bazz(); // side effect
{
Bazz.onNewBazz(this.bazz); // reference escaping
}
}
class Bazz{
static int count = 0;
static List<Foo> fooList = new LinkedList<>();
static List<Bazz> bazzList = new LinkedList<>();
static void onNewFoo(Foo foo){
fooList.add(foo);
}
static void onNewBazz(Bazz bazz){
bazzList.add(bazz);
}
public Bazz(){
count++;
}
}
You might think we should just let javac do this analysis and optimization for us. The problem with that is, that there is no way to guarantee that the version of Foo() that was on the classpath at compile-time will be the same as that which is on the classpath at run-time. (Which is a very valuable feature of Java - it allows me to move my application from Glassfish to Tomcat without recompiling). So we can't trust analysis done at compile-time.
Finally, realize that String is no different from Foo. We'd still need to run that analysis, and there is no way to do that analysis in advance (which I why I can upgrade my JRE without recompiling my apps)
I had some confusion about inner classes and lambda expression, and I tried to ask a question about that, but then another doubt arose, and It's probable better posting another question than commenting the previous one.
Straight to the point: I know (thank you Jon) that something like this won't compile
public class Main {
public static void main(String[] args) {
One one = new One();
F f = new F(){ //1
public void foo(){one.bar();} //compilation error
};
one = new One();
}
}
class One { void bar() {} }
interface F { void foo(); }
due to how Java manages closures, because one is not [effectively] final and so on.
But then, how come is this allowed?
public class Main {
public static void main(String[] args) {
One one = new One();
F f = one::bar; //2
one = new One();
}
}
class One { void bar() {} }
interface F { void foo(); }
Is not //2 equivalent to //1? Am I not, in the second case, facing the risks of "working with an out-of-date variable"?
I mean, in the latter case, after one = new One(); is executed f still have an out of date copy of one (i.e. references the old object). Isn't this the kind of ambiguity we're trying to avoid?
A method reference is not a lambda expression, although they can be used in the same way. I think that is what is causing the confusion. Below is a simplification of how Java works, it is not how it really works, but it is close enough.
Say we have a lambda expression:
Runnable f = () -> one.bar();
This is the equivalent of an anonymous class that implements Runnable:
Runnable f = new Runnable() {
public void run() {
one.bar();
}
}
Here the same rules apply as for an anonymous class (or method local class). This means that one needs to effectively final for it to work.
On the other hand the method handle:
Runnable f = one::bar;
Is more like:
Runnable f = new MethodHandle(one, one.getClass().getMethod("bar"));
With MethodHandle being:
public class MethodHandle implements Runnable {
private final Object object;
private final Method method;
public MethodHandle(Object object, java.lang.reflect.Method method) {
this.object = Object;
this.method = method;
}
#Override
public void run() {
method.invoke(object);
}
}
In this case, the object assigned to one is assigned as part of the method handle created, so one itself doesn't need to be effectively final for this to work.
Your second example is simply not a lambda expression. It's a method reference. In this particular case, it chooses a method from a particular object, which is currently referenced by the variable one. But the reference is to the object, not to the variable one.
This is the same as the classical Java case:
One one = new One();
One two = one;
one = new One();
two.bar();
So what if one changed? two references the object that one used to be, and can access its method.
Your first example, on the other hand, is an anonymous class, which is a classical Java structure that can refer to local variables around it. The code refers to the actual variable one, not the object to which it refers. This is restricted for the reasons that Jon mentioned in the answer you referred to. Note that the change in Java 8 is merely that the variable has to be effectively final. That is, it still can't be changed after initialization. The compiler simply became sophisticated enough to determine which cases will not be confusing even when the final modifier is not explicitly used.
The consensus appears to be that this is because when you do it using an anonymous class, one refers to a variable, whereas when you do it using a method reference, the value of one is captured when the method handle is created. In fact, I think that in both cases one is a value rather than a variable. Let's consider anonymous classes, lambda expressions and method references in a bit more detail.
Anonymous classes
Consider the following example:
static Supplier<String> getStringSupplier() {
final Object o = new Object();
return new Supplier<String>() {
#Override
public String get() {
return o.toString();
}
};
}
public static void main(String[] args) {
Supplier<String> supplier = getStringSupplier();
System.out.println(supplier.get()); // Use o after the getStringSupplier method returned.
}
In this example, we are calling toString on o after the method getStringSupplier has returned, so when it appears in the get method, o cannot refer to a local variable of the getStringSupplier method. In fact it is essentially equivalent to this:
static Supplier<String> getStringSupplier() {
final Object o = new Object();
return new StringSupplier(o);
}
private static class StringSupplier implements Supplier<String> {
private final Object o;
StringSupplier(Object o) {
this.o = o;
}
#Override
public String get() {
return o.toString();
}
}
Anonymous classes make it look as if you are using local variables, when in fact the values of these variables are captured.
In contrast to this, if a method of an anonymous class references the fields of the enclosing instance, the values of these fields are not captured, and the instance of the anonymous class does not hold references to them; instead the anonymous class holds a reference to the enclosing instance and can access its fields (either directly or via synthetic accessors, depending on the visibility). One advantage is that an extra reference to just one object, rather than several, is required.
Lambda expressions
Lambda expressions also close over values, not variables. The reason given by Brian Goetz here is that
idioms like this:
int sum = 0;
list.forEach(e -> { sum += e.size(); }); // ERROR
are fundamentally serial; it is quite difficult to write lambda bodies
like this that do not have race conditions. Unless we are willing to
enforce -- preferably at compile time -- that such a function cannot
escape its capturing thread, this feature may well cause more trouble
than it solves.
Method references
The fact that method references capture the value of the variable when the method handle is created is easy to check.
For example, the following code prints "a" twice:
String s = "a";
Supplier<String> supplier = s::toString;
System.out.println(supplier.get());
s = "b";
System.out.println(supplier.get());
Summary
So in summary, lambda expressions and method references close over values, not variables. Anonymous classes also close over values in the case of local variables. In the case of fields, the situation is more complicated, but the behaviour is essentially the same as capturing the values because the fields must be effectively final.
In view of this, the question is, why do the rules that apply to anonymous classes and lambda expressions not apply to method references, i.e. why are you allowed to write o::toString when o is not effectively final? I do not know the answer to that, but it does seem to me to be an inconsistency. I guess it's because you can't do as much harm with a method reference; examples like the one quoted above for lambda expressions do not apply.
No. In your first example you define the implementation of F inline and try to access the instance variable one.
In the second example you basically define your lambda expression to be the call of bar() on the object one.
Now this might be a bit confusing. The benefit of this notation is that you can define a method (most of the time it is a static method or in a static context) once and then reference the same method from various lambda expressions:
msg -> System.out::println(msg);
can you explain me which is the difference between:
public class Test {
public static final Person p;
static {
p = new Person();
p.setName("Josh");
}
}
and
public class Test {
public static final Person p = initPerson();
private static Person initPerson() {
Person p = new Person();
p.setName("Josh");
return p;
}
}
I have always used the second one, but is there any difference with an static initializer block?
There are of course technical differences (you could invoke the static method multiple times within your class if you wanted, you could invoke it via reflection, etc) but, assuming you don't do any of that trickery, you're right -- the two approaches are effectively identical.
I also prefer the method-based approach, since it gives a nice name to the block of code. But it's almost entirely a stylistic approach.
As Marko points out, the method-based approach also serves to separate the two concerns of creating the Person, and assigning it to the static variable. With the static block, those two things are combined, which can hurt readability if the block is non-trivial. But with the method approach, the method is responsible solely for creating the object, and the static variable's initializion is responsible solely for taking that method's result and assigning it to the variable.
Taking this a bit further: if I have two static fields, and one depends on the other, then I'll declare two methods, and have the second method take the first variable as an explicit argument. I like to keep my static initialization methods entirely free of state, which makes it much easier to reason about which one should happen when (and what variables it assumes have already been created).
So, something like:
public class Test {
public static final Person p = initPerson();
public static final String pAddress = lookupAddress(p);
/* implementations of initPerson and lookupAddress omitted */
}
It's very clear from looking at that, that (a) you don't need pAddress to initialize p, and (b) you do need p to initialize lookupAddress. In fact, the compiler would give you a compilation error ("illegal forward reference") if you tried them in reverse order and your static fields were non-final:
public static String pAddress = lookupAddress(p); // ERROR
public static Person p = initPerson();
You would lose that clarity and safety with static blocks. This compiles just fine:
static {
pAddress = p.findAddressSomehow();
p = new Person();
}
... but it'll fail at run time, since at p.findAddressSomehow(), p has its default value of null.
A static method (second example) is executed every time you call it. A static init block (first example) is only called once on initializing the class.
http://docs.oracle.com/javase/tutorial/java/javaOO/initial.html
The advantage of private static methods is that they can be reused later if you need to reinitialize the class variable.
This does not count for final instances, because a final variable can only be initialized once.
initPerson requires calling at some point, whereas the static block is executed when creating the Test object.
The static before a function specifies that you can use that function by calling it on the Class name handle itself. For example, if you want to create a Person object outside the class you can write
Person p = Test.initPerson();
However, there is no advantageous difference between the two as you can access the object p outside the class in both cases.
public class Test {
int value = 100;
public Test() {
}
}
And
public class Test {
int value;
public Test() {
value = 100;
}
}
Are equivalent, right? Is there a reason why I'd prefer to do one over the other? Obviously if the constructor takes parameters that are later given to the fields is a reason:
public class Test {
int value;
public Test(int value) {
this.value = value;
}
}
Or perhaps I need to do some special calculation.
But if I don't do that, is there another good reason?
Well it all really depends on how you plan on using this. I'm going to assume that you don't plan to make value static but it's just there for internal purposes.
Firstly lets look at the bytecode.
D:\eclipse\workspace\asdf\bin>javap -c A.class
Compiled from "A.java"
public class A {
int value;
public A();
Code:
0: aload_0
1: invokespecial #10 // Method java/lang/Object."<init>":()V
4: aload_0
5: bipush 100
7: putfield #12 // Field value:I
10: return
}
D:\eclipse\workspace\asdf\bin>javap -c B.class
Compiled from "B.java"
public class B {
int value;
public B();
Code:
0: aload_0
1: invokespecial #10 // Method java/lang/Object."<init>":()V
4: aload_0
5: bipush 100
7: putfield #12 // Field value:I
10: return
}
D:\eclipse\workspace\asdf\bin>
Guess what? Exactly the same! Why? Because you can't USE value until you make an object by using the new keyword.
The oracle docs states that:
As you have seen, you can often provide an initial value for a field
in its declaration:
public class BedAndBreakfast {
// initialize to 10
public static int capacity = 10;
// initialize to false
private boolean full = false;
}
This works well when the initialization value is available and the initialization can be put on
one line. However, this form of initialization has limitations because
of its simplicity. If initialization requires some logic (for example,
error handling or a for loop to fill a complex array), simple
assignment is inadequate. Instance variables can be initialized in
constructors, where error handling or other logic can be used. To
provide the same capability for class variables, the Java programming
language includes static initialization blocks.
So now you have confirmation that the whole point of doing it in the constructor is if you are doing something complex like initializing an array otherwise feel free to do it right there when you declare the field.
If you WERE to use static then you are obviously doing two different things. It's almost like a check to see if someone has ever created an instance of this object or not. Your variable would be 0 until someone creates an object and then it would be 100 afterward.
Field initialization code is copied into each constructor... if you had multiple constructors and wanted the field initialized with the same value in each (or even just most) then it would be better to initialize at declaration and override the value in the constructor.
Well, it depends.
With the second case, value would be populated with its default value of 0, only to be reassigned at instantiation with 100. In the first case, value is just instantly given the value of 100.
Semantically, this would help a programmer - they would see that this particular value means something a little more than just it being arbitrary (although, it should be a constant value somewhere).
Programmatically, there's no pain if a primitive is set to some initial value. It means that there's something in there for you to use, and if your program depends on there being a non-negative or false value, by George it will work.
Things get more explicit when dealing with object references. Take, for instance, these two classes:
public class Foo {
List<String> elements;
public Foo() {
}
public Foo(String... items) {
elements = new ArrayList<>();
for(String item : items) {
elements.add(item);
}
}
}
public class Bar {
List<String> elements = new ArrayList<>();
public Bar() {
}
public Bar(String... items) {
for(String item : items) {
elements.add(item);
}
}
}
There are intentionally no-arg constructors to hammer home the point - for Foo, if I attempt to use elements, then I'm in a bit of trouble if I don't use the appropriate constructor - elements is null!* I could then just instantiate it whenever I needed it, but I would very much want to avoid destroying a potentially newed and populated list.
That means a lot of code looking something like this:
if(elements == null) {
elements = new ArrayList<>();
}
...then I have to worry about it being thread safe. Sheesh, talk about a hassle.
With Bar, I'm guaranteed that at instantiation, there is an instance of a list in elements, so I don't have to worry about it being null.**
This is known as eager instantiation. You really don't want to live without that object, so why wait until you think you need it (or lazily instantiate)?
*: The default value for all reference types is null.
**: You do have to worry about that being overwritten, but that's a concern outside of the scope of this question.
public class Test {
int value = 100;
public Test() {
}
}
This works well when the initialization value is available and you may declare and initialize field on one line. However, this form of initialization has limitations because of its simplicity. If initialization requires some logic (for example, error handling or a validation or condition), simple assignment is inadequate. When you are using constructor initialization, you may do error handling or other logic. To provide the same capability for class variables, the Java programming language includes static initialization blocks. There are also another two ways to initialize instance variables:
initializer blocks
{
// initialization
}
final methods
class Foo{
int age=initAge();
protected int initAge(){
//initialization code
}
}
If you're not doing any calculation or not taking any parameters, there's no difference in any of those above two, whether you initialize or not initialize those variable inside constructor.
If you declare them as the first one like you declare as:
public class Test {
int value = 100;
public Test() {
}
}
It would be more of a readable format as you're assigning them value directly, no need to view from constructor.
It would be better also for If you have more than one constructor, you don't have to repeat the initializations (and you cannot forget them).
Whenever a class is created the constructor is initialized first. So, when you declare or define a variable inside the constructor the memory is allocated to that variable at first and then the process will be continued.
Currently in your example, there is just one field and you are deciding which way of initializing is better than the other.
But, if you increase the complexity by initializing a lot of fields (say 30 or 40), then it does make a lot of difference.
In this situation, consider what Joshua Bloch has to say on initializing through constructors.
Below is the summary,
The telescoping constructor pattern works, but it is hard to write
client code when there are many parameters, and harder still to read
it.
The solution is a form of Builder pattern where instead of making
the desired object directly, the client calls a constructor (or
static factory) with all of the required parameters and gets a
builder object.
I'm not talking about bytecode, but they can differ semantically (if you have multiple constructors),
The field will be always initialized to 100 if you define it as below, no matter which constructor is called:
int field = 100;
but otherwise you shall initialize the field in each constructor.
Your class may have just a single constructor, but just think, will there be any other constructor in future releases of your class?