How to set and get static variables from spark? - java

I have a class as this:
public class Test {
private static String name;
public static String getName() {
return name;
}
public static void setName(String name) {
Test.name = name;
}
public static void print() {
System.out.println(name);
}
}
Inside my Spark driver, I'm setting the name like this and calling the print() command:
public final class TestDriver{
public static void main(String[] args) throws Exception {
SparkConf sparkConf = new SparkConf().setAppName("TestApp");
// ...
// ...
Test.setName("TestName")
Test.print();
// ...
}
}
However, I'm getting a NullPointerException. How do I pass a value to the global variable and use it?

Ok, there is basically 2 ways to take a value known to the master to the executors:
Put the value inside a closure to be serialized to the executors to perform a task. This is the most common one and very simple/elegant. Sample and doc here.
Create a broadcast variable with the data. This is good for immutable data of a big size, so you want to guarantee it is send only once. Also good if the same data is used over and over. Sample and doc here.
No need to use static variables in either case. But, if you DO want to have static values available on your executor VMs, you need to do one of these:
If the values are fixed or the configuration is available on the executor nodes (lives inside the jar, etc), then you can have a lazy val, guaranteeing initialization only once.
You can call mapPartitions() with code that uses one of the 2 options above, then store the values on your static variable/object. mapPartitions is guaranteed to run only once for each partition (much better than once per line) and is good for this kind of thing (initializing DB connections, etc).
Hope this helps!
P.S: As for you exception: I just don't see it on that code sample, my bet is that it is occurring elsewhere.
Edit for extra clarification: The lazy val solution is simply Scala, no Spark involved...
object MyStaticObject
{
lazy val MyStaticValue = {
// Call a database, read a file included in the Jar, do expensive initialization computation, etc
4
}
}
Since each Executor corresponds to a JVM, once the classes are loaded MyStaticObject will be initialized. The lazy keyword guarantees that the MyStaticValue variable will only be initialized the first time it is actually requested, and hold its value ever since.

The copy of your class in your driver process isn't the copy in your executors. They aren't in the same ClassLoader, or even the same JVM, or even on the same machine. Setting a static variable on the driver does nothing to the other copies, hence you find it null remotely.

I would like to add one more approach this makes sense only when if you have a few variables which cab ne passed in runtime as arguments.
spark Configuration --> --conf "spark.executor.extraJavaOptions=-DcutomField=${value}"
and
when you need data in transformations you can call System.getProperty("cutomField");
you can find more details here
Note: above discussed does not make sense when we have a significant number of variables
. in those cases, I would prefer #Daniel Langdon approaches.

I would like to add one more point into DanielL's Answer
When declare a variable with static keyword the JVM loads it during the class loading so if you create a jar and set initial values of static fields in a Java /scala class are stored in the jar, workers can use it directly. However if you change the value of a static field in the driver program, workers can only see the initial value assigned into Jar and your changed value will not reflect , so you need to copy again new jar or need to copy class manually into all executors .

Related

How do I remember method so I can use it back when server restarts?

I am writing RaspberryPi program for executing tasks at given time. I wrote TaskManager that keeps all tasks in synchronized Map (awaitingTasks) and manage them. One of it's methods is
addInTimeTasks(...)
public static int addInTimeTask(Callable task, DateTime time) {
synchronized (awaitingTasks) {
final int id = assignNewId();
awaitingTasks.put(id, scheduledThreadPool.schedule(new CallableTask(task, new CallableTask.MyCallback() {
#Override
public void onFinish() {
awaitingTasks.remove(id);
}
}), TimeDiffCalc.secToDate(time), TimeUnit.SECONDS));
return id;
}
}
as you can see Task (thinking of making it class if it has more attributes) have its own ID, Date and method that it executes.
I want to handle situation when server restarts and all in time tasks simply dissapear. I was thinking about holding Tasks in database. I can hold TaskID and Date but how do I determine method that given task should execute?
I like flexablity of this method cuz I can make any method in-time executable.
For example here is method from RGBLed class (which have mutltiple methods that can be executed in time)
public int lightLed(final LedColor color, DateTime dateTime){
return TaskManager.addInTimeTask(new Callable<Void>() {
public Void call() throws Exception {
//here is code that makes led lighting
return null;
}
},dateTime);
}
What came into my mind was to assign to every method ID, and then get method by id but I dont think it is passible.
I ll bet that were many questions with similar problem but I can not simply find them. I can not specify question properly (So please change it)
Thanks!
You are facing two problems. That one that you describe can be fixed "easily". You see, you know that you want to call specific methods.
Methods have names. Names are ... strings. So, you could simply store that name as string; and when you have some object in front of you, you can use Java reflection means to invoke a particular method.
The other problem is: persisting your objects might not be that easy. If I get your examples right, you are mostly dealing with anonymous inner classes. And yes, objects of such classes can be serialized too, but not as "easy" or "without thought" as normal classes (see here for example).
So, my suggestions:
Don't use inner classes; but ordinary classes (although that might affect the "layout" of existing code to a great degree); serialize objects of those classes
Together with serialized object, remember name (and probably the arguments you need) so you can call methods by name
Probably it would make sense to create a class specifically for that purpose; containing two fields (the actual object to serialized, and the name of the method to call on that).

How to block/bulk reset static variables using Java 7

I am parsing a text file looking for syntax matches. to complete this task, I created Variable class with static variables in it. Then I parse a file and I assign the parsed information to the variable class static elements.
Variable.name = "the parsed information";
Then I created a Baseline class to check and compare against the Variable elements to check if certain condition is met.
For example
if (BaseLine.x.equal(Variable.x) // do whatever.
Variable Class:
public class Variable {
static String name;
static String userID;
static Integer age
}
BaseLine Class:
public class BaseLine {
static String name;
static String userID;
static Integer age
}
Utilizing JavaFX I move between scenes to load a file, parse it, assign the parsed variable then compare it with my BaseLine class. Thus far, Everything works as expected until I hit the back button to go back to the original scene to load a new file. now, the issue that I am having how can I reset the variable inside my Variable class in bulk instead of doing it one by one inside my controller initialize statement? I want to do this to ensure that I don't capture any variable from the older file I just parsed before I hit the back button. what is the correct way of completing this task?
I was able to get what I am looking for when I reset the variables inside my initialize controller, but it seems to be a lengthy process to do for OOP I have over 100 variables (int/sets/strings...) to reset.
Here is what I did to reset the static Variables inside the controller initialize section.
#Override
public void initialize(URL url, ResourceBundle rb) {
Variable.name = null;
Variable.setName.clear();
Variable.age = null;
I was able to get what I am looking for when I reset the variables inside my initialize controller, but it seems to be a lengthy process to do for OOP I have over 100 variables (int/sets/strings...) to reset.
If you have hundreds of static variables, you are not doing OOP properly. In proper OO design, your application's state should be held in instance variables, and accessed via instance variables. Static variables should be kept to an absolute minimum. (You can eliminate them entirely, if you can use a dependency injection (DI) framework ...)
The bad news is that there is no >>good<< way to reset a large number of
static variables. There are a couple of >>bad<< ways; e.g. reflection, and messing around with classloaders ... but you just would be replacing one problem (clunky code) with a worse one (complex, fragile code). IMO.
But the good news is that you fix your design / implementation to be properly OO, then you won't have this nasty problem of resetting the variables. And a whole bunch of other things will be easier too ... like writing unit tests.
I am not aware of a way that would exists that would allow you to reset all static variables in one go. You would have to do one by one.
I think your problem lies somewhere else - and i believe you should refactor your code in that instance (if possible).
How about having all your variables as standard (non static) and set/get them like you normally would but when done just create new object and let the old one be collected by Garbage collector??
Edit:
Although you could perhaps use reflection ? Not entirely sure if that would work though.
Something like
Field[] fields = MyClass.class.getDeclaredFields();
for (Field field : fields) {
if (Modifier.isStatic(field.getModifiers() && isRightName(field.getName()) {
field = null;
}
}

pattern for getting around final limitation of Java closure

I'm trying to write a very simple piece of code and can't figure out an elegant solution to do it:
int count = 0;
jdbcTemplate.query(readQuery, new RowCallbackHandler() {
#Override
public void processRow(ResultSet rs) throws SQLException {
realProcessRow(rs);
count++;
}
});
This obviously doesn't compile. The 2 solutions that I'm aware of both stink:
I don't want to make count a class field because it's really a local variable that I just need for logging purposes.
I don't want to make count an array because it is plain ugly.
This is just silly, there got to be a reasonable way to do it?
A third possibility is to use a final-mutable-int-object, for example:
final AtomicInteger count = new AtomicInteger(0);
....
count.incrementAndGet();
Apache Commons also have a MutableInteger I believe, but I have not used it.
You seem to already be aware of the solutions (they are different though); and you are probably aware of the reasons (it cannot capture local variables by reference because the variable might not exist by the time the closure is run, so it must capture by value (have multiple copies); it is bad to have the same variable refer to different copies in different scopes that each can be changed independently, so they cannot be changed).
If your closure does not need to share state back to the enclosing scope, then a field in the class is the right thing to do. I don't understand what your objection is. If the closure needs to be able to be called multiple times and it needs to increment each time, then it needs to maintain state in the object. A field (instance variable) properly expresses the storing of state in an object. The field can be initialized with the captured value from the outside scope.
If your closure needs to share state back to the enclosing scope (which is not a very common situation), then using a mutable structure (like an array) is the right thing to do, because it avoids the problem of the lifetime of the local variable.
I typically make count a class field but add a comment that it is only a field because it is used by an inner closure, Runnable etc...

static variable lose its value

I have helper class with this static variable that is used for passing data between two classes.
public class Helper{
public static String paramDriveMod;//this is the static variable in first calss
}
this variable is used in following second class mathod
public void USB_HandleMessage(char []USB_RXBuffer){
int type=USB_RXBuffer[2];
MESSAGES ms=MESSAGES.values()[type];
switch(ms){
case READ_PARAMETER_VALUE: // read parameter values
switch(prm){
case PARAMETER_DRIVE_MODE: // paramet drive mode
Helper.paramDriveMod =(Integer.toString(((USB_RXBuffer[4]<< 8)&0xff00)));
System.out.println(Helper.paramDriveMod+"drive mode is selectd ");
//here it shows the value that I need...........
}
}
//let say end switch and method
}
and the following is an third class method use the above class method
public void buttonSwitch(int value) throws InterruptedException{
boolean bool=true;
int c=0;
int delay=(int) Math.random();
while(bool){
int param=3;
PARAMETERS prm=PARAMETERS.values()[param];
switch(value){
case 0:
value=1;
while(c<5){
Thread.sleep(delay);
protocol.onSending(3,prm.PARAMETER_DRIVE_MODE.ordinal(),dataToRead,dataToRead.length);//read drive mode
System.out.println(Helper.paramDriveMod+" drive mode is ..........in wile loop");//here it shows null value
}
//break; ?
}
}
//let say end switch and method
}
what is the reason that this variable lose its value?
Could I suggest that to pass data between classes, you use separate objects instead of a global variable?
It's not at all clear how you expect the code in protocolImpl to get executed - as templatetypedef mentions, you haven't shown valid Java code in either that or the param class (neither of which follows Java naming conventions).
A short but complete example would really help, but in general I would suggest you avoid using this pattern in the first place. Think in terms of objects, not global variables.
As I understand it, a "Class" (Not just an instance but the entire class object) Can be garbage collected just like any other unreferenced object--a static variable in that class instance won't prevent the GC from collecting your class.
I just came here because I think I'm seeing this behavior in a singleton and I wanted to see if anyone else noticed it (I've never had to research the problem before-and this knowledge is like a decade old from the back of my brain so I'm unsure of it's reliability at this point).
Going to go continue research now.
Just found this question, check the accepted answer--looks like it's unlikely that a static will be lost due to GC, but possible.
Are static fields open for garbage collection?
A Variable never "loses" its value. You set it to "null" somewhere, but your code here is not enough to tell whats going on. The only place here where you set it is this line:
Helper.paramDriveMod =(Integer.toString(((USB_RXBuffer[4]<< 8)&0xff00)));
But if you pass "null" to toString() you get some null pointer exception...so I would assume that this line never gets hit and so you get the "null" value as you dont initialize paramDriveMod with some other value.
Don't use static variable until you are in some critical situation. You can use getter setter instead
Could it be that you may be confusing static with final? Static variables' values can change. Final variables' values can not.
The execution flow not shown - may be the 3rd code:
while(c<5){
Thread.sleep(delay);
protocol.onSending(3,prm.PARAMETER_DRIVE_MODE.ordinal(),dataToRead,dataToRead.length);//read drive mode
System.out.println(Helper.paramDriveMod+" drive mode is ..........in wile loop");//here it shows null value "
is executed before the second code
switch(ms)
{
case READ_PARAMETER_VALUE: // read parameter values
switch(prm){
case PARAMETER_DRIVE_MODE: // paramet drive mode
Helper.paramDriveMod =(Integer.toString(((USB_RXBuffer[4]<< 8)&0xff00)));

Static method in Java

Looking through some java code and this just does not seem right. To me, it looks like every time you call projects, you will get a new hashmap, so that this statement is always false
projects.get(soapFileName) != null
Seems like it should have a backing field
public static HashMap<String,WsdlProject> projects = new HashMap<String,WsdlProject>();
public Object[] argumentsFromCallSoapui(CallT call, Vector<String> soapuiFiles, HashMap theDPLs,int messageSize)
{
try {
for (String soapFileName:soapuiFiles){
System.out.println("Trying "+soapFileName);
WsdlProject project ;
if (projects.get(soapFileName) != null){
project = projects.get(soapFileName);
} else {
project = new WsdlProject(soapFileName);
projects.put(soapFileName,project);
}
}
} ...
}
Nope. In Java that static variable only gets initialized once.
So, this line will only get called once.
public static HashMap<String,WsdlProject> projects = new HashMap<String,WsdlProject> ();
The projects variable will be initialized once, when the class first loads.
Generally, static maps of this sort are a bad idea: they often turn into memory leaks, as you hold entries long past their useful life.
In this particular case, I'd also worry about thread safety. If you have multiple threads calling this method (which is likely in code dealing with web services), you'll need to synchronize access to the map or you could corrupt it.
And, in a general stylistic note, it's a good idea to define variables using the least restrictive class: in this case, the interface Map, rather than the concrete class HashMap.
You don't call projects - it's a field, not a method.
As it's a static field, it will be initialized exactly once (modulo the same type being loaded in multiple classloaders).
if you add a static initialiser (static constructor?) you'll be able to see that statics are just initialised the first time the class is loaded:
public class Hello {
static { System.out.println("Hello static World!"); }
...
}
You won't get a new HashMap every time you invoke a method on projects, if that's what you are referring to. A new HashMap will be created once, however all instances of the class will share a single HashMap.

Categories

Resources