Passing Java array to Scala

Passing Java array to Scala - java

Although I've been using Scala for a while and have mixed it with Java before, I bumped on a problem.
How can I pass a Java array to Scala? I know that the other way around is fairly straightforward. Java to Scala is not so however.
Should I declare my method in Scala?
Here is a small example of what I'm trying to achieve:
Scala:
def sumArray(ar: Array[Int]) = ...
Java:
RandomScalaClassName.sumArray(new int[]{1,2,3});
Is this possible?

absolutely!
The Array[T] class in Scala is mapped directly to the Java type T[]. They both have exactly the same representation in bytecode.
At least, this is the case in 2.8. Things were a little different in 2.7, with lots of array boxing involved, but ideally you should be working on 2.8 nowadays.
So yes, it'll work exactly as you've written it.

Yes, it is totally possible and in fact very easy. The following code will work as expected.
// TestArray.scala
object TestArray {
def test (array: Array[Int]) = array.foreach (println _)
}
-
// Sample.java
public class Sample
{
public static void main (String [] args) {
int [] x = {1, 2, 3, 4, 5, 6, 7};
TestArray.test (x);
}
}
Use the following command to compile/run.
$scalac TestArray.scala
$javac -cp .:/opt/scala-2.8.0/lib/scala-library.jar Sample.java
$java -cp .:/opt/scala-2.8.0/lib/scala-library.jar Sample

Related

Groovy script compiles to a class

From this answer, I learnt that, every Groovy script compiles to a class that extends groovy.lang.Script class
Below is a test groovy script written for Jenkins pipeline in Jenkins editor.
node('worker_node'){
print "***1. DRY principle***"
def list1 = [1,2,3,4]
def list2 = [10,20,30,40]
def factor = 2
def applyFactor = {e -> e * factor}
print(list1.each(applyFactor))
print(list2.each(applyFactor))
print "***2. Higher order function***"
def foo = { value, f -> f(value *2) }
foo(3, {print "Value is $it"})
foo(3){
print "Value is $it"
}
}
How to compile this groovy script to see the class generated(source code)?

The class generated is bytecode, not source code. The source code is the Groovy script.
If you want to see something similar to what the equivalent Java source code would look like, use groovyc to compile the script as usual, and then use a Java decompiler to produce Java source (this question's answers lists a few).
That's subject to the usual caveats on decompiled code, of course. High-level information is lost in the process of compiling. Decompilers have to guess a bit to figure out the best way to represent what might have been in the original source. For instance, what was a for loop in the original code may end up being decompiled as a while loop instead.

groovy in jenkins pipeline is a Domain Specific Language.
It's not a plain groovy.
However if you remove node(){ } then it seems to be groovy in your case.
and you can run it in groovyconsole or compile to class with groovyc
just download a stable groovy binary and extract it.
if you have java7 or java8 on your computer - you can run groovyconsole and try your code there.
with Ctrl+T you can see the actual class code generated for your script.

pyspark: call a custom java function from pyspark. Do I need Java_Gateway?

I wrote the following MyPythonGateway.java so that I can call my custom java class from Python:
public class MyPythonGateway {
public String findMyNum(String input) {
return MyUtiltity.parse(input).getMyNum();
}
public static void main(String[] args) {
GatewayServer server = new GatewayServer(new MyPythonGateway());
server.start();
}
}
and here is how I used it in my Python code:
def main():
gateway = JavaGateway() # connect to the JVM
myObj = gateway.entry_point.findMyNum("1234 GOOD DAY")
print(myObj)
if __name__ == '__main__':
main()
Now I want to use MyPythonGateway.findMyNum() function from PySpark, not just a standalone python script. I did the following:
myNum = sparkcontext._jvm.myPackage.MyPythonGateway.findMyNum("1234 GOOD DAY")
print(myNum)
However, I got the following error:
... line 43, in main:
myNum = sparkcontext._jvm.myPackage.MyPythonGateway.findMyNum("1234 GOOD DAY")
File "/home/edamameQ/spark-1.5.2/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 726, in __getattr__
py4j.protocol.Py4JError: Trying to call a package.
So what did I miss here? I don't know if I should run a separate JavaApplication of MyPythonGateway to start a gateway server when using pyspark. Please advice. Thanks!
Below is exactly what I need:
input.map(f)
def f(row):
// call MyUtility.java
// x = MyUtility.parse(row).getMyNum()
// return x
What would be the best way to approach this? Thanks!

First of all the error you see usually means the class you're trying to use is not accessible. So most likely it is a CLASSPATH issue.
Regarding general idea there are two important issues:
you cannot access SparkContext inside an action or transformation so using PySpark gateway won't work (see How to use Java/Scala function from an action or a transformation? for some details)). If you want to use Py4J from the workers you'll have to start a separate gateways on each worker machine.
you really don't want to pass data between Python an JVM this way. Py4J is not designed for data intensive tasks.

In PySpark before start calling the method -
myNum = sparkcontext._jvm.myPackage.MyPythonGateway.findMyNum("1234 GOOD DAY")
you have to import MyPythonGateway java class as follows
java_import(sparkContext._jvm, "myPackage.MyPythonGateway")
myPythonGateway = spark.sparkContext._jvm.MyPythonGateway()
myPythonGateway.findMyNum("1234 GOOD DAY")
specify the jar containing myPackage.MyPythonGateway with --jars option in spark-submit

If input.map(f) has inputs as an RDD for example, this might work, since you can't access the JVM variable (attached to spark context) inside the executor for a map function of an RDD (and to my knowledge there is no equivalent for #transient lazy val in pyspark).
def pythonGatewayIterator(iterator):
results = []
jvm = py4j.java_gateway.JavaGateway().jvm
mygw = jvm.myPackage.MyPythonGateway()
for value in iterator:
results.append(mygw.findMyNum(value))
return results
inputs.mapPartitions(pythonGatewayIterator)

all you need to do is compile jar and add to pyspark classpath with --jars or --driver-class-path spark submit options. Then access class and method with below code-
sc._jvm.com.company.MyClass.func1()
where sc - spark context
Tested with Spark 2.3. Keep in mind, you can call JVM class method only from driver program and not executor.

Java Constructors in a Ruby Script

I'm trying to figure out how to add constructor parameters to my JRuby Script. I have had it working before with the following code.
class Man < NpcCombat
def attackScripts attacker, victim
return [BasicAttack.meleeAttack(attacker, victim,AttackStyle::Mode::MELEE_ACCURATE, 2, Weapon::FISTS)]
end
end
However the Java Class "NpcCombat" now has a integer parameter, such as NpcCombat(int). I'm trying to figure out how to change this in my ruby script, but it's not working.

I've never used jruby, but based on Ruby I imagine adding an initialize block that calls the super constructor should work:
class Man < NpcCombat
def initialize(num)
super(num)
end
...
end

How to call a java function from python/numpy?

it is clear to me how to extend Python with C++, but what if I want to write a function in Java to be used with numpy?
Here is a simple scenario: I want to compute the average of a numpy array using a Java class. How do I pass the numpy vector to the Java class and gather the result?
Thanks for any help!

I spent some time on my own question and would like to share my answer as I feel there is not much information on this topic on stackoverflow. I also think Java will become more relevant in scientific computing (e.g. see WEKA package for data mining) because of the improvement of performance and other good software development features of Java.
In general, it turns out that using the right tools it is much easier to extend Python with Java than with C/C++!
Overview and assessment of tools to call Java from Python
http://pypi.python.org/pypi/JCC: because of no proper
documentation this tool is useless.
Py4J: requires to start the Java process before using python. As
remarked by others this is a possible point of failure. Moreover, not many examples of use are documented.
JPype: although development seems to be death, it works well and there are
many examples on it on the web (e.g. see http://kogs-www.informatik.uni-hamburg.de/~meine/weka-python/ for using data mining libraries written in Java) . Therefore I decided to focus
on this tool.
Installing JPype on Fedora 16
I am using Fedora 16, since there are some issues when installing JPype on Linux, I describe my approach.
Download JPype, then modify setup.py script by providing the JDK path, in line 48:
self.javaHome = '/usr/java/default'
then run:
sudo python setup.py install
Afters successful installation, check this file:
/usr/lib64/python2.7/site-packages/jpype/_linux.py
and remove or rename the method getDefaultJVMPath() into getDefaultJVMPath_old(), then add the following method:
def getDefaultJVMPath():
return "/usr/java/default/jre/lib/amd64/server/libjvm.so"
Alternative approach: do not make any change in the above file _linux.py, but never use the method getDefaultJVMPath() (or methods which call this method). At the place of using getDefaultJVMPath() provide directly the path to the JVM. Note that there are several paths, for example in my system I also have the following paths, referring to different versions of the JVM (it is not clear to me whether the client or server JVM is better suited):
/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre/lib/x86_64/client/libjvm.so
/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre/lib/x86_64/server/libjvm.so
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/libjvm.so
Finally, add the following line to ~/.bashrc (or run it each time before opening a python interpreter):
export JAVA_HOME='/usr/java/default'
(The above directory is in reality just a symbolic link to my last version of JDK, which is located at /usr/java/jdk1.7.0_04).
Note that all the tests in the directory where JPype has been downloaded, i.e. JPype-0.5.4.2/test/testsuite.py will fail (so do not care about them).
To see if it works, test this script in python:
import jpype
jvmPath = jpype.getDefaultJVMPath()
jpype.startJVM(jvmPath)
# print a random text using a Java class
jpype.java.lang.System.out.println ('Berlusconi likes women')
jpype.shutdownJVM()
Calling Java classes from Java also using Numpy
Let's start implementing a Java class containing some functions which I want to apply to numpy arrays. Since there is no concept of state, I use static functions so that I do not need to create any Java object (creating Java objects would not change anything).
/**
* Cookbook to pass numpy arrays to Java via Jpype
* #author Mannaggia
*/
package test.java;
public class Average2 {
public static double compute_average(double[] the_array){
// compute the average
double result=0;
int i;
for (i=0;i<the_array.length;i++){
result=result+the_array[i];
}
return result/the_array.length;
}
// multiplies array by a scalar
public static double[] multiply(double[] the_array, double factor) {
int i;
double[] the_result= new double[the_array.length];
for (i=0;i<the_array.length;i++) {
the_result[i]=the_array[i]*factor;
}
return the_result;
}
/**
* Matrix multiplication.
*/
public static double[][] mult_mat(double[][] mat1, double[][] mat2){
// find sizes
int n1=mat1.length;
int n2=mat2.length;
int m1=mat1[0].length;
int m2=mat2[0].length;
// check that we can multiply
if (n2 !=m1) {
//System.err.println("Error: The number of columns of the first argument must equal the number of rows of the second");
//return null;
throw new IllegalArgumentException("Error: The number of columns of the first argument must equal the number of rows of the second");
}
// if we can, then multiply
double[][] the_results=new double[n1][m2];
int i,j,k;
for (i=0;i<n1;i++){
for (j=0;j<m2;j++){
// initialize
the_results[i][j]=0;
for (k=0;k<m1;k++) {
the_results[i][j]=the_results[i][j]+mat1[i][k]*mat2[k][j];
}
}
}
return the_results;
}
/**
* #param args
*/
public static void main(String[] args) {
// test case
double an_array[]={1.0, 2.0,3.0,4.0};
double res=Average2.compute_average(an_array);
System.out.println("Average is =" + res);
}
}
The name of the class is a bit misleading, as we do not only aim at computing the average of a numpy vector (using the method compute_average), but also multiply a numpy vector by a scalar (method multiply), and finally, the matrix multiplication (method mult_mat).
After compiling the above Java class we can now run the following Python script:
import numpy as np
import jpype
jvmPath = jpype.getDefaultJVMPath()
# we to specify the classpath used by the JVM
classpath='/home/mannaggia/workspace/TestJava/bin'
jpype.startJVM(jvmPath,'-Djava.class.path=%s' % classpath)
# numpy array
the_array=np.array([1.1, 2.3, 4, 6,7])
# build a JArray, not that we need to specify the Java double type using the jpype.JDouble wrapper
the_jarray2=jpype.JArray(jpype.JDouble, the_array.ndim)(the_array.tolist())
Class_average2=testPkg.Average2
res2=Class_average2.compute_average(the_jarray2)
np.abs(np.average(the_array)-res2) # ok perfect match!
# now try to multiply an array
res3=Class_average2.multiply(the_jarray2,jpype.JDouble(3))
# convert to numpy array
res4=np.array(res3) #ok
# matrix multiplication
the_mat1=np.array([[1,2,3], [4,5,6], [7,8,9]],dtype=float)
#the_mat2=np.array([[1,0,0], [0,1,0], [0,0,1]],dtype=float)
the_mat2=np.array([[1], [1], [1]],dtype=float)
the_mat3=np.array([[1, 2, 3]],dtype=float)
the_jmat1=jpype.JArray(jpype.JDouble, the_mat1.ndim)(the_mat1.tolist())
the_jmat2=jpype.JArray(jpype.JDouble, the_mat2.ndim)(the_mat2.tolist())
res5=Class_average2.mult_mat(the_jmat1,the_jmat2)
res6=np.array(res5) #ok
# other test
the_jmat3=jpype.JArray(jpype.JDouble, the_mat3.ndim)(the_mat3.tolist())
res7=Class_average2.mult_mat(the_jmat3,the_jmat2)
res8=np.array(res7)
res9=Class_average2.mult_mat(the_jmat2,the_jmat3)
res10=np.array(res9)
# test error due to invalid matrix multiplication
the_mat4=np.array([[1], [2]],dtype=float)
the_jmat4=jpype.JArray(jpype.JDouble, the_mat4.ndim)(the_mat4.tolist())
res11=Class_average2.mult_mat(the_jmat1,the_jmat4)
jpype.java.lang.System.out.println ('Goodbye!')
jpype.shutdownJVM()

I consider Jython to be one of the best options - which makes it seamless to use java objects in python. I actually integrated weka with my python programs, and it was super easy. Just import the weka classes and call them as you would in java within the python code.
http://www.jython.org/

I'm not sure about numpy support, but the following might be helpful:
http://pypi.python.org/pypi/JCC/

Include Perl in Java

Is there any way to execute perl code without having to use Runtime.getRuntime.exec("..."); (parse in java app)?

I've been looking into this myself recently. The most promising thing I've found thus far is the Inline::Java module on CPAN. It allows calling Java from Perl but also (via some included Java classes) calling Perl from Java.

this looks like what you're asking for

Inline::Java provides an embedded Perl interpreter in a class. You can use this to call Perl code from your Java code.
Graciliano M. Passos' PLJava also provides an embedded interpreter.
Don't use JPL (Java Perl Lingo)--the project is dead and has been removed from modern perls.

Inline::Perl is the accepted way. But there's also Jerl which may be run from a JAR.
Here's an example without using the VM wrapper (which is not so fun).
Here's some examples using the jerlWrapper class to make it easier to code:
import jerlWrapper.perlVM;
public final class HelloWorld {
/* keeping it simple */
private static String helloWorldPerl = "print 'Hello World '.$].\"\n\";";
public static void main(String[] args) {
perlVM helloJavaPerl = new perlVM(helloWorldPerl);
helloJavaPerl.run();
}
}
or
import jerlWrapper.perlVM;
public final class TimeTest {
/* The (ugly) way to retrieve time within perl, with all the
* extra addition to make it worth reading afterwards.
*/
private static String testProggie = new String(
"my ($sec, $min, $hr, $day, $mon, $year) = localtime;"+
"printf(\"%02d/%02d/%04d %02d:%02d:%02d\n\", "+
" $mon, $day + 1, 1900 + $year, $hr, $min, $sec);"
);
public static void main(String[] args) {
perlVM helloJavaPerl = new perlVM(testProggie);
boolean isSuccessful = helloJavaPerl.run();
if (isSuccessful) {
System.out.print(helloJavaPerl.getOutput());
}
}
}

I could have sworn it was easy as pie using the Java Scripting API.
But apparently it's not on the list of existing implementations...
So, maybe this helps instead :
java and perl
edit: i said "maybe"

No, I don't believe this exists. While there have been several languages ported to the JVM (JRuby, Jython etc) Perl is not yet one of them.

In the future, the standard way to use any scripting language is through the java Scripting Support introduced in JSR 223. See the scripting project homepage for a list of scripting languages supported at the moment. Unfortunately, Perl isn't on there yet :-(

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Passing Java array to Scala - java

Related

Groovy script compiles to a class

pyspark: call a custom java function from pyspark. Do I need Java_Gateway?

Java Constructors in a Ruby Script

How to call a java function from python/numpy?

Include Perl in Java

Categories

Resources