Getting R to use newer versions of java - java

This question is related to this other question.
I am trying to use RNetLogo with R and get the following error.
nl.path <- "/Applications/NetLogo 5.1.0"
NLStart(nl.path)
Error in .jnew("nlcon/Preprocess") :
java.lang.UnsupportedClassVersionError: nlcon/Preprocess : Unsupported major.minor version 51.0
From what I understood in this other question, the problem is that R is using an old version of Java which is incompatible with RNetLogo.
I installed Java 8.0 hoping to solve the problem but my understanding is that, despite Java 8.0 being installed on my computer (Mac OS Maverick), R does not pick it up and keep trying to use old versions of Java.
So my question is : How can I get R to use Java 8.0 instead of any older version?
In the terminal console, I get
java -version :
java version "1.6.0_65"
Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
Thanks in advance for your help,

Seems like on Mac OS X you can have multiple Java at a one time.
Use below command on terminal to check how many JDK version you have.
/usr/libexec/java_home -V
You can follow below instruction to have correct Java Path setup:
How To Set $JAVA_HOME Environment Variable On Mac OS X
In nutshell do:
export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)

Unfortunately, none of these seem to help on a Mac. Windows and Linux solutions are not relevant because the files are in different places.
If you just update to Java 1.8 (Java 8 for Oracle) in the standard way as prompted by the Java preferences pain, you just get the Java Runtime Environment (JRE). If you run...
/usr/libexec/java_home -V
...it still shows only java 1.6, and...
export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)
...throws an error saying it can't find a version 1.8.
To get the Mac to even recognize a newer version of Java, it seems you must install JDK v.8 (not JRE). At that point, you can get the Mac to recognize that a new java virtual machine is available, and you can do the export command successfully. (Note that the new 1.8 JVM is in a DIFFERENT place--/Library/Java instead of /System/Library/Java for 1.6.). BUT, this still does no good for R.
I've tried putting the export JAVA_HOME... command into my .profile and my .bash_profile. Then sourcing both. Works fine, but has no effect on R AFAICT. I've launched R via the standard Mac R GUI, from RStudio, and from the terminal and it is only recognizing Java 1.6. So RNetLogo still does not work.
I will try to find RNetLogo 1.0-0 in the archive and test that. If it works, I suggest that 1.0-1 be rolled back until this Java problem is solved.

I use this line on windows :
options(java.home="C:/Program Files/Java/jre7/")
You probably have to change the 7 for an 8 and find the proper path on the mac.

This supposedly works (originally from this blog entry):
1) Download and install Apple’s Java version 1.6.
2) Reconfigure your Java installation by using sudo R CMD javareconf -n.
3) Reinstall rJava from source with: install.packages('rJava', type='source').
Please acknowledge Will Lowe at conjugateprior.org for the original post and solution.

Try linking libjvm.dylib to /usr/lib:
sudo ln -f -s $(/usr/libexec/java_home)/jre/lib/server/libjvm.dylib /usr/lib
-f flag is added to force overwriting existing file/link

EDIT: I don't know if anyone is still struggling with this, but with rJava 0.9-9, the 'partial fix' below no longer works. What does work, and completely, is the final solution offered here: https://github.com/s-u/rJava/issues/86
Copying from there, many thanks to Gregory R. Warnes:
uninstall existing rJava versions by running the following in the Terminal:
Rscript -e 'remove.packages("rJava")'
sudo Rscript -e 'remove.packages("rJava")'
add the following to /Users/<userid>/.bashrc:
export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)/jre'
(e.g., type > vim /Users/<userid>/.bashrc in the Terminal, then 'i', add the line above, then ':wq' to save and quit)
close and re-start all Terminal, R and RStudio windows
type the following in the Terminal window:
sudo ln -sf $(/usr/libexec/java_home)/jre/lib/server/libjvm.dylib /usr/local/lib
in a new R session, re-install rJava from source:
install.packages("rJava", repos="http://rforge.net", type="source")
OLD 'PARTIAL FIX' BELOW:
Okay. I have been working on this problem all morning, and I have a partial fix.
I tried the solution suggested by Guilherme Kenji Chihaya above, but even after sudo R CMD javareconf -n and install.packages('rJava', type='source'), R insists on using Java 1.6 (and is happy to do so).
HOWEVER, R studio throws an error after re-installing rJava:
library(rJava)
Error : .onLoad failed in loadNamespace() for 'rJava', details:
call: dyn.load(file, DLLpath = DLLpath, ...)
error: unable to load shared object '/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rJava/libs/rJava.so':
dlopen(/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rJava/libs/rJava.so, 6): Library not loaded: #rpath/libjvm.dylib
Referenced from: /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rJava/libs/rJava.so
Reason: image not found
Error: package or namespace load failed for ‘rJava’
Googling this lead me to this post: http://andrewgoldstone.com/blog/2015/02/03/rjava/, with a working solution. Set the following in the Terminal:
alias r="DYLD_FALLBACK_LIBRARY_PATH=/Library/Java/JavaVirtualMachines/jdk1.8.0_11.jdk/Contents/Home/jre/lib/server/: open -a r"
And start R from the Terminal. Then, magically, in R:
> library(rJava)
> .jinit()
> .jcall("java/lang/System", "S", "getProperty", "java.runtime.version")
[1] "1.8.0_11-b12"
However, this only works when starting R from the Terminal. I haven't been able to get R to automatically recognise the right "DYLD_FALLBACK_LIBRARY_PATH" in any way.

In Ubuntu there is a command alternatives that i use it for this purpose.
alternatives --install /usr/bin/java java /usr/java/jdk1.8*/jre/bin/java 200000
alternatives --install /usr/bin/javaws javaws /usr/java/jdk1.8*/jre/bin/javaws 200000
alternatives --install /usr/bin/javac javac /usr/java/jdk1.8*/bin/javac 200000
alternatives --install /usr/bin/jar jar /usr/java/jdk1.8*/bin/jar 200000
After alternatives install use the following command to change your version.
alternatives --config java and then select your newer java.
If this is not available you should first find out what where is your new java actually.
locate *jdk1.8*
Then find out which java you run as binary. which java this will returns the path of binary. this is the old java binary, So remove it and link new java binary in the same place. For example : ln -s /path/to/java1.8*/bin/java /usr/bin/java
In addtion you need to update your CLASS_PATH environment variable that is necessary for the VM.
for example:
export CLASS_PATH=/usr/java/jdk1.8*/jre/lib and you can add this line in your bashrc file to register this configuration.

In Debian-based installations R uses /etc/R/Makeconf settings for building libraries. One of the setting there is JAVA_HOME. Try setting the correct path there and reinstall the package.

Related

JDK is installed on mac but i'm getting "The operation couldn’t be completed. Unable to locate a Java Runtime that supports apt." sudo apt update

I'm trying to run the command sudo apt update on my terminal in MacOS
I'm getting this message in response: The operation couldn’t be completed. Unable to locate a Java Runtime that supports apt. Please visit http://www.java.com for information on installing Java.
I saw a similar question here, however even though I made sure to install the JDK like the solution suggested I'm still getting the same response.
I also tried pasting
export PATH="$HOME/.jenv/bin:$PATH"
eval "$(jenv init -)"
export JAVA_HOME="$HOME/.jenv/versions/`jenv version-name`"
Into my .zshrc.save folder and had no luck.
When I run java -version in the terminal this is what I get back:
java version "15.0.2" 2021-01-19
Java(TM) SE Runtime Environment (build 15.0.2+7-27)
Java HotSpot(TM) 64-Bit Server VM (build 15.0.2+7-27, mixed mode, sharing)
20 years ago, java shipped with a tool called apt: Annotation Processor Tool. This tool was obsolete not much later.
What that update-node-js-version is talking about, is a completely and totally unrelated tool: It's the Advanced Package Tool, which is a tool to manage installations on debian and ubuntu - linux distros. You do not want to run this on a mac, and the instructions you found are therefore completely useless: That is how to update node-js on linux. Your machine isn't linux.
Search around for answers involving brew, which is the go-to equivalent of apt on mac. And completely forget about java - this has NOTHING to do with java - that was just a pure coincidence.
Install Homebrew on your Mac Machine
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
For the system Java wrappers to find this JDK, symlink it with
sudo ln -sfn /usr/local/opt/openjdk/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk.jdk
If you need to have openjdk first in your PATH, run:
echo 'export PATH="/usr/local/opt/openjdk/bin:$PATH"' >> ~/.profile
For compilers to find openjdk you may need to set:
export CPPFLAGS="-I/usr/local/opt/openjdk/include"
The below commands worked for me.
First, install the homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Then set the Android Studio Java path to the Home(If you have Android Studio). If not then you take the respective Java path & export it to the JAVA Home path.
export JAVA_HOME=/Applications/Android\ Studio.app/Contents/jre/Contents/Home

Issue installing Java 8 on Mac OS X

I am having an issue trying to upgrade to Java 8 from Java 6 on my Mac running Mac OS X 10.10.5, with Java 8 seemingly not getting recognized.
I installed Java 8 via the .dmg installer: jre-8u66-macosx-x64.dmg, yet when I enter: java -version, it reports:
java version "1.6.0_65".
Yet, I noticed under the Java Panel via System Preference, the Java Runtime Environment Settings Panel is displaying 1.8.0_102.
From poking around I have noticed:
1) Java 8 seems to have installed into: /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk
2) Java 6 seems to have been installed into:
/System/Library/Java/JavaVirtualMachines/1.6.0.jdk
I then noticed a post on StackOverflow recommending to use "brew" to install Java, and not use the official installer for Mac, as it is broken. When I went to install brew I got this error message:
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
I got this error message: Illegal variable name.
And then I read that "brew" is broken on versions of El Capitan and above.
So, what do I need to do to get this upgrade to Java 8 to work???
Appreciate any help with this! Thanks!
your java command points to the stub binary which uses current version configured
$ ls -la /usr/bin/java
lrwxr-xr-x 1 root wheel 74 Feb 5 2015 /usr/bin/java -> /System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands/java
all you need is to update your JAVA_HOME(I've added that to my ~/.profile):
export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)
There's a topic which describes this in-depth Need help understanding Oracle's Java on Mac
set your JAVE_HOME to java 8.
For the brew thing, guess you might have run it in csh/tcsh..change it to bash and the script will working to install brew.
I don't know about the brew thing. Looks like you have two jdks and the one with /System/Library/Java/JavaVirtualMachines/1.6.0.jdk has the $PATH set for it but the new one you installed doesn't have its $PATH set for it yet. I would recommend following,
i. either delete the old one and then set the $PATH for new one
ii. just set the $PATH for new one but try to keep both JDK in same location
in both cases you want to have $JAVA_HOME yield the path for your latest jdk which should work.
here is link that might be helpful
https://cloudlink.soasta.com/t5/CloudTest-Knowledge-Base/Adding-JDK-Path-in-Mac-OS-X-Linux-or-Windows/ta-p/43867
for setting the $PATH for you. In plain terms, you have to find a file in your mac that is called .bash_profile and then in that file you have to set the $PATH for your new jdk or which ever jdk you want to work with. This should take little research if you don't know but its not very difficult. Hope this helps..

How does one configure rJava on OSX to select the right JVM -- .jinit() failing

I installed rJava by calling install.packages("rJava") -- no problems seen
However when I call:
library(rJava)
.jinit()
I get:
JavaVM: requested Java version ((null)) not available. Using Java at "" instead.
JavaVM: Failed to load JVM: /bundle/Libraries/libserver.dylib
JavaVM FATAL: Failed to load the jvm library.
Error in .jinit() : JNI_GetCreatedJavaVMs returned -1
I'm running OSX:
Darwin MBP-2 14.5.0 Darwin Kernel Version 14.5.0: Tue Sep 1 21:23:09 PDT 2015; root:xnu-2782.50.1~1/RELEASE_X86_64 x86_64
I have the following Sun JDK's installed:
$ ls /Library/Java/JavaVirtualMachines/
jdk1.7.0_79.jdk jdk1.8.0_65.jdk
Which Java is on my PATH:
$ which java
/Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home//bin/Java
I also have JavaHome defined as:
$ echo $JAVA_HOME
/Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home/
The answer at https://stackoverflow.com/a/36045290/4351357 helped resolve this issue for me on MacOS Sierra Version 10.12. In particular, it provided the code in command 1, below.
The three Terminal commands I used are:
sudo ln -s $(/usr/libexec/java_home)/jre/lib/server/libjvm.dylib /usr/local/lib
sudo R CMD javareconf
install.packages("rJava",type='source')
This solution Cannot load R xlsx package on Mac OS 10.11
sudo R CMD javareconf
install.packages("rJava",type='source')
Worked for me
While attempting to run this test on OSX 10.11.5 (El Capitan)
http://www.r-bloggers.com/connecting-r-to-an-oracle-database-with-rjdbc/
I kept getting that same error. I attempted to just do the install as suggested by Tim Child. The thing I noticed was that my version of R Studio (Version 0.99.896 ) kept complaining about installing the 1.6 Legacy Java for OSX.
I installed the legacy Java from the website https://support.apple.com/kb/DL1572?locale=en_US
Then ran a simpler test in R Studio.
library(rJava)
.jinit()
print(.jcall("java/lang/System", "S", "getProperty", "java.version"))
My results:
> library(rJava)
> .jinit()
> print(.jcall("java/lang/System", "S", "getProperty", "java.version"))
[1] "1.6.0_65"
Happy to see that I ran my other test (then that one that started me down this path). I still could not get it to set the most current Java version.
I did one more re-install of the package rJava
install.packages("rJava",type='source')
and things are working ok, using version of Java 1.6 for now, but at least I can get some work done. Hope some comes across a better fix :)
I struggled with this issue for a couple of hours. There was a really good thread with some background on rJava here:
https://groups.google.com/forum/#!topic/r-sig-mac/eFSDrjphgGs
The following steps ended up working for me:
(1) Upgrade to the latest JDK
(2) Set the following environment variables in my ~/.bash_profile:
export JAVA_HOME=$(/usr/libexec/java_home)
export JAVA_CPPFLAGS=$(/usr/libexec/java_home)/include
(3) Re-install rJava from source as root:
sudo R CMD INSTALL rJava_0.9-8.tar.gz
(4) Re-configure Java as root:
sudo R CMD javareconf
Then I could properly install other libraries that depended on the correct configuration of rJava.
This helped me:
sudo ln -s /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre/lib/server/libjvm.dylib /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/lib/libserver.dylib

Pyspark: Exception: Java gateway process exited before sending the driver its port number

I'm trying to run pyspark on my macbook air. When i try starting it up I get the error:
Exception: Java gateway process exited before sending the driver its port number
when sc = SparkContext() is being called upon startup. I have tried running the following commands:
./bin/pyspark
./bin/spark-shell
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
with no avail. I have also looked here:
Spark + Python - Java gateway process exited before sending the driver its port number?
but the question has never been answered. Please help! Thanks.
One possible reason is JAVA_HOME is not set because java is not installed.
I encountered the same issue. It says
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/spark/launcher/Main : Unsupported major.minor version 51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:643)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:296)
at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:406)
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/opt/spark/python/pyspark/conf.py", line 104, in __init__
SparkContext._ensure_initialized()
File "/opt/spark/python/pyspark/context.py", line 243, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
File "/opt/spark/python/pyspark/java_gateway.py", line 94, in launch_gateway
raise Exception("Java gateway process exited before sending the driver its port number")
Exception: Java gateway process exited before sending the driver its port number
at sc = pyspark.SparkConf(). I solved it by running
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
which is from https://www.digitalocean.com/community/tutorials/how-to-install-java-with-apt-get-on-ubuntu-16-04
this should help you
One solution is adding pyspark-shell to the shell environment variable PYSPARK_SUBMIT_ARGS:
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
There is a change in python/pyspark/java_gateway.py , which requires PYSPARK_SUBMIT_ARGS includes pyspark-shell if a PYSPARK_SUBMIT_ARGS variable is set by a user.
Had this error message running pyspark on Ubuntu, got rid of it by installing the openjdk-8-jdk package
from pyspark import SparkConf, SparkContext
sc = SparkContext(conf=SparkConf().setAppName("MyApp").setMaster("local"))
^^^ error
Install Open JDK 8:
apt-get install openjdk-8-jdk-headless -qq
On MacOS
Same on Mac OS, I typed in a terminal:
$ java -version
No Java runtime present, requesting install.
I was prompted to install Java from the Oracle's download site, chose the MacOS installer, clicked on jdk-13.0.2_osx-x64_bin.dmg and after that checked that Java was installed
$ java -version
java version "13.0.2" 2020-01-14
EDIT To install JDK 8 you need to go to https://www.oracle.com/java/technologies/javase-jdk8-downloads.html (login required)
After that I was able to start a Spark context with pyspark.
Checking if it works
In Python:
from pyspark import SparkContext
sc = SparkContext.getOrCreate()
# check that it really works by running a job
# example from http://spark.apache.org/docs/latest/rdd-programming-guide.html#parallelized-collections
data = range(10000)
distData = sc.parallelize(data)
distData.filter(lambda x: not x&1).take(10)
# Out: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
Note that you might need to set the environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON and they have to be the same Python version as the Python (or IPython) you're using to run pyspark (the driver).
I use Mac OS. I fixed the problem!
Below is how I fixed it.
JDK8 seems works fine. (https://github.com/jupyter/jupyter/issues/248)
So I checked my JDK /Library/Java/JavaVirtualMachines, I only have jdk-11.jdk in this path.
I downloaded JDK8 (I followed the link).
Which is:
brew tap caskroom/versions
brew cask install java8
After this, I added
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_202.jdk/Contents/Home
export JAVA_HOME="$(/usr/libexec/java_home -v 1.8)"
to ~/.bash_profile file. (you sholud check your jdk1.8 file name)
It works now!
Hope this help :)
I will repost how I solved it here just for future references.
How I solved my similar problem
Prerequisite:
anaconda already installed
Spark already installed (https://spark.apache.org/downloads.html)
pyspark already installed (https://anaconda.org/conda-forge/pyspark)
Steps I did (NOTE: set the folder path accordingly to your system)
set the following environment variables.
SPARK_HOME to 'C:\spark\spark-3.0.1-bin-hadoop2.7'
set HADOOP_HOME to 'C:\spark\spark-3.0.1-bin-hadoop2.7'
set PYSPARK_DRIVER_PYTHON to 'jupyter'
set PYSPARK_DRIVER_PYTHON_OPTS to 'notebook'
add 'C:\spark\spark-3.0.1-bin-hadoop2.7\bin;' to PATH system variable.
Change the java installed folder directly under C: (Previously java was installed under Program files, so I re-installed directly
under C:)
so my JAVA_HOME will become like this 'C:\java\jdk1.8.0_271'
now. it works !
Had the same issue with my iphython notebook (IPython 3.2.1) on Linux (ubuntu).
What was missing in my case was setting the master URL in the $PYSPARK_SUBMIT_ARGS environment like this (assuming you use bash):
export PYSPARK_SUBMIT_ARGS="--master spark://<host>:<port>"
e.g.
export PYSPARK_SUBMIT_ARGS="--master spark://192.168.2.40:7077"
You can put this into your .bashrc file. You get the correct URL in the log for the spark master (the location for this log is reported when you start the master with /sbin/start_master.sh).
After spending hours and hours trying many different solutions, I can confirm that Java 10 SDK causes this error. On Mac, please navigate to /Library/Java/JavaVirtualMachines then run this command to uninstall Java JDK 10 completely:
sudo rm -rf jdk-10.jdk/
After that, please download JDK 8 then the problem will be solved.
I had the same error with PySpark, and setting JAVA_HOME to Java 11 worked for me (it was originally set to 16). I'm using MacOS and PyCharm.
You can check your current Java version by doing echo $JAVA_HOME.
Below is what worked for me. On my Mac I used the following homebrew command, but you can use a different method to install the desired Java version, depending on your OS.
# Install Java 11 (I believe 8 works too)
$ brew install openjdk#11
# Set JAVA_HOME by assigning the path where your Java is
$ export JAVA_HOME=/usr/local/opt/openjdk#11
Note: If you installed using homebrew and need to find the location of the path, you can do $ brew --prefix openjdk#11 and it should return a path like this: /usr/local/opt/openjdk#11
At this point, I could run my PySpark program from the terminal - however, my IDE (PyCharm) still had the same error until I globally changed the JAVA_HOME variable.
To update the variable, first check whether you're using the zsh or bash shell by running echo $SHELL on the command line. For zsh, you'll edit the ~/.zshenv file and for bash you'll edit the ~/.bash_profile.
# open the file
$ vim ~/.zshenv
OR
$ vim ~/.bash_profile
# once inside the file, set the variable with your Java path, then save and close the file
export JAVA_HOME=/usr/local/opt/openjdk#11
# test if it was set successfully
$ echo $JAVA_HOME
/usr/local/opt/openjdk#11
After this step, I could run PySpark through my PyCharm IDE as well.
Spark is very picky with the Java version you use. It is highly recommended that you use Java 1.8 (The open source AdoptOpenJDK 8 works well too).
After install it, set JAVA_HOME to your bash variables, if you use Mac/Linux:
export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)
export PATH=$JAVA_HOME/bin:$PATH
There are many valuable hints here, however, none solved my problem completely so I will show the procedure that worked for me working in an Anaconda Jupyter Notebook on Windows:
Download and install java and pyspark in directories without blank spaces.
[maybe unnecessary] In the anaconda prompt, type where conda and where python and add the paths of the .exe files' directories to your Path variable using the Windows environmental variables tool. Add also the variables JAVA_HOME and SPARK_HOME there with their corresponding paths.
Even doing so, I had to set these variables manually from within the Notebook along with PYSPARK_SUBMIT_ARGS (use your own paths for SPARK_HOME and JAVA_HOME):
import os
os.environ["SPARK_HOME"] = r"C:\Spark\spark-3.2.0-bin-hadoop3.2"
os.environ["PYSPARK_SUBMIT_ARGS"] = "--master local[3] pyspark-shell"
os.environ["JAVA_HOME"] = r"C:\Java\jre1.8.0_311"
Install findspark from the notebook with !pip install findspark.
Run import findspark and findspark.init()
Run from pyspark.sql import SparkSession and spark = SparkSession.builder.getOrCreate()
Some useful links:
https://towardsdatascience.com/installing-apache-pyspark-on-windows-10-f5f0c506bea1
https://sparkbyexamples.com/pyspark/pyspark-exception-java-gateway-process-exited-before-sending-the-driver-its-port-number/
https://www.datacamp.com/community/tutorials/installing-anaconda-windows
I got the same Java gateway process exited......port number exception even though I set PYSPARK_SUBMIT_ARGS properly. I'm running Spark 1.6 and trying to get pyspark to work with IPython4/Jupyter (OS: ubuntu as VM guest).
While I got this exception, I noticed an hs_err_*.log was generated and it started with:
There is insufficient memory for the Java Runtime Environment to continue. Native memory allocation (malloc) failed to allocate 715849728 bytes for committing reserved memory.
So I increased the memory allocated for my ubuntu via VirtualBox Setting and restarted the guest ubuntu. Then this Java gateway exception goes away and everything worked out fine.
If you are trying to run spark without hadoop binaries, you might encounter the above mentioned error. One solution is to :
1) download hadoop separatedly.
2) add hadoop to your PATH
3) add hadoop classpath to your SPARK install
The first two steps are trivial, the last step can be best done by adding the following in the $SPARK_HOME/conf/spark-env.sh in each spark node (master and workers)
### in conf/spark-env.sh ###
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
for more info also check: https://spark.apache.org/docs/latest/hadoop-provided.html
After spending a good amount of time with this issue, I was able to solve this. I own MacOs Catalina, working on Pycharm in an Anaconda environment.
Spark currently supports only Java8. If you install Java through command line, it will by default install the latest Java10+ and would cause all sorts of troubles. To solve this, follow the below steps -
1. Make sure you have Homebrew, else install Homebrew
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
2. Install X-code
xcode-select –-install
3. Install Java8 through the official website (not through terminal)
https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html
4. Install Apache-Spark
brew install apache-spark
5. Install Pyspark and Findspark (if you have anaconda)
conda install -c conda-forge findspark
conda install -c conda-forge/label/gcc7 findspark
conda install -c conda-forge pyspark
Viola! this should let you run PySpark without any issues
I got the same Exception: Java gateway process exited before sending the driver its port number in Cloudera VM when trying to start IPython with CSV support with a syntax error:
PYSPARK_DRIVER_PYTHON=ipython pyspark --packages com.databricks:spark-csv_2.10.1.4.0
will throw the error, while:
PYSPARK_DRIVER_PYTHON=ipython pyspark --packages com.databricks:spark-csv_2.10:1.4.0
will not.
The difference is in that last colon in the last (working) example, seperating the Scala version number from the package version number.
In my case this error came for the script which was running fine before. So I figured out that this might be due to my JAVA update. Before I was using java 1.8 but I had accidentally updated to java 1.9. When I switched back to java 1.8 the error disappeared and everything is running fine.
For those, who get this error for the same reason but do not know how to switch back to older java version on ubuntu:
run
sudo update-alternatives --config java
and make the selection for java version
I figured out the problem in Windows system. The installation directory for Java must not have blanks in the path such as in C:\Program Files. I re-installed Java in C\Java. I set JAVA_HOME to C:\Java and the problem went away.
I got this error because I was running low on disk space.
Had same issue, after installing java using below lines solved the issue !
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
I have the same error.
My trouble shooting procedures are:
Check out Spark source code.
Follow the error message. In my case: pyspark/java_gateway.py, line 93, in launch_gateway.
Check the code logic to find the root cause then you will resolve it.
In my case the issue is PySpark has no permission to create some temporary directory, so I just run my IDE with sudo
I have the same error in running pyspark in pycharm.
I solved the problem by adding JAVA_HOME in pycharm's environment variables.
I had the same exception and I tried everything by setting and resetting all environment variables. But the issue in the end drilled down to space in appname property of spark session,that is, "SparkSession.builder.appName("StreamingDemo").getOrCreate()". Immediately after removing space from string given to appname property it got resolved.I was using pyspark 2.7 with eclipse on windows 10 environment. It worked for me.
Enclosed are required screenshots.
For Linux (Ubuntu 18.04) with a JAVA_HOME issue, a key is to point it to the master folder:
Set Java 8 as default by: sudo update-alternatives --config java. If Jave 8 is not installed, install by: sudo apt install openjdk-8-jdk.
Set JAVA_HOME environment variable as the master java 8 folder. The location is given by the first command above removing jre/bin/java. Namely: export JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64/". If done on the command line, this will be relevant only for the current session (ref: export command on Linux). To verify: echo $JAVA_HOME.
In order to have this permanently set, add the bolded line above to a file that runs before you start your IDE/Jupyter/python interpreter. This could be by adding the bolded line above to .bashrc. This file loads when a bash is started interactively ref: .bashrc
The error occured since JAVA is not installed on machine.
Spark is developed in scala which usually runs on JAVA.
Try to install JAVA and execute the pyspark statements.
It will works
This usually happens if you do not have java installed in your machine.
Go to command prompt and check the version of your java:
type : java -version
you should get output sth like this
java version "1.8.0_241"
Java(TM) SE Runtime Environment (build 1.8.0_241-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.241-b07, mixed mode)
If not, go to orcale and download jdk.
Check this video on how to download java and add it to the buildpath.
https://www.youtube.com/watch?v=f7rT0h1Q5Wo
Step:1
Check the java vesrion on from the terminal.
java -version
If you see the bash: java: command not found,which mean you don't have java installed in your system.
Step:2
Install Java using the following command,
sudo apt-get install default-jdk
Step:3
No check java version, you'll see the version have been downloaded.
java -version
result:
openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.20.04)
OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.20.04, mixed mode, sharing)
Step:4
Now run the pyspark code,
you'll never see such error.
I met this problem and actually not due to the JAVE_HOME setting. i assume you are using windows, and using Anaconda as your python tools. Please check whether you can use command prompt. I cannot run spark due to the crash of cmd. After fix this, spark can work well on my pc.
Worked hours on this. My problem was with Java 10 installation. I uninstalled it and installed Java 8, and now Pyspark works.
For me, the answer was to add two 'Content Roots' in 'File' -> 'Project Structure' -> 'Modules' (in IntelliJ):
YourPath\spark-2.2.1-bin-hadoop2.7\python
YourPath\spark-2.2.1-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip
This is an old thread but I'm adding my solution for those who use mac.
The issue was with the JAVA_HOME. You have to include this in your .bash_profile.
Check your java -version. If you downloaded the latest Java but it doesn't show up as the latest version, then you know that the path is wrong. Normally, the default path is export JAVA_HOME= /usr/bin/java.
So try changing the path to:
/Library/Internet\ Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/bin/java
Alternatively you could also download the latest JDK.
https://www.oracle.com/technetwork/java/javase/downloads/index.html and this will automatically replace usr/bin/java to the latest version. You can confirm this by doing java -version again.
Then that should work.
Make sure that both your Java directory (as found in your path) AND your Python interpreter reside in directories with no spaces in them. These were the cause of my problem.

"No Java runtime" error in OS X 10.10 using Oracle's 1.8.0 JVM

I've tried 3 different Macs running OS X 10.10, R 3.1.2, Java 1.8.0_25, and rJava 0.9-7. In all three cases rJava installs from source without error but after running .jinit(), fails to detect Java and prompts to install Java 6 from apple.
Something similar happens with Netlogo 5.1.0:
I've spent hours researching online, but haven't yet found a solution and tried various things like manually setting JAVA_HOME and LD_LIBRARY_PATH to no avail.
R CMD javareconf
Java interpreter : /usr/bin/java
Java version : 1.8.0_25
Java home path : /Library/Java/JavaVirtualMachines/jdk1.8.0_25.jdk/Contents/Home/jre
Java compiler : /usr/bin/javac
Java headers gen.: /usr/bin/javah
Java archive tool: /usr/bin/jar
...
~ % echo $JAVA_HOME
/Library/Java/JavaVirtualMachines/jdk1.8.0_25.jdk/Contents/Home/jre
~ % echo $LD_LIBRARY_PATH
/Library/Java/JavaVirtualMachines/jdk1.8.0_25.jdk/Contents/Home/jre/lib/server
~ % R CMD INSTALL rJava_0.9-7.tar.gz
~ % R
library('rJava')
.jinit()
No Java runtime present, requesting install.
I suspect this has to do with Oracle, because rJava loads the correct JVM, but Oracle's code attempts to fall back to Apple Java or something like that.
Any ideas?
These apps rely on Apple´s old JVM and support, which is gone on Yosemite. Sun´s JDK and JRE do not include the same "bridge" code to run these apps so OSX you still ask you to install the old Apple provided JRE so they can be run.
Until these apps are made not to rely on Apple´s JRE there is nothing else you can do other than installing the legacy Java support.
Following the link from sdza , I found the solution. You need to install Apple's Java, which will then actually let you use Oracle's Java as well. Just install this: https://support.apple.com/kb/DL1572 and it should magically start working.

Categories

Resources