R: JDBC() not finding Java drivers path when connecting to Teradata - java

I'm trying to connect to Teradata through RStudio, but for some reason JDBC function has problems recognizing the path where Java drivers sit. See the code below:
library(RODBC)
library(RJDBC)
library(rJava)
# both Java drivers definitely exist
file.exists('/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/tdgssconfig.jar')
[1] TRUE
file.exists('/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/terajdbc4.jar')
[1] TRUE
But when I paste those paths in JDBC call...
# allow more elaborated error messages to appear
.jclassLoader()$setDebug(1L)
drv = JDBC("com.teradata.jdbc.TeraDriver","/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/tdgssconfig.jar;/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/terajdbc4.jar")
... I get the following error:
RJavaClassLoader: added
'/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/tdgssconfig.jar;/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/terajdbc4.jar'
to the URL class path loader WARNING: the path
'/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/tdgssconfig.jar;/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/terajdbc4.jar'
does NOT exist, it will NOT be added to the internal class path!
RJavaClassLoader: added
'/Library/Frameworks/R.framework/Versions/3.4/Resources/library/RJDBC/java/RJDBC.jar'
to the URL class path loader RJavaClassLoader: adding Java archive
file
'/Library/Frameworks/R.framework/Versions/3.4/Resources/library/RJDBC/java/RJDBC.jar'
to the internal class path
RJavaClassLoader#3d4eac69.findClass(com.teradata.jdbc.TeraDriver)
- URL loader did not find it: java.lang.ClassNotFoundException: com.teradata.jdbc.TeraDriver
RJavaClassLoader.findClass("com.teradata.jdbc.TeraDriver")
- trying class path "/Library/Frameworks/R.framework/Versions/3.4/Resources/library/rJava/java"
Directory, can get
'/Library/Frameworks/R.framework/Versions/3.4/Resources/library/rJava/java/com/teradata/jdbc/TeraDriver.class'?
NO
- trying class path "/Library/Frameworks/R.framework/Versions/3.4/Resources/library/RJDBC/java/RJDBC.jar"
JAR file, can get 'com/teradata/jdbc/TeraDriver'? NO
ClassNotFoundException Error in .jfindClass(as.character(driverClass)[1]) : class not found
Running the same code in R, rather than RStudio, returns the same error.
Also, re-installing RJDBC package (as suggested here) didn't solve the issue.
Can anyone explain why this is happening? Thanks for help.
Here's my session info:
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] devtools_1.13.4 RJDBC_0.2-7 rJava_0.9-9 DBI_0.8 RODBC_1.3-15
[6] dplyr_0.7.4 readr_1.1.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.15 bindr_0.1 magrittr_1.5 hms_0.3 R6_2.2.2
[6] rlang_0.1.6 httr_1.3.1 tools_3.4.1 git2r_0.19.0 withr_2.1.1.9000
[11] yaml_2.1.16 assertthat_0.2.0 digest_0.6.15 tibble_1.4.2 bindrcpp_0.2
[16] curl_3.0 memoise_1.1.0 glue_1.2.0 compiler_3.4.1 pillar_1.1.0
[21] pkgconfig_2.0.1

That's a mistake in the path - you have inadvertently pasted two paths together (note the semicolon between the paths). You probably intended
drv <- JDBC("com.teradata.jdbc.TeraDriver",
c("/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/tdgssconfig.jar",
"/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/terajdbc4.jar"))
note that you probably can make your life easier by simply using
drv <- JDBC("com.teradata.jdbc.TeraDriver", Sys.glob("/Users/KULMAK/Documents/TeraJDBC__indep_indep.16.10.00.03/*.jar"))

This worked for me. Just make sure that both jars are located in the referenced directory.
library(RJDBC)
drv <- RJDBC::JDBC(driverClass = "com.teradata.jdbc.TeraDriver", classPath = Sys.glob("~/drivers/teradata/*"))
conn <- dbConnect(drv,'jdbc:teradata://<server>/<db>',"un","pw")
result.df<- dbGetQuery(conn,"select * from table")

Related

Why is the CLASSPATH failing for Python but working for RazorSQL?

On Windows Server 2016, we are trying to connect over JDBC with a Jython script but it is giving following error:
java.lang.ClassNotFoundException: java.lang.ClassNotFoundException:
com.microsoft.sqlserver.jdbc.SQLServerDriver
RazorSQL, on the same machine, connects without error using these settings:
Driver Class: com.microsoft.sqlserver.jdbc.SQLServerDriver
Driver Location: \Program Files (x86)\RazorSQL\drivers\sqlserver\sqljdbc.jar
As a result, we set the CLASSPATH to same location with this command:
set CLASSPATH=C:\Program Files (x86)\RazorSQL\drivers\sqlserver\sqljdbc.jar
...but when running the code below - we still get the same ClassNotFound error.
This is our Python code:
jclassname = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
database = "our_database_name"
db_elem = ";databaseName={}".format(database) if database else ""
host = "###.##.###.###" # ip address
port = "1433"
user = "user_name"
password = "password"
url = (
 jdbc:sqlserver://{host}:{port}{db_elem}"
 ";user={user};password={password}".format(
host=host, port=port, db_elem=db_elem,
 er=user, password=password)
 )
print url
 
driver_args = [url]
jars = None
libs = None
db = jaydebeapi.connect(jclassname, driver_args, jars=jars,
libs=libs)
This is how we are running our Python script:
C:\jython2.7.0\bin\jython.exe C:\path_to_our_script.py
How is that RazorSQL is connecting fine - but somehow Python cannot? How do we remove this CLASSPATH error?
You have to load the JARs at runtime using the system Classloader.
Please refer to this answer.
The following code snippet has been taken from this Gist.
def loadJar(jarFile):
'''load a jar at runtime using the system Classloader (needed for JDBC)
adapted from http://forum.java.sun.com/thread.jspa?threadID=300557
Author: Steve (SG) Langer Jan 2007 translated the above Java to Jython
Reference: https://wiki.python.org/jython/JythonMonthly/Articles/January2007/3
Author: seansummers#gmail.com simplified and updated for jython-2.5.3b3+
>>> loadJar('jtds-1.3.1.jar')
>>> from java import lang, sql
>>> lang.Class.forName('net.sourceforge.jtds.jdbc.Driver')
<type 'net.sourceforge.jtds.jdbc.Driver'>
>>> sql.DriverManager.getDriver('jdbc:jtds://server')
jTDS 1.3.1
'''
from java import io, net, lang
u = io.File(jarFile).toURL() if type(jarFile) <> net.URL else jarFile
m = net.URLClassLoader.getDeclaredMethod('addURL', [net.URL])
m.accessible = 1
m.invoke(lang.ClassLoader.getSystemClassLoader(), [u])
if __name__ == '__main__':
import doctest
doctest.testmod()
Also look at - https://wiki.python.org/jython/JythonMonthly/Articles/January2007/3

JVM Error While Writing Data Frame to Oracle Database using parLapply

I want to parallelize my data writing process. I am writing a data frame to Oracle Database. This data has 4 million rows and 8 columns. It takes 6.5 hours without parallelizing.
When I try to go parallel, I get the error
Error in checkForRemoteErrors(val) :
7 nodes produced errors; first error: No running JVM detected. Maybe .jinit() would help.
I know this error. I can solve it when I work with single cluster. But I do not know how to tell other clusters the location of Java. Here is my code
Sys.setenv(JAVA_HOME='C:/Program Files/Java/jre1.8.0_181')
library(rJava)
library(RJDBC)
library(DBI)
library(compiler)
library(dplyr)
library(data.table)
jdbcDriver =JDBC("oracle.jdbc.OracleDriver",classPath="C:/Program Files/directory/ojdbc6.jar", identifier.quote = "\"")
jdbcConnection =dbConnect(jdbcDriver, "jdbc:oracle:thin:#//XXXXX", "YYYYY", "ZZZZZ")
By using Sys.setenv(JAVA_HOME='C:/Program Files/Java/jre1.8.0_181') I solve the same problem for single core. But when I go parallel
library(parallel)
no_cores <- detectCores() - 1
cl <- makeCluster(no_cores)
clusterExport(cl, varlist = list("jdbcConnection", "brand3.merge.u"))
clusterEvalQ(cl, .libPaths("C:/Users/onur.boyar/Documents/R/win-library/3.5"))
clusterEvalQ(cl, library(RJDBC))
clusterEvalQ(cl, library(rJava))
parLapply(cl, 1:length(brand3.merge.u$CELL_PH_NUM), function(x) dbSendUpdate(jdbcConnection, "INSERT INTO xxnvdw.an_cust_analytics VALUES(?,?,?,?,?,?,?,?)", brand3.merge.u[x, 1], brand3.merge.u[x,2], brand3.merge.u[x,3],brand3.merge.u[x,4],brand3.merge.u[x,5],brand3.merge.u[x,6],brand3.merge.u[x,7],brand3.merge.u[x,8]))
#brand3.merge.u is my data frame that I try to write.
I get the above error and I do not know how to set my Java location for other nodes.
I want to use parLapply since it is faster than foreach. Any help would be appreciated. Thanks!
JAVA_HOME environment variable
If the problem really is with the location of Java, you could set the environment variable in your .Renviron file. It is likely located in ~/.Renviron. Add a line to that file and this will be propagated to all R session that run via your user:
JAVA_HOME='C:/Program Files/Java/jre1.8.0_181'
Alternatively, you can just add that location to your PATH environment variable.
JVM Initialization via rJava
On the other hand the error message may point to just a JVM not being initialized, which you can solve with .jinit, a minimal example:
library(parallel)
cl <- makeCluster(detectCores())
parallel::parLapply(cl, 1:5, function(x) {
rJava::.jinit()
rJava::.jnew(class = "java/lang/Integer", x)$toString()
})
Working around Java use
This was not specifically asked, but you can also work around the need for Java dependency using ODBC drivers, which for Oracle should be accessible here:
con <- DBI::dbConnect(
odbc::odbc(),
Driver = "[your driver's name]",
...
)

apache PIG with datafu: Cannot resolve UDF's

I'm trying the quickstart from here: http://datafu.incubator.apache.org/docs/datafu/getting-started.html
I tried nearly everything, but I'm sure it must be my fault somewhere. I tried already:
exporting PIG_HOME, CLASSPATH, PIG_CLASSPATH
starting pig with -cpdatafu-pig-incubating-1.3.0.jar
registering datafu-pig-incubating-1.3.0.jar locally and in hdfs => both succesful (at least no error shown)
nothing helped
Trying this on pig:
register datafu-pig-incubating-1.3.0.jar
DEFINE Median datafu.pig.stats.StreamingMedian();
data = load '/user/hduser/numbers.txt' using PigStorage() as (val:int);
data2 = FOREACH (GROUP data ALL) GENERATE Median(data);
or directly
data2 = FOREACH (GROUP data ALL) GENERATE datafu.pig.stats.StreamingMedian(data);
I get this name-resolve error:
2016-06-04 17:22:22,734 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 1070: Could not resolve datafu.pig.stats.StreamingMedian using imports: [, java.lang., org.apache.pig.builtin.,
org.apache.pig.impl.builtin.] Details at logfile:
/home/hadoop/pig_1465053680252.log
When I look into the datafu-pig-incubating-1.3.0.jar it looks OK, everything in place. I also tried some Bag functions, same error then.
I think it's kind of a noob-error which I just don't see (as I did not find particular answers for datafu in SO or google), so thanks in advance for shedding some light on this.
Pig script is proper, the only thing that could break is that while registering datafu there were some class dependencies that coudn't been met.
Try to run locally (pig -x local) and see a detailed log.
Check also the version of pig - it should be newer than 0.14.0.

Postgres DB can't connect to R with RJDBC

I've been trying to query data from a PostgreSQL DB via R. I've tried skinning the cat with a few different packages (RODBC, RJDBC, DBI, RPostgres, etc), but I seem to keep getting driver errors. Oddly, I never have trouble using the same drivers/URL's and settings to connect to Postgres from SQLWorkbench/J.
I've tried using postgresql-9.2-1002.jdbc4.jar and postgresql-9.3-1100.jdbc41.jar, as well as the generic "PostgreSQL" driver in R. The two jar files are the (i) the driver I use all the time with SQLWorkbench/J and (ii) the slightly newer version of that same driver, respectively. Yet, when I try to use it...
drv_custom <- JDBC(driverClass = "org.postgresql.Driver", classPath="/Users/xxxx/postgresql-9.3-1100.jdbc41.jar")
I get an error:
Error in .jfindClass(as.character(driverClass)[1]) : class not found
OK, so next I try it with the generic driver:
drv_generic <- dbDriver("PostgreSQL")
and strangely, it doesn't want me to enter a username:
>con <- dbConnect(drv=drv_generic, "jdbc:postgresql://xxx.xxxxx.com", port=xxxx, uid="xxxx", password="xxxx")
>Error in postgresqlNewConnection(drv, ...) : unused argument (uid = "xxxx")
so I try it without user/uid:
con <- dbConnect(drv_generic, "jdbc:postgresql://padb-01.jiwiredev.com:5439", password="paraccel")
and get an error....
Error in postgresqlNewConnection(drv, ...) :
RS-DBI driver: (could not connect jdbc:postgresql://padb-01.xxx.com:5439#local on dbname "jdbc:postgresql://xxxx.xxxx.com:5439")
Apparently the syntax is wrong?
Then I circle back to trying the "custom" driver (either one of the .jar files from earlier) but without the driverClass specified.
drv_custom1 <- JDBC( classPath="/Users/xxxx/postgresql-9.2-1002.jdbc4.jar")
con <- dbConnect(drv=drv_custom1, "jdbc:postgresql://xxx.xxx.com", port=5439, uid="paraccel", pwd="paraccel")
and get this error:
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
RcallMethod: attempt to call a method of a NULL object.
I tried it again with a slight alteration to the syntax:
con <- dbConnect(drv=drv_custom1, url="jdbc:postgresql://xxxx.xxxx.com", port=xxxx, uid="xxxx", pwd="xxxxx",dsn="xxxx")
and got the same error. I tried a number of other variations/approaches as well. I think part of my confusion comes from the fact that the documentation is handled in a very piecemeal way between packages like DBI and those that build upon it like RJDBC, so that when I look at documentation such as ?dbConnect many of the options I need to specify are not even mentioned, and I've been working based off of miscellaneous Google search results related to these packages/errors.
One thread I found suggested trying
.jaddClassPath( "xxxxx/postgresql-9.2-1002.jdbc4.jar" )
first, but that didn't seem to help.
I also tried using
x <- PostgreSQL(max.con = 16, fetch.default.rec = 500, force.reload = FALSE)
to no avail and I experimented with RODBC as the driver.
UPDATE:
I tried using an older version of the driver (jdbc3 instead of jdbc4), restarting R, and detaching all unnecessary packages.
I was able to load the driver
> drv_custom <- JDBC(driverClass = "org.postgresql.Driver", classPath="/xxxxx/xxxxx/postgresql-9.3-1100.jdbc3.jar")
but I still can't connect...
> con <- dbConnect(drv=drv_custom, "jdbc:postgresql://xxxxx.xxxxx.com", port=5439, uid="xxxxx", pwd="xxxxx")
Error in .verify.JDBC.result(jc, "Unable to connect JDBC to ", url) :
Unable to connect JDBC to jdbc:postgresql://xxxxx.xxxx.com
This works for me:
library(RJDBC)
drv <- JDBC("org.postgresql.Driver","C:/R/postgresql-9.4.1211.jar")
con <- dbConnect(drv, url="jdbc:postgresql://host:port/dbname", user="<user name>", password="<password>")
The trick was to include port and dbname in the url. For some reason, jdbc:postgresql does not seem to like reading those information from the dbConnect parameters.
If you are not sure what the dbname is, it is perhaps postgres.
If you are not sure what the port is, it is perhaps 5432.
So a typical call would look like:
con <- dbConnect(drv, url="jdbc:postgresql://10.10.10.10:5432/postgres", user="<user name>", password="<password>")
You can get the jar file from https://jdbc.postgresql.org/
It took some troubleshooting on IRC, but here's what had to happen:
I needed to clear the workspace, detach RODBC and RJDBC, then restart
Load RPostgreSQL
Use only the generic driver
Modify the syntax to
con <- dbConnect(drv=drv_generic, "xxx.xxx.com", port=vvvv, user="ffff", password="ffff", dbname="ggg")
Note: removing the jdbc:postgresql: part was key. Those would've been necessary with RJDBC, but RJDBC turned out to be an unnecessarily painful route.

Error writing xlsx in R: Could not initialize class sun.java2d.Disposer

I'm using the xlsx package to write Excel files in R:
addPicture('trend_indirect.png' ,sheet1)
addDataFrame(df.ssis_duplmonth ,sheet1, startRow=22)
addDataFrame(df.ssis_dupltrans ,sheet1, startRow=35)
addDataFrame(df.ssis_duplmonth_dir, sheet2, startRow=22)
addDataFrame(df.ssis_dupltrans_dir, sheet2, startRow=55)
saveWorkbook(wb, file="SSIS_import_controls.xlsx")
At this point I get the following error:
> addDataFrame(df.ssis_duplmonth ,sheet1, startRow=22)
Error in .jcall("RJavaTools", "Z", "hasField", .jcast(x, "java/lang/Object"), :
java.lang.NoClassDefFoundError: Could not initialize class sun.java2d.Disposer
R version 2.15.2, 32bit.
Thanks
Edit: I can't really make it reproducible as probably the issue is in my settings but I get the error when I run this:
library('xlsx')
df.test <- iris[1:5, ]
wb <- createWorkbook()
sheet1 <- createSheet(wb, 'Indirect Sales')
addPicture('trend_indirect.png' ,sheet1)
addDataFrame(df.test ,sheet1, startRow=22)
saveWorkbook(wb, file="stack_test.xlsx")
The image is just a simple ggplot graph saved in png. Thanks
Try installing libxtst. That solved a similar problem for me.
I also installed fontconfig and libcups in the course of solving my issue, in case it wasn't libxtst that fixed it.
I was with the same exception but running a Java program using Ubuntu 12.
I've installed libxtst6 and add this java parameter to my JAVA_OPTS variable: -Djava.awt.headless=true
Then it works fine.

Categories

Resources