I am getting "Failed to specify server's Kerberos principal name" when attempting to connect to habse with janusgraph in a kerberos hadoop cluster
First off a little environmental info -
OS: 7.6.1810
Java: 1.8.0_191-b12
Spark: 2.3.2.3.1.0.78-4
YARN: 2.5.0
Hbase: 2.0.2.3.1.0.78-4
Hadoop: 3.1.1.3.1.0.78-4
Kerberos: 5 version 1.15.1
Janusgraph: 0.4.0
I did kinit and test the bundled gremlin client to ensure the graph.properties for the env works. It was able to connect up create a simple test graph, add some vertices, restart and retrieve the stored data. So cool the bundled copy works.
For laziness/simplicity I decided to load the spark-shell with janusgraph libs. While attempting to connect to the same graph it started throwing kerberos errors.
First thought being maybe its a hadoop/spark lib/conf conflict(pretty typical). So built out a very simple and barebones java app in an attempt to see if it would work. Got the same errors as spark.
Spark Invocations -
First attempt:
spark-shell \
--conf spark.driver.userClassPathFirst=true \
--conf spark.executor.userClassPathFirst=true \
--conf spark.driver.userClassPathFirst=true \
--jars /etc/hadoop/conf/core-site.xml,/etc/hbase/conf/hbase-site.xml,groovy-console-2.5.6.jar,javax.servlet-api-3.1.0.jar,netty-buffer-4.1.25.Final.jar,RoaringBitmap-0.5.11.jar,groovy-groovysh-2.5.6-indy.jar,javax.ws.rs-api-2.0.1.jar,netty-codec-4.1.25.Final.jar,activation-1.1.jar,groovy-json-2.5.6-indy.jar,jaxb-api-2.2.2.jar,netty-common-4.1.25.Final.jar,airline-0.6.jar,groovy-jsr223-2.5.6-indy.jar,jaxb-impl-2.2.3-1.jar,netty-handler-4.1.25.Final.jar,antlr-2.7.7.jar,groovy-swing-2.5.6.jar,jbcrypt-0.4.jar,netty-resolver-4.1.25.Final.jar,antlr-3.2.jar,groovy-templates-2.5.6.jar,jboss-logging-3.1.2.GA.jar,netty-transport-4.1.25.Final.jar,antlr-runtime-3.2.jar,groovy-xml-2.5.6.jar,jcabi-log-0.14.jar,noggit-0.6.jar,aopalliance-repackaged-2.4.0-b34.jar,gson-2.2.4.jar,jcabi-manifests-1.1.jar,objenesis-2.1.jar,apacheds-i18n-2.0.0-M15.jar,guava-18.0.jar,jcl-over-slf4j-1.7.25.jar,ohc-core-0.3.4.jar,apacheds-kerberos-codec-2.0.0-M15.jar,hadoop-annotations-2.7.7.jar,je-7.5.11.jar,org.apache.servicemix.bundles.commons-csv-1.0-r706900_3.jar,api-asn1-api-1.0.0-M20.jar,hadoop-auth-2.7.7.jar,jersey-client-1.9.jar,oro-2.0.8.jar,api-util-1.0.0-M20.jar,hadoop-client-2.7.7.jar,jersey-client-2.22.2.jar,osgi-resource-locator-1.0.1.jar,asm-3.1.jar,hadoop-common-2.7.7.jar,jersey-common-2.22.2.jar,paranamer-2.6.jar,asm-5.0.3.jar,hadoop-distcp-2.7.7.jar,jersey-container-servlet-2.22.2.jar,picocli-3.9.2.jar,asm-analysis-5.0.3.jar,hadoop-gremlin-3.4.1.jar,jersey-container-servlet-core-2.22.2.jar,protobuf-java-2.5.0.jar,asm-commons-5.0.3.jar,hadoop-hdfs-2.7.7.jar,jersey-core-1.9.jar,py4j-0.10.7.jar,asm-tree-5.0.3.jar,hadoop-mapreduce-client-app-2.7.7.jar,jersey-guava-2.22.2.jar,pyrolite-4.13.jar,asm-util-5.0.3.jar,hadoop-mapreduce-client-common-2.7.7.jar,jersey-json-1.9.jar,reflections-0.9.9-RC1.jar,astyanax-cassandra-3.10.2.jar,hadoop-mapreduce-client-core-2.7.7.jar,jersey-media-jaxb-2.22.2.jar,reporter-config-base-3.0.0.jar,astyanax-cassandra-all-shaded-3.10.2.jar,hadoop-mapreduce-client-jobclient-2.7.7.jar,jersey-server-1.9.jar,reporter-config3-3.0.0.jar,astyanax-core-3.10.2.jar,hadoop-mapreduce-client-shuffle-2.7.7.jar,jersey-server-2.22.2.jar,scala-library-2.11.8.jar,astyanax-recipes-3.10.2.jar,hadoop-yarn-api-2.7.7.jar,jets3t-0.7.1.jar,scala-reflect-2.11.8.jar,astyanax-thrift-3.10.2.jar,hadoop-yarn-client-2.7.7.jar,jettison-1.3.3.jar,scala-xml_2.11-1.0.5.jar,audience-annotations-0.5.0.jar,hadoop-yarn-common-2.7.7.jar,jetty-6.1.26.jar,servlet-api-2.5.jar,avro-1.7.4.jar,hadoop-yarn-server-common-2.7.7.jar,jetty-sslengine-6.1.26.jar,sesame-model-2.7.10.jar,avro-ipc-1.8.2.jar,hamcrest-core-1.3.jar,jetty-util-6.1.26.jar,sesame-rio-api-2.7.10.jar,avro-mapred-1.8.2-hadoop2.jar,hbase-shaded-client-2.1.5.jar,jffi-1.2.16-native.jar,sesame-rio-datatypes-2.7.10.jar,bigtable-hbase-1.x-shaded-1.11.0.jar,hbase-shaded-mapreduce-2.1.5.jar,jffi-1.2.16.jar,sesame-rio-languages-2.7.10.jar,caffeine-2.3.1.jar,hibernate-validator-4.3.0.Final.jar,jline-2.14.6.jar,sesame-rio-n3-2.7.10.jar,cassandra-all-2.2.13.jar,high-scale-lib-1.0.6.jar,jna-4.0.0.jar,sesame-rio-ntriples-2.7.10.jar,cassandra-driver-core-3.7.1.jar,high-scale-lib-1.1.4.jar,jnr-constants-0.9.9.jar,sesame-rio-rdfxml-2.7.10.jar,cassandra-thrift-2.2.13.jar,hk2-api-2.4.0-b34.jar,jnr-ffi-2.1.7.jar,sesame-rio-trig-2.7.10.jar,checker-compat-qual-2.5.2.jar,hk2-locator-2.4.0-b34.jar,jnr-posix-3.0.44.jar,sesame-rio-trix-2.7.10.jar,chill-java-0.9.3.jar,hk2-utils-2.4.0-b34.jar,jnr-x86asm-1.0.2.jar,sesame-rio-turtle-2.7.10.jar,chill_2.11-0.9.3.jar,hppc-0.7.1.jar,joda-time-2.8.2.jar,sesame-util-2.7.10.jar,commons-cli-1.3.1.jar,htrace-core-3.1.0-incubating.jar,jsch-0.1.54.jar,sigar-1.6.4.jar,commons-codec-1.7.jar,htrace-core4-4.2.0-incubating.jar,json-20090211_1.jar,slf4j-api-1.7.12.jar,commons-collections-3.2.2.jar,httpasyncclient-4.1.2.jar,json-simple-1.1.jar,slf4j-log4j12-1.7.12.jar,commons-configuration-1.10.jar,httpclient-4.4.1.jar,json4s-ast_2.11-3.5.3.jar,snakeyaml-1.11.jar,commons-crypto-1.0.0.jar,httpcore-4.4.1.jar,json4s-core_2.11-3.5.3.jar,snappy-java-1.0.5-M3.jar,commons-httpclient-3.1.jar,httpcore-nio-4.4.5.jar,json4s-jackson_2.11-3.5.3.jar,solr-solrj-7.0.0.jar,commons-io-2.3.jar,httpmime-4.4.1.jar,json4s-scalap_2.11-3.5.3.jar,spark-core_2.11-2.4.0.jar,commons-lang-2.5.jar,ivy-2.3.0.jar,jsp-api-2.1.jar,spark-gremlin-3.4.1.jar,commons-lang3-3.3.1.jar,jackson-annotations-2.6.6.jar,jsr305-3.0.0.jar,spark-kvstore_2.11-2.4.0.jar,commons-logging-1.1.1.jar,jackson-core-2.6.6.jar,jts-core-1.15.0.jar,spark-launcher_2.11-2.4.0.jar,commons-math3-3.2.jar,jackson-core-asl-1.9.13.jar,jul-to-slf4j-1.7.16.jar,spark-network-common_2.11-2.4.0.jar,commons-net-1.4.1.jar,jackson-databind-2.6.6.jar,junit-4.12.jar,spark-network-shuffle_2.11-2.4.0.jar,commons-pool-1.6.jar,jackson-datatype-json-org-2.6.6.jar,kryo-shaded-4.0.2.jar,spark-tags_2.11-2.4.0.jar,commons-text-1.0.jar,jackson-jaxrs-1.9.13.jar,leveldbjni-all-1.8.jar,spark-unsafe_2.11-2.4.0.jar,compress-lzf-1.0.0.jar,jackson-mapper-asl-1.9.13.jar,libthrift-0.9.2.jar,spatial4j-0.7.jar,concurrentlinkedhashmap-lru-1.3.jar,jackson-module-paranamer-2.6.6.jar,log4j-1.2.16.jar,stax-api-1.0-2.jar,crc32ex-0.1.1.jar,jackson-module-scala_2.11-2.6.6.jar,logback-classic-1.1.3.jar,stax-api-1.0.1.jar,curator-client-2.7.1.jar,jackson-xc-1.9.13.jar,logback-core-1.1.3.jar,stax2-api-3.1.4.jar,curator-framework-2.7.1.jar,jamm-0.3.0.jar,lucene-analyzers-common-7.0.0.jar,stream-2.7.0.jar,curator-recipes-2.7.1.jar,janusgraph-all-0.4.0.jar,lucene-core-7.0.0.jar,stringtemplate-3.2.jar,disruptor-3.0.1.jar,janusgraph-berkeleyje-0.4.0.jar,lucene-queries-7.0.0.jar,super-csv-2.1.0.jar,dom4j-1.6.1.jar,janusgraph-bigtable-0.4.0.jar,lucene-queryparser-7.0.0.jar,thrift-server-0.3.7.jar,ecj-4.4.2.jar,janusgraph-cassandra-0.4.0.jar,lucene-sandbox-7.0.0.jar,tinkergraph-gremlin-3.4.1.jar,elasticsearch-rest-client-6.6.0.jar,janusgraph-core-0.4.0.jar,lucene-spatial-7.0.0.jar,unused-1.0.0.jar,exp4j-0.4.8.jar,janusgraph-cql-0.4.0.jar,lucene-spatial-extras-7.0.0.jar,uuid-3.2.jar,findbugs-annotations-1.3.9-1.jar,janusgraph-es-0.4.0.jar,lucene-spatial3d-7.0.0.jar,validation-api-1.1.0.Final.jar,gbench-0.4.3-groovy-2.4.jar,janusgraph-hadoop-0.4.0.jar,lz4-1.3.0.jar,vavr-0.9.0.jar,gmetric4j-1.0.7.jar,janusgraph-hbase-0.4.0.jar,lz4-java-1.4.0.jar,vavr-match-0.9.0.jar,gprof-0.3.1-groovy-2.4.jar,janusgraph-lucene-0.4.0.jar,metrics-core-3.0.2.jar,woodstox-core-asl-4.4.1.jar,gremlin-console-3.4.1.jar,janusgraph-server-0.4.0.jar,metrics-core-3.2.2.jar,xbean-asm6-shaded-4.8.jar,gremlin-core-3.4.1.jar,janusgraph-solr-0.4.0.jar,metrics-ganglia-3.2.2.jar,xercesImpl-2.9.1.jar,gremlin-driver-3.4.1.jar,javapoet-1.8.0.jar,metrics-graphite-3.2.2.jar,xml-apis-1.3.04.jar,gremlin-groovy-3.4.1.jar,javassist-3.18.0-GA.jar,metrics-json-3.1.5.jar,xmlenc-0.52.jar,gremlin-server-3.4.1.jar,javatuples-1.2.jar,metrics-jvm-3.2.2.jar,zookeeper-3.4.6.jar,gremlin-shaded-3.4.1.jar,javax.inject-1.jar,minlog-1.3.0.jar,zstd-jni-1.3.2-2.jar,groovy-2.5.6-indy.jar,javax.inject-2.4.0-b34.jar,netty-3.10.5.Final.jar,groovy-cli-picocli-2.5.6.jar,javax.json-1.0.jar,netty-all-4.1.25.Final.jar
Second attempt(less libs):
spark-shell \
--conf spark.driver.userClassPathFirst=true \
--conf spark.executor.userClassPathFirst=true \
--conf spark.driver.userClassPathFirst=true \
--jars /etc/hadoop/conf/core-site.xml,/etc/hbase/conf/hbase-site.xml,gremlin-core-3.4.1.jar,gremlin-driver-3.4.3.jar,gremlin-shaded-3.4.1.jar,groovy-2.5.7.jar,groovy-json-2.5.7.jar,javatuples-1.2.jar,commons-lang3-3.8.1.jar,commons-configuration-1.10.jar,janusgraph-core-0.4.0.jar,hbase-shaded-client-2.1.5.jar,janusgraph-hbase-0.4.0.jar,high-scale-lib-1.1.4.jar
Java attempt:
java \
-cp /etc/hadoop/conf/core-site.xml:/etc/hbase/conf/hbase-site.xml:hbase-shaded-client-2.1.5.jar:janusgraph-hbase-0.4.0.jar:janusgraph-core-0.4.0.jar:commons-lang3-3.8.1.jar:gremlin-driver-3.4.3.jar:groovy-2.5.7.jar:javatuples-1.2.jar:commons-configuration-1.10.jar:gremlin-core-3.4.1.jar:gremlin-shaded-3.4.1.jar:groovy-json-2.5.7.jar:high-scale-lib-1.1.4.jar:Janusgraph_Ingestion.jar:../janusgraph-0.4.0-hadoop2/lib/commons-lang-2.5.jar:../janusgraph-0.4.0-hadoop2/lib/slf4j-api-1.7.12.jar:../janusgraph-0.4.0-hadoop2/lib/slf4j-log4j12-1.7.12.jar:../janusgraph-0.4.0-hadoop2/lib/log4j-1.2.16.jar:../janusgraph-0.4.0-hadoop2/lib/guava-18.0.jar:../janusgraph-0.4.0-hadoop2/lib/commons-logging-1.1.1.jar:../janusgraph-0.4.0-hadoop2/lib/commons-io-2.3.jar:../janusgraph-0.4.0-hadoop2/lib/htrace-core4-4.2.0-incubating.jar \
Entry
As far as code being executed in the spark-shell or java
import org.janusgraph.core.JanusGraphFactory;
val g = JanusGraphFactory.open("/home/devuser/janusgraph-0.4.0-hadoop2/conf/janusgraph-hbase.properties").traversal()
Also tried adding the below before attempting to open the graph
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.security.UserGroupInformation;
val conf = new Configuration();
conf.set("hadoop.security.authentication", "Kerberos");
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromSubject(null);
Including graph connect config for completeness
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=hbase
storage.hostname=hosta.example.com:2181,hostb.example.com:2181,hostc.example.com:2181
storage.hbase.table=JgraphTest
storage.hbase.ext.zookeeper.znode.parent=/hbase-secure
storage.batch-loading=false
java.security.krb5.conf=/etc/krb5.conf
storage.hbase.ext.hbase.security.authentication=kerberos
storage.hbase.ext.hbase.security.authorization=true
storage.hbase.ext.hadoop.security.authentication=kerberos
storage.hbase.ext.hadoop.security.authorization=true
storage.hbase.ext.hbase.regionserver.kerberos.principal=hbase/_HOST#HDPDEV.example.com
ids.block-size=10000
ids.renew-timeout=3600000
storage.buffer-size=10000
ids.num-partitions=10
ids.partition=true
schema.default=none
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.5
Expected result would be a usable traversal object
Actual result below
19/10/18 11:40:30 TRACE NettyRpcConnection: Connecting to hostb.example.com/192.168.1.101:16000
19/10/18 11:40:30 DEBUG AbstractHBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos principal name is null
19/10/18 11:40:30 TRACE AbstractRpcClient: Call: IsMasterRunning, callTime: 4ms
19/10/18 11:40:30 DEBUG RpcRetryingCallerImpl: Call exception, tries=7, retries=16, started=8197 ms ago, cancelled=false, msg=java.io.IOException: Call to hostb.example.com/192.168.1.101:16000 failed on local exception: java.io.IOException: Failed to specify server's Kerberos principal name, details=, see https://s.apache.org/timeout, exception=org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Call to hostb.example.com/192.168.1.101:16000 failed on local exception: java.io.IOException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.hbase.client.ConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionImplementation.java:1175)
at org.apache.hadoop.hbase.client.ConnectionImplementation.getKeepAliveMasterService(ConnectionImplementation.java:1234)
at org.apache.hadoop.hbase.client.ConnectionImplementation.getMaster(ConnectionImplementation.java:1223)
at org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:57)
at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3089)
at org.apache.hadoop.hbase.client.HBaseAdmin.getHTableDescriptor(HBaseAdmin.java:569)
at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:529)
at org.janusgraph.diskstorage.hbase.HBaseAdmin1_0.getTableDescriptor(HBaseAdmin1_0.java:105)
at org.janusgraph.diskstorage.hbase.HBaseStoreManager.ensureTableExists(HBaseStoreManager.java:726)
at org.janusgraph.diskstorage.hbase.HBaseStoreManager.getLocalKeyPartition(HBaseStoreManager.java:537)
at org.janusgraph.diskstorage.hbase.HBaseStoreManager.getDeployment(HBaseStoreManager.java:376)
at org.janusgraph.diskstorage.hbase.HBaseStoreManager.getFeatures(HBaseStoreManager.java:418)
at org.janusgraph.graphdb.configuration.builder.GraphDatabaseConfigurationBuilder.build(GraphDatabaseConfigurationBuilder.java:51)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:161)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:132)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:79)
at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:26)
at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31)
at $line22.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33)
at $line22.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:35)
at $line22.$read$$iw$$iw$$iw$$iw.<init>(<console>:37)
at $line22.$read$$iw$$iw$$iw.<init>(<console>:39)
at $line22.$read$$iw$$iw.<init>(<console>:41)
at $line22.$read$$iw.<init>(<console>:43)
at $line22.$read.<init>(<console>:45)
at $line22.$read$.<init>(<console>:49)
at $line22.$read$.<clinit>(<console>)
at $line22.$eval$.$print$lzycompute(<console>:7)
at $line22.$eval$.$print(<console>:6)
at $line22.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:415)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:76)
at org.apache.spark.repl.Main$.main(Main.scala:56)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: Call to hostb.example.com/192.168.1.101:16000 failed on local exception: java.io.IOException: Failed to specify server's Kerberos principal name
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:221)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)
at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)
at org.apache.hadoop.hbase.ipc.BufferCallBeforeInitHandler.userEventTriggered(BufferCallBeforeInitHandler.java:92)
at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:329)
at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:315)
at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:307)
at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.userEventTriggered(DefaultChannelPipeline.java:1377)
at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:329)
at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:315)
at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireUserEventTriggered(DefaultChannelPipeline.java:929)
at org.apache.hadoop.hbase.ipc.NettyRpcConnection.failInit(NettyRpcConnection.java:179)
at org.apache.hadoop.hbase.ipc.NettyRpcConnection.saslNegotiate(NettyRpcConnection.java:197)
at org.apache.hadoop.hbase.ipc.NettyRpcConnection.access$800(NettyRpcConnection.java:71)
at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:273)
at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:261)
at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:306)
at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:341)
at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:633)
at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.hbase.security.AbstractHBaseSaslRpcClient.<init>(AbstractHBaseSaslRpcClient.java:99)
at org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClient.<init>(NettyHBaseSaslRpcClient.java:43)
at org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler.<init>(NettyHBaseSaslRpcClientHandler.java:70)
at org.apache.hadoop.hbase.ipc.NettyRpcConnection.saslNegotiate(NettyRpcConnection.java:194)
... 18 more
Well I feel like an idiot. Sooo apparently the answer was actually a really simple deal. Appears that the gremlin client will work fine when just using storage.hbase.ext.hbase.regionserver.kerberos.principal but when using the libs out side of that storage.hbase.ext.hbase.master.kerberos.principal is needed as well. Well as far as this things are working on to the next set of problems I made for myself lol.
I'm trying to run ssvd on some tfidf-vectors in mahout. When I run it in Java code as follows (with mahout 0.6 jars), it works fine:
public static void main(String[] args){
runSSVDOnSparseVectors(vectorOutputPath
+ "/tfidf-vectors/part-r-00000", ssvdOutputPath, 1, 0, 30000, 1);
}
private static void runSSVDOnSparseVectors(String inputPath, String outputPath,
int rank, int oversampling, int blocks,
int reduceTasks) throws IOException {
Configuration conf = new Configuration();
// get number of reduce tasks from config?
SSVDSolver solver = new SSVDSolver(conf, new Path[] { new Path(
inputPath) }, new Path(outputPath), blocks, rank, oversampling,
reduceTasks);
solver.setcUHalfSigma(true);
solver.setcVHalfSigma(true);
solver.run();
}
I decided that I wanted to convert it to a bash script and just use the cli command instead, but when I do, I get the following error (tried this on version 0.5 and 0.7, neither worked. I could try 0.6 but I don't think it's a version thing):
[username#hostname lsa]$ $MAHOUT/mahout ssvd -i $H/test_lsa/v_out/tfidf-vectors -o $H/test_lsa/svd_out -k 1 -p 0 -r 30000 -t 1
Running on hadoop, using /usr/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/lib/mahout-distribution-0.7/mahout-examples-0.7-job.jar
12/07/23 15:00:47 INFO common.AbstractJob: Command line arguments: {--abtBlockHeight=[200000], --blockHeight=[30000], --broadcast=[true], --computeU=[true], --computeV=[true], --endPhase=[2147483647], --input=[/path/to/folder/test_lsa/v_out/tfidf-vectors], --minSplitSize=[-1], --outerProdBlockHeight=[30000], --output=[/path/to/folder/test_lsa/svd_out], --oversampling=[0], --pca=[false], --powerIter=[0], --rank=[1], --reduceTasks=[100], --startPhase=[0], --tempDir=[temp], --uHalfSigma=[false], --vHalfSigma=[false]}
12/07/23 15:00:49 INFO input.FileInputFormat: Total input paths to process : 100
Exception in thread "main" java.io.IOException: Q job unsuccessful.
at org.apache.mahout.math.hadoop.stochasticsvd.QJob.run(QJob.java:230)
at org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:377)
at org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.run(SSVDCli.java:141)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.main(SSVDCli.java:171)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
I'm running this in distributed mode on a cluster. I've read that Q job failure can have something to do with block size, but mine is greater than p+k. I also realize I'm using a ridiculously small input (4 vectors), but like I said, it works in the java code. I'm mostly baffled as to why it would work in java but not in the CLI. I'm pretty sure I've got all of the same arguments to the function. I can always just package up the java code into a jar and put it into the bash script, but that would be pretty hacky...
The log for the job says:
2012-07-23 15:00:55,413 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2012-07-23 15:00:55,417 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#6ce53220
2012-07-23 15:00:55,638 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2012-07-23 15:00:55,697 ERROR org.apache.mahout.common.IOUtils: new m can't be less than n
java.lang.IllegalArgumentException: new m can't be less than n
at org.apache.mahout.math.hadoop.stochasticsvd.qr.GivensThinSolver.adjust(GivensThinSolver.java:109)
at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.cleanup(QRFirstStep.java:233)
at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.close(QRFirstStep.java:89)
at org.apache.mahout.common.IOUtils.close(IOUtils.java:128)
at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.cleanup(QJob.java:158)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org. apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
2012-07-23 15:00:55,731 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-07-23 15:00:55,733 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.IllegalArgumentException: new m can't be less than n
at org.apache.mahout.math.hadoop.stochasticsvd.qr.GivensThinSolver.adjust(GivensThinSolver.java:109)
at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.cleanup(QRFirstStep.java:233)
at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.close(QRFirstStep.java:89)
at org.apache.mahout.common.IOUtils.close(IOUtils.java:128)
at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.cleanup(QJob.java:158)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
2012-07-23 15:00:55,736 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
Thanks for the help in advance.
I actually think this was because there were some sequence files in tfidf-vectors that were empty, because I was using too many reducers. This seems like a bug to me.