apache PIG with datafu: Cannot resolve UDF's - java

I'm trying the quickstart from here: http://datafu.incubator.apache.org/docs/datafu/getting-started.html
I tried nearly everything, but I'm sure it must be my fault somewhere. I tried already:
exporting PIG_HOME, CLASSPATH, PIG_CLASSPATH
starting pig with -cpdatafu-pig-incubating-1.3.0.jar
registering datafu-pig-incubating-1.3.0.jar locally and in hdfs => both succesful (at least no error shown)
nothing helped
Trying this on pig:
register datafu-pig-incubating-1.3.0.jar
DEFINE Median datafu.pig.stats.StreamingMedian();
data = load '/user/hduser/numbers.txt' using PigStorage() as (val:int);
data2 = FOREACH (GROUP data ALL) GENERATE Median(data);
or directly
data2 = FOREACH (GROUP data ALL) GENERATE datafu.pig.stats.StreamingMedian(data);
I get this name-resolve error:
2016-06-04 17:22:22,734 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 1070: Could not resolve datafu.pig.stats.StreamingMedian using imports: [, java.lang., org.apache.pig.builtin.,
org.apache.pig.impl.builtin.] Details at logfile:
/home/hadoop/pig_1465053680252.log
When I look into the datafu-pig-incubating-1.3.0.jar it looks OK, everything in place. I also tried some Bag functions, same error then.
I think it's kind of a noob-error which I just don't see (as I did not find particular answers for datafu in SO or google), so thanks in advance for shedding some light on this.

Pig script is proper, the only thing that could break is that while registering datafu there were some class dependencies that coudn't been met.
Try to run locally (pig -x local) and see a detailed log.
Check also the version of pig - it should be newer than 0.14.0.

Related

Error while emitting RecordedSimulation in Gatling Tool

Getting below error when I am trying to run gatling.sh file. From my understanding getting compilation issue in RecorderSimulation.scala file while doing gatling.sh. Please see the below Error and help me
JAVA = "java"
11:51:37.496 [ERROR] i.g.c.ZincCompiler$ - Error while emitting
RecordedSimulation
Method too large: RecordedSimulation.<init> ()V
11:51:37.520 [ERROR] i.g.c.ZincCompiler$ - one error found
11:51:37.531 [ERROR] i.g.c.ZincCompiler$ - Compilation crashed
sbt.internal.inc.CompileFailed: null
at sbt.internal.inc.AnalyzingCompiler.compile(AnalyzingCompiler.scala:122)
at sbt.internal.inc.AnalyzingCompiler.compile(AnalyzingCompiler.scala:95)
at sbt.internal.inc.MixedAnalyzingCompiler.$anonfun$compile$4(MixedAnalyzingCompiler.scala:91)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at sbt.internal.inc.MixedAnalyzingCompiler.timed(MixedAnalyzingCompiler.scala:186)
at sbt.internal.inc.MixedAnalyzingCompiler.$anonfun$compile$3(MixedAnalyzingCompiler.scala:82)
at sbt.internal.inc.MixedAnalyzingCompiler.$anonfun$compile$3$adapted(MixedAnalyzingCompiler.scala:77)
at sbt.internal.inc.JarUtils$.withPreviousJar(JarUtils.scala:215)
at sbt.internal.inc.MixedAnalyzingCompiler.compileScala$1(MixedAnalyzingCompiler.scala:77)
at sbt.internal.inc.MixedAnalyzingCompiler.compile(MixedAnalyzingCompiler.scala:146)
at sbt.internal.inc.IncrementalCompilerImpl.$anonfun$compileInternal$1(IncrementalCompilerImpl.scala:343)
at sbt.internal.inc.IncrementalCompilerImpl.$anonfun$compileInternal$1$adapted(IncrementalCompilerImpl.scala:343)
at sbt.internal.inc.Incremental$.doCompile(Incremental.scala:120)
at sbt.internal.inc.Incremental$.$anonfun$compile$4(Incremental.scala:100)
at sbt.internal.inc.IncrementalCommon.recompileClasses(IncrementalCommon.scala:180)
at sbt.internal.inc.IncrementalCommon.cycle(IncrementalCommon.scala:98)
at sbt.internal.inc.Incremental$.$anonfun$compile$3(Incremental.scala:102)
at sbt.internal.inc.Incremental$.manageClassfiles(Incremental.scala:155)
at sbt.internal.inc.Incremental$.compile(Incremental.scala:92)
at sbt.internal.inc.IncrementalCompile$.apply(Compile.scala:75)
at sbt.internal.inc.IncrementalCompilerImpl.compileInternal(IncrementalCompilerImpl.scala:348)
at sbt.internal.inc.IncrementalCompilerImpl.$anonfun$compileIncrementally$1(IncrementalCompilerImpl.scala:301)
at sbt.internal.inc.IncrementalCompilerImpl.handleCompilationError(IncrementalCompilerImpl.scala:168)
at sbt.internal.inc.IncrementalCompilerImpl.compileIncrementally(IncrementalCompilerImpl.scala:248)
at sbt.internal.inc.IncrementalCompilerImpl.compile(IncrementalCompilerImpl.scala:74) at io.gatling.compiler.ZincCompiler$.doCompile(ZincCompiler.scala:211)
at io.gatling.compiler.ZincCompiler$.delayedEndpoint$io$gatling$compiler$ZincCompiler$1(ZincCompiler.scala:216)
at io.gatling.compiler.ZincCompiler$delayedInit$body.apply(ZincCompiler.scala:39)
at scala.Function0.apply$mcV$sp(Function0.scala:39)
at scala.Function0.apply$mcV$sp$(Function0.scala:39)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
at scala.App.$anonfun$main$1$adapted(App.scala:80)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.App.main(App.scala:80)
at scala.App.main$(App.scala:78)
at io.gatling.compiler.ZincCompiler$.main(ZincCompiler.scala:39)
at io.gatling.compiler.ZincCompiler.main(ZincCompiler.scala)
Choose a simulation number:
[0] computerdatabase.BasicSimulation
[1] computerdatabase.advanced.AdvancedSimulationStep01
[2] computerdatabase.advanced.AdvancedSimulationStep02
[3] computerdatabase.advanced.AdvancedSimulationStep03
[4] computerdatabase.advanced.AdvancedSimulationStep04
Please suggest solution
This error is due to the size of the scala file being too large. You need to keep number of requests in a method to under 100. I faced the same issue and resolved it by splitting the script into two methods.
You've managed to generate some piece of code that's too large with regard to the Scala language spec.
Please make sure you're using latest Gatling version (3.3.1 as of now) and upgrade if not.
If problem still happens, please reach out on Gatling community mailing list and provide your code.

Using toUpperCase with Correct Locale

I'm trying to fix errors which are reported by forbiddenapis. I had that line:
paramMap.put(Config.TITLEBOOST.toUpperCase(), titleBoost);
So, its been reported as error as usual. I've tried that:
paramMap.put(Config.TITLEBOOST.toUpperCase(Locale.getDefault()), titleBoost);
and that:
paramMap.put(Config.TITLEBOOST.toUpperCase(Locale.ROOT), titleBoost);
also that:
paramMap.put(Config.TITLEBOOST.toUpperCase(Locale.ENGLISH), titleBoost);
However none of them fixed the error:
[forbiddenapis] Forbidden method invocation:
java.lang.String#toUpperCase() [Uses default locale]
What I miss?
Double-check that the bytecode you are analyzing is actually your most recent build output, and that you're looking at the same line forbiddenapis is :) . This looks to me like your source/bytecode/analysis are falling out of sync — the relevant rule shouldn't flag an error on String.toUpperCase(Locale).
Disclaimer: I haven't used forbiddenapis myself --- I wrote this answer based on the repo and on a blog post I found.

Run an ABCL code that uses cl-cppre

With reference to my previous question,
Executing a lisp function from Java
I was able to call lisp code from Java using ABCL.
But the problem is, the already existing lisp code uses CL-PPCRE package.
I can not compile the code as it says 'CL-PPCRE not found'.
I have tried different approaches to add that package,
including
1) how does one compile a clisp program which uses cl-ppcre?
2)https://groups.google.com/forum/#!topic/cl-ppcre/juSfOhEDa1k
Doesnot work!
Other thing is, that executing (compile-file aima.asd) works perfectly fine although it does also require cl-pprce
(defpackage #:aima-asd
(:use :cl :asdf))
(in-package :aima-asd)
(defsystem aima
:name "aima"
:version "0.1"
:components ((:file "defpackage")
(:file "main" :depends-on ("defpackage")))
:depends-on (:cl-ppcre))
The final java code is
interpreter.eval("(load \"aima/asdf.lisp\")");
interpreter.eval("(compile-file \"aima/aima.asd\")");
interpreter.eval("(compile-file \"aima/defpackage.lisp\")");
interpreter.eval("(in-package :aima)");
interpreter.eval("(load \"aima/aima.lisp\")");
interpreter.eval("(aima-load 'all)");
The error message is
Error loading C:/Users/Administrator.NUIG-1Z7HN12/workspace/aima/probability/domains/edit-nets.lisp at line 376 (offset 16389)
#<THREAD "main" {3A188AF2}>: Debugger invoked on condition of type READER-ERROR
The package "CL-PPCRE" can't be found.
[1] AIMA(1):
Can anyone help me?
You need to load cl-ppcre before you can use it. You can do that by using (asdf:load-system :aima), provided that you put both aima and cl-ppcre into locations that your ASDF searches.
I used QuickLisp to add cl-ppcre (because nothing else worked for me).
Here is what I did
(load \"~/QuickLisp.lisp\")")
(quicklisp-quickstart:install)
(load "~/quicklisp/setup.lisp")
(ql:quickload :cl-ppcre)
First 2 lines are only a one time things. Once quickLisp is installed you can start from line 3.

Owl2Java Code generation issue

I'm using the owl2java plugin to generate Java code from an Ontology file. But I'm always getting de same error.
Exception in thread "main" com.hp.hpl.jena.ontology.ConversionException: Cannot convert node http://www.w3.org/2002/07/owl#bottomObjectProperty to TransitiveProperty
at com.hp.hpl.jena.ontology.impl.TransitivePropertyImpl$1.wrap(TransitivePropertyImpl.java:66)
at com.hp.hpl.jena.enhanced.EnhNode.convertTo(EnhNode.java:142)
at com.hp.hpl.jena.enhanced.EnhNode.convertTo(EnhNode.java:22)
at com.hp.hpl.jena.enhanced.Polymorphic.asInternal(Polymorphic.java:54)
at com.hp.hpl.jena.enhanced.EnhNode.viewAs(EnhNode.java:92)
at com.hp.hpl.jena.enhanced.EnhGraph.getNodeAs(EnhGraph.java:135)
at com.hp.hpl.jena.ontology.impl.OntModelImpl$SubjectNodeAs.map1(OntModelImpl.java:3040)
at com.hp.hpl.jena.ontology.impl.OntModelImpl$SubjectNodeAs.map1(OntModelImpl.java:3033)
at com.hp.hpl.jena.util.iterator.Map1Iterator.next(Map1Iterator.java:35)
at com.hp.hpl.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:68)
at com.hp.hpl.jena.util.iterator.UniqueExtendedIterator.nextIfNew(UniqueExtendedIterator.java:61)
at com.hp.hpl.jena.util.iterator.UniqueExtendedIterator.hasNext(UniqueExtendedIterator.java:69)
at com.hp.hpl.jena.util.iterator.NiceIterator.asList(NiceIterator.java:185)
at com.hp.hpl.jena.util.iterator.NiceIterator.toList(NiceIterator.java:159)
at de.incunabulum.owl2java.core.generator.OwlReader.handleProperties(OwlReader.java:862)
at de.incunabulum.owl2java.core.generator.OwlReader.generateJModel(OwlReader.java:457)
at de.incunabulum.owl2java.core.JenaGenerator.generate(JenaGenerator.java:65)
at onto.main.main(main.java:99)
I have no idea about what I'm doing wrong. Any Ideas?
Thanks you a lot.
I looked at the top line on your exception, and see com.hp.hpl.jena.ontology.impl.TransitivePropertyImpl.
Googling for that leads to a version of the source code. It may not be exactly the same version as you're using, but is probably close enough to be informative. Reading the code leads to these questions:
Does your Model have a profile? It must.
Does the profile support Transitivity? It must.
Are you combining Transitive with something else that it's incompatible with?

error 1200 mismatched input 'as' expecting SEMI_COLON when using DayExtractor in Pig

I'm trying to follow this tutorial to analyze Apache access log files using Pig:
http://venkatarun-n.blogspot.com/2013/01/analyzing-apache-logs-with-pig.html
And i'm stuck with this Pig script:
grpd = GROUP logs BY DayExtractor(dt) as day;
When i execute that in grunt terminal, i get the following error:
ERROR 1200: mismatched input 'as' expecting
SEMI_COLON Failed to parse: mismatched input 'as'
expecting SEMI_COLON
Function DayExtractor is defined from piggybank.jar in this manner:
DEFINE DayExtractor
org.apache.pig.piggybank.evaluation.util.apachelogparser.DateExtractor('yyyy-MM-dd');
Ideas anyone?
I've been searching for awhile about this. Any help would be greatly be appreciated.
I am not sure how the author of the blog post got it to work, but as far as I know, you cannot use as in GROUP BY in pig. Also, I don't think you cannot use UDFs in GROUP BY. May be the author had a different version of pig that supported such operations. To get the same effect, you can split it into two steps:
logs_day = FOREACH logs GENERATE ....., DayExtractor(dt) as day;
grpd = GROUP logs_day BY day;

Categories

Resources