[Pig-User] Including JARs for UDF in MapReduce mode

Hey Guys,
I am running into a bit of trouble, and I know its something that must be
commonly done. I have created a loader function which uses external JARs,
which is fine when ran in local mode; the Pig job is also scripted in
embedded mode (so included in native java). The JARs aren't required by my
Pig script directly, but required by the UDF that the Pig script uses, how
do I distribute the JARs in mapreduce mode and how do I setup my native java
code to access it ?
Many Thanks,
Robert.

Reply To : Including JARs For UDF In MapReduce Mode

asked Jan 7 2011 at 22:00

robert

2 Replies for : Including JARs For UDF In MapReduce Mode

You just need to:
1) make sure they are on the classpath when you call pig (you can add them
to PIG_CLASSPATH)
2) add "register /path/to/myrequired.jar" at the top of your script for
every required jar.
-D

Reply To : Including JARs For UDF In MapReduce Mode

answered Jan 7 2011 at 22:10

Dmitriy Ryaboy

Dmitriy,
many thanks, that seemed to stop it complaining about classes not being
found.
Robert.

Reply To : Including JARs For UDF In MapReduce Mode

answered Jan 7 2011 at 22:37

robert

Related discussions

Error With Jython UDF Only In Mapreduce Mode

I have a Jython UDF I've written that works fine in local mode but bombs out when I run it on my cluster. I'm running 0.8.0, and my stack trace and environment variables are below. java.io.IOException: Deserialization error: could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' with arguments '[src/apachelogs.py, extract_apache_log]' at org.apache.pig.impl.util.ObjectSerializer

Can't Use Python UDF In MapReduce Mode

Hi all, I've been using apache pig to do some ETL work, but ran into a weird problem today when trying pyhon UDFs. I borrowed an example from http://sundaycomputing.blogspot.com/2011/01/python-udfs-from-pig-scripts.html And it worked well in local mode, but not MapReduce mode. Since my team have already been using pig for quite a while, it's really hard to drop it, so please, if anyone could help

Setup Debug Mode In Eclipse For Java UDF And Pig Script

Hi, I am new to pig . Can somebody help me setting he debug mode in eclipse in easy steps. Would really appreciate the help. Thanks

Vectorizing Data In Mapreduce Mode

Hi everyone, My Pig script generates the following -- results are stored in part-m-00000 to part-m-00004 files. -bash-4.1$ hadoop dfs -ls /scratch/ItemIds Found 7 items -rw-r--r-- 1 userid supergroup 0 2013-12-23 11:13 /scratch/ItemIds/_SUCCESS drwxr-xr-x - userid supergroup 0 2013-12-23 11:12 /scratch/ItemIds/_logs -rw-r--r-- 1 userid supergroup 276019 2013

Error In Mapreduce Mode

Hi all, New to pig. The simplest sentences of "load" and then "dump" does not work in Mapreduce mode :-( Here is the error information I get. I am wondering what "Queue 'default' does not exist" means. 2009-11-22 21:53:25,317 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2009-11-22 21:53:25,317 [main] INFO org.apache

Very Basic Custom Pig Udf Function Not Working| Cannot Instantiate: Myudf.UPPER

Hey All, To isolate an issue that I am facing I have written a very basic custom UDF function. On running map reduce job I am observing the below error. Surprisingly I am not getting any error when I run it in local mode or mapreduce mode but when I run it from my app using PigServer Java API I get the below error. java.io.IOException: Deserialization error: Cannot instantiate: myudf.UPPER

Set Up Eclipse In Debug Mode For Java UDF + Pig Script -- Easy Steps

Hi, I am new bee to pig . Can you guide me or point me to the easy steps to set up debug mode in eclipse. Really appreciate your help. Thanks You received this message because you are subscribed to the Google Groups "CDH Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://

Hive Hbase Auxpath Jars

Hello, some questions about the hbase/hive integration: I am running a cloudera hadoop/hbase cluster. I want to get access with hive to hbase tables. This actually works fine with cli like sudo -u hdfs hive --auxpath /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar,/usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar,/usr/lib/hive/lib/zookeeper-3.3.1.jar The zookeeper quorum is defined in the hife-site

CDH3 Fail Python Udf

Hi, We recently updated our hadoop from CDH2 to CDH3b4, and had problems using some old python udfs. Runing in local mode still works, but in hadoop mode, it gives errors like "could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' with arguments...". Anyone see similar error with python udf on this hadoop distribution? We are using pig 0.8.0. Thanks! Regards Shawn

UDF. Change OutputSchema From Evaluate

Hello! I created custom UDF and I need to have different outputSchema columns(in one case - 2 chararray fields, in the other - 3 chararray fields for output) depends on UDF input parameter(mode_1 = 'one'): ..FLATTEN(myUDF('int, char', 'one')) I overrided @Override public Schema getOutputSchema(Schema input){ Schema schema = new Schema(); // check mode_1 parameter 'one'-

Including JARs For UDF In MapReduce Mode

Related discussions

Pig-commits

Pig-dev

Pig-user