I have a Jython UDF I've written that works fine in local mode but
bombs out when I run it on my cluster.
I'm running 0.8.0, and my stack trace and environment variables are below.
java.io.IOException: Deserialization error: could not instantiate
'org.apache.pig.scripting.jython.JythonFunction' with arguments
'[src/apachelogs.py, extract_apache_log]'
at org.apache.pig.impl.util.ObjectSerializer
Hi all,
I've been using apache pig to do some ETL work, but ran into a weird
problem today when trying pyhon UDFs.
I borrowed an example from
http://sundaycomputing.blogspot.com/2011/01/python-udfs-from-pig-scripts.html
And it worked well in local mode, but not MapReduce mode.
Since my team have already been using pig for quite a while, it's
really hard to drop it, so please, if anyone could help
Hi,
I am new to pig .
Can somebody help me setting he debug mode in eclipse in easy steps.
Would really appreciate the help.
Thanks
Hi everyone,
My Pig script generates the following -- results are stored in part-m-00000 to part-m-00004 files.
-bash-4.1$ hadoop dfs -ls /scratch/ItemIds
Found 7 items
-rw-r--r-- 1 userid supergroup 0 2013-12-23 11:13 /scratch/ItemIds/_SUCCESS
drwxr-xr-x - userid supergroup 0 2013-12-23 11:12 /scratch/ItemIds/_logs
-rw-r--r-- 1 userid supergroup 276019 2013
Hi all,
New to pig. The simplest sentences of "load" and then "dump" does not work
in Mapreduce mode :-(
Here is the error information I get. I am wondering what "Queue 'default'
does not exist" means.
2009-11-22 21:53:25,317 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
2009-11-22 21:53:25,317 [main] INFO
org.apache
Hey All,
To isolate an issue that I am facing I have written a very basic custom UDF function.
On running map reduce job I am observing the below error.
Surprisingly I am not getting any error when I run it in local mode or mapreduce mode but when I run it from my app using PigServer Java API I get the below error.
java.io.IOException: Deserialization error: Cannot instantiate: myudf.UPPER
Hi,
I am new bee to pig .
Can you guide me or point me to the easy steps to set up debug mode in
eclipse.
Really appreciate your help.
Thanks
You received this message because you are subscribed to the Google Groups "CDH Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
[email protected].
For more options, visit https://
Hello,
some questions about the hbase/hive integration:
I am running a cloudera hadoop/hbase cluster. I want to get access with hive to hbase tables.
This actually works fine with cli like
sudo -u hdfs hive --auxpath /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar,/usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar,/usr/lib/hive/lib/zookeeper-3.3.1.jar
The zookeeper quorum is defined in the hife-site
Hi,
We recently updated our hadoop from CDH2 to CDH3b4, and had problems
using some old python udfs. Runing in local mode still works, but in
hadoop mode, it gives errors like "could not instantiate
'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
Anyone see similar error with python udf on this hadoop distribution?
We are using pig 0.8.0. Thanks!
Regards
Shawn
Hello!
I created custom UDF and I need to have different outputSchema columns(in one
case - 2 chararray fields, in the other - 3 chararray fields for
output) depends on UDF input parameter(mode_1 = 'one'):
..FLATTEN(myUDF('int, char', 'one'))
I overrided
@Override
public Schema getOutputSchema(Schema input){
Schema schema = new Schema();
// check mode_1 parameter 'one'-