I have a simple query with grouping. Something similar to bellow:
SELECT col1, col2, col3, min(date), count(*)
FROM tblX
WHERE partitionDate="20141107"
GROUP BY col1, col2, col3;
When I run this query through WebHCat everything works fine. But when I try
to run it from hive shell I have error like this:
Setting job diagnostics to REDUCE capability required is more than the
supported max container capability in the cluster. Killing the Job.
reduceResourceReqt: 21238 maxContainerCapability:8192
I tried to setup in hive: SET hive.exec.reducers.max92
but this doesn't change anything. What I did wrong?
Ja Sam 's gravatar image asked Nov 7 2014 at 06:26 in Hive-User by Ja Sam

3 Answers

Which scheduler are you using ? FairScheduler might throw such issues...
And environment details please ? setting config parameters optimally in
yarn/mr configs might help you but please do so wisely as it may imbalance
other things if not implemented thoughtfully.
regards
Devopam
Devopam Mittra
Life and Relations are not binary
Devopam Mittra 's gravatar image answered Nov 7 2014 at 06:34 by Devopam Mittra
I don't use any scheduler. Anyway this error happens when we try to run
this query from HIVE. In webhcat every thing works fine.
Ja Sam 's gravatar image answered Nov 7 2014 at 06:41 by Ja Sam
I found the problem. I had a diffrent configuration on namnode in
yarn-site.xml and on datanodes in same file.
I still don't know why, but this is easy to fix
Ja Sam 's gravatar image answered Nov 7 2014 at 23:33 by Ja Sam