i have executed  below Hive query
create table table_llv_N_C as select
table_line_c_passed.id from table_line_n_passed join table_line_c_passed on
and got following error ......
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row (tag=1)
{"key":{"joinkey0":"12"},"value":{"_col2":"."},"alias":1} at
org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:258) ...
7 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
could only be replicated to 0 nodes instead of minReplication (=1). There
are 2 datanode(s) running and no node(s) are excluded in this operation.
The root cause may be lack of disk space in the HDFS cluster. details of
disk space are
hdfs dfs -df -h
Filesystem Size Used Available Use%
hdfs://x.y.ab.com:8020 159.7 G 21.9 G 110.7 G 14%.
table_line_n_passed having 4767409 rows and 1.1 G size.
similarly table_line_c_passed having 4717082 rows and 1.0 G size .
Does Hive really require that much space (more then available free space
110 G ) to process data. how to calculate how much free space require
before running query .any way to run query within available free space.
do i need to set any property or value in hive configuration .
PS: if i used LIMIT 10000 in above query its running fine .
with regards
krish 's gravatar image asked Mar 10 2015 at 03:17 in Hive-User by krish

0 Answers