Hi, All
 I have am writing a Hive client to run a Hive query using Hive JDBC driver.
 Since the data amount is huge I really would like to see the progress when
the query is running.
 Is there anyway I can get the job progress?
Thanks
Haijia
Haijia Zhou's gravatar image asked Sep 14 2012 at 19:17 in Hive-User by Haijia Zhou

4 Answers

Not familiar with JDBC, but thrift seems can't.
MiaoMiao's gravatar image answered Sep 17 2012 at 04:17 by MiaoMiao
One try you can give. As you run a hive query it in turn runs a map-reduce at server, over there you can capture the progress of that map and reduce percentage and send client side for progress bar or feedback
Regards
=E2=88=9E Shashwat Shriparv
shashwat shriparv's gravatar image answered Sep 17 2012 at 05:40 by shashwat shriparv
The jdbc driver uses thrift so if thrift can't then jdbc can't.
This can be surprisingly difficult to do. Hive can split a query into x hadoop jobs and some will run in parallel and some will run in sequence.
I've used oracle in the past (10 and 11) and I could also never find out how long a large job would take, which leads me to suspect it's not a trivial thing to do.
Bennie Schut's gravatar image answered Sep 17 2012 at 12:31 by Bennie Schut
Thanks a lot for all the answers and suggestions. Looks like one hacky workaround is to check the hadoop task status. But for my project it's way too much cost.
On Mon, Sep 17, 2012 at 8:31 AM, Bennie Schut wrote:
> The jdbc driver uses thrift so if thrift can't then jdbc can't. > > This can be surprisingly difficult to do. Hive can split a query into x > hadoop jobs and some will run in parallel and some will run in sequence. > I've used oracle in the past (10 and 11) and I could also never find out > how long a large job would take, which leads me to suspect it's not a > trivial thing to do. > > >
Haijia Zhou's gravatar image answered Sep 17 2012 at 13:24 by Haijia Zhou