QnaList > Groups > Spark-User > Mar 2016
faq

Partitioning To Speed Up Processing?

I have a number of queries that result in a sequence Filter > Project > Aggregate. I wonder whether partitioning the input table makes sense.
Does Aggregate benefit from a partitioned input? If so, what partitions would be most useful (related to the aggregations)?
Do Filter and Project preserve the partition of its inputs?
Thanks,
Gerhard

asked Mar 10 2016 at 11:54

Gerhard Fiedler 's gravatar image



Related discussions

Tagged

Group Spark-user

asked Mar 10 2016 at 11:54

active Mar 10 2016 at 11:54

posts:1

users:1

Spark-dev

Spark-user

©2013 QnaList.com