QnaList > Groups > Spark-User > Apr 2016
faq

DataSet With Array Member

Hi everyone. I have such class
case class DistinctValues(statType: Int, dataType: Int, _id: Int, values: Array[(String, Long)], category: String) extends Serializable {
I think this class won't work in case of  DistinctValues. values.length > Int.MaxValue.
Moreover I instantiate this class by
      .mapGroups { (cdid, itr) =>
        val arr = itr.toArray
        val values = arr.flatMap(x => Array(x._2._1)).toArray.distinct
        val category = arr.head._2._2
        DistinctValues(GV.ACC_STAT, GV.STRING_TYPE, cdid, values, category)
      }
I also concern that above code won’t work in case of itr.length > Int.MaxValue 
How should I fix DistinctValues class?

asked Apr 5 2016 at 01:03

JH P 's gravatar image



Related discussions

Tagged

Group Spark-user

asked Apr 5 2016 at 01:03

active Apr 5 2016 at 01:03

posts:1

users:1

Spark-dev

Spark-user

©2013 QnaList.com