spark二次排序到多次排序

数据示例:
1 5 6 9
1 5 6 7
1 5 6 8
2 4 7 5
3 6 3 3
1 5 3 3
1 5 2 4
2 4 3 7 【spark二次排序到多次排序】
实现需求:先按第一列排序,若第一列相同按照第二列排序,依次类推
scals实现:


class SeveralSortKey(val arr:Array[String]) extends Ordered[SeveralSortKey] with Serializable{ //重写Ordered类的compare方法 override def compare(that: SeveralSortKey): Int = { val loop = new Breaks var result:Int = -1 loop.breakable { for (i <- 0 until arr.length) { if (this.arr(i).toInt - that.arr(i).toInt != 0) { result = this.arr(i).toInt - that.arr(i).toInt loop.break() }else{ result = this.arr(i+1).toInt - that.arr(i+1).toInt } } } result } }

object SortDemo{ def main(args:Array[String]): Unit ={ val conf = new SparkConf().setAppName("soft").setMaster("local") val sc = new SparkContext(conf) val lines = sc.textFile("f://sort.txt") val pairs = lines.map(line =>( new SeveralSortKey(line.split(" ")),line ))val softPairs = pairs.sortByKey() var softedLines=softPairs.map(line=>line._2) softedLines.foreach(println) } }

注:当出现两个一模一样的行时,会报错

    推荐阅读