Arrays.sort()源码分析与典型实现例子 Java源码

文章目录

自定义排序

用匿名内部类实现一维数组升序
Lamda表达式对二维数组进行第一维度排序
对二维数组进行双维度排序

源码

Comparator的compare的源码
Arrays.sort(）

【Arrays.sort()源码分析与典型实现例子】
自定义排序

在做一些算法题时常常会需要对数组、自定义对象、集合进行排序. 在java中对数组排序提供了Arrays.sort()方法，对集合排序提供Collections.sort()方法。对自定义对象排序时要自己重写比较器，对象数组则调用Arrays.sort()，对象集合则调用Collections.sort()。两个方法默认都是升序，也可以重写比较器，实现降序。

Comparator接口可以实现自定义排序，实现Comparator接口时，要重写compare方法：int compare(Object o1, Object o2) 返回一个基本类型的整型如果要按照升序排序,则o1 小于o2，返回-1（负数），相等返回0，01大于02返回1（正数）如果要按照降序排序,则o1 小于o2，返回1（正数），相等返回0，01大于02返回-1（负数）

compare（int o1, int o2） 方法 return o1 - o2 是升序，return o2 - o1 是降序。那么原因我们不妨跳进去源码看一下

以下给出几种范例
用匿名内部类实现一维数组升序

import java.util.Arrays; import java.util.Comparator; public class ArraysDemo1 { public static void main(String[] args) { //int包装类对象数组，赋值 Integer[] arr = {12,15,32,16,20,25}; //传入引用类型对象arr，用匿名类实现Comparator接口，i1在前则为升序，反正降序 Arrays.sort(arr, new Comparator() { @Override public int compare(Integer i1, Integer i2) { int num = i2 - i1; return num; } }); //打印数组 System.out.println(Arrays.toString(arr)); } }

Lamda表达式对二维数组进行第一维度排序假如有一个二维数组是nums = [[5, 0], [4, 1], [6, 2]]，这里面的每一个一维数组的第一个元素是值，第二个元素是序号，我想要排序的结果是nums = [[4, 1], [5, 0], [6, 2]]，那可以这样做

public static void main(String[] args) { int[][] nums = {{5, 0}, {4, 1}, {6, 2}}; //重写Comparator接口里面的compare方法，用Lambda表达式写比较简洁 Arrays.sort(nums, (o1, o2) -> o1[0] - o2[0]); for (int i = 0; i < nums.length; i++) { System.out.println("值：" + nums[i][0] + "序号：" + nums[i][1]); } /* 值：4序号：1 值：5序号：0 值：6序号：2 */ }

对二维数组进行双维度排序

int[][] intervals = { { 2, 3 }, { 4, 5 }, { 6, 7 }, { 8, 9 }, { 1, 10 } }; Arrays.sort(intervals, new Comparator() { @Override public int compare(int[] o1, int[] o2) { if (o1[0] == o2[0]) return o1[1] - o2[1]; return o1[0] - o2[0]; } });

参考链接：
sort方法和自定义比较器的写法
Arrays类的sort方法对数组的升序和降序
如何用Arrays.sort对二维数组进行排序
详解JAVA使用Comparator接口实现自定义排序
Comparator的compare方法如何定义升序降序
源码 Comparator的compare的源码

public static void sort(T[] a, Comparator c) { if (c == null) { sort(a); } else { if (LegacyMergeSort.userRequested) legacyMergeSort(a, c); else TimSort.sort(a, 0, a.length, c, null, 0, 0); } }

private static void legacyMergeSort(T[] a, Comparator c) { T[] aux = a.clone(); if (c==null) mergeSort(aux, a, 0, a.length, 0); else mergeSort(aux, a, 0, a.length, 0, c); }

private static void mergeSort(Object[] src, Object[] dest, int low, int high, int off, Comparator c) { int length = high - low; // Insertion sort on smallest arrays if (length < INSERTIONSORT_THRESHOLD) { for (int i=low; ilow && c.compare(dest[j-1], dest[j])>0; j--) swap(dest, j, j-1); return; }// Recursively sort halves of dest into src int destLow= low; int destHigh = high; low+= off; high += off; int mid = (low + high) >>> 1; mergeSort(dest, src, low, mid, -off, c); mergeSort(dest, src, mid, high, -off, c); // If list is already sorted, just copy from src to dest.This is an // optimization that results in faster sorts for nearly ordered lists. if (c.compare(src[mid-1], src[mid]) <= 0) { System.arraycopy(src, low, dest, destLow, length); return; }// Merge sorted halves (now in src) into dest for(int i = destLow, p = low, q = mid; i < destHigh; i++) { if (q >= high || p < mid && c.compare(src[p], src[q]) <= 0) dest[i] = src[p++]; else dest[i] = src[q++]; } }

关键部分

if (length < INSERTIONSORT_THRESHOLD) { for (int i=low; ilow && c.compare(dest[j-1], dest[j])>0; j--) swap(dest, j, j-1); return; }

可以看到这里面调用了compare方法，当方法的返回值大于0的时候就将数组的前一个数和后一个数做交换。以升序为例来讲解，升序的话compare方法就 return o1 - o2，那么就是 return dest[j-1] - dest[j]。当 dest[j-1] > dest[j] 时，就进行交换。当 dest[j-1] <= dest[j] 时位置不变，从而达到数组升序。
Arrays.sort(） jdk中的Arrays.sort(）的实现是通过所谓的双轴快排的算法
双轴快排：
快速排序使用的是分治思想，将原问题分成若干个子问题进行递归解决。选择一个元素作为轴(pivot)，通过一趟排序将要排序的数据分割成独立的两部分，其中一部分的所有数据都比轴元素小，另外一部分的所有数据都比轴元素大，然后再按此方法对这两部分数据分别进行快速排序，整个排序过程可以递归进行，以此达到整个数据变成有序序列。
双轴快排(DualPivotQuicksort)，顾名思义有两个轴元素pivot1，pivot2，且pivot ≤
pivot2，将序列分成三段：x < pivot1、pivot1 ≤ x ≤ pivot2、x >pivot2，然后分别对三段进行递归。这个算法通常会比传统的快排效率更高，也因此被作为Arrays.java中给基本类型的数据排序的具体实现。
下面我们以JDK1.8中Arrays对int型数组的排序为例来介绍其中使用的双轴快排：
1.判断数组的长度是否大于286，大于则使用归并排序(merge sort)，否则执行2。

// Use Quicksort on small arrays if (right - left < QUICKSORT_THRESHOLD) { sort(a, left, right, true); return; } // Merge sort ......

2.判断数组长度是否小于47，小于则直接采用插入排序(insertion sort)，否则执行3。

// Use insertion sort on tiny arrays if (length < INSERTION_SORT_THRESHOLD) { // Insertion sort ...... }

3.用公式length/8+length/64+1近似计算出数组长度的1/7。

// Inexpensive approximation of length / 7 int seventh = (length >> 3) + (length >> 6) + 1;

4.取5个根据经验得出的等距点。

/* * Sort five evenly spaced elements around (and including) the * center element in the range. These elements will be used for * pivot selection as described below. The choice for spacing * these elements was empirically determined to work well on * a wide variety of inputs. */ int e3 = (left + right) >>> 1; // The midpoint int e2 = e3 - seventh; int e1 = e2 - seventh; int e4 = e3 + seventh; int e5 = e4 + seventh;

5.将这5个元素进行插入排序

// Sort these elements using insertion sort if (a[e2] < a[e1]) { int t = a[e2]; a[e2] = a[e1]; a[e1] = t; } if (a[e3] < a[e2]) { int t = a[e3]; a[e3] = a[e2]; a[e2] = t; if (t < a[e1]) { a[e2] = a[e1]; a[e1] = t; } } if (a[e4] < a[e3]) { int t = a[e4]; a[e4] = a[e3]; a[e3] = t; if (t < a[e2]) { a[e3] = a[e2]; a[e2] = t; if (t < a[e1]) { a[e2] = a[e1]; a[e1] = t; } } } if (a[e5] < a[e4]) { int t = a[e5]; a[e5] = a[e4]; a[e4] = t; if (t < a[e3]) { a[e4] = a[e3]; a[e3] = t; if (t < a[e2]) { a[e3] = a[e2]; a[e2] = t; if (t < a[e1]) { a[e2] = a[e1]; a[e1] = t; } } } }

6.选取a[e2]，a[e4]分别作为pivot1，pivot2。由于步骤5进行了排序，所以必有pivot1 <=pivot2。定义两个指针less和great，less从最左边开始向右遍历，一直找到第一个不小于pivot1的元素，great从右边开始向左遍历，一直找到第一个不大于pivot2的元素。

/* * Use the second and fourth of the five sorted elements as pivots. * These values are inexpensive approximations of the first and * second terciles of the array. Note that pivot1 <= pivot2. */ int pivot1 = a[e2]; int pivot2 = a[e4]; /* * The first and the last elements to be sorted are moved to the * locations formerly occupied by the pivots. When partitioning * is complete, the pivots are swapped back into their final * positions, and excluded from subsequent sorting. */ a[e2] = a[left]; a[e4] = a[right]; /* * Skip elements, which are less or greater than pivot values. */ while (a[++less] < pivot1); while (a[--great] > pivot2);

7.接着定义指针k从less-1开始向右遍历至great，把小于pivot1的元素移动到less左边，大于pivot2的元素移动到great右边。这里要注意，我们已知great处的元素小于pivot2，但是它与pivot1的大小关系，还需要进行判断，如果比pivot1还小，需要移动到到less左边，否则只需要交换到k处。

/* * Partitioning: * *left partcenter partright part * +--------------------------------------------------------------+ * |< pivot1|pivot1 <= && <= pivot2|?|> pivot2| * +--------------------------------------------------------------+ *^^^ *||| *lesskgreat * * Invariants: * *all in (left, less)< pivot1 *pivot1 <= all in [less, k)<= pivot2 *all in (great, right) > pivot2 * * Pointer k is the first index of ?-part. */ outer: for (int k = less - 1; ++k <= great; ) { int ak = a[k]; if (ak < pivot1) { // Move a[k] to left part a[k] = a[less]; /* * Here and below we use "a[i] = b; i++; " instead * of "a[i++] = b; " due to performance issue. */ a[less] = ak; ++less; } else if (ak > pivot2) { // Move a[k] to right part while (a[great] > pivot2) { if (great-- == k) { break outer; } } if (a[great] < pivot1) { // a[great] <= pivot2 a[k] = a[less]; a[less] = a[great]; ++less; } else { // pivot1 <= a[great] <= pivot2 a[k] = a[great]; } /* * Here and below we use "a[i] = b; i--; " instead * of "a[i--] = b; " due to performance issue. */ a[great] = ak; --great; } }

8.将less-1处的元素移动到队头，great+1处的元素移动到队尾，并把pivot1和pivot2分别放到less-1和great+1处。

// Swap pivots into their final positions a[left]= a[less- 1]; a[less- 1] = pivot1; a[right] = a[great + 1]; a[great + 1] = pivot2;

9.至此，less左边的元素都小于pivot1，great右边的元素都大于pivot2，分别对两部分进行同样的递归排序。

// Sort left and right parts recursively, excluding known pivots sort(a, left, less - 2, leftmost); sort(a, great + 2, right, false);

10.对于中间的部分，如果大于4/7的数组长度，很可能是因为重复元素的存在，所以把less向右移动到第一个不等于pivot1的地方，把great向左移动到第一个不等于pivot2的地方，然后再对less和great之间的部分进行递归排序。

/* * If center part is too large (comprises > 4/7 of the array), * swap internal pivot values to ends. */ if (less < e1 && e5 < great) { /* * Skip elements, which are equal to pivot values. */ while (a[less] == pivot1) { ++less; } while (a[great] == pivot2) { --great; } } ...... // Sort center part recursively sort(a, less, great, false);

另外参考了其他博文，算法思路如下：
算法步骤

对于很小的数组（长度小于47），会使用插入排序。
选择两个点P1,P2作为轴心，比如我们可以使用第一个元素和最后一个元素。
P1必须比P2要小，否则将这两个元素交换，现在将整个数组分为四部分：
（1）第一部分：比P1小的元素。
（2）第二部分：比P1大但是比P2小的元素。
（3）第三部分：比P2大的元素。
（4）第四部分：尚未比较的部分。
在开始比较前，除了轴点，其余元素几乎都在第四部分，直到比较完之后第四部分没有元素。
从第四部分选出一个元素a[K]，与两个轴心比较，然后放到第一二三部分中的一个。
移动L，K，G指向。
重复 4 5 步，直到第四部分没有元素。
将P1与第一部分的最后一个元素交换。将P2与第三部分的第一个元素交换。
递归的将第一二三部分排序。