Adam Hacks

CS Review: Merge Sort

Merge sort is among the divide and conquer algorithms. That is to say, it divides the input array into half, calls itself on both halves. This process is done recursively over the entire array. When we’re done, all the individually sorted pieces are merged back together to form the fully sorted array.

A Closer Look at the Functionality

Merge sort can be a bit hard to wrap your head around when you first hear it, so let’s take a look at how the process looks on an example array. Consider that you have the following, unsorted array of integers:

Unsorted example

If we apply merge sort to this data, it’s going to recursively split the array in half until we have done this over the entire length of the data. At this point, it will then start merging the arrays together in the correct, sorted order. The process looks something like this:

The merge sort process

I personally think things make a lot more sense when you work through a diagram of the algorithm, such as this one. What makes things even more clear for me, however, is simply seeing the implementation!

Implementation

The merge sort algorithm will make use of two methods. One is the recursive merge sort method and the other is a merge method which, as the name suggests, will merge the halves of the array back together to form the completely sorted structure.

For our merge sort method, we will take in the array and the length of the array. Since it’s recursive, we will need a base case, which in this case is simply to check if the length of the array is one (which therefor means that we can’t possibly split the data up any more than we already have). Otherwise, it will happily go about splitting up the arrays and calling merge on them.

The merge method simply takes in our two sub-arrays and the original array that we are sorting and merges them together.

Let’s take a look at the algorithm in Java:

 1package com.hackeradam.sorting;
 2public class Main {
 3    public static void merge(int[] dest, int[] left, int[] right) {
 4        int szLeft = left.length;
 5        int szRight = right.length;
 6        // Index counters for the various arrays
 7        int leftIdx = 0, rightIdx = 0, destIdx = 0;
 8        // Perform the initial merging
 9        while (leftIdx < szLeft && rightIdx < szRight) {
10            if (left[leftIdx] <= right[rightIdx])
11                dest[destIdx++] = left[leftIdx++];
12            else
13                dest[destIdx++] = right[rightIdx++];
14        }
15        // Handle any remaining items in left
16        while (leftIdx < szLeft)
17            dest[destIdx++] = left[leftIdx++];
18        // Handle any remaining items in right
19        while (rightIdx < szRight)
20            dest[destIdx++] = right[rightIdx++];
21    }
22    public static void mergeSort(int[] arr, int length) {
23        if (length  == 1) // Base case for the recursive call
24            return;
25        // Find the midpoint of the array
26        int middle = length / 2;
27        // Reserve space for the left and right portions that we will split out
28        int[] left = new int[middle];
29        int[] right = new int[length - middle];
30        // Perform the split
31        for (int i = 0; i < middle; i++)
32            left[i] = arr[i];
33        for (int i = middle; i < length; i++)
34            right[i - middle] = arr[i];
35        // Recursive calls
36        mergeSort(left, middle);
37        mergeSort(right, length - middle);
38        // Merge the halves together
39        merge(arr, left, right);
40    }
41    public static void printArray(int arr[]) {
42        for (int i=0; i < arr.length; i++) {
43            System.out.print(arr[i] + " ");
44        }
45        System.out.println();
46    }
47    public static void main(String... args) {
48        int items[] = {5, 2, 88, 100, 3, 0, -1, 20, 30, 7, 99};
49        printArray(items);
50        System.out.println("After merge sort: ");
51        mergeSort(items, items.length);
52        printArray(items);
53    }
54}

Complexity

The merge sort algorithm, as is typical with the divide and conquer algorithms, is quite a fast sort algorithm. In face, it has a worst, average, and best case time complexity of O(n log n).

One thing to take note of is the fact that merge sort requires an amount of additional space that is equal to the size of the unsorted array (which should be intuitively obvious). That is to say, that merge sort requires O(n) of additional storage, where n is the size of the array we are sorting. As such, it’s not exactly a recommended algorithm to use for sorting extremely large datasets.

Merge sort is the preferred method for sorting linked lists due to the way the memory functions in a linked list as compared to an array.

#Computer-Science #Programming #Algorithms