I store my dishes at home in a stack, with the smallest dishes on top, and the largest dishes at the bottom. After enough time spent taking some out and putting them back, they tend to get out of order. I want to sort them, but swapping two dishes has an interesting caveat that regular arrays don’t. I can remove a dish from any index in the stack, but removing them requires slightly lifting up all the dishes above, to create enough give allowing me to slide the dish out. In simpler terms, there’s a cost to extracting a dish at index $i$ that is proportional to its position in the stack (assuming position is increasing from the top of the stack). Similarly, inserting a dish also has a similar cost dependent on position. Given this model, what then is the optimal sorting algorithm that minimzes the cost of removing and inserting dishes?

A bunch of plates stacked on top of each
other

Model

Given an array of integers $A$ of length $n$ (to represent the size of a dish), we define two operations - $\text{pop}(A, i)$ and $\text{insert}(A, i, x)$. $\text{pop}$ removes the element $A[i]$ from the array and returns it, while automatically shifting up all elements $A[i+1:n]$ by one place. Similarly, $\text{insert}$ shifts all elements in $A[i + 1: n]$ up by one place and inserts $x$ at $i$.

Both $\text{pop}$ and $\text{insert}$ have a cost associated with them given by $\text{cost}{\text{pop}}(i)$ and $\text{cost}{\text{insert}}(i)$. For our dish sorting scenario, we have both $\text{cost}{\text{pop}}(i) = \text{cost}{\text{insert}}(i) = i - 1$, since we have to lift up the stack of dishes above index $i$, and the difficulty of doing this corresponds to how many dishes there are, which is $i - 1$.

Given this model, sort $A$ in non-decreasing order while only using the $\text{pop}$ and $\text{insert}$ operations to modify the array, and minimizing the total cost of these operations.

Algorithms

It is instructive to test known sorting algorithms to see how we might do better. Let’s start with what seems like an obvious first choice - insertion sort.

Insertion Sort

Insertion sort attempts to sort a list by inserting elements into an already sorted list at their correct places, until all elements are exhausted, and the list is thus sorted. Given the invariant that $A[1:i]$ is sorted, insertion sort looks at $A[i + 1]$ and finds where it should be inserted in $A[1:i]$, which for non-decreasing order is the smallest index $j$ such that $A[j] > A[i + 1]$. Python code for insertion sort is -

1
2
3
4
5
6
7
8
9
def insertion_sort(arr: List[T]) -> None:
    i = 1
    while i < len(arr):
        j = i
        while j > 0 and arr[j - 1] > arr[j]:
            arr[j - 1], arr[j] = arr[j], arr[j - 1]
            j -= 1
        i += 1
    return arr

We do not need to repeatedly swap elements since we get the shifting of elements for free in our problem setup. We simply need to find the right index to insert our element. We can rewrite insertion sort using our allowed operations as

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def insertion_sort(arr: List[T]) -> None:
    i = 1
    while i < len(arr):
        j = i - 1
        while j > 0 and arr[j - 1] > arr[j]:
            j -= 1
        if i != j:
            x = pop(arr, i)
            insert(arr, j, x)
        i += 1

What’s the expected cost of using insertion sort? For every element $i$ in $[2, n]$, we locate an insertion index beween $[1, i - 1]$. Let $X_i$ be the event that we insert element $A[i]$ at some index $r$. We can assume all indices are equally likely, so $P(X_i)$ is $\frac{1}{i}$. The cost for insertion at any index $r$ is given by the cost function $\text{cost}_{\text{insert}}(r) = r - 1$. The expected cost then is -

$\begin{aligned} E[X_i] &= \frac{i - 1}{i} + \frac{i - 2}{i} + \cdots + \frac{1}{i} + \frac{0}{i} \ &= \frac{1}{i} \sum_{r = 0}^{i - 1} r \ &= \frac{1}{i} \cdot \frac{(i - 1)i}{2} \ &= \frac{i - 1}{2} \end{aligned}$

We do this for all indices from $[2, n]$, so the total expected cost is

$\begin{aligned} \sum_{i = 2}^{n} E[X_i] &= \sum_{i = 2}^{n} \frac{i - 1}{2} \ &= \frac{1}{2}\left[\left(\frac{n(n+1)}{2} - 1 \right) - (n - 2)\right] \ &= \frac{n^2 + n - 2 - 2n + 4}{2} \ &= \frac{n^2 - n + 2}{2} \ &= \frac{(n - 2)(n + 1) + 4}{2} \ &\approx \mathcal{O}(n^2) \end{aligned}$

Selection Sort

Selection sort is similar to insertion sort in that it maintains a sorted sub-list and grows it by looking for elements in the unsorted sub-list. However, where insertion sort grabs the next available element and looks for the correct place to insert it, selection sort grabs the next available place and looks for the correct element to place into it. Python code for selection sort is -

1
2
3
4
5
6
7
8
9
def selection_sort(arr: List[T]) -> None:
    n = len(arr)
    for i in range(n):
        min_index = i
        for j in range(i + 1, n):
            if arr[j] < arr[min_index]:
                min_index = j
        arr[i], arr[min_index] = arr[min_index], arr[i]
    return arr

We can modify this for our purpose by replacing the swap with an insert.

1
2
3
4
5
6
7
8
9
def selection_sort(arr: List[T]) -> None:
    n = len(arr)
    for i in range(n):
        min_index = i
        for j in range(i + 1, n):
            if arr[j] < arr[min_index]:
                min_index = j
        x = pop(arr, min_index)
        insert(arr, i, x)

What’s the expected cost of using selection sort? For every element $i$ in $[1, n - 1]$, we locate the element to be inserted there in $[i + 1, n]$. Following very similarly from insertion sort, let $X_i$ be the event that an element at index $r \in [i + 1, n]$ is selected to be inserted into index $i$. We again assume that every index is equiprobable, so the expected cost is -

$\begin{aligned} E[X_i] &= \frac{i}{i} + \frac{i + 1}{i} + \cdots + \frac{n - 1}{i} \ &= \frac{1}{i} \sum_{r = i + 1}^{n} r - 1 \ &= \frac{(n - i)(n + i - 1)}{2i} \end{aligned}$

We do this for all indices $i \in [1, n - 1]$. The total expected cost is -

$\begin{aligned} \sum_{i = 1}^{n-1} E[X_i] &= \sum_{i = 1}^{n-1} \frac{(n - i)(n + i - 1)}{2i} \ &= \frac{1}{4}(n-1)(2nH_{n-1} - n + 2) \ &\approx \mathcal{O}(n^2 \log n) \end{aligned}$

where $H_n$ is the $n^{\text{th}}$ harmonic number. The sum was solved using Wolfram Alpha because I really didn’t know how to myself, and the growth rate of $H_n$ is $\mathcal{O}(\log n)$.1


How do we improve upon this? We can make a few observations after sitting with the problem a little -

  1. For every element, we should only be operating on it once i.e. exactly once pop and insert. Intuitively, if we operate on an element more than once, we could have probably operated on it just once and chosen a different index to insert it into instead.

  2. Operating on the least number of elements possible will reduce total cost. Since each element adds a cost for the pop and insert operations, minimizing the number of elements we operate on will reduce the overall cost.

  3. The least number of elements to be moved can by found by not moving anything in the longest increasing subsequence of the array.

LIS Sort

The above observations led to the following algorithm that I dub LIS sort. It uses the elements in the LIS as “anchors”, and only sorts elements not in the LIS because those in the LIS are already in sorted order.

1
2
3
4
5
6
1. Find the longest non-decreasing subsequence of the array
2. Sort all the elements of the array that do not belong to the LIS in non-decreasing order
3. For each element in the LIS in non-increasing order, do
    3a. Find all non-LIS elements > the LIS element and insert them at 1 + the (current) index of the LIS element in
    non-increasing order
4. Find all non-LIS elements <= the smallest LIS element and insert them at index 0 in non-increasing order

Proof of optimality

WIP because I’m not sure this is optimal and I lack facility with proofs! Ideally, we could run a planner to brute force the optimal solution and measure how LIS sort does in comparison.

Empirical Analysis

I plotted the costs of using bubble sort, selection sort, insertion sort, quick sort and LIS sort on $20$ different arrays of lengths ranging from $10$ to $10^4$. The repository for reproducing the analysis can be found here.

A line graph plotting the cost of using a sorting algorithm to the length of the array

As can be seen from the graph above, LIS sort and insertion sort are both clearly better than the other sorting algorithms. LIS sort is slightly better than insertion sort for smaller arrays, but they seem to approach the same costs at larger array sizes.

Boas, R. P., & Wrench, J. W. (1971). Partial Sums of the Harmonic Series. The American Mathematical Monthly, 78(8), 864–870. https://doi.org/10.1080/00029890.1971.11992881