Analysis of Recursive, List-Based Insertion Sort

Described here is a series of exercises that analyze a recursive implementation of a list-based insertion sort. There are a number exercises; their difficulty varies significantly. Please send me (Steve Vegdahl) any questions or comments you have: vegdahl@up.edu.

Objectives

á      To give students experience analyzing the time complexity and correctness of a recursive algorithm.

á      To give students experience applying mathematical induction to a problem involving data structures.

These exercises give students practice with mathematical induction. For many, it will be the first time they have seen an induction proof on a data structure, as opposed to one that is purely numerically based.

The exercises are engaging to students that are strong mathematically because they find mathematical induction to be fun, and they see it being applied to computer science rather than to a purely mathematical problem. It is instructive (perhaps not as engaging) to students who are weaker mathematically, because many portions of the problem are accessible to them.

Definition of Insertion Sort

 

The following definition of the insertionSort function is used, along with its helper-function insert:

insertionSort(lst) =
     if lst = [] then []
     else insert(lst.head, insertionSort(lst.tail));
where
     insert(el, lst) =
          if lst = [] then [el]
          else if el <= lst.head then el:lst
          else lst.head:insert(el,lst.tail);

Preliminaries

In class, I start by describing the programming notation (for those who have not seen it), and by explaining that the goal is to produce of sorted copy of the list—the original list is not modified. We do several by-hand examples of the program being run on small lists, so that students become comfortable with the algorithm and notation. (Some may not have seen functional programming before.)

Because most of the exercises use induction—and because students vary widely their comfort with induction—we review what mathematical induction is and why it works. I typically explain it in two different ways:

á      An induction proof is a proof that a proof exists. If I prove the base case (say, for 0) and the induction case. I can prove it true for N using the base case and N copies of the induction case.

á      An induction proof is a special case of proof by contradiction. Once I prove the base case the induction case, if you claim that the conjecture is not true for the base case and all larger values, I will ask you Òwhat is the smallest value for which it does not holdÓ:

o   If you tell me the base-case number, my base-case proof shows you wrong.

o   If you tell me any larger number, my induction-case proof shows you wrong.

Overview of the Exercises

We do several exercises. Some are done in class; others are assigned as required homework; the most difficult is assigned as extra credit.

1.     Analyze the time-complexity of insertionSort, as a function of list length. We start by analyzing insert, making a hypothesis regarding its complexity by expanding out a few terms; then we prove it by induction. The same is then done for insertionSort, using result from insert.

2.     Prove the partial correctness of insert and inserionSort, using recursive definitions for Òis sortedÓ and Òis permutation ofÓ (see below). In this proof, we assume Òis permutation ofÓ is an equivalence relation.

á      Note: the complexity result from Step 1 implies termination.

3.     Prove from the recursive definition that Òis permutation ofÓ is an equivalence relation.

Definition of ÒIs SortedÓ

We use the following recursive definition of Òis sortedÓ:

á      Any list whose length is 0 or 1 is sorted.

á      A list with length greater than one is sorted if and only if

o   its first element is less than or equal to its second element, and

o   its tail (all but its first element) is sorted.

Definition of ÒIs Permutation OfÓ

 

List A is considered to be permutation of List B if and only if one of the following is true:

o   Lists A and B are both empty.

o   Lists A and B are both non-empty, have equal first elements, and the tail of A is a permutation of the tail of List B

o   Lists A and B are both non-empty, have unequal first elements, and there exists a list C such that

o   A.tail is a permutation of (B.head):C, and

o   B.tail is a permutation of (A.head):C

(Intuitively, C is a list containing all of the elements—in some order—except A.head and B.head.)

Difficulty of the Exercises

Table 1 is a list of the sub-tasks of the above three steps, giving an estimate of the difficulty of each.

Table 1: Estimated difficulty of the various exercises

step

task

difficulty

1

identify the two execution paths through insertionSort and the three execution paths through insert

low

write recurrences to express the worst-case complexity for insert and insertionSort, as a function of list length

medium: many students are not comfortable with recurrences

expand out the recurrences for insert and insertionSort; arrive at a hypothesis regarding a closed-form solution

low-to-medium: does not require complex math, but most students are not used to doing this

using induction to prove the complexity result for insert, and then for insertionSort

medium: not conceptually difficult, but many students are spooked by induction

2

prove by induction that the result of insert is always a sorted list, assuming that its input list is sorted

medium: the three cases together make it a bit complicated, and this is the first such proof theyÕve seen

prove by induction that the result of insertionSort is always a sorted list

low: most of the work is done by the proof for insert

prove by induction that the result of insert(el,lst) is a permutation of el:lst.

medium-to-high: the third-branch case requires that students come up with an appropriate Òlist cÓ (for the third bullet-item in the definition of permutation)

prove by induction that the result of insertionSort(lst) is a permuation of lst

high: the recursive case requires several transformation steps

3

prove by induction that the permutation relation (as defined above) is reflexive

low: a very straightforward induction proof

prove by induction that the permutation relation is symmetric

medium: relatively straightforward, but requires more thought than for the reflexive part

prove by induction that the permutation relation is transitive

very high: involves three lists, X, Y and Z, and has subcases for all combinations of the respective heads of these lists being equal and unequal to one another. The case where the three heads are all unequal requires the three applications of the third bullet-item in the definition of ÒpermutationÓ, each with their own ÔCÕ.

My ÒDivision of LaborÓ

When I teach this, I generally do parts 1 and 2 interactively with the students in class, while Part 3 is left as a homework exercise:

á      Òpermutation is reflexiveÓ: required

á      Òpermutation is symmetricÓ: required

á      Òpermutation is transitiveÓ: extra credit

o   perhaps 20% of my students attempt the ÒtransitiveÓ proof, with perhaps 30% of those getting it (mostly) right.

This is certainly not the only way to approach it pedagogically, but I believe that most students need to see a few proofs of this type before doing one on their own.