Cycle sort

Cycle sort
Example of cycle sort sorting a list of random numbers.
Class Sorting algorithm
Data structure Array
Worst-case performance Θ(n2)
Best-case performance Θ(n2)
Average performance Θ(n2)
Worst-case space complexity Θ(n) total, Θ(1) auxiliary

Cycle sort is an in-place, unstable sorting algorithm, a comparison sort that is theoretically optimal in terms of the total number of writes to the original array, unlike any other in-place sorting algorithm. It is based on the idea that the permutation to be sorted can be factored into cycles, which can individually be rotated to give a sorted result.

Unlike nearly every other sort, items are never written elsewhere in the array simply to push them out of the way of the action. Each value is either written zero times, if it's already in its correct position, or written one time to its correct position. This matches the minimal number of overwrites required for a completed in-place sort.

Minimizing the number of writes is useful when making writes to some huge data set is very expensive, such as with EEPROMs like Flash memory where each write reduces the lifespan of the memory.

Algorithm

The following algorithm finds cycles and rotates them, giving a sorted result. Arrays are zero-indexed.

# Sort an array in place and return the number of writes.
procedure cycleSort(array):
  writes = 0

  # Loop through the array to find cycles to rotate.
  for cycleStart from 0 to length(array) - 2, inclusive:
    item = array[cycleStart]

    # Find where to put the item.
    pos = cycleStart
    for i from cycleStart + 1 to length(array) - 1, inclusive:
      if array[i] < item:
        pos += 1

    # If the item is already there, this is not a cycle.
    if pos == cycleStart:
      continue

    # Otherwise, put the item there or right after any duplicates.
    while item == array[pos]:
      pos += 1
    array[pos], item = item, array[pos]
    writes += 1

    # Rotate the rest of the cycle.
    while pos != cycleStart:
      # Find where to put the item.
      pos = cycleStart
      for i from cycleStart + 1 to length(array) - 1, inclusive
        if array[i] < item:
          pos += 1

      # Put the item there or right after any duplicates.
      while item == array[pos]:
        pos += 1
      array[pos], item = item, array[pos]
      writes += 1

  return writes

Situation-specific optimizations

When the array contains only duplicates of a relatively small number of items, a constant-time perfect hash function can greatly speed up finding where to put an item1, turning the sort from Θ(n2) time to Θ(n + k) time, where k is the total number of hashes. The array ends up sorted in the order of the hashes, so choosing a hash function that gives you the right ordering is important.

Before the sort, create a histogram, sorted by hash, counting the number of occurrences of each hash in the array. Then create a table with the cumulative sum of each entry in the histogram. The cumulative sum table will then contain the position in the array of each element. The proper place of elements can then be found by a constant-time hashing and cumulative sum table lookup rather than a linear search.

External links

^ "Cycle-Sort: A Linear Sorting Method", The Computer Journal (1990) 33 (4): 365-367.

This article is issued from Wikipedia - version of the 7/25/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.