Counting Sort

Counting sort (like bucket sort) is a sorting algorithm which takes advantage of knowing the maximum value within the array to be sorted. If the maximum value is not known, an initial pass of the data will be necessary to find the largest value (this pass will be O(n)). It uses this to create an array (referenced by the values within the array to be sorted) that will serve to count the number of values of a certain number and, after this, the future positions of the values on a third array, which is the sorted permutation of the second array (i.e., Ai <= Aj, i < j). This algorithm is stable and it is O(n+k), where k is the maximum value within the array. In order for this algorithm to have any semblance of efficiency, k must be comparatively small. The values within the original array have to be known and non-negative. If there are some negative numbers in the array, an initial pass of O(n) is necessary to identify the most negative number, and then the absolute value of this number can be added to all numbers in the list for purposes of calculating which array bin to increment. Note that the first array that is used in the counting sort has a length of the number of possible elements between 0 and the maximum known number. This is not a problem for small ranges of numbers, but large ranges of numbers will eat up more memory (or even hard drive space) than current systems have available. For instance, if one is ordering a list of numbers whose range is between 0 and 100, the counting sort may be the best possible sort. However, if one is trying to order a list of names alphabetically, the array would have to be of length 27n (using a numeric representation of letters, where 27 accounts for all the letters of the alphabet plus "space" to pad out names which are not of the maximum length), where n is the length of the largest name in characters. It is important to note that the length of the sorting array is independent of the number of items in the list to be sorted and dependent solely on the total possible values in the search space. Even a sort of a list of names where the maximum name length is 8 would require prohibitive amounts of RAM for array storage.

Sample implementations

C

/* end is the last index + 1 */ void  csort(int array[], const int end, const int max, const int min) {   int i,j;   /* offsets of +/-1 allow for zero */   const int range = max-min+1;   int count[range],       scratch[end];    for(i=0; i<range; i++)     count[i] = 0;    for(i=0; i<end; i++) {     int c = array[i]+1-min;     count[c]++;   }    for(i=1; i<range; i++)     count[i] += count[i-1];    for(i=0; i<end; i++) {     int c = array[i]-min;     int s = count[c];     scratch[s] = array[i];     count[c]++;   }    for(i=0; i<end; i++)     array[i] = scratch[i]; }

References:

  • Cormen, et al. Introduction to Algorithms 2nd. ed. 2001. The MIT Press.
  • Seward, Harold H. Information sorting in the application of electronic digital computers to business operations Masters thesis. MIT 1954.

 

<< PreviousWord BrowserNext >>
whitsun
william fox talbot
martinmas
gbp
group 15 element
carbon group
perry
dexter's laboratory
cow and chicken
red guards
heightfield
martic township, pennsylvania
maytown, pennsylvania
st. benet's abbey
how hill
new jersey state highway 23
hoveton great broad
loddon, england
new jersey state highway 24
new jersey state highway 29
new jersey state highway 26
michael mckean
david lander
vote
hyperplane
tourist attraction
chris elliott
university of massachusetts boston
bogosort
ad maiorem dei gloriam
geneseo (town), new york
lunenburg, nova scotia (town)
autaugaville, alabama
andy kaufman
billingsley, alabama
prattville, alabama
bay minette, alabama
daphne, alabama
elberta, alabama
fairhope, alabama
foley, alabama
gulf shores, alabama
loxley, alabama
orange beach, alabama