How to calculate standard deviation

When I started this blog last month, I thought “Standard Deviation” was a snappy title. Of course, I also knew about standard deviation as a statistical tool, however I didn’t expect that this overlap would cause Google search to drive 50+ visitors a month here looking for implementations of the standard deviation formula.

So as a “public service”, here is some code to figure standard deviation in Ruby and Java.

(disclaimer: no warranties as to correctness, particularly to the nth decimal place, don’t use this to run your home made nuclear reactor or air traffic control system, blah blah, etc, etc :-))

The algorithm I’ll be using is “borrowed” from wikipedia’s entry on Algorithms to calculate variance. Specifically, I’ll be using a variant of algorithm II, which is sourced from Knuth, except we’ll calculate the standard deviation for the population, rather than a sample.

As you should probably know, standard deviation is defined as the square root of the variance. If you didn’t know this, maybe you should go read about standard deviation first.

  1. /**
  2. * @param population an array, the population
  3. * @return the variance
  4. */
  5. public double variance(double[] population) {
  6.         long n = 0;
  7.         double mean = 0;
  8.         double s = 0.0;
  9.  
  10.         for (double x : population) {
  11.                 n++;
  12.                 double delta = x - mean;
  13.                 mean += delta / n;
  14.                 s += delta * (x - mean);
  15.         }
  16.         // if you want to calculate std deviation
  17.         // of a sample change this to (s/(n-1))
  18.         return (s / n);
  19. }
  20.  
  21. /**
  22. * @param population an array, the population
  23. * @return the standard deviation
  24. */
  25. public double standard_deviation(double[] population) {
  26.         return Math.sqrt(variance(population));
  27. }

example usage:

  1. double[] arr = { 1, 3, 24, 17, 12, 6, 14};
  2. System.out.printf("%f", standard_deviation(arr));
  3.   // prints 7.596992
  1. def variance(population)
  2.     n = 0
  3.     mean = 0.0
  4.     s = 0.0
  5.     population.each { |x|
  6.       n = n + 1
  7.       delta = x - mean
  8.       mean = mean + (delta / n)
  9.       s = s + delta * (x - mean)
  10.     }
  11.     # if you want to calculate std deviation
  12.     # of a sample change this to "s / (n-1)"
  13.     return s / n
  14.   end
  15.  
  16.   # calculate the standard deviation of a population
  17.   # accepts: an array, the population
  18.   # returns: the standard deviation
  19.   def standard_deviation(population)
  20.     Math.sqrt(variance(population))
  21.   end

example usage:

  1. puts standard_deviation([1, 3, 24, 17, 12, 6, 14])
  2.  # prints 7.59699188589047

If you found this at all useful, (or have spotted a bug), please leave a comment to that effect…

14 Responses to “How to calculate standard deviation”

  1. Sandeep Thukral Says:

    I have copied your Java code for use in my performance testing client. I hope this works for long ‘population’ also.

  2. Andy Says:

    Thanks for the code. I could’ve figured it out but you saved me some time!

  3. Brian Says:

    Thanks … this was very helpful. I tried it in Ruby and it worked for my simple tests.

    It looks like your code is a variant of Algorithm III rather than Algorithm II (at least in the current post http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance)

    I also implemented the other Algorithm and glad to send it along if interested.

  4. Daniel Says:

    Thanks this was exactly what I was looking for when I googled “calculate std”, and in Ruby aswell so I didn’t even have to rewrite it :)

  5. gehwokka Says:

    my search was “ruby calculate standard deviation” and boom! this page popped up. Thanks a bunch for putting this up. You’ve saved me a bunch of work!

  6. Kaz Says:

    As one of those 50+ visitors per month, thankyou Warren. :)

  7. neil Says:

    You’re the man! Thank you very much.

  8. kudjo Says:

    i found it very useful, warren u are a life saver, thank u

  9. Dharmendra Says:

    Thank you , I found it useful to understand. I translated it into PL/SQL

  10. mischa Says:

    Perfect google result.

    Thanks

  11. Angy Says:

    Thanx, clean and clear!!

  12. Brian Egge Says:

    Thanks for the code. I added:


    values = []
    $<.each do |l|
    values << l.to_f
    end

    puts ["count", "mean", "stddev"].join(”\t”)
    puts [values.size(), mean(values), standard_deviation(values)].join(”\t”)

    So I can pipe in a list of values from the command line.

    $ du -s * | cut -f1 | stddev
    count mean stddev
    9 2048.44444444444 5625.49061783575

  13. Roger Says:

    Are you sure about this implementation?
    It looks to me like the caclulation of the mean would be sensitive to the order of the data

    try

    {1,2,3} vs {3,2,1}

    x m=0,s=0
    1 d=1,m=1,s=1
    2 d=1,m=1.5,s=1.5
    3 d=1.5,m=2,s=3

    sd=1

    x m=0,s=0
    3 d=3,m=3,s=0
    2 d=-1,m=2.5,s=1
    1 d=-2,m=1.5,s=2

    sd=0.666

  14. Alex Says:

    Cheers, mate! This saved me some time.
    @Roger: code works fine here.

Leave a Reply