- Home
- Anonymous
- Sign in
- Create
- Ask a question
- Spaces
- Site Issues (NOT FOR DATABASE QUESTIONS)
- Explore
- Topics
- Questions
- Users
- Badges

- Wikipedia: Algorithms for calculating variance

It's more numerically stable than either the two-pass or online simple sum of squares collectors suggested in other responses. The stability only really matters when you have lots of values that are close to each other as they lead to what is known as "catastrophic cancellation" in the floating point literature.

You might also want to brush up on the difference between dividing by the number of samples (N) and N-1 in the variance calculation (squared deviation). Dividing by N-1 leads to an unbiased estimate of variance from the sample, whereas dividing by N on average underestimates variance (because it doesn't take into account the variance between the sample mean and the true mean).

I wrote two blog entries on the topic which go into more details, including how to delete previous values online:

- Computing Sample Mean and Variance Online in One Pass
- Deleting Values in Welfordâ€™s Algorithm for Online Mean and Variance

You can also take a look at my Java implement; the javadoc, source, and unit tests are all online:

- Javadoc:
`stats.OnlineNormalEstimator`

- Source:
`stats.OnlineNormalEstimator.java`

- JUnit Source:
`test.unit.stats.OnlineNormalEstimatorTest.java`

- LingPipe Home Page

**How to delete a newline if it is the last character in a file?**

**How to pass command-line arguments to a Perl program?**

**Howto use a variable in the replacement side of the Perl substitution operator?**

**How to summ quickly all numbers in a file?**

**How to remove duplicate items from an array in Perl?**

**How to differ of Two Arrays Using Perl**

**How to escape a forward slash in a regular expression**

Copyright 2024 HowProg.One
Privacy Policy