Another common mistake in least squares fitting

May 10, 2010 Filed under Blog, Featured, Potpourri, Resources, Writing

On p. 121 of Eloquent Science, I spend a page discussing the misuses of linear correlation. Turns out I didn’t cover all of them.

Mark Hibberd writes:

I think your Figure 11.10 [to the right] clearly shows a very common mistake of inappropriately using a standard least squares fit. The fit given (y = -13.2 + 0.42 x) assumes that there is no uncertainty in the y values. Eye-balling the data, it is clear that the line is not a good fit.

If you swap the axes and redo the standard least squares fit you get a fit that would be shown the figure as y = 36.0 + 1.89 x, which is even worse. (I digitised the figure to do the fitting.)

The correct method is to use a bivariate fit, which allows for uncertainty in both x and y. If we assume equal uncertainty in both x and y values, we get y = -2.1 + 0.75 x.

This method is well explained by Cantrell, C.A. (2008) “Technical Note: Review of methods for linear least-squares fitting of data and application to atmospheric chemistry problems” Atmos. Chem. Phys., 8, 5477–5487. He also includes a very useful spreadsheet in supplemental material available with the paper, which I strongly recommend to all scientists who fit data.

Note that a useful warning sign of problems is if fitting x vs y and y vs x give different standard least square fits.

Thanks, Mark, for the advice. For what it’s worth, in the figure in question, we weren’t expecting a great fit to the data, no matter the method. Nevertheless, our inclusion of the regression line should have been done appropriately, regardless.

Comments are closed.

Another common mistake in least squares fitting

Author

Recent Posts

Popular

Categories

Archives

Search