Instability of ordination results under changes in input data order: explanations and remedies

by Jari Oksanen & Peter R. Minchin

Journal of Vegetation Science 8, 447-454; 1997.

Abstract.

We demonstrate that correspondence analysis (CA) and detrended correspondence analysis (DCA) ordinations produced by the popular program CANOCO are unstable under reordering of the species and sites in the input data matrix. In CA, the main cause of the instability is the use of insufficiently stringent convergence criteria in the power algorithm that is used to estimate the eigenvalues. The use of stricter criteria produces CA ordinations that are acceptably stable. The divisive classification program TWINSPAN uses CA based on a similar algorithm, but with extremely lax convergence criteria, and is thus susceptible to extreme instability. We detected an order-dependent programming error in the non-linear rescaling procedure that forms part of DCA. When this bug is corrected, much of the instability in DCA disappears. The stability of DCA solutions is further enhanced by the use of strict convergence criteria. Although in our trials, much of the instability occurred on the third and forth axes, it should not be assumed that published 2-dimensional ordinations are sufficiently accurate. Data sets which have pairs of almost equal eigenvalues among the first 3 axes could suffer from marked instability in the first two dimensions. We strongly recommend that a debugged, strict version of CANOCO be released. Meanwhile, users can check the stability of their CA and DCA ordinations using the software that we have made available on the World Wide Web. An accurate program for CA, a debugged, strict version of DECORANA (for DCA) and a strict version of TWINSPAN are also available from our site.

Keywords: algorithm; correspondence analysis; clustering; detrended correspondence analysis; eigenanalysis; non-linear rescaling; numerical method; precision; tolerance.