capscale {vegan} | R Documentation |

Constrained Analysis of Principal Coordinates (CAP) is an ordination method
similar to Redundancy Analysis (`rda`

), but it allows
non-Euclidean dissimilarity indices, such as Manhattan or
Bray–Curtis distance. Despite this non-Euclidean feature, the analysis
is strictly linear and metric. If called with Euclidean distance,
the results are identical to `rda`

, but `capscale`

will be much more inefficient. Function `capscale`

is a
constrained version of metric scaling, a.k.a. principal coordinates
analysis, which is based on the Euclidean distance but can be used,
and is more useful, with other dissimilarity measures. The function
can also perform unconstrained principal coordinates analysis,
optionally using extended dissimilarities.

capscale(formula, data, distance = "euclidean", sqrt.dist = FALSE, comm = NULL, add = FALSE, dfun = vegdist, metaMDSdist = FALSE, na.action = na.fail, ...)

`formula` |
Model formula. The function can be called only with the
formula interface. Most usual features of `formula` hold,
especially as defined in `cca` and `rda` . The
LHS must be either a community data matrix or a dissimilarity matrix,
e.g., from
`vegdist` or `dist` .
If the LHS is a data matrix, function `vegdist`
will be used to find the dissimilarities. The RHS defines the
constraints. The constraints can be continuous variables or factors,
they can be transformed within the formula, and they can have
interactions as in a typical `formula` . The RHS can have a
special term `Condition` that defines variables to be
``partialled out'' before constraints, just like in `rda`
or `cca` . This allows the use of partial CAP. |

`data` |
Data frame containing the variables on the right hand side of the model formula. |

`distance` |
The name of the dissimilarity (or distance) index if
the LHS of the `formula` is a data frame instead of
dissimilarity matrix. |

`sqrt.dist` |
Take square roots of dissimilarities. See section
`Notes` below. |

`comm` |
Community data frame which will be used for finding
species scores when the LHS of the `formula` was a
dissimilarity matrix. This is not used if the LHS is a data
frame. If this is not supplied, the ``species scores'' are the axes
of initial metric scaling (`cmdscale` ) and may be
confusing. |

`add` |
Logical indicating if an additive constant should be
computed, and added to the non-diagonal dissimilarities such
that all eigenvalues are non-negative in the underlying
Principal Co-ordinates Analysis (see `cmdscale`
for details). This implements “correction method 2” of
Legendre & Legendre (1998, p. 434). The negative eigenvalues are
caused by using semi-metric or non-metric dissimilarities with
basically metric `cmdscale` . They are harmless and
ignored in `capscale` , but you also can avoid warnings with
this option. |

`dfun` |
Distance or dissimilarity function used. Any function
returning standard `"dist"` and taking the index name as the
first argument can be used. |

`metaMDSdist` |
Use `metaMDSdist` similarly as in
`metaMDS` . This means automatic data transformation and
using extended flexible shortest path dissimilarities (function
`stepacross` ) when there are many dissimilarities based on
no shared species. |

`na.action` |
Handling of missing values in constraints or
conditions. The default (`na.fail` ) is to stop
with missing values. Choices `na.omit` and
`na.exclude` delete rows with missing values, but
differ in representation of results. With `na.omit` only
non-missing site scores are shown, but `na.exclude` gives
`NA` for scores of missing observations. Unlike in
`rda` , no WA scores are available for missing
constraints or conditions. |

`...` |
Other parameters passed to `rda` or to
`metaMDSdist` . |

Canonical Analysis of Principal Coordinates (CAP) is simply a
Redundancy Analysis of results of Metric (Classical) Multidimensional
Scaling (Anderson & Willis 2003). Function capscale uses two steps:
(1) it ordinates the dissimilarity matrix using
`cmdscale`

and (2) analyses these results using
`rda`

. If the user supplied a community data frame instead
of dissimilarities, the function will find the needed dissimilarity
matrix using `vegdist`

with specified
`distance`

. However, the method will accept dissimilarity
matrices from `vegdist`

, `dist`

, or any
other method producing similar matrices. The constraining variables can be
continuous or factors or both, they can have interaction terms,
or they can be transformed in the call. Moreover, there can be a
special term
`Condition`

just like in `rda`

and `cca`

so that ``partial'' CAP can be performed.

The current implementation differs from the method suggested by Anderson & Willis (2003) in three major points which actually make it similar to distance-based redundancy analysis (Legendre & Anderson 1999):

- Anderson & Willis used the orthonormal solution of
`cmdscale`

, whereas`capscale`

uses axes weighted by corresponding eigenvalues, so that the ordination distances are the best approximations of original dissimilarities. In the original method, later ``noise'' axes are just as important as first major axes. - Anderson & Willis take only a subset of axes, whereas
`capscale`

uses all axes with positive eigenvalues. The use of subset is necessary with orthonormal axes to chop off some ``noise'', but the use of all axes guarantees that the results are the best approximation of original dissimilarities. - Function
`capscale`

adds species scores as weighted sums of (residual) community matrix (if the matrix is available), whereas Anderson & Willis have no fixed method for adding species scores.

With these definitions, function `capscale`

with Euclidean
distances will be identical to `rda`

in eigenvalues and
in site, species and biplot scores (except for possible sign
reversal).
However, it makes no sense to use `capscale`

with
Euclidean distances, since direct use of `rda`

is much more
efficient. Even with non-Euclidean dissimilarities, the
rest of the analysis will be metric and linear.

The function can be also used to perform ordinary metric scaling
a.k.a. principal coordinates analysis by using a formula with only a
constant on the left hand side, or `comm ~ 1`

. With
`metaMDSdist = TRUE`

, the function can do automatic data
standardization and use extended dissimilarities using function
`stepacross`

similarly as in non-metric multidimensional
scaling with `metaMDS`

.

The function returns an object of class `capscale`

which is
identical to the result of `rda`

. At the moment,
`capscale`

does not have specific methods, but it uses
`cca`

and `rda`

methods
`plot.cca`

,
`scores.rda`

etc. Moreover, you
can use `anova.cca`

for permutation tests of
``significance'' of the results.

The function produces negative eigenvalues with many
dissimilarity indices. The non-Euclidean component of inertia is
given under the title `Imaginary`

, and the negative eigenvalues
are listed after unconstrained eigenvalues with prefix `NEG`

.
The total inertia is the sum of all eigenvalues, including negative
ones (Gower 1985). No ordination scores are given for negative
eigenvalues. If the negative eigenvalues are disturbing, you can
use argument `add = TRUE`

passed to `cmdscale`

, or,
preferably, a distance measure that does not cause these warnings.
Alternatively, after square root transformation of distances
(argument `sqrt.dist = TRUE`

) many indices do not produce
negative eigenvalues.

The inertia is named after the dissimilarity index as defined in the
dissimilarity data, or as `unknown distance`

if such an
information is missing. Function `rda`

usually divides
the ordination scores by number of sites minus one. In this way, the
inertia is variance instead of sum of squares, and the eigenvalues sum
up to variance. Many dissimilarity measures are in the range 0 to 1,
so they have already made a similar division. If the largest original
dissimilarity is less than or equal to 4 (allowing for
`stepacross`

), this division is undone in `capscale`

and original dissimilarities are used. Keyword `mean`

is added to
the inertia in cases where division was made, e.g. in Euclidean and
Manhattan distances. Inertia is based on squared index, and keyword
`squared`

is added to the name of distance, unless data were
square root transformed (argument `sqrt.dist = TRUE`

). If an
additive constant was used, keyword `euclidified`

is added to the
the name of inertia (argument `add = TRUE`

).

Jari Oksanen

Anderson, M.J. & Willis, T.J. (2003). Canonical analysis of principal
coordinates: a useful method of constrained ordination for
ecology. *Ecology* 84, 511–525.

Gower, J.C. (1985). Properties of Euclidean and non-Euclidean
distance matrices. *Linear Algebra and its Applications* 67, 81–97.

Legendre, P. & Anderson, M. J. (1999). Distance-based redundancy
analysis: testing multispecies responses in multifactorial ecological
experiments. *Ecological Monographs* 69, 1–24.

Legendre, P. & Legendre, L. (1998). *Numerical Ecology*. 2nd English
Edition. Elsevier

`rda`

, `cca`

, `plot.cca`

,
`anova.cca`

, `vegdist`

,
`dist`

, `cmdscale`

.

data(varespec) data(varechem) ## Basic Analysis vare.cap <- capscale(varespec ~ N + P + K + Condition(Al), varechem, dist="bray") vare.cap plot(vare.cap) anova(vare.cap) ## Avoid negative eigenvalues with additive constant capscale(varespec ~ N + P + K + Condition(Al), varechem, dist="bray", add =TRUE) ## Avoid negative eigenvalues by taking square roots of dissimilarities capscale(varespec ~ N + P + K + Condition(Al), varechem, dist = "bray", sqrt.dist= TRUE) ## Principal coordinates analysis with extended dissimilarities capscale(varespec ~ 1, dist="bray", metaMDS = TRUE)

[Package *vegan* version 1.16-32 Index]