Austin has strongly criticized the use of symmetric Gaussian response
functions in gradient analysis (Austin 1976, 1980, 1987; Austin
& Gaywood 1994). The Gaussian response function has been the
only model in which the parameters can be estimated effectively
(Austin *et al.* 1984; ter Braak & Looman 1986; Jongman
*et al.* 1987). Although the evidence for the Gaussian model
can be criticized strongly (Austin 1976, 1980), all the methods
proposed to test the significance of skewness of the response
have had some problems: Visual judgement of skewness (Økland
1986) is subjective; third degree polynomials (Austin *et al.*
1990) have irrealistic shapes; beta function (Austin *et al.*
1994) confounds the location of the maximum and the skewness (Oksanen
1997); and smoothing by generalized additive models (Hastie &
Tibshirani 1990; Yee & Mitchell 1991) must be judged subjectively.
However, Huisman, Olff and Fresco (1993) have proposed a set of
hierarchical models which include a skewed response and a symmetric
response. They assume that these models cannot be fitted using
maximum likelihood (Huisman *et al.* 1993). However, this
can be done, and so it is possible to (1) use more realistic error
distribution for the observation than the originally implied normal
distribution, and (2) it is possible to test statistically whether
a response is significantly skewed, or whether symmetric
[*Fig: jpeg, 13K*] or monotonic response could be used.

The program HOF fits the model of Huisman, Olff and Fresco to the vegetation data using maximum likelihood estimation with Poisson error. The program runs under MS-DOS, it reads in CEP or CANOCO formatted files, and reports results both on the screen and in a log file on hard disk. This paper gives an overview of the theory behind the program, and an introduction to its use and limitations.

A set of five hierarchic models was originally proposed (Huisman
*et al.* 1993). Hierarchic means that a simpler model has
(1) fewer parameters than the more complex model, and (2) can
be derived from the more complex models by fixing one of the parameters.
The most complex model proposed defines a skewed response function.
It can be written in slightly modified form:

Here *m* is the expected value which is dependent on the
known values of the gradient (*x*) and maximum possible value
(*M*), and the four estimated parameters (*a, b, c, d*)
of the function. This model can be simplified as follows
[*Fig: gif, 5.6K*]:

Model Estimated / fixed parameters No. of estimated parameters V: Skewed4 IV: Symmetricabcd3 II: Monotoneabcb2 I: Flatab-inf 01a0 -inf 0

Here estimated parameters are in bold and fixed parameters in
ordinary print showing their value. Originally the authors proposed
one more model (III) with parameters (*a, b, c, 0*), but
since it uses the same number of parameters as model IV we did
not use it: It cannot be tested statistically against model IV
and its shape is less interesting than that of model IV (model
IV gives a symmetric response resembling the Gaussian response,
and model III a monotone response resembling model II but converging
to a plateau below *M* instead of *M*).

Originally only a least squares method was proposed, and it was
said that maximum likelihood cannot be used with these models
(Huisman *et al.* 1993). However, the use of maximum likelihood
is fairly straightforward (cf. Oksanen *et al.* 1990), though
it may be computationally demanding.

Least squares fitting can be seen as a maximum likelihood estimation
using normal error. In least squares we compare fitted values
(*mu*) with the observed values (*y*) trying to minimize

So the task is to find parameter estimates that give the fitted
values (*mu*) that give the least squares fit to the observed
values (*y*). The density function of normal distribution
for one indexed observation is given by:

Here *sigma *is a scaling parameter which is constant for
all observations. To find the most likely values of *mu*
we can write this into a log-likelihood function (*LL*):

Maximum likelihood estimate of *mu* is the value that maximizes
*LL*. The value of the function is dependent only on the
indexed variables *y* and *mu*. To maximize *LL*
we have to minimize the least squares (*y-mu*)^2. So the
least squares give the maximum likelihood estimate with normal
error.

Other error distributions can be used analogously. For instance,
the probability function of Poisson distribution is:

In maximum likelihood estimation with Poisson error we have to
find the parameters that give the values of *mu* that maximize
that function.

In maximum likelihood estimation we can (1) use log-likelihood
(*LL*) instead of original likelihood, and (2) we can omit
the term *y*! since it is not dependent on the fitted values
and does not influence the estimation. In that case we get the
quasilikelihood (*qLL*) estimates:

Quasilikelihood function will give the same estimates as the original
likelihood function. However, the value of *LL* and *qLL*
evaluated at these estimates of *mu* will be different.

Maximum likelihood estimates maximize the likelihood function
*LL* or equivalently minimize -*LL*. Maximum likelihood
function used in the HOF model is non-linear in parameters, and
so the parameters must be estimated using iterative methods. There
are several alternative methods for non-linear minimization. The
one used here is a quasi-Newton method known as the Broyden-Fletcher-Goldfarb-Shanno
variant of the Davidon-Fletcher-Powell method (Press *et al.*
1992). The method is much more effective than the traditional
steepest descent method. The maximum likelihood estimates of the
parameters are found by minimizing 2*qLL*. No convergence
to a global optimum can be guaranteed in iterative fitting. When
convergence has been found, the minimization is started again
at that point to see whether the program can get anywhere.

If the model fits the data, -2*LL* follows Chi-squared distribution
(Lindgren 1976). This means that the differences of -2*LL*
from two different models follow the Chi-squared distribution
as well. Consequently, if we have two models where one has one
more parameter than the other, and the difference in -2*LL*
is larger than 3.84, the *extra parameter* is significant
at the level *P* < 0.05. So the hierarchic HOF models
can be compared by analysing the change in -2*LL* and adopting
the model which cannot be simplified without significant change
in -2*LL.*

A major condition in this scheme is that -2*LL* follows Chi-squared
distribution *only* *if *the model fits the data. If
the variance is larger than assumed in the likelihood function,
the data are overdispersed, and the statistical test must be adapted
to this overdispersion (McCullagh & Nelder 1989). Since the
estimation is based on quasilikelihood (-2*qLL)* we must
(1) transform -2*qLL* to a statistic known as a deviance
(McCullagh & Nelder 1989) so that we can analyse the degree
of overdispersion, and (2) adjust statistical testing to overdispersion.

The smallest possible -2*qLL* with the current data (*y*)
is obtained when *m *= *y*, i.e., fitted values are
identical with observed values. Deviance (*D*) is defined
as the difference between minimum 2*LL* and observed 2*LL*
(McCullagh & Nelder 1989):

Deviance follows Chi-squared distribution similarly as the -2*LL*.
The expectation of a Chi-squared variate equals its degrees of
freedom. If we have *n* observations, and we estimate *p*
parameters, the residual degrees of freedom is *n-p*. The
data are overdispersed if the deviance exceeds the degrees of
freedom. Overdispersion is a rule in community data sets (Kemp
1987; Anderson 1988; Oksanen *et al.* 1990, 1991).

If the data are overdispersed, we cannot use the simple Chi-squared
test to analyse the significance of the parameters, but we have
to use *F* test instead (Anderson 1988; McCullagh & Nelder
1989). If we have a model with *p* estimated parameters and
deviance *D(p) *, and a simpler model with *p*-1 estimated
parameters and deviance *D(p-1)* , then the testing can be
based on the statistic (Anderson 1988; McCullagh & Nelder
1989):

The program HOF implements maximum likelihood estimation of the
parameters of the HOF model (Huisman *et al.* 1993) as explained
above: It uses Poisson error and statistical testing based on
overdispersed data.

The program reads in data from normal CEP files or files accepted by the program CANOCO (ter Braak 1987).

The program asks the names of vegetation data and environmental
data files, the number of the one and only environmental variable
used, the minimum frequency of species analysed, and the value
of greatest theoretical value (*M*). The alternatives for
*M* are either the sum of values for the observations as
computed from the data (useful for counts), or a constant (e.g.
100 for percentage cover).

The gradient is scaled to the range 0..1, mainly to simplify setting the starting values and estimation of the parameters.

The output screen [*Fig: gif,
14K*] has following entries:

-2*qLL* gives the fit at the current iteration step. When
the convergence has been found, the program is started again,
and if a minimum was found (stops at iteration 0) or -2*qLL*
changes only a little, the iteration stops. With Pentium the calculations
are so fast that you can hardly follow this column, but with slower
computers (e.g. 486DX) this shows you that the computer is indeed
doing something.

Graphics screen plots first the observations against the scaled gradient, and then in addition the latest fitted significant curve.

The summary screen shows the parameter estimates, fit statistics, and significance of the removed term. This information will be printed to a disk file called HOFLOG.TXT.

The model is simplified as long as the removed parameters are
insignificant at *P* < 0.05. Estimation is stopped when
first significant term is found. The graphic screen will show
the curve from the largest significant model.

- The program reads in CEP files and "relaxed restricted condensed format" files from CANOCO (ter Braak 1987). However, there may be cases where the program cannot handle files accepted by these other programs. Most importantly, the program uses the index numbers to identify the sample plots. If the numbering is discontinuous, the missing plots are not removed but they are given the value zero on the gradient (on the other hand, the program does not require the plots be in order, unlike CEP and CANOCO). Data cannot be transformed or edited in the program.
- The program is dimensioned to 8000 non-zero observations, 1000 sample plots, and 600 species. The storage allocation is dynamic, and it may fail if the computer does not have a sufficient amount of available memory. The success of allocation is usually checked, and the program is stopped if the allocation fails.
- The program can use only Poisson error. In many biological
data sets, binomial error would be more adequate. If the
*fitted values*are small, Poisson error gives a good approximation. Specifically, the variance of Poisson variable is*mu,*and the variance of binomial observation is*mu*(1-*mu/M*), and for small*mu/M*these are nearly equal. I'm working to implement binomial error as well, but I won't promise it very soon. - The program assumes that
*M*is known.*M*is the largest possible value, e.g., the scale maximum or total number counted. If needed, I may change the program so that this variable is read from a file instead assuming it constant or computed as the sum of observations. It is not clear to me whether*M*originally (Huisman*et al.*1993) was intended to be among the estimated parameters or as a known variable like in this program. - In some cases the program is trapped so that the fitted "curve"
is a horizontal line at
*M.*This happens with some data sets. It does not occur in the data sets I've used, but if needed, I may try to work a correction to this (I have an idea how it could be done). - No convergence to global optimum can be guaranteed. This is nothing special with the program, but a general feature of non-linear minimization where we may have several local minima. There is no solution to this, but if you suspect wrong fitting, there may be a programming error. So please, report these cases.

- The program has been written in C language, and compiled using Microsoft Quick C 2.5. It uses graphical libraries of Quick C. The graphical routines are not portable even to Microsoft Visual C 1.5. A subset of the program without graphics has been ported to DJGPP (MS-DOS port to Gnu C). The program runs under DOS.
- Numerical Recipes Software (Press
*et al.*1992) are used for minimization of -2*qLL*(dfpmin, modified) and evaluation of the*F*statistics (betai). These are copyright routines, and**copying this program is against Numerical Recipes copyright**. - The routines used in reading CEP and CANOCO files were written
in FORTRAN, and translated into C using F2C software (Feldman
*et al.*1993).

- Anderson, D.A. (1988) Some models for overdispersed binomial
data.
*Australian Journal of Statistics*,**30,**125-148. - Austin, M.P. (1976) On non-linear species response models
in ordination.
*Vegetatio*,**33,**33-41. - Austin, M.P. (1980) Searching for a model for use in vegetation
analysis.
*Vegetatio*,**42,**11-21. - Austin, M.P. (1987) Models for the analysis of species' response
to environmental gradients.
*Vegetatio*,**69,**35-45. - Austin, M.P. & Gaywood, M.J. (1994) Current problems of
environmental gradients and species response curves in relation
to continuum theory.
*Journal of Vegetation Science*,**5,**473-482. - Austin, M.P., Cunningham, R.B. & Fleming, P.M. (1984)
New approach to direct gradient analysis using environmental scalars
and statistical curve-fitting.
*Vegetatio*,**55,**11-27. - Austin, M.P., Nicholls, A.O. & Margules, C.R. (1990) Measurement
of the realized qualitative niche: environmental niches of five
*Eucalyptus*species.*Ecological Monographs*,**60,**161-177. - Austin, M.P., Nicholls, A.O., Doherty, M.D. & Meyers,
J.A. (1994) Determining species response functions to an environmental
gradient by means of a
*ß*-function.*Journal of Vegetation Science*,**5,**215-228. - ter Braak, C.J.F. (1987)
*CANOCO - a FORTRAN program for*cano*nical*c*ommunity*o*rdination by [partial] [detrended] [canonical] correspondence analysis, principal components analysis and redundancy analysis (version 2.1)*. TNO, Wageningen. - ter Braak, C.J.F. & Looman, C.W.N. (1986) Weighted averaging,
logistic regression and the Gaussian response model.
*Vegetatio*,**65,**3-11. - Feldman, S.I., Gay, D.M., Maimone, M.W. & Schryer, N.L.
(1993) A Fortran-to-C converter.
*AT&T Bell Laboratories, Computing Science Technical Report*,**149,**1-25. (originally issued 1990, update 1993) - Hastie, T.J. & Tibshirani, R.J. (1990)
*Generalized additive models*. Chapman & Hall, London. - Huisman, J., Olff, H. & Fresco, L.F.M. (1993) A hierarchical
set of models for species response analysis.
*Journal of Vegetation Science*,**4,**37-46. - Jongman, R.H., ter Braak, C.J.F. & van Tongeren, O.F.R.
(1987)
*Data analysis in community and landscape ecology*. Pudoc, Wageningen. - Kemp, A.W. (1987) Families of discrete distributions satisfying
Taylor's power law.
*Biometrics*,**43,**693-699. - Lindgren, B.W. (1976)
*Statistical theory*, 3rd Edn. Macmillan, New York. - McCullagh, P. & Nelder, J.A. (1989)
*Generalized linear models*, 2nd Edn. Chapman & Hall, London. - Økland, R.H. (1986) Rescaling of ecological gradients.
II. The effect of scale on symmetry of species response curves.
*Nordic Journal of Botany*,**6,**671-677. - Oksanen, J. (1997) Why the beta-function cannot be used to
estimate skewness of species responses.
*Journal of Vegetation Science*,**8,**147-152. - Oksanen, J., Läärä, E., Huttunen, P. &
Meriläinen, J. (1990) Maximum likelihood prediction of lake
acidity based on sedimented diatoms.
*Journal of Vegetation Science*,**1,**49-59. - Oksanen, J., Läärä, E. & Zobel, K. (1991)
Statistical analysis of bioindicator value of epiphytic lichens.
*Lichenologist*,**23,**167-180. - Press, W.H., Teukolsky, S.A., Vetterling, W.T. & Flannery,
B.P. (1992)
*Numerical recipes in C: The art of scientific computing*, 2nd Edn. Cambridge University Press, Cambridge. - Yee, T.W. & Mitchell, N.D. (1991) Generalized additive
models in plant ecology.
*Journal of Vegetation Science*,**2,**587-602.

Go to [ top ] [my homepage]

*Updated 5/5/97*