|
|
|
Mathematical Methods in Statistics / Méthodes mathématiques en statistique Org: Russell Steele, Alain Vandal and/et David Wolfson (McGill)
- JEAN-FRANCOIS ANGERS, Dép. de math/stat, Université de Montréal, C. P. 6128
Succ. Centre-ville, Montréal, QC H3C 3J7
Mixture of Zero Inflated Densities
-
In several real life examples one encounters count data where the
number of zeros is such that usual discrete probability density
functions does not fit the data. Quite often the number of zeros is
large, and hence the data are zero-inflated. Furthermore, the
histogram is often multimodal indicating that the data may come from
different sub-populations. In such a situation, a zero inflated model
along with a mixture of discrete probability density functions can be
considered and a Bayesian analysis can be carried out. Using the EM
algorithm, Bayesian estimates and credibility intervals for the
different parameters are obtained. The proposed technique is
illustrated using a real life data set.
- MASOUD ASGHARIAN, McGill University
On the Singularities of the Information Matrix
-
The information matrix plays a central role when establishing
asymptotic normality of parameter estimates in problems of statistical
inference. One recurring condition for asymptotic normality is that
the information matrix be positive definite. For many problems,
however, this condition seems virtually impossible to verify. An
important class of models where this is the case, is the class of
mixture models for which the form of the information matrix prevents
the verification of this crucial condition. Using the Subimmersion
Theorem we show that under identifiability and Le Cam's smoothness
conditions the set of singularities of the information matrix is a
nowhere dense set. Under further conditions we demonstrate that this
set is also of measure zero, provided that the score function, when
considered as a function on a complex domain conformable with the
parameter space, is bounded by a statistic whose second moment exists.
We also study the measure of this set when parameter orthogonality is
possible. In particular, it is shown that one can find a
reparameterization under which the set of singularities of the
information matrix is nowhere dense and of measure zero set, provided
that parameter orthogonality is possible. Our results, therefore,
suggest that in problems for which the tangible conditions of
identifiability and smoothness may be assumed, positive definiteness
of the information matrix "rarely" fails to hold.
- CHRISTIAN GENEST, Université Laval, Québec, Canada G1K 7P4
Testing independence revisited / Un nouveau regard sur les
tests d'indépendance
-
Testing independence is a time-honored problem in statistics. This
issue will be revisited here in the light of the theory of copulas.
It will be argued that outside the normal paradigm, effective tests of
independence should be rank-based, and that nonparametric tests of
independence yield most powerful and robust procedures that those
based on the likelihood. The small- and large-sample properties of
locally most powerful procedures will be compared to those of standard
tests, notably through the notion of Pitman's asymptotic relative
efficiency. Tests based on Cramér-von Mises and Kolmogorov-Smirnov
functionals of Deheuvels' empirical copula process will also be
considered. This presentation will be based on joint work with
J.-F. Quessy, B. Rémillard, and F. Verret.
Le problème de tester l'indépendance est classique en statistique.
La question sera abordée ici sous l'angle de la théorie des
copules. On fera valoir qu'en dehors du cadre normal, tout bon test
d'indépendance devrait être fondé sur les rangs des observations
et qu'à cet égard, les procédures non paramétriques sont
généralement plus puissantes et robustes que celles qui s'appuient
sur la vraisemblance. Le comportement de tests de rangs localement
les plus puissants sera comparé à celui de procédures
classiques, tant dans de petits que de grands échantillons,
notamment au moyen de la notion d'efficacité relative asymptotique
de Pitman. On s'intéressera en outre à des tests construits à
partir de fonctionnelles de type Cramér-von Mises et
Kolmogorov-Smirnov du processus de copule expérimental de Deheuvels.
Cette présentation s'appuie sur des travaux réalisés en
collaboration avec J.-F. Quessy, B. Rémillard et F. Verret.
- PIERRE LEGENDRE, Université de Montréal, C.P. 6128, Succ. Centre-ville,
Montréal, Québec H3C 3J7
What are the important spatial scales in an ecosystem?
-
Spatial heterogeneity of ecological structures comes either from the
physical forcing of environmental variables or from community
processes. In both cases, spatial structuring plays a functional role
in ecosystems. Ecological models should explicitly take into account
the spatial organization of ecosystems.
A canonical (regression-type) modeling method has been developed,
which allows the decomposition of the variance of multivariate
(e.g., species abundance) data table into four components:
(a) a non-spatially-structured component explained by the
environmental variables in the model,
(b) a spatially-structured component of environmental
variation,
(c) a spatially-structured fraction which is not explained by
the environmental variables and possibly results from community
dynamics, and
(d) a residual fraction.
The first three components can be mapped separately, providing new
insights into community dynamics.
In previous work, we used a polynomial function of the geographic
coordinates of the sampling sites to represent broad-scale spatial
variation. We found a way of representing spatial structures at all
scales in these analyses. This is obtained by eigenvalue
decomposition of a truncated distance matrix among sampling sites.
The behavior of this method has been investigated using numerical
simulations and real data sets. When sampling occurred along a
transect or a regular grid, this modeling method allows the estimation
of the variance associated with each spatial scale in the observation
window. A graph of the resulting F-statistic against scales is called
a Scalogram. It indicates the significant spatial scales present in
the multivariate data under study-for example, a community
composition data table.
References
- [1]
-
D. Borcard, P. Legendre and P. Drapeau,
Partialling out the spatial component of ecological variation.
Ecology 73(1992), 1045-1055.
- [2]
-
D. Borcard and P. Legendre,
Environmental control and spatial structure in ecological
communities: an example using Oribatid mites (Acari, Oribatei).
Environmental and Ecological Statistics 1(1994), 37-61.
- [3]
-
,
All-scale spatial analysis of ecological data by means of
principal coordinates of neighbour matrices.
Ecological Modelling 153(2002), 51-68.
- [4]
-
D. Borcard, P. Legendre, C. Avois-Jacquet and H. Tuomisto,
Dissecting the spatial structure of ecological data at multiple
scales.
Ecology (in press), 2004.
- [5]
-
P. Legendre and D. Borcard,
Quelles sont les échelles spatiales importantes dans un
écosystème ?
In: J.-J. Droesbeke, M. Lejeune et G. Saporta (éds),
Analyse statistique de données spatiales.
Editions TECHNIP, Paris (in press), 2004.
- [6]
-
P. Legendre, J. A. Rusak and D. Borcard,
Temporal scales of zooplankton variation and resistance in a
whole-lake acidification experiment.
Limnology & Oceanography, submitted.
- [7]
-
P. Legendre and L. Legendre,
Numerical ecology.
2nd English edition, Elsevier Science, 1998.
- BRENDA MacGIBBON, Département de mathématiques, Université du Québec
à Montréal
Exact inference for categorical data
-
Exact methods of inference for parameter significance and
goodness-of-fit with categorical data have recently been the subject
of renewed statistical interest because contingency tables are arising
in many applications which have integer entries of counts small enough
in some cells to cause doubt about the validity of multivariate normal
approximations. On the other hand, the tables in these applications
have entries large enough in other cells to make enumeration
difficult. Exact computational methods fall into two groups: complete
enumeration and Monte Carlo methods. These methods will be
illustrated on a data set of categories of congenital heart
malformations for sibling pairs and a case-control one. This is joint
work with Yuguo Chen and Ian Dinwoodie.
- NEAL MADRAS, York University, 4700 Keele Street, Toronto, Ontario M3J 1P3
A Model for Tracking the History of the AIDS Epidemic
-
The epidemiology of AIDS poses many challenging statistical and
mathematical problems. In particular, tracking and forecasting the
AIDS epidemic is complicated by very long incubation times.
Individuals with HIV infections are often diagnosed before developing
AIDS, and the ensuing treatment makes it difficult to model incubation
times.
I shall describe a new model that accounts for early detection without
introducing bias. We use a Gibbs sampler Monte Carlo simulation to
estimate the probabilities of diagnoses and the total number of new
HIV infections each year among homosexual men in Ontario.
- PAUL MARRIOT, University of Waterloo, 200 University Ave. W., Waterloo,
Ontario N2L 3G1
Mixture Models and Geometry
-
The class of statistical models known as mixtures have wide
applicability in applied problems due to their flexibility,
naturalness and interpretability. However despite their apparent
simplicity the inference problem associated with them remains hard,
both from a theoretical and a practical standpoint. This talk gives
an overview of some methods which use geometric techniques to
understand the problem of inference under mixture models. The
recently introduced class of local mixtures is shown to have many
applications, managing to retain a great deal of flexibility and
interpretability while having excellent inference properties. Also
discussed will be some interesting issues which arise when you
transfer ideas from one mathematical area (differential geometry) to
another (statistical inference).
- BRUNO REMILLARD, HEC Montréal
Bootstrapping methods for empirical processes
-
In this talk I will show that some bootstrapping methods work for
empirical processes, while some other methods do not work. Examples
will include parametric bootstrap for goodness-of-fit test for copula
families and multiplier methods for empirical processes of
pseudo-observations.
- LOUIS-PAUL RIVEST, Laval
Utilisation des quaternions pour la modélisation
statistique du mouvement humain
-
En biomécanique la mesure du mouvement humain fait intervenir des
systèmes de caméra qui enregistrent la position de marqueurs
fixés sur les membres du sujet. Les coordonnées des marqueurs
sont convertis en des matrices de rotations 3 ×3 qui donnent
les orientations des membres à l'étude.
Le mouvement d'une articulation est ensuite calculé comme étant la
suite temporelle des matrices de rotation donnant l'orientation
relative, l'un part rapport à l'autre, des deux segments de
l'articulation. Cette suite de rotations est représentée sous la
forme de trois séries temporelles d'angles d'Euler, respectivement
associées à des mouvements de type flexion-extension,
abduction-adduction et rotation externe-interne. Cet exposé
présente un modèle statistique qui permet d'identifier des erreurs
de mesure pour certains types de mouvement. Il postule qu'un
changement judicieux de l'orientation des systèmes d'axes des deux
segments d'une articulation permet de représenter le mouvement à
l'aide d'une seule suite d'angles d'Euler; ainsi ce mouvement serait,
par exemple, une flexion-extension pure. Cet exposé montre que
l'écriture des matrices de rotation sous la forme de quaternions
permet d'ajuster ce modèle de façon relativement simple.
- ALAIN VANDAL, McGill University, 805 rue Sherbrooke ouest, Montréal,
Québec H3A 2K6
Weak order partitioning of interval orders, with application
to survival analysis
-
We propose a partition of the set of linear extensions of any interval
order, in which each subset is itself the set of linear extensions of
a weak order. This partitioning technique lends itself immediately to
the definition of a Markov chain on the partitioning weak orders.
Because the number of linear extensions of a weak order is easily
computed, this chain can be used in Monte Carlo fashion to draw a
linear extension uniformly from the interval order. For the
statistical investigation of interval orders, this technique is more
attractive than Bubley & Dyers, as the correlation between statistics
on consecutive linear extensions in the chain is very small. Using an
alternate, computationally simple technique to draw linear extensions
in non-uniform fashion, we show how we can estimate the number of
linear extensions in an interval order using this Markov chain Monte
Carlo. We also show how the Markov chain can be used to produce rank
statistics for interval-censored failure times of subjects in a
control or treatment group, under the null hypothesis of equivalence
between control and treatment.
|
|