A likelihood ratio test for correlated paired multivariate samples
Abstract
Many laboratory experiments in the fields of biological sciences usually involve two main
groups say the healthy and infected subjects. In one of these kind of experiments, each
specimen from each group can be divided in two portions; one portion is stimulated
while the other remains unstimulated. Consequently resulting into two main groups
with paired measurements that are correlated. For all the groups, p genes are measured
for expression. The stimulation in this case can be done by introducing a known infection causing micro-organism like the group A streptococcus which is usually associated
with the acute rheumatic fever. An important question in such experiment would be to
statistically test for the di↵erences in the di↵erences in means for the healthy and the infected groups. That is, the di↵erence in the means of the healthy group (stimulated and
unstimulated) is tested against the di↵erence in the means of the infected (stimulated
and unstimulated) group. In this paper, a likelihood ratio test statistic is developed for
such kind of problems. The developed statistics and the Hotelling T2 statistic are both
applied to the data are simulated from real biological situations and their performances
are compared. The simulated data exhibit the correlation structure similar to that of
real biological data obtained from experiments involving the milliplex analyst biomarker
data sets. The results indicate that the proposed test statistic give the same conclusions
for the hypotheses tested as those of the Hotelling T2 test. However, the proposed test
is intuitively more appealing since it takes care of the correlations between the pairs in
the data. The simulation study confirms that the test statistics follow a chi-square distribution. This research contributes a theoretical analysis of paired correlated samples
motivated by a practical problem for which the existing statistical methods in use have
seldomly taken into account the correlation structure of the data.