Solutions to systems of linear equations
Consider the equation Y = Xβ where
and . For
a given X and Y (observed data), does there exist a solution β to
this equation?
If p = n (i.e. X square) and X is nonsingular, then yes and
the unique solution is . Note that in this
case, the
number of parameters is equal to the number of subjects, and
we could not make inference.
Suppose p ≤ n and Y ∈ C(X), then yes though
the solution
is not necessarily unique. In this case, is
a solution
since for all Y ∈ C(X) by Definition
of
generalized inverse. Consider following 2 cases:
If r(X) = p, (X full rank) then the columns of X form a basis
for C(X) and the coordinates of Y relative to that basis are
unique (recall notes section 2.2) and therefore the solution β
is unique.
Suppose r(X) < p. If β* is a solution to Y = Xβ then
β* + w, w ∈ N(X) is also a solution. So we have the set of
all solutions to the equation equal to
. Note
that is the orthogonal projection operator onto
C(X') and so is the orthogonal
projection operator onto .
In general, Y ≠ C(X) and no solution exists. In this
case, we
look for a vector in C(X) that is "closest" to Y and solve the
equation with this vector in place of Y . This is given by MY
where is the orthogonal projection
operator onto X. Now solve:
MY = X β
The general solution (for r(X) ≤ p) is given by
and again there are infinite
solutions. Let the SVD of X be given by . We
know the MP generalized inverse of X is .
Therefore,
So the general solution is given by
Now assume r(X) = p. In this case, we have
and so
Random vectors and matrices
Definition: Let be
a random vector with
and .
The
expectation of Y is given by
Similarly, the expectation of a matrix is the matrix of expectations
of the elements of that matrix.
Definition: Suppose Y is an n ×1 vector of random
variables.
The covariance of Y is given by the matrix:
where
Theorem: Suppose Y is a random n ×1 vector with
mean
E(Y ) = μ and covariance . Further
suppose the
elements of and
are scalar constants. Then,
and
Definition: Let
and
be random vectors
with
E(Y ) = μ and E(W) = . The covariance
between Y and W is
given by
We call this a matrix of covariances (not necessarily square) which
is distinct from a covarince matrix.
Theorem: Let
and be random vectors with
and
. Further suppose
and are
matrices of constant scalars. Then
Theorem: Covariance matrices are always positive
semi-definite.
Proof: Let
be a random vector and
where μ= E(Y ). We need
to show that for any . Let Z = (Y -μ ),
then
we have:
(since x is a vector of scalars) | |
(where w = Z'x) | |
Since the expectation of a non-negative random variable
will
always be non-negative. Note that if wi = 0 for all i, then we have
where zi is the ith column of
Z'. This implies dependency among the columns and singularity of
the covariance matrix.