For a generic dominated family with densities , a simple estimator (maximum likelihood estimator, MLE) for is
Argmax may not exist, be unique, or be computable.
It doesn't depend on parameterization or base measure; MLE for is .
Example (Exponential Families)
. So . , so should solve if it exists.
Since is negative definite unless (in which case parameters are redundant), then at most solution exists. Let , then .
Example (Exponential Families, multiple variables)
, . So , with .
Assume . , , (notation of is defined in the last example), so continuous, and By consistency, ; by continuous mapping, .
Since (Recall .)
By Delta method,
Recall . So Asymptotically unbiased, Gaussian achieves CRLB.
The nice behavior of MLE we found in the exponential family case generalizes to a much broader class of models.
Setting: , . is "smooth" in .
Let , . Then
We say an estimator is asymptotically efficient if Delta method for differentiable estimand : also achieves CRLB if does.
3 Asymptotic Distribution of MLE
Under mild conditions, is asymptotically Gaussian, and efficient. We will be interested in as a function of . Notate "true" value as ()
Then for ,
Recall KL divergence: . Let , . Note too. "=" iff . But this is not enough:
MLE depends on entire function .
Need uniform convergence in .
Convergence of function series
For compact , let . For , let . Denote in this norm if .
Theorem (Uniform LLN)
Assume is compact. iid. , . Then , and
(i.e. )
Theorem (Consistency of MLE for Compact )
. has densities , . Assume:
is continuous in , .
is compact.
.
Model identifiable.
Then if .
Proof
iid, mean , (because )
By definition, maximizes , and .
Fix , want to show .
Let compact. Let , , then
We usually care about non-compact parameter spaces, so need some extra assumption to get us there.
Corollary
Same assumptions except now (non-compact), but there is some large engou so , then .
Proof
Let , , then by assumption. Since so , so
So the only thing we actually need to worry about is if is extremely far away from with non-negligible probability.
Theorem (Asymptotic Distribution of MLE)
, has densities .
Assume
identifiable.
compact.
.
has two continuous derivatives in .
.
is positive definite.
Then
Proof
From before, we had for between and . Previous result shows , so also.
Define , by assumption. Then , , , . So
(*) is by continuous mapping. Hence by continuous mapping By Slutsky,