1 Simple Linear Regression

1 Setting

Setting: (x1,y1),,(xn,yn).
For a time series setting, for example, we may take xi=i as the index, and yi be the population at a certain time.
For a linear model, we assume yi=β0+β1xi+εi, where β0 is the intercept, β1 is the slope, εi is the error term, and assume εii.i.dN(0,σ2).
The parameters here are β0,β1,σ. (σ measures the noise scale). We want to do estimation and uncertainty quantification.

2 Frequentist Inference

The frequentist approach is the MLE . The basic process is


3 Bayesian Inference

Bayesian involves posterior density f(β0,β1,σ)|(xi,yi)(β0,β1,σ)f(xi,yi)|(β0,β1,σ)(xi,yi)likelihoodfβ0,β1,σ(β0,β1,σ)prior.
The additional information we need to provide is the prior. We can assume β0,β1,logσi.i.dUniform[C,C].
By transformation formula, fσ(σ)=flogσ(logσ)1σ. So the prior density is fβ0,β1,σ(β0,β1,σ)=fβ0(β0)fβ1(β1)fσ(σ)=I{C<β0<C}2CI{C<β1<C}2CI{C<logσ<C}2Cσ1σI{C<β0,β1,logσ<C}.
So fβ0,β1,σ|data(β0,β1,σ)(2π)n2σnexp[S(β0,β1)2σ2]I{C<β0,β1,logσ<C}σσn1exp[S(β0,β1)2σ2]I{C<β0,β1,logσ<C}.
If we want to get posterior for β0,β1|data, we integrate over σ: fβ0,β1|data(β0,β1)=fβ0,β1,σ|data(β0,β1,σ)dσ1{C<β0,β1<C}eCeCσn1exp(S(β0,β1)2σ2)dσ.
When C is large, (eC,eC) goes to (0,). Use change of variable s=σS(β0,β1), then the integral is 0σn1exp(S(β0,β1)2σ2)dσ=S(β0,β1)n20sn1exp(12s2)dsβS(β0,β1)n2, and (3.1)fβ0,β1|data(β0,β1)1{C<β0,β1<C}S(β0,β1)n2.
Since S(β0,β1) in practical problems are extremely large, to handle numerical issue, we rewrite as (3.2)fβ0,β1|data(β0,β1)(S(β^0,β^1)S(β0,β1))n21{C<β0,β1<C}.
The density will be concentrated around values s.t. S(β0,β1) is close to S(β^0,β^1).