Model components Generalized linear model




1 model components

1.1 probability distribution
1.2 linear predictor
1.3 link function





model components

the glm consists of 3 elements:



1. probability distribution exponential family.
2. linear predictor η = xβ .
3. link function g such e(y) = μ = g(η).

probability distribution

the overdispersed exponential family of distributions generalization of exponential family , exponential dispersion model of distributions , includes probability distributions, parameterized




θ



{\displaystyle {\boldsymbol {\theta }}}

,



τ


{\displaystyle \tau }

, density functions f (or probability mass function, case of discrete distribution) can expressed in form








f

y


(

y



θ

,
τ
)
=
h
(

y

,
τ
)
exp



(




b

(

θ


)


t




t

(
y
)

a
(

θ

)


d
(
τ
)



)


.




{\displaystyle f_{y}(\mathbf {y} \mid {\boldsymbol {\theta }},\tau )=h(\mathbf {y} ,\tau )\exp {\left({\frac {\mathbf {b} ({\boldsymbol {\theta }})^{\rm {t}}\mathbf {t} (y)-a({\boldsymbol {\theta }})}{d(\tau )}}\right)}.\,\!}







τ


{\displaystyle \tau }

, called dispersion parameter, typically known , related variance of distribution. functions



h
(

y

,
τ
)


{\displaystyle h(\mathbf {y} ,\tau )}

,




b

(

θ

)


{\displaystyle \mathbf {b} ({\boldsymbol {\theta }})}

,




t

(
y
)


{\displaystyle \mathbf {t} (y)}

,



a
(

θ

)


{\displaystyle a({\boldsymbol {\theta }})}

, ,



d
(
τ
)


{\displaystyle d(\tau )}

known. many common distributions in family, including normal, exponential, gamma, poisson, bernoulli, , (for fixed number of trials) binomial, multinomial, , negative binomial.


for scalar



y


{\displaystyle y}

,



θ


{\displaystyle \theta }

, reduces to








f

y


(
y

θ
,
τ
)
=
h
(
y
,
τ
)
exp



(



b
(
θ
)
t
(
y
)

a
(
θ
)


d
(
τ
)



)


.




{\displaystyle f_{y}(y\mid \theta ,\tau )=h(y,\tau )\exp {\left({\frac {b(\theta )t(y)-a(\theta )}{d(\tau )}}\right)}.\,\!}








θ



{\displaystyle {\boldsymbol {\theta }}}

related mean of distribution. if




b

(

θ

)


{\displaystyle \mathbf {b} ({\boldsymbol {\theta }})}

identity function, distribution said in canonical form (or natural form). note distribution can converted canonical form rewriting




θ



{\displaystyle {\boldsymbol {\theta }}}







θ





{\displaystyle {\boldsymbol {\theta }} }

, applying transformation




θ

=

b

(


θ



)


{\displaystyle {\boldsymbol {\theta }}=\mathbf {b} ({\boldsymbol {\theta }} )}

. possible convert



a
(

θ

)


{\displaystyle a({\boldsymbol {\theta }})}

in terms of new parametrization, if




b

(


θ



)


{\displaystyle \mathbf {b} ({\boldsymbol {\theta }} )}

not one-to-one function; see comments in page on exponential family. if, in addition,




t

(
y
)


{\displaystyle \mathbf {t} (y)}

identity ,



τ


{\displaystyle \tau }

known,




θ



{\displaystyle {\boldsymbol {\theta }}}

called canonical parameter (or natural parameter) , related mean through








μ

=
e

(

y

)
=

a
(

θ

)
.




{\displaystyle {\boldsymbol {\mu }}=\operatorname {e} (\mathbf {y} )=\nabla a({\boldsymbol {\theta }}).\,\!}



for scalar



y


{\displaystyle y}

,



θ


{\displaystyle \theta }

, reduces to







μ
=
e

(
y
)
=

a


(
θ
)
.




{\displaystyle \mu =\operatorname {e} (y)=a (\theta ).\,\!}



under scenario, variance of distribution can shown be







var

(

y

)
=



2


a
(

θ

)
d
(
τ
)
.




{\displaystyle \operatorname {var} (\mathbf {y} )=\nabla ^{2}a({\boldsymbol {\theta }})d(\tau ).\,\!}



for scalar



y


{\displaystyle y}

,



θ


{\displaystyle \theta }

, reduces to







var

(
y
)
=

a


(
θ
)
d
(
τ
)
.




{\displaystyle \operatorname {var} (y)=a (\theta )d(\tau ).\,\!}



linear predictor

the linear predictor quantity incorporates information independent variables model. symbol η (greek eta ) denotes linear predictor. related expected value of data (thus, predictor ) through link function.


η expressed linear combinations (thus, linear ) of unknown parameters β. coefficients of linear combination represented matrix of independent variables x. η can expressed as







η
=

x


β

.



{\displaystyle \eta =\mathbf {x} {\boldsymbol {\beta }}.\,}



link function

the link function provides relationship between linear predictor , mean of distribution function. there many commonly used link functions, , choice informed several considerations. there well-defined canonical link function derived exponential of response s density function. however, in cases makes sense try match domain of link function range of distribution function s mean, or use non-canonical link function algorithmic purposes, example bayesian probit regression.


when using distribution function canonical parameter



θ


{\displaystyle \theta }

, canonical link function function expresses



θ


{\displaystyle \theta }

in terms of



μ


{\displaystyle \mu }

, i.e.



θ
=
b
(
μ
)


{\displaystyle \theta =b(\mu )}

. common distributions, mean



μ


{\displaystyle \mu }

1 of parameters in standard form of distribution s density function, ,



b
(
μ
)


{\displaystyle b(\mu )}

function defined above maps density function canonical form. when using canonical link function,



b
(
μ
)
=
θ
=

x


β



{\displaystyle b(\mu )=\theta =\mathbf {x} {\boldsymbol {\beta }}}

, allows





x



t




y



{\displaystyle \mathbf {x} ^{\rm {t}}\mathbf {y} }

sufficient statistic




β



{\displaystyle {\boldsymbol {\beta }}}

.


following table of several exponential-family distributions in common use , data typically used for, along canonical link functions , inverses (sometimes referred mean function, done here).



in cases of exponential , gamma distributions, domain of canonical link function not same permitted range of mean. in particular, linear predictor may negative, give impossible negative mean. when maximizing likelihood, precautions must taken avoid this. alternative use noncanonical link function.


note in case of bernoulli, binomial, categorical , multinomial distributions, support of distributions not same type of data parameter being predicted. in of these cases, predicted parameter 1 or more probabilities, i.e. real numbers in range



[
0
,
1
]


{\displaystyle [0,1]}

. resulting model known logistic regression (or multinomial logistic regression in case k-way rather binary values being predicted).


for bernoulli , binomial distributions, parameter single probability, indicating likelihood of occurrence of single event. bernoulli still satisfies basic condition of generalized linear model in that, though single outcome either 0 or 1, expected value nonetheless real-valued probability, i.e. probability of occurrence of yes (or 1) outcome. similarly, in binomial distribution, expected value np, i.e. expected proportion of yes outcomes probability predicted.


for categorical , multinomial distributions, parameter predicted k-vector of probabilities, further restriction probabilities must add 1. each probability indicates likelihood of occurrence of 1 of k possible values. multinomial distribution, , vector form of categorical distribution, expected values of elements of vector can related predicted probabilities binomial , bernoulli distributions.








Comments

Popular posts from this blog

Discography Kassav'

History New York State Route 133

History Women in science