1 Odds and Odds ratio

For binominal distribution $\sim Bin(N,p)$ , $\frac{X}{N}$

$\mu = E(Y) = \frac{E(X)}{N}=p=- \frac{c'(p)}{b'(p)}=-\frac{-N/(1-p)}{N/p(1-p)}$

$logit(p)=log(\frac{p}{1-p})$ -----------Logit Link Function

Link function for binary data

Logit Link: $=log(\frac{p}{1-p})$
Probit Link: $\mathbb{\phi}^{-1}(p)$ , where $\mathbb{\phi}$ is c.d.f of $N (0, 1)$
Log-log Link: $h (p) = - l o g (- l o g (p))$
Complementray log-log Link: $h (p) = - l o g (- l o g (1 - p))$

Odds Definition

$\frac{p}{1-p}$ , $p$ is the probability of the outcome of interest, $\frac{Odds}{1+Odds}$

In logistic regression,

log odds: $log(Odds)=log(\frac{p}{1-p}) = x^\mathsf{T}\beta$

log odds ratios $\beta$ :
When compare the two coefficient of a factor
$\beta = log(\frac{p_1}{1-p_1})-log(\frac{p_2}{1-p_2}) \\ =log(\frac{Odds_1}{Odds_2})$

odds ratios $exp(\beta)$ :
since $Odds_2=exp(\beta) Odds_1$ , we also call $\exp(\beta)$ as odds multiplier.

2 Is it good fit?

For GLM: Deviance $D$

For logistic regression (binomial models):

Deviance residuals
deviance residual :
$d_k = sign(y_k-n_k\hat{p}_k) \times \left[2\left[y_k log(\frac{y_k}{n_k \hat{p}_k})+(n_k - y_k) log (\frac{n_k - y_k}{n_k - n_k \hat{p}_k}) \right]\right]^{\frac{1}{2}}.$
standardised deviance residual:
$r_{DK}=\frac{d_k}{\sqrt{1 - h_k}}$
$h_k$ is the leverage of the hat matrix.

tips: The residuals are not informative if the response is binary of $n_k$ is small for most covariate patterns, wouldn’t be useful for the outcome variable is binary and the predictor is continuous.
Pearson’ s chi-squared statistic
$\chi^2=\sum^{n}_{i=1} \frac{(y_i - n_i \hat p_i)^2}{n_i \hat p_i(1 - \hat {p}_i)}, ~~~ i = 1,..n$
Pearson residuals
Pearson or chi-squared residual:
$X_k = \frac{y_k-n_k\hat{p}_k}{\sqrt{n_k \hat{p}_k (1 - \hat{p}_k)}}$
standardised Pearson residual:
$r_{PK}=\frac{X_K}{\sqrt{1 - h_k}}$
$h_k$ is the leverage of the hat matrix.
Likelihood ratio chi-squared statistic
$2[l(\hat {p} ; y) - l(\tilde{p};y)], ~~~ where ~ \tilde{p} = \frac{\sum y_i}{\sum x_i}, ~~~ \hat p ~ is ~ under ~ MLE$
AIC
$-2l(\hat p;y)+2p$
smaller for better
BIC
$-2l(\hat p;y)+2p \times log(number ~ of ~ observations)$
smaller for better