Theorems Singular Value Decomposition Formula for Ridge Regression Attributions

Attributions & Verification

Track contributions and verify content correctness

Proof

[proofplan] We diagonalize the ridge normal matrix $X^\top X + \rho I_p$ using the right singular vector basis of $X$. In that basis, inversion is scalar division by $d_j^2+\rho$ on the row-space directions and by $\rho$ on the null-space directions. Since $X^\top y$ has no component in the null space of $X$, only the first $r$ singular directions contribute. Multiplying the resulting coefficient expansion by $X$ gives the fitted-value formula with shrinkage factors $d_j^2/(d_j^2+\rho)$. [/proofplan]

custom_env admin

[step:Diagonalize the ridge normal matrix in the right singular vector basis] Since $X = UDV^\top$, orthogonality of $U$ gives \begin{align*} X^\top X &= (UDV^\top)^\top(UDV^\top) \\ &= VD^\top U^\top UDV^\top \\ &= VD^\top DV^\top. \end{align*} Define the diagonal matrix \begin{align*} A := D^\top D + \rho I_p \in \mathbb{R}^{p \times p}. \end{align*} Then \begin{align*} X^\top X + \rho I_p = VAV^\top. \end{align*} The diagonal entries of $A$ are $d_j^2+\rho$ for $1 \le j \le r$ and $\rho$ for $r < j \le p$. Because $\rho > 0$, all diagonal entries of $A$ are positive, so $A$ is invertible. Since $V$ is orthogonal, \begin{align*} (X^\top X + \rho I_p)^{-1} = VA^{-1}V^\top. \end{align*} [/step]

custom_env admin

[step:Expand $X^\top y$ in the right singular vector basis] For each $1 \le j \le r$, the diagonal structure of $D$ gives $D^\top u_j = d_j e_j$, where $e_j \in \mathbb{R}^p$ is the $j$-th standard basis vector. For $r < j \le n$, the corresponding singular value is zero, so no positive singular direction contributes. Hence \begin{align*} X^\top y &= VD^\top U^\top y \\ &= V\left(\sum_{j=1}^r d_j(u_j^\top y)e_j\right) \\ &= \sum_{j=1}^r d_j(u_j^\top y)v_j. \end{align*} [/step]

custom_env admin

[step:Apply the inverse diagonal multiplier to obtain the coefficient formula] Using the previous two steps, \begin{align*} \hat{\beta}^{\mathrm{ridge}}(\rho) &= (X^\top X + \rho I_p)^{-1}X^\top y \\ &= VA^{-1}V^\top \left(\sum_{j=1}^r d_j(u_j^\top y)v_j\right). \end{align*} Since $V^\top v_j = e_j$ and $A^{-1}e_j = (d_j^2+\rho)^{-1}e_j$ for $1 \le j \le r$, we get \begin{align*} \hat{\beta}^{\mathrm{ridge}}(\rho) &= \sum_{j=1}^r d_j(u_j^\top y)V A^{-1}e_j \\ &= \sum_{j=1}^r \frac{d_j}{d_j^2+\rho}(u_j^\top y)Ve_j \\ &= \sum_{j=1}^r \frac{d_j}{d_j^2+\rho}(u_j^\top y)v_j. \end{align*} [/step]

custom_env admin

[step:Multiply by $X$ to obtain the fitted-value formula] For $1 \le j \le r$, the [singular value decomposition](/theorems/3071) gives \begin{align*} Xv_j = UDV^\top v_j = UDe_j = d_j u_j. \end{align*} Therefore \begin{align*} X\hat{\beta}^{\mathrm{ridge}}(\rho) &= X\left(\sum_{j=1}^r \frac{d_j}{d_j^2+\rho}(u_j^\top y)v_j\right) \\ &= \sum_{j=1}^r \frac{d_j}{d_j^2+\rho}(u_j^\top y)Xv_j \\ &= \sum_{j=1}^r \frac{d_j^2}{d_j^2+\rho}(u_j^\top y)u_j. \end{align*} This is the stated fitted-value expansion, and the proof is complete. [/step]

custom_env admin

Verification Progress

5 Total Blocks

0 Verified

0% verified

Contributors

admin 5 blocks (0 verified)

Who Can Verify

Areas: Probability & Statistics

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

Raw Attribution Data

Loading attribution data...

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data