DiscoverLessWrong (30+ Karma)“Resampling Conserves Redundancy & Mediation (Approximately) Under the Jensen-Shannon Divergence” by David Lorell
“Resampling Conserves Redundancy & Mediation (Approximately) Under the Jensen-Shannon Divergence” by David Lorell

“Resampling Conserves Redundancy & Mediation (Approximately) Under the Jensen-Shannon Divergence” by David Lorell

Update: 2025-10-31
Share

Description

Audio note: this article contains 86 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.

Around two months ago, John and I published Resampling Conserves Redundancy (Approximately). Fortunately, about two weeks ago, Jeremy Gillen and Alfred Harwood showed us that we were wrong.

This proof achieves, using the Jensen-Shannon divergence ("JS"), what the previous one failed to show using KL divergence ("_D_{KL}_"). In fact, while the previous attempt tried to show only that redundancy is conserved (in terms of _D_{KL}_) upon resampling latents, this proof shows that the redundancy and mediation conditions are conserved (in terms of JS).

Why Jensen-Shannon?

In just about all of our previous work, we have used _D_{KL}_ as our factorization error. (The error meant to capture the extent to which a given distribution fails to factor according to some graphical structure.) In this post I use the Jensen Shannon divergence.

_D_{KL}(U||V) := mathbb{E}_{U}lnfrac{U}{V}_

_JS(U||V) := frac{1}{2}D_{KL}left(U||frac{U+V}{2}right) + frac{1}{2}D_{KL}left(V||frac{U+V}{2}right)_

The KL divergence is a pretty fundamental quantity in information theory, and is used all over the place. (JS is usually defined in terms of _D_{KL}_, as above.) We [...]

---

Outline:

(01:04 ) Why Jensen-Shannon?

(03:04 ) Definitions

(05:33 ) Theorem

(06:29 ) Proof

(06:32 ) (1) _\\epsilon^{\\Gamma}_1 = 0_

(06:37 ) Proof of (1)

(06:52 ) (2) _\\epsilon^{\\Gamma}_2 \\leq (2\\sqrt{\\epsilon_1}+\\sqrt{\\epsilon_2})^2_

(06:57 ) Lemma 1: _JS(S||R) \\leq \\epsilon_1_

(07:10 ) Lemma 2: _\\delta(Q,R) \\leq \\sqrt{\\epsilon_1} + \\sqrt{\\epsilon_2}_

(07:20 ) Proof of (2)

(07:32 ) (3) _\\epsilon^{\\Gamma}_{med} \\leq (2\\sqrt{\\epsilon_1} + \\sqrt{\\epsilon_{med}})^2_

(07:37 ) Proof of (3)

(07:48 ) Results

(08:33 ) Bonus

The original text contained 1 footnote which was omitted from this narration.

---


First published:

October 31st, 2025



Source:

https://www.lesswrong.com/posts/JXsZRDcRX2eoWnSxo/resampling-conserves-redundancy-and-mediation-approximately


---


Narrated by TYPE III AUDIO.

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

“Resampling Conserves Redundancy & Mediation (Approximately) Under the Jensen-Shannon Divergence” by David Lorell

“Resampling Conserves Redundancy & Mediation (Approximately) Under the Jensen-Shannon Divergence” by David Lorell