DiscoverDaily Paper CastThe Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Update: 2025-12-24
Share

Description

🤗 Upvotes: 53 | cs.CV



Authors:

Weichen Fan, Haiwen Diao, Quan Wang, Dahua Lin, Ziwei Liu



Title:

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding



Arxiv:

http://arxiv.org/abs/2512.19693v1



Abstract:

Deep representations across modalities are inherently intertwined. In this paper, we systematically analyze the spectral characteristics of various semantic and pixel encoders. Interestingly, our study uncovers a highly inspiring and rarely explored correspondence between an encoder's feature spectrum and its functional role: semantic encoders primarily capture low-frequency components that encode abstract meaning, whereas pixel encoders additionally retain high-frequency information that conveys fine-grained detail. This heuristic finding offers a unifying perspective that ties encoder behavior to its underlying spectral structure. We define it as the Prism Hypothesis, where each data modality can be viewed as a projection of the natural world onto a shared feature spectrum, just like the prism. Building on this insight, we propose Unified Autoencoding (UAE), a model that harmonizes semantic structure and pixel details via an innovative frequency-band modulator, enabling their seamless coexistence. Extensive experiments on ImageNet and MS-COCO benchmarks validate that our UAE effectively unifies semantic abstraction and pixel-level fidelity into a single latent space with state-of-the-art performance.

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Jingwen Liang, Gengyu Wang