1.58-bit FLUX

Update: 2024-12-31

Description

🤗 Upvotes: 24 | cs.CV, cs.AI, cs.LG

Authors:

Chenglin Yang, Celong Liu, Xueqing Deng, Dongwon Kim, Xing Mei, Xiaohui Shen, Liang-Chieh Chen

Title:

1.58-bit FLUX

Arxiv:

http://arxiv.org/abs/2412.18653v1

Abstract:

We present 1.58-bit FLUX, the first successful approach to quantizing the state-of-the-art text-to-image generation model, FLUX.1-dev, using 1.58-bit weights (i.e., values in {-1, 0, +1}) while maintaining comparable performance for generating 1024 x 1024 images. Notably, our quantization method operates without access to image data, relying solely on self-supervision from the FLUX.1-dev model. Additionally, we develop a custom kernel optimized for 1.58-bit operations, achieving a 7.7x reduction in model storage, a 5.1x reduction in inference memory, and improved inference latency. Extensive evaluations on the GenEval and T2I Compbench benchmarks demonstrate the effectiveness of 1.58-bit FLUX in maintaining generation quality while significantly enhancing computational efficiency.

Comments

In Channel

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

2025-08-2821:42

VibeVoice Technical Report

2025-08-2821:19

CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics

2025-08-2820:03

VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space

2025-08-2820:50

OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation

2025-08-2822:38

Spacer: Towards Engineered Scientific Inspiration

2025-08-2822:27

UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning

2025-08-2819:39

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

2025-08-2723:14

Visual-CoG: Stage-Aware Reinforcement Learning with Chain of Guidance for Text-to-Image Generation

2025-08-2718:59

MV-RAG: Retrieval Augmented Multiview Diffusion

2025-08-2720:32

Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

2025-08-2622:33

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

2025-08-2621:37

ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks

2025-08-2621:27

Intern-S1: A Scientific Multimodal Foundation Model

2025-08-2319:26

Mobile-Agent-v3: Foundamental Agents for GUI Automation

2025-08-2325:02

Deep Think with Confidence

2025-08-2320:40

LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries

2025-08-2323:48

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

2025-08-2222:59

From Scores to Skills: A Cognitive Diagnosis Framework for Evaluating Financial Large Language Models

2025-08-2223:15

FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction

2025-08-2222:01

00:00

1.58-bit FLUX

Jingwen Liang, Gengyu Wang

#box-pro-ellipsis-175671066624799{-webkit-line-clamp:2;}1.58-bit FLUX

1.58-bit FLUX

Jingwen Liang, Gengyu Wang

1.58-bit FLUX