BERT-Sort: How to use language models to semantically order categorical values

Update: 2022-11-24

Description

Today Ankush Garg is talking to Mehdi Bahrami about his recent project: BERT-Sort.

BERT-Sort is an example of how large language models can add useful context to tabular datasets, and to AutoML systems.

Mehdi is a Member of Research Staff at Fujitsu and, as he describes, he began using AutoML systems for his research, yet he came across some crucial limitations of existing solutions. The modifications he made highlight a promising future for the relationship between language models and AutoML. This is a direction we're going to continue to explore on the show.

References:
BERT-Sort: A Zero-shot MLM Semantic Encoder on Ordinal Features for AutoML - https://proceedings.mlr.press/v188/bahrami22a.html

PyTorrent: A Python Library Corpus for Large-scale Language Models: https://arxiv.org/abs/2110.01710

AugmentedCode: Examining the Effects of Natural Language Resources in Code Retrieval Models: https://arxiv.org/abs/2110.08512

Comments

In Channel

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

2025-10-3101:28:33

Leverage Foundational Models for Black-Box Optimization

2025-09-2256:48

Nyckel - Building an AutoML Startup

2025-03-0701:20:59

Neural Architecture Search: Insights from 1000 Papers

2024-12-0301:15:44

Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

2024-08-0853:04

Discovering Temporally-Aware Reinforcement Learning Algorithms

2024-06-2451:15

X Hacking: The Threat of Misguided AutoML

2024-05-2754:48

Introduction To New Co-Host, Theresa Eimer

2024-05-2713:57

AutoGluon: The Story

2023-09-0503:13:18

How to Integrate Logic and Argumentation into Human-Centric AutoML

2023-06-2643:09

How to Design an AutoML System using Error Decomposition

2023-06-0428:59

The Semantic Layer and AutoML

2023-05-1657:37

Foundation Models: The term and its origins

2023-04-2901:10:18

The Business and Engineering of AutoML Products with Raymond Peck

2023-04-0602:01:51

TabPFN: A Revolution in AutoML?

2023-03-0201:16:24

How financial institutions manage model risk

2023-02-0701:12:13

How to solve dynamical systems by fusing data and mechanism

2023-01-1201:09:36

DASH: How to Search Over Convolutions

2022-12-2001:18:30

Human-Centered AutoML: The New Paradigm

2022-12-0301:10:42

BERT-Sort: How to use language models to semantically order categorical values

2022-11-2440:37

00:00

1.0x

BERT-Sort: How to use language models to semantically order categorical values

#box-pro-ellipsis-176381202856040{-webkit-line-clamp:2;}BERT-Sort: How to use language models to semantically order categorical values

BERT-Sort: How to use language models to semantically order categorical values

BERT-Sort: How to use language models to semantically order categorical values