EA - Open Technical Challenges around Probabilistic Programs and Javascript by Ozzie Gooen

Update: 2023-08-26

Description

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Open Technical Challenges around Probabilistic Programs and Javascript, published by Ozzie Gooen on August 26, 2023 on The Effective Altruism Forum.
While working on Squiggle, we've encountered many technical challenges in writing probabilistic functionality with Javascript. Some of these challenges are solved in Python and must be ported over, and some apply to all languages.
We think the following tasks could be good fits for others to tackle. These are fairly isolated and could be done in contained NPM packages or similar. The solutions would be useful for Squiggle and might be handy for others in the Javascript ecosystem as well. Advice and opinions are also appreciated.
This post was quickly written, as it's for a narrow audience and might get outdated. We're happy to provide more rigor and context if requested. Let us know if you are interested in taking any of them on and could use some guidance!
For those not themselves interested in contributing, this might be useful for giving people a better idea of the sorts of challenges we at QURI work on.
1. Density Estimation
Users often want to convert samples into continuous probability distribution functions (PDFs). This is difficult to do automatically.
The standard approach of basic Kernel Density Estimation can produce poor fits on multimodal or heavily skewed data.
a. Variable kernel density estimation
Simple KDE algorithms use a constant bandwidth. There are multiple methods for estimating this. One common method is Silverman's rule of thumb. In practice, using Silverman's rule of thumb with one single bandwidth performs poorly for multimodal or heavily skewed distributions.
Squiggle performs log KDE for heavily skewed distributions, but this only helps so much, and this strategy comes with various inconsistencies.
There's a set of algorithms for variable kernel density estimation or adaptive bandwidth choice, which seems more promising. Another option is the Sheather-Jones method, which existing python KDE libraries use. We don't know of good Javascript implementations of these algorithms.
b. Performant KDE with non-triangle shapes
Squiggle now uses a triangle kernel for speed. Fast algorithms (FFT) should be possible, with better kernel shapes.
See this thread for some more discussion.
c. Cutoff Heuristics
One frequent edge-case is that many distributions have specific limits, often at 0. There might be useful heuristics like, "If there are no samples below zero, then it's very likely there should be zero probability mass below zero, even if many samples are close and the used bandwidth would imply otherwise."
See this issue for more information.
d. Discrete vs. continuous estimation
Sometimes, users pass in samples from discrete distributions or mixtures of discrete and continuous distributions. In these cases, it's helpful to have heuristics to detect which data might be meant to be discrete and which is meant to be continuous. Right now, in Squiggle, we do this by using simple heuristics of repetition - if multiple samples are precisely the same, we assume they represent discrete information. It's unclear if there are any great/better ways of doing this heuristically.
e. Multidimensional KDE
Eventually, it will be useful to do multidimensional KDE. It might be more effective to do this in WebAssembly, but this would of course, introduce complications.
2. Quantiles to Distributions, Maybe with Metalog
A frequent use case is: "I have a few quantile/CDF points in mind and want to fit this to a distribution. How should I do this?"
One option is to use the Metalog distribution. There's no great existing Javascript implementation of Metalog yet. Sam Nolan made one attempt, but it's not as flexible as we'd like. (It fails to convert many points into metalog distributions).
Jonas Moss thinks we can do better than...

Comments

In Channel

No new episodes will be published here. To keep listening to the EAF & LW, listen to this episode for instructions.

2024-09-2600:33

LW - Augmenting Statistical Models with Natural Language Parameters by jsteinhardt

2024-09-2216:41

LW - Glitch Token Catalog - (Almost) a Full Clear by Lao Mein

2024-09-2202:50:10

LW - Investigating an insurance-for-AI startup by L Rudolf L

2024-09-2126:00

LW - Applications of Chaos: Saying No (with Hastings Greer) by Elizabeth

2024-09-2103:39

LW - Work with me on agent foundations: independent fellowship by Alex Altair

2024-09-2106:20

LW - o1-preview is pretty good at doing ML on an unknown dataset by Håvard Tveit Ihle

2024-09-2003:14

EA - The Best Argument is not a Simple English Yud Essay by Jonathan Bostock

2024-09-2006:35

LW - Interested in Cognitive Bootcamp? by Raemon

2024-09-2002:05

LW - Laziness death spirals by PatrickDFarley

2024-09-1913:04

LW - We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap by johnswentworth

2024-09-1907:41

LW - AI #82: The Governor Ponders by Zvi

2024-09-1943:47

LW - Which LessWrong/Alignment topics would you like to be tutored in? [Poll] by Ruby

2024-09-1902:03

EA - What Would You Ask The Archbishop of Canterbury? by JDBauman

2024-09-1900:43

LW - [Intuitive self-models] 1. Preliminaries by Steven Byrnes

2024-09-1939:21

EA - EA Organization Updates: September 2024 by Toby Tremlett

2024-09-1909:56

EA - Five Years of Animal Advocacy Careers: Our Journey to impact, Lessons Learned, and What's Next by lauren mee

2024-09-1928:13

AF - The Obliqueness Thesis by Jessica Taylor

2024-09-1930:04

LW - The case for a negative alignment tax by Cameron Berg

2024-09-1814:19

EA - Match funding opportunity to challenge the legality of Frankenchickens by Gavin Chappell-Bates

2024-09-1807:19

00:00

EA - Open Technical Challenges around Probabilistic Programs and Javascript by Ozzie Gooen

#box-pro-ellipsis-176150191413258{-webkit-line-clamp:2;}EA - Open Technical Challenges around Probabilistic Programs and Javascript by Ozzie Gooen

EA - Open Technical Challenges around Probabilistic Programs and Javascript by Ozzie Gooen

Ozzie Gooen

EA - Open Technical Challenges around Probabilistic Programs and Javascript by Ozzie Gooen