Data Science #18 - The k-nearest neighbors algorithm (1951)

Update: 2024-11-25

Description

In the 18th episode we go over the original k-nearest neighbors algorithm;

Fix, Evelyn; Hodges, Joseph L. (1951). Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties USAF School of Aviation Medicine, Randolph Field, Texas

They introduces a nonparametric method for classifying a new observation 𝑧 z as belonging to one of two distributions, 𝐹 F or 𝐺 G, without assuming specific parametric forms.

Using 𝑘 k-nearest neighbor density estimates, the paper implements a likelihood ratio test for classification and rigorously proves the method's consistency.

The work is a precursor to the modern 𝑘 k-Nearest Neighbors (KNN) algorithm and established nonparametric approaches as viable alternatives to parametric methods.

Its focus on consistency and data-driven learning influenced many modern machine learning techniques, including kernel density estimation and decision trees.

This paper's impact on data science is significant, introducing concepts like neighborhood-based learning and flexible discrimination.

These ideas underpin algorithms widely used today in healthcare, finance, and artificial intelligence, where robust and interpretable models are critical.

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

Data Science #20 - the Rao-Cramer bound (1945)

2024-12-0959:42

Data Science #19 - The Kullback–Leibler divergence paper (1951)

2024-12-0252:41

Data Science #18 - The k-nearest neighbors algorithm (1951)

2024-11-2544:01

Data Science #17 - The Monte Carlo Algorithm (1949)

2024-11-1838:11

Data Science #16 - The First Stochastic Descent Algorithm (1952)

2024-11-0742:20

Data Science #15 - The First Decision Tree Algorithm (1963)

2024-10-2836:35

Data Science #14 - The original k-means algorithm paper review (1957)

2024-10-1046:57

Data Science #13 - Kolmogorov complexity paper review (1965) - Part 2

2024-10-0129:25

Data Science #12 - Kolmogorov complexity paper review (1965) - Part 1

2024-09-2838:53

Data Science #11 - The original Perceptron paper by Frank Rosenblatt (1958)

2024-09-2001:03:29

Data Science #10 - The original principal component analysis (PCA) paper by Harold Hotelling (1935)

2024-09-1255:41

Data Science #9 - The Unreasonable Effectiveness of Mathematics in Natural Sciences, Eugene Wigner

2024-09-1001:24:32

Data Science #8 - The Turing test by Turing Alan "Computing machinery and intelligence" Mind (1950)

2024-09-0454:57

Data Science #7 - "The use of multiple measurements in taxonomic problems." (1936), Fisher RA

2024-08-1247:30

Data Science #6 -"On the problem of the most efficient tests of statistical hypotheses." (1933) N&P

2024-08-0756:32

Data Science #5 - "A Mathematical Theory of Communication" (1948), Shannon, C. E. Part - 3

2024-07-3001:07:22

Data Science #4 - "A Mathematical Theory of Communication" (1948), Shannon, C. E. Part - 2

2024-07-2141:11

Data Science #3 - "A Mathematical Theory of Communication" (1948), Shannon, C. E. Part - 1

2024-07-1641:04

Data Science #2 - "Application of the Logistic Function to Bio-Assays" (1944), Berkson Joseph

2024-07-0701:01:29

Data Science #1 - Fisher RA. "On the mathematical foundations of theoretical statistics"(1922)

2024-07-0701:16:59

00:00

Data Science #18 - The k-nearest neighbors algorithm (1951)

#box-pro-ellipsis-173491217872531{-webkit-line-clamp:2;}Data Science #18 - The k-nearest neighbors algorithm (1951)

Data Science #18 - The k-nearest neighbors algorithm (1951)

Mike E

Data Science #18 - The k-nearest neighbors algorithm (1951)