Scale Cast – A podcast about big data, distributed systems, and scalability

A podcast about big data, distributed systems, and scalability

An Introduction to ZooKeeper Video

In 2006 we were building distributed applications that needed a master, aka coordinator, aka controller to manage the sub processes of the applications. It was a scenario that we had encountered before and something that we saw repeated over and over again inside and outside of Yahoo!. For example, we have an application that consists […]

04-26
--:--

More Optimal Bloom Filters

The Bloom filter, conceived by Burton H. Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positives are possible, but false negatives are not. Elements can be added to the set, but not removed (though this can be addressed with […]

04-18
--:--

An Overview of High Performance Computing and Challenges for the Future

In this talk we examine how high performance computing has changed over the last 10-year and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our software. A new generation of software libraries and algorithms are needed for the effective and reliable use […]

04-08
--:--

Disk-Based Parallel Computation, Rubik’s Cube, and Checkpointin

This talk takes us on a journey through three varied, but interconnected topics. First, our research lab has engaged in a series of disk-based computations extending over five years. Disks have traditionally been used for filesystems, for virtual memory, and for databases. Disk-based computation opens up an important fourth use: an abstraction for multiple disks […]

03-29
--:--

Lecture 1: Cluster Computing and MapReduce

Lecture 1 in a five part series introducing mapreduce and cluster computing. See http://code.google.com/edu/… for slides and other resources. Link to video

01-03
--:--

Recommend Channels