In joint work with Soumik Pal, we study natural mixing processes where cards (or dominoes or mahjong tiles) are 'smushed' around on a table with two hands. How long should mixing continue. If things are not well mixed, what patterns remain? We study this in practice (!): experiments indicate that about 30 seconds of smushing suffice to mix 52 cards. We also study it in theory introducing a variety of models which permit analysis. Part of the analysis passes to a reflecting, jump- diffusion limit and uses this and a novel 'shadow coupling' to give reasonably precise bounds on the mixing time.
For much of its history, AI research has aimed toward building intelligent machines independently of their interactions with people. As the world of computing has evolved, and systems pervade ever more facets of life, the challenges of building computer systems smart enough to work effectively with people, in groups as well as individually, has become increasingly important. Furthermore, recent advances in AI-capable systems raise societal and ethical questions about the effects of such systems on people and societies at large. In this talk, I will argue that the ability to work with is essential for truly intelligent behavior, identify fundamental scientific questions this teamwork requirement raises, describe research by my group on computational models of collaboration and their use in supporting health-care coordination, and briefly discuss ethical challenges AI-capable systems pose, along with approaches to those challenges.
Given n uniform points on the surface of a two-dimensional sphere, how can we partition the sphere fairly among them? "Fairly" means that each region has the same area. It turns out that if the given points apply a two-dimensional gravity force to the rest of the sphere, then the basins of attraction for the resulting gradient flow yield such a partition - with exactly equal areas, no matter how the points are distributed. (See the cover of the AMS Notices at http://www.ams.org/publications/journals/notices/201705/rnoti-cvr1.pdf.) Our main result is that this partition minimizes, up to a bounded factor, the average distance between points in the same cell. I will also present an application to almost optimal matching of n uniform blue points to n uniform red points on the sphere, connecting to a classical result of Ajtai, Komlos and Tusnady (Combinatorica 1984). Joint work with Nina Holden and Alex Zhai.
We prove that any random distribution satisfying conformal invariance and a form of domain Markov property and having a finite moment condition must be the Gaussian free field. We also present some open problems regarding what happens beyond the Gaussian free field. Ongoing joint work with Nathanael Berestycki and Ellen Powell.
A secure multiparty computation protocol allows a set of mutually distrusting parties to compute a joint function of their private inputs without leaking anything apart from the output of the functionality. Ever since the initial results on this topic, an active line of research has been to minimize the number of rounds needed for securely computing any functionality as well as minimize the assumptions under which protocol can be proven secure. In this work, we give a construction of round-optimal secure multiparty computation from the minimal assumption that two-round oblivious transfer exists. I will also discuss several extensions of the result. Based on joint work with Sanjam Garg.
The talk is divided into two sections. In the first section, I will talk about behavioral and physiological responses associated with interest, curiosity, and novelty in image viewing. We recorded facial expressions, eye gaze and electrodermal responses of 50 participants viewing 80 images. Participants self-reported their level of interest, perceived novelty, complexity and comprehensibility of the images. We found that not everybody perceived interest similarly and pleasantness is the strongest indicator of interest. In the second study, we aimed at automatically recognizing a user's intent for image search in the early stage of a search session. We designed seven different search scenarios under the intent conditions of finding items, re-finding items, and entertainment. We collected the same behavioral and physiological responses in addition to implicit user interactions from the same participants. Participants performed seven different search tasks on a custom-built image retrieval platform. Finally, we trained machine learning models to predict users' search intentions from the visual content of the visited images, the user interactions, and the spontaneous responses. After fusing the visual and user interaction features, our system achieved the F-1 score of 0.722 for classifying three classes in a user-independent cross-validation. We found that eye gaze and implicit user interactions, including mouse movements and keystrokes are the most informative features in search intent prediction.
Rapid advances in sequencing technology have led to the generation of genome-scale DNA sequencing data for more than 2 million individuals worldwide. These data represent incredibly powerful information about the distribution and impact of genetic variation, but major challenges remain to aggregating and harmonizing them. In this presentation, I will describe the development of the Exome Aggregation Consortium (ExAC) and Genome Aggregation Database (gnomAD) databases, which combined represent exome and genome sequencing data for over 135,000 individuals. I will discuss approaches to analyzing genome data at massive scale and the applications of these data to understanding human variation and gene function.
Optimal matching problems are random variational problems widely investigated in the mathematics and physics literature. We discuss here the optimal matching problem of an empirical measure on a sample of iid random variables to the common law in Kantorovich-Wasserstein distances, which is a classical topic in probability and statistics. Two-dimensional matching of uniform samples gave rise to deep results investigated from various view points (combinatorial, generic chaining). We study here the case of Gaussian samples, first in dimension one on the basis of explicit representations of Kantorovich metrics and a sharp analysis of more general log-concave distributions in terms of their isoperimetric profile (joint work with S. Bobkov), and then in dimension two (and higher) following the PDE and transportation approach recently put forward by L. Ambrosia, F. Stra and D. Trevisan.
Modern data often consists of feature vectors with a large number of features. High-dimensional geometry and Linear Algebra (Singular Value Decomposition) are two of the crucial areas which form the mathematical foundations of Data Science. This mini-course covers these areas, providing intuition and rigorous proofs. Connections between Geometry and Probability will be brought out. Text Book: Foundations of Data Science.
Modern data often consists of feature vectors with a large number of features. High-dimensional geometry and Linear Algebra (Singular Value Decomposition) are two of the crucial areas which form the mathematical foundations of Data Science. This mini-course covers these areas, providing intuition and rigorous proofs. Connections between Geometry and Probability will be brought out. Text Book: Foundations of Data Science.
Modern data often consists of feature vectors with a large number of features. High-dimensional geometry and Linear Algebra (Singular Value Decomposition) are two of the crucial areas which form the mathematical foundations of Data Science. This mini-course covers these areas, providing intuition and rigorous proofs. Connections between Geometry and Probability will be brought out. Text Book: Foundations of Data Science.
Modern data often consists of feature vectors with a large number of features. High-dimensional geometry and Linear Algebra (Singular Value Decomposition) are two of the crucial areas which form the mathematical foundations of Data Science. This mini-course covers these areas, providing intuition and rigorous proofs. Connections between Geometry and Probability will be brought out. Text Book: Foundations of Data Science.
Zachary Horvitz is a rising junior at Brown University, concentrating in computer science and anthropology. He presents at Microsoft Research background on issues around "broken window policing" and his work over the summer at the ACLU of Massachusetts on exploring data and building data visualization tools to better understand arrest data and outcomes in Massachusetts.
Rebeca is designed for modeling and formal verification of asynchronous and reactive systems in 2001, and is supported by a model-checking tool, Afra. Followed by that, an actor-based family of languages are introduced to enable model driven development and provide a faithful and usable model for building distributed, asynchronous, and event-based systems. In Timed Rebeca, network and computational delays, periodic events, and required deadlines can be expressed in the model. Model checking and simulation tools are built based on the formal semantics of the language and its extensions. For deadlock-freedom and schedulability analysis special efficient techniques in state space exploration is proposed by exploiting the isolation of method execution in the model. I will show how these models can be used in safety assurance and performance evaluation of different systems, like Network on Chip Architectures, Sensor Network Applications, Traffic Control Systems, and Quadcopters. I show a general pattern in track-based traffic control systems, and a framework where self-adaptive actors are used to address self-adaptive traffic control systems.
The world of fashion is rapidly adopting technology on runways around the world, bringing a spotlight to innovative new designs and experiments, also known as Haute Tech Couture. However, designers and their teams face a growing number of challenges, from technical (e.g. battery placement) to social (team dynamics between designers and engineers). In this talk, I describe 2.5 years of my journey and research that delves into the dynamics of building fashion technology pieces, as well as experiences with coordinating Haute Tech Couture shows around the world. I also discuss the broader importance and impact of Fashion Technology as vehicle for STEM, based on experiences hosting multiple month long fashion tech workshops in Canada and China.
This talk will explore how to use social theory to understand problems of social order and its relationship to an era of Big Data. I take as a starting-point of the talk the neglected late work of theorist Norbert Elias and the concept of figurations, which I draw upon in my recent book (The Mediated Construction of Reality, with Andrew Hepp, Polity 2016), as a way of thinking better about the social world's real complexity. I will sketch the historical background that shapes what we know about the role of communications in the growth of industrial capitalism in the 19th century to argue that we are in a parallel phase of major transformation today. This raises two problems on which the talk will reflect: first, what are the distinctive features of the social knowledge that is today being generated through big data processes, compared with the 19th century's rise of statistics as the primary generator of social knowledge; second, what are the implications for the enduring value of freedom of the data collection processes on which Big Data is founded?
We propose a new class of post-quantum digital signature schemes that: (a) derive their security entirely from the security of symmetric-key primitives, believed to be quantum-secure, and (b) have extremely small keypairs, and, (c) are highly parametrizable. In our signature constructions, the public key is an image y=f(x) of a one-way function f and secret key x. A signature is a non-interactive zero-knowledge proof of x, that incorporates a message to be signed. For this proof, we leverage recent progress of Giacomelli et al. (USENIX'16) in constructing an efficient sigma protocol for statements over general circuits. We improve this sigma protocol to reduce proof sizes by a factor of two, at no additional computational cost. While this is of independent interest as it yields more compact proofs for any circuit, it also decreases our signature sizes. We consider two possibilities for making the proof non-interactive, the Fiat-Shamir transform, and Unruh's transform (EUROCRYPT'12,'15,'16). The former has smaller signatures, while the latter has a security analysis in the quantum-accessible random oracle model. By customizing Unruh's transform to our application, the overhead is reduced to 1.6x when compared to the Fiat-Shamir transform, which does not have a rigorous post-quantum security analysis. We implement and benchmark both approaches and explore the possible choice of f, taking advantage of the recent trend to strive for practical symmetric ciphers with a particularly low number of multiplications and end up using LowMC. This is joint work with Melissa Chase, David Derler, Steven Goldfeder, Claudio Orlandi, Christian Rechberger, Daniel Slamanig and Greg Zaverucha.
We will be presenting a series of online tutorial sessions to introduce and provide examples of the work that is possible in the Catapult Academic environment.
I will highlight some methodologies and examples of finding causal mutations in both medical genomics and comparative genomics contexts.
The increasing ubiquity of devices capable of capturing videos has led to an explosion in the amount of recorded video content. Instead of "eyeballing" the videos for potentially useful information, it has therefore been a pressing need to develop automatic video analysis and understanding algorithms for various applications. However, understanding videos on a large scale remains challenging: large variations and complexities, time-consuming annotations, and a wide range of involved video concepts. In light of these challenges, my research towards video understanding focuses on designing effective network architectures to learn robust video representations, learning video concepts from weak supervision and building a stronger connection between language and vision. In this talk, I will first introduce a Deep Event Network (DevNet) that can simultaneously detect pre-defined events and localize spatial-temporal key evidence. Then I will show how web crawled videos and images could be utilized for learning video concepts. Finally, I will present our recent efforts to connect visual understanding to language through attractive visual captioning and visual question segmentation.