https://www.cs.jhu.edu/~mdinitz/theory
America/New_York
America/New_York
America/New_York
20211107T020000
-0400
-0500
EST
20220313T020000
-0500
-0400
EDT
ai1ec-257@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
20181116
20181117
Georgetown University
0
Capital Area Theory Day
free
ai1ec-301@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
20191109
20191110
0
FOCS Workshops
free
ai1ec-302@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
20191110
20191113
0
FOCS
free
ai1ec-259@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Nithin Varma
Affiliation: Boston University
Title: Separating erasures and errors in property testing using local list decoding
Abstract:
Corruption in data can be in the form of erasures (missing data) or errors (wrong data). Erasure-resilient property testing (Dixit, Raskhodnikova, Thakurta, Varma ’16) and tolerant property testing (Parnas, Ron, Rubinfeld ’06) are two formal models of sublinear algorithms that account for the presence of erasures and errors in input data, respectively.
We first show that there exists a property P that has an erasure-resilient tester whose query complexity is independent of the input size n, whereas every tolerant tester for P has query complexity that depends on n. We obtain this result by designing a local list decoder for the Hadamard code that works in the presence of erasures, thereby proving an analog of the famous Goldreich-Levin Theorem. We also show a strengthened separation by proving that there exists another property R such that R has a constant-query erasure-resilient tester, whereas every tolerant tester for R requires n^{Omega(1)} queries. The main tool used in proving the strengthened separation is an approximate variant of a locally list decodable code that works against erasures.
Joint work with Sofya Raskhodnikova and Noga Ron-Zewi.
20181024T120000
20181024T130000
0
[Theory Seminar] Nithin Varma
free
ai1ec-256@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Jalaj Upadhyay
Affiliation: JHU
Title: Differentially Private Spectral Sparsification of Graphs
Abstract:
In this talk, we will discuss differentially private spectral sparsification of graphs. We argue that traditional spectral sparsification where the output graph is a subgraph of the input graph is not possible with differential privacy. This motivates us to define a relaxed version of spectral sparsification of graphs.
We consider edge-level privacy, i.e., neighboring graphs differs in one edge with weight one. We give efficient $(\alpha,\beta)$-differentially private algorithms that, on input a dense graph G, construct a spectral sparsification of G. Our output graphs has $ O(n/\eps^2)$ weighted edges, which matches the best known non-private algorithms.
We can use our private sparse graph to solve various combinatorial and learning problems on graphs efficiently while preserving differential privacy. Some examples include all possible cut queries, Max-Cut, Sparse-Cut, Edge-Expansion, Laplacian eigenmaps, etc.
This talk is based on a joint work with Raman Arora and Vladimir Braverman.
20181107T120000
20181107T130000
0
[Theory Seminar] Jalaj Upadhyay
free
ai1ec-264@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Ke Wu
Affiliation: Johns Hopkins University
Title: Synchronization Strings: Efficient and Fast Deterministic Constructions over Small Alphabets
Abstract:
Synchronization strings are recently introduced by Haeupler and Shahrasbi (STOC 2017) in the study of codes for correcting insertion and deletion errors (insdel codes). They showed that for any parameter ε>0, synchronization strings of arbitrary length exist over an alphabet whose size depends only on ε. Specifically, they obtained an alphabet size of O(ε^{−4}), which left an open question on where the minimal size of such alphabets lies between Ω(ε^{1}) and O(ε^{−4}). In this work, we partially bridge this gap by providing an improved lower bound of Ω(ε^{−3/2}), and an improved upper bound of O(ε^{−2}). We also provide fast explicit constructions of synchronization strings over small alphabets.
Further, along the lines of previous work on similar combinatorial objects, we study the extremal question of the smallest possible alphabet size over which synchronization strings can exist for some constant ε
20181205T120000
20181205T130000
0
[Theory Seminar] Ke Wu
free
ai1ec-270@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Sami Davies
Affiliation: University of Washington
Title: A Tale of Santa Claus, Hypergraphs, and Matroids
Abstract:
A well-known problem in scheduling and approximation algorithms is the Santa Claus problem. Suppose that Santa Claus has a set of gifts, and he wants to distribute them among a set of children so that the least happy child is made as happy as possible. Here, the value that a child i has for a present j is of the form p_{ij} \in \{0,p_j\}. A polynomial time algorithm by Annamalai et al. gives a 12.33-approximation algorithm and is based on a modification of Haxell’s hypergraph matching argument.
In this paper, we introduce a matroid version of the Santa Claus problem. Our algorithm is also based on Haxell’s augmentation tree, but with the introduction of the matroid structure we solve a more general problem with cleaner methods. Our result can then be used as a blackbox to obtain a (4 \varepsilon)-approximation for Santa Claus. This factor also compares against a natural, compact LP for Santa Claus.
20190116T120000
20190116T130000
0
[Theory Seminar] Sami Davies
free
ai1ec-275@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Jalaj Upadhyay
Affiliation: Johns Hopkins Universit
Title: Towards Robust and Scalable Private Data Analysis
Abstract:
In the current age of big data, we are constantly creating new data which is analyzed by various platforms to improve service and user’s experience. Given the sensitive and confidential nature of these data, there are obvious security and privacy concerns while storing and analyzing such data. In this talk, I will discuss the fundamental challenges in providing robust security and privacy guarantee while storing and analyzing large data. I will also give a brief overview of my contributions and future plans towards addressing these challenges.
To give a glimpse of these challenges in providing a robust privacy guarantee known as differential privacy, I will use spectral sparsification of graphs as an example. Given the ubiquitous nature of graphs, differentially private analysis on graphs has gained a lot of interest. However, existing algorithms for these analyses are tailored made for the task at hand making them infeasible in practice. In this talk, I will present a novel differentially private algorithm that outputs a spectral sparsification of the input graph. At the core of this algorithm is a method to privately estimate the importance of an edge in the graph. Prior to this work, there was no known privacy preserving method that provides such an estimate or spectral sparsification of graphs.
Since many graph properties are defined by the spectrum of the graph, this work has many analytical as well as learning theoretic applications. To demonstrate some applications, I will show more efficient and accurate analysis of various combinatorial problems on graphs and the first technique to perform privacy preserving manifold learning on graphs.
20190206T120000
20190206T130000
0
[Theory Seminar] Jalaj Upadhyay
free
ai1ec-243@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Martin Farach-Colton
Affiliation: Rutgers University
Title: TBA
Abstract: TBA
20190213T120000
20190213T130000
0
[Theory Seminar] Martin Farach-Colton
free
ai1ec-272@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Xue Chen
Affiliation: Northwestern University
Title: Active Regression via Linear-Sample Sparsification
Abstract:
We present an approach that improves the sample complexity for a variety of curve fitting problems, including active learning for linear regression, polynomial regression, and continuous sparse Fourier transforms. In the active linear regression problem, one would like to estimate the least squares solution \beta^* minimizing ||X \beta – y||_2 given the entire unlabeled dataset X \in \R^{n \times d} but only observing a small number of labels y_i. We show that O(d/\eps) labels suffice to find an \eps-approximation \wt{\beta} to \beta^*:
E[||X \wt{\beta} – X\beta^*||_2^2] \leq \eps ||X \beta^* – y||_2^2.
This improves on the best previous result of O(d \log d d/\eps) from leverage score sampling. We also present results for the inductive setting, showing when \wt{\beta} will generalize to fresh samples; these apply to continuous settings such as polynomial regression. Finally, we show how the techniques yield improved results for the non-linear sparse Fourier transform setting.
Bio: Xue Chen is broadly interested in randomized algorithms and the use of randomness in computation. Specific areas include Fourier transform, learning theory and optimization, and pseudorandomness. He obtained his Ph.D. at the University of Texas at Austin, under the supervision of David Zuckerman. Currently, he is a postdoctoral fellow in Northwestern University.
20190227T120000
20190227T130000
0
[Theory Seminar] Xue Chen
free
ai1ec-282@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Rediet Abebe
Affiliation: Cornell University
Title: Using Search Queries to Understand Health Information Needs in Africa
Abstract:
Access to healthcare and health information is of major global
concern. The stark inequality in the availability of health data by
country, demographic groups, and socioeconomic status impedes the
identification of major public health concerns and implementation of
effective interventions. This data gap ranges from basic disease
statistics, such as disease prevalence rates, to more nuanced
information, such as public attitudes. A key challenge is
understanding health information needs of under-served and
marginalized communities. Without understanding people’s everyday
needs, concerns, and misconceptions, health organizations lack the
ability to effectively target education and programming efforts.
In this presentation, we focus on the lack of comprehensive,
high-quality data about information needs of individuals in developing
nations. We propose an approach that uses search data to uncover
health information needs of individuals in all 54 nations in Africa.
We analyze Bing searches related to HIV/AIDS, malaria, and
tuberculosis; these searches reveal diverse health information needs
that vary by demographic groups and geographic regions. We also shed
light on discrepancies in the quality of content returned by search
engines.
We conclude with a discussion on computationally-informed
interventions both on- and off-line in health and related domains and
the Mechanism Design for Social Good research initiative.
Bio:
Rediet Abebe is a computer scientist with a strong interest in the
promotion of equality and justice. Her research focuses on algorithms,
AI, and their applications to social good. As part of this research
agenda, she co-founded and co-organizes Mechanism Design for Social
Good (MD4SG), an interdisciplinary, multi-institutional research
initiative with over 300 individuals. She is also a co-founder and
co-organizer of Black in AI, an international network of over 1000
individuals focused on increasing the presence and inclusion of Black
and African researchers in AI. Her research is deeply influenced by
her upbringing in her hometown of Addis Ababa, Ethiopia, where she
lived until moving to the U.S. in 2009. Her work has been generously
supported by fellowships and scholarships through Facebook, Google,
the Cornell Graduate School, and the Harvard-Cambridge Fellowship.
20190304T120000
20190304T130000
0
Rediet Abebe
free
ai1ec-278@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Grigory Yaroslavtsev
Affiliation: Indiana University, Bloomington
Title: Advances in Hierarchical Clustering for Vector Data
Abstract:
Compared to the highly successful flat clustering (e.g. k-means), despite its important role and applications in data analysis, hierarchical clustering has been lacking in rigorous algorithmic studies until late due to absence of rigorous objectives. Since 2016, a sequence of works has emerged and gave novel algorithms for this problem in the general metric setting. This was enabled by a breakthrough by Dasgupta, who introduced a formal objective into the study of hierarchical clustering.
In this talk I will give an overview of our recent progress on models and scalable algorithms for hierarchical clustering applicable specifically to high-dimensional vector data. I will first discuss various linkage-based algorithms (single-linkage, average-linkage) and their formal properties with respect to various objectives. I will then introduce a new projection-based approximation algorithm for vector data. The talk will be self-contained and doesn’t assume prior knowledge of clustering methods.
Based on joint works with Vadapalli (ICML’18) and Charikar, Chatziafratis and Niazadeh (AISTATS’19)
20190306T120000
20190306T130000
0
[Theory Seminar] Grigory Yaroslavtsev
free
ai1ec-286@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Arka Rai Choudhury
Affiliation: Johns Hopkins University
Title: Finding a Nash Equilibrium is No Easier than Breaking Fiat-Shamir
Abstract:
The Fiat-Shamir heuristic transforms a public-coin interactive proof into a non-interactive argument, by replacing the verifier with a cryptographic hash function that is applied to the protocol’s transcript. Constructing hash functions for which this transformation is sound is a central and long-standing open question in cryptography.
We show that solving the END-OF-METERED-LINE problem is no easier than breaking the soundness of the Fiat-Shamir transformation when applied to the sumcheck protocol. In particular, if the transformed protocol is sound, then any hard problem in #P gives rise to a hard distribution in the class CLS, which is contained in PPAD. Our result opens up the possibility of sampling moderately-sized games for which it is hard to find a Nash equilibrium, by reducing the inversion of appropriately chosen one-way functions to #SAT.
Our main technical contribution is a stateful incrementally verifiable procedure that, given a SAT instance over n variables, counts the number of satisfying assignments. This is accomplished via an exponential sequence of small steps, each computable in time poly(n). Incremental verifiability means that each intermediate state includes a sumcheck-based proof of its correctness, and the proof can be updated and verified in time poly(n).
Joint work with Pavel Hubacek, Chethan Kamath, Krzysztof Pietrzak, Alon Rosen, and Guy Rothblum.
20190327T120000
20190327T130000
0
[Theory Seminar] Arka Rai Choudhury
free
ai1ec-289@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Amitabh Basu
Affiliation: JHU
Title: Admissibility of solution estimators for stochastic optimization
Abstract:
We look at stochastic optimization problems through the lens of statistical decision theory. In particular, we address admissibility, in the statistical decision theory sense, of the natural sample average estimator for a stochastic optimization problem (which is also known as the empirical risk minimization (ERM) rule in learning literature). It is well known that for general stochastic optimization problems, the sample average estimator may not be admissible. This is known as Stein’s paradox in the statistics literature. We show in this paper that for optimizing stochastic linear functions over compact sets, the sample average estimator is admissible.
20190410T120000
20190410T130000
0
[Theory Seminar] Amitabh Basu
free
ai1ec-287@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Daniel Reichman
Affiliation: Princeton University
Title: Contagious sets in bootstrap percolation
Abstract:
Consider the following activation process in undirected graphs: a vertex is active either if it belongs to a set of initially activated vertices or if at some point it has at least r active neighbors. This process (commonly referred to as bootstrap percolation) has been studied in several fields including combinatorics, computer science, statistical physics and viral marketing. A contagious set is a set whose activation results with the entire graph being active. Given a graph G, let m(G,r) be the minimal size of a contagious set.
I will survey upper and lower bounds for m(G,r) in general graphs and for graphs with special properties (random and pseudo-random graphs, graphs without short cycles) and discuss many unresolved questions.
Based on joint work with Amin Coja-Oghlan, Uriel Feige and Michael Krivelevich.
20190417T120000
20190417T130000
0
[Theory Seminar] Daniel Reichman
free
ai1ec-292@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Xuan Wu
Affiliation: Johns Hopkins
Title: Coreset for Ordered Weighted Clustering
Abstract:
Ordered k-Median is a generalization of classical clustering problems such as k-Median and k-Center, that offers a more flexible data analysis, like easily combining multiple objectives (e.g., to increase fairness or for Pareto optimization). Its objective function is defined via the Ordered Weighted Averaging (OWA) paradigm of Yager (1988), where data points are weighted according to a predefined weight vector, but in order of their contribution to the objective (distance from the centers).
Coreset is a powerful data-reduction technique which summarizes the data set into a small (weighted) point set while approximately preserving the objective value of the data set to all centers. When there are multiple objectives (weights), the above standard coreset might have limited usefulness, whereas in a \emph{simultaneous} coreset, which was introduced recently by Bachem and Lucic and Lattanzi (2018), the above approximation holds for all weights (in addition to all centers). Our main result is the first construction of simultaneous coreset for the Ordered k-Median problem of small size.
In this talk, I will introduce the basics of coreset construction for the clustering problem and the main ideas of our new results. Finally, we discuss some remaining open problems.
This talk is based on joint work with Vladimir Braverman, Shaofeng Jiang, and Robert Krauthgamer.
20190501T120000
20190501T130000
0
[Theory Seminar] Xuan Wu
free
ai1ec-303@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Christopher Musco
Affiliation: NYU
Title: Structured Covariance Estimation
Abstract:
Given access to samples from a distribution D over d-dimensional vectors, how many samples are necessary to learn the distribution’s covariance matrix, T? Moreover, how can we leverage a priori knowledge about T’s structure to reduce this sample complexity?
I will discuss this fundamental statistical problem in the setting where T is known to have Toeplitz structure. Toeplitz covariance matrices arise in countless signal processing applications, from wireless communications, to medical imaging, to time series analysis. In many of these applications, we are interested in learning algorithms that only view a subset of entries in each d-dimensional vector sample from D. We care about minimizing two notions of sample complexity 1) the total number of vector samples taken and 2) the number of entries accessed in each vector sample. The later goal typically equates to minimizing equipment or hardware requirements.
I will present several new non-asymptotic bounds on these sample complexity measures. We will start by taking a fresh look at classical and widely used algorithms, including methods based on selecting entries from each sample according to a “sparse ruler”. Then, I will introduce a novel sampling and estimation strategy that improves on existing methods in many settings. Our new approach for learning Toeplitz structured covariance utilizes tools from random matrix sketching, leverage score sampling for continuous signals, and sparse Fourier transform algorithms. It fits into a broader line of work which seeks to address fundamental problems in signal processing using tools from theoretical computer science and randomized numerical linear algebra.
Bio:
Christopher Musco is an Assistant Professor in the Computer Science and Engineering department at NYU’s Tandon School of Engineering. His research focuses on the algorithmic foundations of data science and machine learning. Christopher received his Ph.D. in Computer Science from the Massachusetts Institute of Technology and B.S. degrees in Applied Mathematics and Computer Science from Yale University.
20191011T123000
20191011T133000
0
[Theory Seminar] Christopher Musco
free
ai1ec-296@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Guy Kortsarz
Affiliation: Rutgers Universty – Camden
Title: A survey on the Directed Steiner tree problem
Abstract:
The directed Steiner problem is one of the most important problems in optimization, and in particular is more general than Group Steiner and other problems.
I will discuss the (by now classic) 1/\epsilon^3 n^epsilon approximation for the problem by Charikar et al (the algorithm was invented by Kortsarz and Peleg and is called recursive greedy. A technique who people in approximation should know). The running time is more than n^{1/epsilon}. One of the most important open questions in Approximation Algorithms is if there is a polynomial time polylog ratio for this problem. This is open from 1997.
I will discuss the Group Steiner problem ( a special case of the Directed Steiner problem) and the Directed Steiner Forest (a generalization of the Directed Steiner problem) and many more related problems.
20191016T120000
20191016T130000
0
[Theory Seminar] Guy Kortsarz
free
ai1ec-294@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Jiapeng Zhang
Affiliation: Harvard University
Title:An improved sunflower bound
Abstract:
A sunflower with $r$ petals is a collection of $r$ sets so that the intersection of each pair is equal to the intersection of all. Erdos and Rado in 1960 proved the sunflower lemma: for any fixed $r$, any family of sets of size $w$, with at least about $w^w$ sets, must contain a sunflower. The famous sunflower conjecture is that the bound on the number of sets can be improved to $c^w$ for some constant $c$. Despite much research, the best bounds until recently were all of the order of $w^{cw}$ for some constant c. In this work, we improve the bounds to about $(log w)^w$.
Joint work with Ryan Alweiss, Shachar Lovett and Kewen Wu.
20191106T120000
20191106T130000
0
[Theory Seminar] Jiapeng Zhang
free
ai1ec-310@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Robert Krauthgamer
Affiliation: Weizmann Institute of Science
Title: On Solving Linear Systems in Sublinear Time
Abstract:
I will discuss sublinear algorithms that solve linear systems locally. In
the classical version of this problem, the input is a matrix S and a vector
b in the range of S, and the goal is to output a vector x satisfying Sx=b.
We focus on computing (approximating) one coordinate of x, which potentially
allows for sublinear algorithms. Our results show that there is a
qualitative gap between symmetric diagonally dominant (SDD) and the more
general class of positive semidefinite (PSD) matrices. For SDD matrices, we
develop an algorithm that runs in polylogarithmic time, provided that S is
sparse and has a small condition number (e.g., Laplacian of an expander
graph). In contrast, for certain PSD matrices with analgous assumptions, the
running time must be at least polynomial.
Joint work with Alexandr Andoni and Yosef Pogrow.
20191108T120000
20191108T130000
0
[Theory Seminar] Robert Krauthgamer
free
ai1ec-314@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Yasamin Nazari
Affiliation: Johns Hopkins University
Title: Sparse Hopsets in Congested Clique
Abstract:
We give the first Congested Clique algorithm that computes a sparse hopset with polylogarithmic hopbound in polylogarithmic time. Given a graph G=(V,E), a (β,ϵ)-hopset H with “hopbound” β, is a set of edges added to G such that for any pair of nodes u and v in G there is a path with at most β hops in G∪H with length within (1 ϵ) of the shortest path between u and v in G.
Our hopsets are significantly sparser than the recent construction of Censor-Hillel et al. [6], that constructs a hopset of size Õ (n^{3/2}), but with a smaller polylogarithmic hopbound. On the other hand, the previously known constructions of sparse hopsets with polylogarithmic hopbound in the Congested Clique model, proposed by Elkin and Neiman [10],[11],[12], all require polynomial rounds.
One tool that we use is an efficient algorithm that constructs an ℓ-limited neighborhood cover, that may be of independent interest.
Finally, as a side result, we also give a hopset construction in a variant of the low-memory Massively Parallel Computation model, with improved running time over existing algorithms.
20191204T120000
20191204T130000
0
[Theory Seminar] Yasamin Nazari
free
ai1ec-315@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Richard Shea
Affiliation: Applied and Computational Math program, Johns Hopkins University
Title: Progress towards building a Dynamic Hawkes Graph
Abstract:
This talk will introduce the Dynamic Hawkes Graph. It builds from developments in multivariate Hawkes Processes, locally stable Hawkes distributions, and representations of the Hawkes process as an Integer Valued autoregressive (INAR) fit. The background to enable understanding of the Dynamic Hawkes Graph will be the bulk of the talk. Richard is presenting this as part of his Master’s thesis.
20191211T120000
20191211T130000
0
[Theory Seminar] Richard Shea
free
ai1ec-317@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Aditya Krishnan
Affiliation: Johns Hopkins University
Title: Schatten Norms in Matrix Streams: The Role of Sparsity
Abstract:
Spectral functions of large matrices contain important structural information about the underlying data, and are thus becoming increasingly important to efficiently compute. Many times, large matrices representing real-world data are sparse or doubly sparse (i.e., sparse in both rows and columns), and are accessed as a stream of updates, typically organized in row-order. In this setting, where space (memory) is the limiting resource, all known algorithms require space that is polynomial in the dimension of the matrix, even for sparse matrices. We address this challenge by providing the first algorithms whose space requirement is independent of the matrix dimension, assuming the matrix is doubly-sparse and presented in row-order.
In addition, we prove that multiple passes are unavoidable in this setting and show extensions of our primary technique, including a trade-off between space requirements and number of passes. Our algorithms approximate the Schatten p-norms, which we use in turn to approximate other spectral functions, such as logarithm of the determinant, trace of matrix inverse, and Estrada index.
Joint work with Vladimir Braverman, Robert Krauthgamer and Roi Sinoff.
20200311T120000
20200311T130000
0
[Theory Seminar] Aditya Krishnan
free
ai1ec-305@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Arnold Filtser
Affiliation: Columbia University
Title: TBD
Abstract: TBD
20200325T120000
20200325T130000
0
[Theory Seminar] Arnold Filtser
free
ai1ec-322@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
20200902T120000
20200902T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Welcome and Introductions
free
ai1ec-321@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Jasper Lee
Affiliation: Brown University
Title: Real-Valued Sub-Gaussian Mean Estimation, Optimal to a (1 o(1)) Multiplicative Factor
Abstract:
We revisit one of the most fundamental problems in statistics: given access to independent samples from a 1D random variable (with finite but unknown mean and variance), what is the best way to estimate the mean, in terms of error convergence with respect to sample size? The conventional wisdom is to use the sample mean as our estimate. However it is known that the sample mean has optimal convergence if and only if the underlying distribution is (sub-)Gaussian. Moreover, it can even be exponentially slower than optimal for certain heavy-tailed distributions. On the other hand, the median-of-means estimator (invented and reinvented in various literature) does have sub-Gaussian convergence for all finite-variance distributions, albeit in the big-O sense with a sub-optimal multiplicative constant. The natural remaining question then, is whether it is possible to bridge the gap, to have an estimator that has optimal sub-Gaussian concentration with an optimal constant, for all finite-variance distributions.
In this talk, we answer the question affirmatively by giving an estimator that converges with the optimal constant inside the big-O, up to a (1 o(1)) multiplicative factor. Our estimator is furthermore computable in time linear in the sample size. The convergence analysis involves deriving tail bounds using tools from linear and convex programming, which may be of independent interest.
Joint work with Paul Valiant.
20200916T120000
20200916T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Jasper Lee
free
ai1ec-324@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Edinah Gnang
Affiliation: Johns Hopkins University
Title: On the Kotzig-Ringel-Rosa conjecture
Abstract:
In this talk we describe and motivate the K.R.R. conjecture graph partitioning and describe a functional approach enabling us to tackle the K.R.R. conjecture via a new composition lemma. If time permits I will also discuss algorithmic aspects.
20200923T120000
20200923T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Edinah Gnang
free
ai1ec-332@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Aditya Krishnan
Affiliation: Johns Hopkins University
Title: Schatten Norms in Matrix Streams: The Role of Sparsity.
Abstract: Spectral functions of large matrices contain important structural information about the underlying data, and are thus becoming increasingly important to efficiently compute. Many times, large matrices representing real-world data are \emph{sparse} or \emph{doubly sparse} (i.e., sparse in both rows and columns), and are accessed as a \emph{stream} of updates, typically organized in \emph{row-order}. In this setting, where space (memory) is the limiting resource, all known algorithms require space that is polynomial in the dimension of the matrix, even for sparse matrices. We address this challenge by providing the first algorithms whose space requirement is \emph{independent of the matrix dimension}, assuming the matrix is doubly-sparse and presented in row-order.
In addition, we prove that multiple passes are unavoidable in this setting and show extensions of our primary technique, including a trade-off between space requirements and number of passes. Our algorithms approximate the Schatten p-norms, which we use in turn to approximate other spectral functions, such as logarithm of the determinant, trace of matrix inverse, and Estrada index.
Joint work with Vladimir Braverman, Robert Krauthgamer and Roi Sinoff.
20201007T120000
20201007T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Aditya Krishnan
free
ai1ec-331@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Brian Brubach
Affiliation: Wellesley College
Title: Online matching under three layers of uncertainty
Abstract:
Online matching problems have become ubiquitous with the rise of the internet and e-commerce. From the humble beginnings of a single problem 30 years ago, the theoretical study of online matching now encompasses dozens of variations inspired by diverse applications. We’ll take a tour through the landscape of online matching problems. As we go, we’ll venture deeper and deeper into the jungle of stochasticity and uncertainty. Finally, we’ll cover some very recent work introducing new algorithms and models. Along the way, I’ll highlight techniques and tricks I’ve found useful in studying these problems.
20201021T120000
20201021T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Brian Brubach
free
ai1ec-325@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Joshua Grochow
Affiliation: University of Colorado
Title: Isomorphism of tensors, algebras, and polynomials, oh my!
Abstract: We consider the problems of testing isomorphism of tensors, p-groups, cubic polynomials, quantum states, algebras, and more, which arise from a variety of areas, including machine learning, group theory, and cryptography. Despite Graph Isomorphism now being in quasi-polynomial time, and having long had efficient practical software, for the problems we consider no algorithm is known that is asymptotically better than brute force, and state-of-the-art software cannot get beyond small instances. We approach this situation in two ways:
– Complexity-theoretic: we show that all these problems are polynomial-time equivalent, giving rise to a class of problems we call Tensor Isomorphism-complete (TI-complete). Perhaps surprising here is that we show that isomorphism of d-tensors for any fixed d (at least 3) is equivalent to testing isomorphism of 3-tensors. These equivalences let us focus on just one problem rather than dozens, or a whole infinite hierarchy, separately.
– Algorithmic: Adapting the Weisfeiler-Leman method from Graph Isomorphism to tensors, trying to understand tensor isomorphism by taking advantage of isomorphisms of small sub-tensors. This leads to tensor analogues of the Graph Reconstruction conjecture and related questions.
Based on joint works with Vyacheslav V. Futorny and Vladimir V. Sergeichuk (Lin. Alg. Appl., 2019; arXiv:1810.09219), with Peter A. Brooksbank, Yinan Li, Youming Qiao, and James B. Wilson (arXiv:1905.02518), and with Youming Qiao (arXiv:1907.00309).
20201028T120000
20201028T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Joshua Grochow
free
ai1ec-338@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Jingfeng Wu
Affiliation: Johns Hopkins University
Title: Direction Matters: On the Implicit Regularization Effect of Stochastic Gradient Descent with Moderate Learning Rate
Abstract:
Understanding the algorithmic regularization effect of stochastic gradient descent (SGD) is one of the key challenges in modern machine learning and deep learning theory. Most of the existing works, however, focus on very small or even infinitesimal learning rate regime, and fail to cover practical scenarios where the learning rate is moderate and annealing. In this paper, we make an initial attempt to characterize the particular regularization effect of SGD in the moderate learning rate regime by studying its behavior for optimizing an overparameterized linear regression problem. In this case, SGD and GD are known to converge to the unique minimum-norm solution; however, with the moderate and annealing learning rate, we show that they exhibit different directional bias: SGD converges along the large eigenvalue directions of the data matrix, while GD goes after the small eigenvalue directions. Furthermore, we show that such directional bias does matter when early stopping is adopted, where the SGD output is nearly optimal but the GD output is suboptimal. Finally, our theory explains several folk arts in practice used for SGD hyperparameter tuning, such as (1) linearly scaling the initial learning rate with batch size; and (2) overrunning SGD with high learning rate even when the loss stops decreasing.
20201104T120000
20201104T130000
0
[Theory Seminar] Jingfeng Wu
free
ai1ec-330@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Arnold Filtser
Affiliation: Columbia University
Title: Scattering and Sparse Partitions, and their Applications
Abstract:
A partition $\mathcal{P}$ of a weighted graph $G$ is $(\sigma,\tau,\Delta)$-sparse if every cluster has diameter at most $\Delta$, and every ball of radius $\Delta/\sigma$ intersects at most $\tau$ clusters. Similarly, $\mathcal{P}$ is $(\sigma,\tau,\Delta)$-scattering if instead for balls we require that every shortest path of length at most $\Delta/\sigma$ intersects at most $\tau$ clusters. Given a graph $G$ that admits a $(\sigma,\tau,\Delta)$-sparse partition for all $\Delta>0$, Jia et al. [STOC05] constructed a solution for the Universal Steiner Tree problem (and also Universal TSP) with stretch $O(\tau\sigma^2\log_\tau n)$. Given a graph $G$ that admits a $(\sigma,\tau,\Delta)$-scattering partition for all $\Delta>0$, we construct a solution for the Steiner Point Removal problem with stretch $O(\tau^3\sigma^3)$. We then construct sparse and scattering partitions for various different graph families, receiving new results for the Universal Steiner Tree and Steiner Point Removal problems.
20201111T120000
20201111T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Arnold Filtser
free
ai1ec-343@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Yu Zheng
Affiliation: Johns Hopkins University
Title: Space Efficient Deterministic Approximation of String Measures
Abstract:
We study approximation algorithms for the following three string measures that are widely used in practice: edit distance (ED), longest common subsequence (LCS), and longest increasing sequence (LIS). All three problems can be solved exactly by standard algorithms that run in polynomial time with roughly $\Theta(n)$ space, where $n$ is the input length, and our goal is to design deterministic approximation algorithms that run in polynomial time with significantly smaller space.
Towards this, we design several algorithms that achieve $1 \eps$ or $1-\eps$ approximation for all three problems, where $\eps>0$ can be any constant and even slightly sub constant. Our algorithms are flexible and can be adjusted to achieve the following two regimes of parameters: 1) space $n^{\delta}$ for any constant $\delta>0$ with running time essentially the same as or slightly more than the standard algorithms; and 2) space $\mathsf{polylog}(n)$ with (a larger) polynomial running time, which puts the approximation versions of the three problems in Steve’s class (SC). Our algorithms significantly improve previous results in terms of space complexity, where all known results need to use space at least $\Omega(\sqrt{n})$. Some of our algorithms can also be adapted to work in the asymmetric streaming model [SS13], and output the corresponding sequence. Furthermore, our results can be used to improve a recent result by Farhadi et. al. [FHRS20] about approximating ED in the asymmetric streaming model, reducing the running time from being exponential in [FHRS20] to a polynomial.
Our algorithms are based on the idea of using recursion as in Savitch’s theorem [Sav70], and a careful adaption of previous techniques to make the recursion work. Along the way we also give a new logspace reduction from longest common subsequence to longest increasing sequence, which may be of independent interest.
20201202T120000
20201202T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Yu Zheng
free
ai1ec-341@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Xuan Wu
Affiliation: Johns Hopkins University
Title: Coreset for Clustering in Graph Metrics.
Abstract:
Clustering is a fundamental task in machine learning. As the increasing demand for running machine learning algorithms in the huge data sets, classic clustering algorithms were found not to scale well. To this end, coreset is introduced as a powerful data reduction technique that turns a huge dataset into a tiny proxy. Coresets have been also successfully applied to streaming and distributed computing. Coresets for clustering in Euclidean spaces have been very well studied. However, very few results were known about the non-Euclidean space. Beyond Euclidean, graph metrics is a very important family of metric space. In this talk, I will cover my recent work on coreset for k-clustering in graph metrics, including bounded treewidth graph and excluded-minor graph.
20201209T120000
20201209T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Xuan Wu
free
ai1ec-350@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Enayat Ullah
Affiliation: Johns Hopkins University
Title: Machine unlearning via algorithmic stability
Abstract: We study the problem of machine unlearning, and identify a notion of algorithmic stability, Total Variation (TV) stability, which we argue, is suitable for the goal of exact efficient unlearning. For convex risk minimization problems, we design TV-stable algorithms based on noisy Stochastic Gradient Descent (SGD). Our key contribution is the design of corresponding efficient unlearning algorithms, which are based on constructing a (maximal) coupling of Markov chains for the noisy SGD procedure. To understand the trade-offs between accuracy and unlearning efficiency, we give upper and lower bounds on excess empirical and population risk of TV stable algorithms for convex risk minimization. Our techniques generalize to arbitrary non-convex functions, and our algorithms are differentially private as well.
20210217T120000
20210217T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Enayat Ullah
free
ai1ec-348@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Thomas Lavastida
Affiliation: Carnegie Mellon University
Title: Combinatorial Optimization Augmented with Machine Learning
Abstract:
Combinatorial optimization often focuses on optimizing for the worst-case. However, the best algorithm to use depends on the “relevant inputs”, which is application specific and often does not have a formal definition.
The talk gives a new theoretical model for designing algorithms that are tailored to inputs for the application at hand. In the model, learning is performed on past problem instances to make predictions on future instances. These predictions are incorporated into the design and analysis of the algorithm. The predictions can be used to achieve “instance-optimal” algorithm design when the predictions are accurate and the algorithm’s performance gracefully degrades when there is error in the prediction.
The talk will apply this framework to online algorithm design and give algorithms with theoretical performance that goes beyond worst-case analysis. The majority of the talk will focus on load balancing with restricted assignments.
20210224T120000
20210224T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Thomas Lavastida
free
ai1ec-349@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Hung Le
Affiliation: University of Massachusetts, Amherst
Title: Reliable Spanners: Locality-Sensitive Orderings Strike Back
Abstract:
A highly desirable property of networks is robustness to failures.
Consider a metric space $(X,d_X)$, a graph $H$ over $X$ is a $\vartheta$-reliable $t$-spanner if, for every set of failed vertices $B\subset X$, there is a superset $B^ \supseteq B$ such that the induced subgraph $H[X\setminus B]$ preserves all the distances between points in $X\setminus B^ $ up to a stretch factor $t$, while the expected size of $B^ $ is as most $(1 \vartheta)|B|$. Such a spanner could withstand a catastrophe: failure of even $90\%$ of the network.
Buchin, Har-Peled, and Olah showed how to construct a sparse reliable spanner for Euclidean space from Euclidean locality-sensitive orderings, an object introduced by Chan, Har-Peled, and Jones. In this talk, we extend their approach to non-Euclidean metric spaces by generalizing the ordering of Chan, Har-Peled, and Jones to doubling metrics and introducing new types of locality-sensitive orderings for other metric spaces. We also show how to construct reliable spanners from the newly introduced locality-sensitive orderings via reliable 2-hop spanners for paths. The highlight of our results is that the number of edges in our spanner has no dependency on the spread.
20210303T120000
20210303T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Hung Le
free
ai1ec-356@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Teodor Marinov
Affiliation: Johns Hopkins University
Title: Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning
Abstract:
Reinforcement Learning (RL) is a general scenario where agents interact with the environment to achieve some goal. The environment and an agent’s interactions are typically modeled as a Markov decision process (MDP), which can represent a rich variety of tasks. But, for which MDPs can an agent or an RL algorithm succeed? This requires a theoretical analysis of the complexity of an MDP. In this talk I will present information-theoretic lower bounds for a large class of MDPs. The lower bounds are based on studying a certain semi-infinite LP. Further, I will show that existing algorithms enjoy tighter gap-dependent regret bounds (similar to the stochastic multi-armed bandit problem), however, they are unable to achieve the information-theoretic lower bounds even in deterministic transition MDPs, unless there is a unique optimal policy.
20210310T120000
20210310T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Teodor Marinov
free
ai1ec-355@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Dominik Kempa
Affiliation: Johns Hopkins University
Title: How to store massive sequence collections in a searchable form
Abstract:
Compressed indexing is concerned with the design and construction of data structures to store massive sequences in space close to the size of compressed data, while simultaneously providing searching functionality (such as pattern matching) on the original uncompressed data. These indexes have a huge impact on the analysis of large-scale DNA databases (containing hundreds of thousands of genomes) but until recently the size of many indexes lacked theoretical guarantees and their construction remains a computational bottleneck.
In this talk I will describe my work proving theoretical guarantees on index size as a function of compressed data size, resolving a key open question in this field. Related to that, I will also describe new algorithms for converting between two conceptually distinct compressed representations, Lempel-Ziv and the Burrows-Wheeler Transform. I will show how these new findings enable advanced computation directly on compressed data. I will conclude by describing avenues for future work, including the new algorithms for the construction of compressed indexes that have the potential to cut indexing time by further orders of magnitude, unlocking the ability to search truly enormous text or DNA datasets.
20210317T120000
20210317T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Dominik Kempa
free
ai1ec-353@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Audra McMillan
Affiliation: Apple
Title: Hiding among the clones: a simple and nearly optimal analysis of privacy amplification by shuffling
Abstract:
Differential privacy (DP) is a model of privacy-preserving machine learning that has garnered significant interest in recent years due to its rigorous privacy guarantees. An algorithm differentially private if the output is stable under small changes in the input database. While DP has been adopted in a variety of applications, most applications of DP in industry actually satisfy a stronger notion called local differential privacy. In local differential privacy data subjects perturb their data before it reaches the data analyst. While this requires less trust, it comes a substantial cost to accuracy. Recent work of Erlingsson, Feldman, Mironov, Raghunathan, Talwar, and Thakurta [EFMRTT19] demonstrated that random shuffling amplifies differential privacy guarantees of locally randomized data. Such amplification implies substantially stronger privacy guarantees for systems in which data is contributed anonymously [BEMMRLRKTS17] and has led to significant interest in the shuffle model of privacy [CSUZZ19, EFMRTT19]. In this talk, we will discuss a new result on privacy amplification by shuffling, which achieves the asymptotically optimal dependence in the local privacy parameter. Our result is based on a new proof strategy which is simpler than previous approaches, and extends to a lightly weaker notion known as approximate differential privacy with nearly the same guarantees. Based on joint work with Vitaly Feldman and Kunal Talwar (https://arxiv.org/abs/2012.12803)
20210324T120000
20210324T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Audra McMillan
free
ai1ec-352@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Maryam Negahbani
Affiliation: Dartmouth University
Title: “Revisiting Priority k-Center: Fairness and Outliers
Abstract:
Clustering is a fundamental unsupervised learning and facility location problem extensively studied in the literature. I will talk about a clustering problem called “priority k-center” introduced by Plesnik (in Disc. Appl. Math. 1987). Given a metric space on n points X, with distance function d, an integer k, and radius r_v for each point v in X, the goal is to choose k points S as “centers” to minimize the maximum distance of a point v to S divided by r_v. For uniform r_v’s this is precisely the “k-center” problem where the objective is to minimize the maximum distance of any point to S. In the priority version, points with smaller r_v are prioritized to be closer to S. Recently, a special case of this problem was studied in the context of “individually fair clustering” by Jung et al., FORC 2020. This notion of fairness forces S to open a center in every “densely populated area” by setting r_v to be v’s distance to its closest (n/k)-th neighbor.
In this talk, I show how to approximate priority k-center with outliers: When for a given integer z, you are allowed to throw away z points as outliers and minimize the objective over the rest of the points. We show there is 9-approximation, which is morally a 5, if you have constant many types of radii or if your radii are powers of 2. This is via an LP-aware reduction to min-cost max-flow and is general enough that could handle Matroid constraints on facilities (where instead of asking to pick at most k facilities, you are asked to pick facilities that are independent in a given matroid). Things become quite interesting for priority knapsack-center with outliers: where opening each center costs something and you have a limited budget to spend on your solution S. In this case, we do not know how to solve the corresponding flow problem, so we alter our reduction to reduce to a simpler problem we do know how to solve taking a hit of 5 in the approximation ratio. There are still many open problems in this work, in addition to solving the flow problem in the knapsack case, the best LP integrality gap we know for priority k-center with outliers is 3.
20210331T120000
20210331T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Maryam Negahbani
free
ai1ec-346@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Leonidas Tsepenekas
Affiliation: University of Maryland
Title: Approximating Two-Stage Stochastic Supplier Problems
Abstract:
The main focus of this talk will be radius-based (supplier) clustering in the two-stage stochastic setting with recourse, where the inherent stochasticity of the model comes in the form of a budget constraint. Our eventual goal is to provide results in the most general distributional setting, where there is only black-box access to the underlying distribution. To that end, we follow a two-step approach. First, we develop algorithms for a restricted version of the problem, in which all possible scenarios are explicitly provided; second, we employ a novel scenario-discarding variant of the standard Sample Average Approximation (SAA) method, in which we also crucially exploit structural properties of the algorithms developed for the first step of the framework. In this way, we manage to generalize the results of the latter to the black-box model. Finally, we note that the scenario-discarding modification to the SAA method is necessary in order to optimize over the radius.
Paper: https://arxiv.org/abs/2008.03325
20210407T120000
20210407T130000
https://wse.zoom.us/j/91450299380
0
[Theory Seminar] Leonidas Tsepenekas
free
ai1ec-358@www.cs.jhu.edu/~mdinitz/theory
20211018T052551Z
Speaker: Samson Zhou
Affiliation: Carnegie Mellon University
Title: Tight Bounds for Adversarially Robust Streams and Sliding Windows via Difference Estimators
Abstract:
We introduce difference estimators for data stream computation, which provide approximations to F(v)-F(u) for frequency vectors v,u and a given function F. We show how to use such estimators to carefully trade error for memory in an iterative manner. The function F is generally non-linear, and we give the first difference estimators for the frequency moments F_p for p between 0 and 2, as well as for integers p>2. Using these, we resolve a number of central open questions in adversarial robust streaming and sliding window models.
For both models, we obtain algorithms for norm estimation whose dependence on epsilon is 1/epsilon^2, which shows, up to logarithmic factors, that there is no overhead over the standard insertion-only data stream model for these problems.
20210414T120000
20210414T130000
0
[Theory Seminar] Samson Zhou
free