Media Summary: MapReduce: TeraSort, minimum spanning tree, triangle counting. Necessity of randomized/approximate guarantees, linear sketching, AMS sketch, p-stable sketch for p less than 2. External memory model: linked list, matrix multiplication, B-tree, buffered repository tree, sorting.
Algorithms For Big Data Compsci 229r Lecture 24 - Detailed Analysis & Overview
MapReduce: TeraSort, minimum spanning tree, triangle counting. Necessity of randomized/approximate guarantees, linear sketching, AMS sketch, p-stable sketch for p less than 2. External memory model: linked list, matrix multiplication, B-tree, buffered repository tree, sorting. ℓ1/ℓ1 recovery, RIP1, unbalanced expanders, Sequential Sparse Matching Pursuit. Hashing: load balancing, k-wise independence, chaining, linear probing. Logistics, course topics, basic tail bounds (Markov, Chebyshev, Chernoff, Bernstein), Morris'
Symmetrization, hashing: linear probing (5-wise indep.), bloom filters, cuckoo hashing, bloomier filters. Linear least squares via subspace embeddings, leverage score sampling, non-commutative Khintchine, oblivious subspace ... P-stable sketch analysis, Nisan's PRG, ℓp estimation for p Khintchine, decoupling, Hanson-Wright, proof of distributional JL lemma. Power of random signs: ℓ2 norm estimation, subspace embeddings (regression), Johnson-Lindenstrauss, deterministic point ... linear programming: standard form, vertices, bases, simplex.
Distinct elements, k-wise independence, geometric subsampling of streams. Amnesic dynamic programming (approximate distance to monotonicity). Preferred path decomposition, link-cut trees. Low-rank approximation, column-based matrix reconstruction, k-means, compressed sensing.