Logbook Optimization and Uncertainty - Summer 2021
Lecture 26 -- 08.07.2021:
Literature:
Zero-Sum Games, definition, Rock-Paper-Scissors, worst-case analysis of algorithms as a zero-sum game, sequential versions of the game, Minimax-Theorem, proof using no-regret learning, implications and consequences.- Notes: Section 7.4
Lecture 25 -- 06.07.2021:
Literature:
Experts problem as a special case of online convex optimization, Euclidean and Entropical regularizers, Lipschitz condition, σ-strong convexity, upper bound on the regret of FTRL, connections of FTRL to RWM and GIGA in the Experts problem.- Notes: Sections 7.3.2
Lecture 24 -- 01.07.2021:
Literature:
Online convex optimization, convexity properties, gradient descent, generalized infinitesimal gradient ascent (GIGA), regret of at most Δ2/(2η) + TηG2/2, where G upper bounds the gradient and Δ the diameter of the ground set. GIGA is a no-regret algorithm. Follow-the-Regularized-Leader (FTRL), bound on the total regret- Notes: Section 7.3, 7.3.1, 7.3.2
- M. Zinkevich. Online Convex Programming and Generalized Infinitesimal Gradient Ascent. ICML 2003.
Lecture 23 -- 29.06.2021:
Literature:
Adversarial Multi-Armed Bandits, Exp3-Algorithm, regret at most 3, Exp3 is a no-(external)-regret algorithm, proof of upper bound on total expected cost.- Notes: Section 7.2
- P. Auer, N. Cesa-Bianchi, Y. Freund, R. Schapire. The Nonstochastic Multiarmed Bandit Problem. SIAM Journal on Computing 32(1):48-77, 2003.
Lecture 22 -- 24.06.2021:
Literature:
Expert classification, Weighted Majority algorithm, proof of upper bound on the number of mistakes in terms of mistakes of the best classifier. Experts problem, Randomized Weighted Majority algorithm (RWM), definition regret, regret of RWM is at most 2, RWM is a no-(external)-regret algorithm, proof of upper bound on total expected cost.- Notes: Section 7.1
- N. Littlestone, M. Warmuth. The Weighted Majority Algorithm. Information and Computation 108(2):212-261, 1994.
Lecture 21 -- 22.06.2021:
Literature:
Stochastic multi-armed bandits, definition, regret, simple explore-then-exploit algorithm, regret sublinear in T; UCB1 algorithm, regret bound of O(n log T) when best expectation of any arm has a constant leading margin.- Notes: Section 6.3
- P. Auer, N. Cesa-Bianchi, P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2):235–256, 2002
Lecture 20 -- 17.06.2021:
Literature:
Markovian single-armed bandits, definition, charges for play, definition Gittins index, Markovian multi-armed bandits, Gittins index policy, proof of optimality- Notes: Section 6.2
Lecture 19 -- 15.06.2021:
Literature:
Infinite Markov decision processes, definition with discounted rewards, optimal Markovian policy, value iteration, policy iteration, convergence proofs. (Linear program corrected in the notes)- Notes: Section 6.1
Lecture 18 -- 15.06.2021:
Literature:
Median-Prophets for delegation, proof of 2-approximation.- Notes: Section 5.2
- J. Kleinberg, R. Kleinberg. Delegated Search Approximates Efficient Search. EC 2018.
Lecture 17 -- 10.06.2021:
Literature:
Persuasion with independent boxes, rounding algorithm, proof of 4-approximation and persuasiveness. Delegation problem, example, connection to prophet inequalities, median-prophet algorithm- Notes: Sections 5.1.2, 5.2
- J. Kleinberg, R. Kleinberg. Delegated Search Approximates Efficient Search. EC 2018.
Lecture 16 -- 08.06.2021:
Literature:
Persuasion with IID boxes, construction of LP, rounding algorithm, proof of (1-1/e)-1-approximation and persuasiveness. Persuasion with independent boxes, SSQ-box, construction of LP, rounding algorithm.- Notes: Sections 5.1.1, 5.1.2
- N. Hahn, M. Hoefer, R. Smorodinsky. Prophet Inequalities for Bayesian Persuasion. IJCAI 2020.
Lecture 15 -- 01.06.2021:
Literature:
Bayesian persuasion, model and example, there is optimal scheme φ* that is direct and persuasive, computing φ* via LP for explicitly represented distribution D. IID case: symmetry of φ* and consequences- Notes: Sections 5.1, 5.1.1
- S. Dughmi, H. Xu. Algorithmic Bayesian Persuasion. STOC 2016.
Lecture 14 -- 27.05.2021:
Literature:
Probing with opening cost, Pandora's Box problem, fair cap as an estimator of the value of the box after subtracting the opening cost, optimal policy greedliy opens boxes in non-increasing order of fair cap- Notes: Section 4.3
- M. Weitzman. Optimal search for the best alternative. Econometrica 47(3):641-654, 1979
- Bo Waggoners Blog Post
Lecture 13 -- 25.05.2021:
Literature:
Testing, k-TestMax, testing a box = refining the support region in which realization is located, testing vs. probing, testing policy for the IID case that obtains at least 1-1/e1/4-o(1) of the expected reward of the optimal probing policy- Notes: Section 4.2
- M. Hoefer, K. Schewior, D. Schmand. Stochastic Probing with Increasing Precision. IJCAI 2021.
Lecture 12 -- 20.05.2021:
Literature:
Probing, k-ProbeMax, non-adaptive and adaptive policies, adaptivity gap, adaptivity gap for k-ProbeMax is at most 8, bounding the optimum using a linear program, deriving a non-adaptive policy from the LP optimum.- Notes: Section 4.1
- A. Asadpour, H. Nazerzadeh. Maximizing Stochastic Monotone Submodular Functions. Management Science 62(8):2374-2391, 2016.
Lecture 11 -- 18.05.2021:
Literature:
Secretary independent set with random-order arrival, bounded inductive independence number, and in the graph sampling model- Notes: Sections 3.4.2, 3.4.3
- O. Göbel, M. Hoefer, T. Kesselheim, T. Schleiden, B. Vöcking. Online Independent Set Beyond the Worst-Case: Secretaries, Prophets, and Periods. ICALP 2014.
- Y. Ye, A. Borodin. Elimination Graphs. ACM Transactions on Algorithms 8(2):774-785, 2012.
Lecture 10 -- 11.05.2021:
Literature:
Online independent set, interval scheduling, lower bound Ω(n) on the competitive ratio, prophet interval scheduling, 4-competitive algorithm- Notes: Sections 3.4, 3.4.1
- O. Göbel, M. Hoefer, T. Kesselheim, T. Schleiden, B. Vöcking. Online Independent Set Beyond the Worst-Case: Secretaries, Prophets, and Periods. ICALP 2014.
Lecture 9 -- 06.05.2021:
Literature:
Yao's principle, given any input distribution, the best deterministic algorithm for that distribution performs better than any randomized algorithm on worst-case input, lower bound of Ω(n) on the competitive ratio of any randomized algorithm for OnlineMax.- Notes: Section 3.3
- A. C.-C. Yao. Probabilistic Computations: Toward a Unified Measure of Complexity (Extended Abstract). FOCS 1977.
Lecture 8 -- 06.05.2021:
Literature:
Examples for finite-time MDPs and optimal policies. PrizeCollection: Sort envelopes in non-increasing order of viqi/(1-qi); SkiRental: Buy only on the first open day when there are at least ⌊((B-1)/q)+2⌋ days remaining; Prophet: accept any person as soon as it has more value than the expected value obtained by the optimal policy in the subsequent rounds.- Notes: Section 3.2.2
Lecture 7 -- 06.05.2021:
Literature:
PrizeCollection problem, definition Markov decision process (MDP), optimal policies π* for finite time horizon, existence of π* that is Markovian, computation via dynamic programming- Notes: Sections 3.2, 3.2.1
Lecture 6 -- 29.04.2021:
Literature:
OnlineMax with independent distributions, prophet inequalities in the general case, 2-competitive, tightness example; prophet inequality in the identical case, 1.58-competitive, discussion of improved approaches- Notes: Section 3.1
- J. Correa, P. Foncea, R. Hoeksma, T. Oosterwijk, T. Vredeveld. Recent developments in prophet inequalities. SIGecom Exchanges 17(1):61–70, 2018
- E. Samuel-Cahn. Comparison of threshold stop rules and maximum for independent nonnegative random variables. Annals of Probability 12(4):1213–1216, 1984.
Lecture 5 -- 27.04.2021:
Literature:
Item Allocation Problem, at least b ≥ 1 copies per item, bundles of size at most d, optimal fractional solution via linear programming, extension of the secretary algorithm is O(d1/b)-competitive.- Notes: Section 2.4
- T. Kesselheim, K. Radke, A. Tönnis, B. Vöcking. An Optimal Online Algorithm for Weighted Bipartite Matching and Extensions to Combinatorial Auctions. ESA 2013.
Lecture 4 -- 22.04.2021:
Literature:
Max-weight bipartite matching, motivation via sponsored search, secretary matching problem, extension of the standard algorithm is e+o(1)-competitive. Main proof steps: expected value of random selected edge in round t is at least v(M*)/n, probability to accept any such edge is s/(t-1).- Notes: Section 2.3
- T. Kesselheim, K. Radke, A. Tönnis, B. Vöcking. An Optimal Online Algorithm for Weighted Bipartite Matching and Extensions to Combinatorial Auctions. ESA 2013.
Lecture 3 -- 20.04.2021:
Literature:
Secretary problem, e+o(1)-competitive algorithm maximizes probability to accept the best person.- Notes: Section 2.2
- M. Beckmann. Dynamic Programming and the Secretary Problem. Computers Math. Applic. 19(11):25-28, 1990.
Lecture 2 -- 15.04.2021:
Literature:
Stochastic concepts and tools, probability distributions, events, independence, conditional probabilities, law of total probability. Random variables, expectation, linearity of expectation. Concentration: Markov inequality, Chernoff bounds.Lecture 1 -- 13.04.2021:
Literature:
(1) Organization and overview.
(2) Fundamentals in online algorithms: Competitive ratio, deterministic and randomized algorithms, randomization in the input.
(3) OnlineMax problem, lower bound Ω(n) on the competitive ratio. Secretary problem = OnlineMax with random-order arrival, e+o(1)-competitive algorithm accepts the best person with probability at least 1/e-1/n.- Notes: Chapter 1, Sections 2, 2.1, 2.2
- Given its huge popularity, there are many surveys on the secretary problem. Here is a rather classic one:
P.R. Freeman. The Secretary Problem and Its Extensions: A Review. Int. Stat. Rev., Vol. 51(2):189-206, 1983.