This is a collection of interesting papers that I have read so far or want to read. Note that the list is not up-to-date.
- General Deep Learning
- Conformal Prediction
- Differential Geometry in Deep Learning
- Dimensionality Reduction
- Thompson Sampling
- Deep Reinforcement Learning
- Reinforcement Learning
- Bandit algorithms
- Optimization
- Statistics
- Probability modeling and inference
- Books, courses and lecture notes
- Blogs and tutorial
- Schools
- 2023, Your diffusion model secretly knows the dimension of the data manifold
- 2022, Regularising Inverse Problems with Generative Machine Learning Models
- 2021, SCORE-BASED GENERATIVE MODELING THROUGH STOCHASTIC DIFFERENTIAL EQUATIONS
- 2021, Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck
- 2021, Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks
- 2021, Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
- 2021, Why flatness correlates with generalization for Deep NN
- 2021, The Modern Mathematics of Deep Learning
- 2021, The Principles of Deep Learning Theory
- 2020, Neural tangent kernel
- 2018, Lipschitz regularity of deep neural networks: analysis and efficient estimation
- 2015, Weight Uncertainty in Neural Networks
- 1998, LeCun, Efficient BackProp
- 2022, Conformal Prediction: a Unified Reviewof Theory and New Challenges
- 2022, Conformal Off-Policy Prediction in Contextual Bandits
- 2020, Conformal Prediction Under Covariate Shift
- 2019, Conformalized Quantile Regression
- 2005, Algorithmic Learning in a Random World
- 2020, Neural Ordinary Differential Equations on Manifolds
- 2019, Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds
- 2019, Diffeomorphic Learning
- 2019, Deep ReLU network approximation of functions on a manifold
- 2019, Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds
- 2016, Deep nets for local manifold learning
- 2020, Stochastic Neighbor Embedding with Gaussian and Student-t Distributions: Tutorial and Survey
- 2015, Parametric nonlinear dimensionality reduction using kernel t-SNE
- 2009, Learning a Parametric Embedding by Preserving Local Structure
- 2020, A Tutorial on Thompson Sampling
- 2020, Neural Thompson Sampling
- 2018, Deep Contextual Multi-armed Bandits
-
- 2023, Empirical Design in Reinforcement Learning
- 2023, An Analysis of Quantile Temporal-Difference Learning
- 2022, Understanding Policy Gradient Algorithms: A Sensitivity-Based Approach
- 2021, Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
- 2021, Adaptive Sampling for Best Policy Identification in MDPs
- 2020, Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning
- 2019, Revisiting the Softmax Bellman Operator: New Benefits and New Perspective
- 2019, Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
- 2019, Provably Efficient Reinforcement Learning with Linear Function Approximation
- 2018, Deep Reinforcement Learning that Matters
- 2018, Is Q-learning Provably Efficient?
- Adaptive sampling for policy identification
- On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
- 2016, Learning the Variance of the Reward-To-Go
- 2012, Policy Gradients with Variance Related Risk Criteria
- 2009, An Analysis of Reinforcement Learning with Function Approximation
- 2008, An analysis of model-based Interval Estimation for Markov Decision Processes
- 2006, PAC Model-Free Reinforcement Learning
- 2004, Bias and Variance in Value Function Estimation
- 2001, Convergence of Optimistic and Incremental Q-Learning
- 2001, TD Algorithm for the Variance of Return and Mean-Variance Reinforcement Learning
- 2000, Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms
- 1993, Convergence of Stochastic Iterative Dynamic Programming Algorithms
- 1992, Reinforcement learning applied to linear quadratic regulation
- 1982,The Variance of Discounted Markov Decision Processes
-
- 2022, Safety-constrained reinforcement learning with a distributional safety critic
- 2022,Constrained Variational Policy Optimization for Safe Reinforcement Learning
- 2022, TRC: Trust Region Conditional Value at Risk for Safe Reinforcement Learning
- 2022, Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk
- 2022, SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics
- 2019, Benchmarking Safe Exploration in Deep Reinforcement Learning
- 2017, Constrained Policy Optimization
- 2017, Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
- 2015, Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs
- 2015, A Comprehensive Survey on Safe Reinforcement Learning
- 2015, Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach
-
- 2022, A Review of Off-Policy Evaluation in Reinforcement Learning
- 2022, Conformal Off-Policy Prediction in Contextual Bandits
- 2020, CoinDICE: Off-Policy Confidence Interval Estimation
- 2018, Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
- 2015, High Confidence Policy Improvement
- 2015, High Confidence Off-Policy Evaluation
- 2000, Eligibility Traces for Off-Policy Policy Evaluation
-
- 2023, Quantile Bandits for Best Arms Identification
- 2020, Neural Contextual Bandits with Deep Representation and Shallow Exploration
- 2020, Neural Contextual Bandits with UCB-based Exploration
- 2016, Optimal Best Arm Identification with Fixed Confidence
- 2016, On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models
- 2016, Explore First, Exploit Next: The True Shape of Regret in Bandit Problems
- 2011, Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems
- 2002, Finite-time Analysis of the Multiarmed Bandit Problem
- 2002, Using Confidence Bounds for Exploitation-Exploration Trade-offs
- 2002, THE NONSTOCHASTIC MULTIARMED BANDIT PROBLEM∗
- 2021, A mean-field analysis of two-player zero-sum games
- 2021, The Limits of Min-Max Optimization Algorithms: Convergence to Spurious Non-Critical Sets
- 2020, On the Convergence of Single-Call Stochastic Extra-Gradient Methods
- 2020, Non-convex Min-Max Optimization: Applications, Challenges, and Recent Theoretical Advances
- 2020, On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems
- 2020, Robust Reinforcement Learning via Adversarial training with Langevin Dynamics
- 2018, Finding Mixed Nash Equilibria of Generative Adversarial Networks
- 2022, A short note on an inequality between KL and TV
- 2020, A Tutorial on Quantile Estimation via Monte Carlo
- 2012, CONCENTRATION INEQUALITIES FOR ORDER STATISTICS
- 1996, IMPORTANCE SAMPLING FOR MONTE CARLO ESTIMATION OF QUANTILES
- 1987, Better Bootstrap Confidence Intervals
- 1982, SOME METHODS FOR TESTING THE HOMOGENEITY OF RAINFALL RECORDS
- 2023, SEMI-IMPLICIT VARIATIONAL INFERENCE VIA SCORE MATCHING
- 2021, Normalizing Flows for Probabilistic Modeling and Inference
- 2020, Improved Techniques for Training Score-Based Generative Models
- 2019, Variational approximations using Fisher divergence
- 2018, Semi-Implicit Variational Inference
- 2018, Variational Inference: A Review for Statisticians
- 2017, Variational Hamiltonian Monte Carlo via Score Matching
- 2013, Stochastic Variational Inference
- 2013, Auto-Encoding Variational Bayes
- 2005, Estimation of Non-Normalized Statistical Models by Score Matching
- 2021, Regularization in RL, Google
- CS 6789: Foundations of Reinforcement Learning
- RL Book Theory
- Reinforcement Learning: an introduction
- Bandit algorithms
- 2021, Lecture Notes for Statistics 311/Electrical Engineering 377
- 2015, Rademacher complexities and VC Dimension
- 2013, An introduction to stochastic approximation
- 2006, System identification and the limits of learning from data
- Deep Learning, Goodfellow et al., 2016
- The Elements of Statistical Learning, Hastie, Tibshirani, and Friedman, 2009
- Machine Learning: a Probabilistic Perspective, Murphy, 2012
- Probability Theory: The Logic of Science, E. T. Jaynes, 2003
- CS285 at UC Berkeley, Deep Reinforcement Learning
- CS234 at Stanford University, Reinforcement Learning
- 15.097 at MIT, Prediction: Machine Learning and Statistics
- 2008, Graphical Models, Exponential Families, and Variational Inference, Wainwright and Jordan
- Deep Reinforcement Learning Doesn't Work Yet
- Distill's publication on Feature Visualization
- Lil'Log, Blog on machine learning
- https://www.math.unipd.it/~vargiolu/home/link.html
- School of mathematics
- Machine learning schools
- Prairie summer school
Papers to add
Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks https://proceedings.neurips.cc/paper/2016/file/abd815286ba1007abfbb8415b83ae2cf-Paper.pdf