My research interest lies in developing robust generative models and applying them to diverse applications, including conditional sampling, disentangled representation learning, and reasoning. Currently, my goal is to build a generative model capable of producing high-fidelity, diverse, and high-dimensional samples in a single forward pass.
My recent works focus on image generation and manipulation using ODE/SDE-based generative models, specifically diffusion probabilistic models.
- My huggingface/diffusers pull requests [1*] [2*] [3]
- My math derivation of DDPM [PDF]
- My literature review on diffusion probabilistic models [Google Sheets]
There's too many problems. So how can you work on all problems simultaneously? You solve the meta-problem, which is to me just intelligence and how do you automate it? - Andrej Karpathy
- This is my fourth time resolving a bug in the long run 🏃♂️🏃, as well as medium-difficulty problems such as "How to work effectively with a large codebase?" and "Preparing an agenda and good questions for my presentation." I am still figuring out how to reproduce the phenomenon. My main strategies include: (1) Loading my mental RAM (my working memory) fully with the problem, be obsessed with it when I'm taking a shower, and falling asleep, until it's fully ingrained in my memory. I might be ready to wake up and work on it right there. This process can take a whole day or more. (2) Asking logical, systematic, and non-random questions, frequently.
- I don't enjoy encountering old bugs. New bugs provide me with a natural source of dopamine.
assert
andraise
are our friends. - Lots of minor engineering problems must be rapidly solved in an algorithmic way, thus contributing to good research.
- Backpropagation and transformer are general ❤️. I like general things. (inductive prior will set a starting point until your model gets informed by your data, while the inductive bias inevitability limits the space of possible model-solutions)
- Why? or Why is it important? So what?