2023 Planning

The Goal

My priority is solving alignment, and for now that means seeking out (1) a position with some financial padding so I can continue doing research and (2) mentorship so I can keep getting better at it. So either: a position at a research organization like Anthropic, OpenAI, DeepMind, etc. or a PhD position (with one of a handful of advisors doing actually relevant research).¹

The Subgoals

The main way I'm going to get a research position is — no surprises — to do research. I'll be at SERI MATS for the next two months with explicitly this purpose.

In particular, I'll be doing research on path dependence and theory of deep learning under the guidance of Evan Hubinger. I'm aiming to publish (in conferences) two or three papers out of this work because you have to play the signaling game just a bit if you hope to succeed.

Afterwards, there's an option for an extension (~6 months). If I decide to stick it out with the industry route, I'm aiming to obtain a position by the end of SERI MATS. If I decide for the academia route (or if two months turns out to be a crazy, unrealistic timeline), I'll go with the extension.

My lesson from last year was to set fewer goals and offer myself more freedom within each goal (e.g., to avoid a reading list of exactly these and these authors).

Writing

I want to publish impactful research (or at least have content that I could publish if I decided it was worth it to go through the process of submission).

Since it's hard to measure impact (at least on a one-year timescale), let's stick to setting targets for the observables (and live with the true goal in mind)...

📢 Publish(able) 3 papers. This seems pretty conservative target considering I have 2 already in the works.
📚 Launch Textbook on AI Safety. This has taken a seat on the back-burner for the last month, but it seems pretty important and valuable. I'm going to throw out the sections on "Foundations" and "Machine Learning", and work on the thing it's actually about.
📝 Publish notes at least once a month. Something I want to get in the rhythm of is publishing high-quality notes. Let's be real, writing little blog articles is fun, but it's not the best thing I can be doing provided I can get my writing fill in other ways, such as publishing notes. Which is what I'll be doing.

Reading List

I'm going to throw out my specific "# of books" goals from previous years, though I will set some specific goals in terms of reading textbooks.

First, though a definition of "reading textbooks."

Reading textbooks means skipping the content that doesn't matter, skimming the content that seems possibly somewhat relevant, and investing in the content that seems important (with multiple readings and problem sets). It doesn't mean actually read end-to-end.

Currently in progress

Artificial Intelligence by Russell and Norvig
Reinforcement Learning by Sutton and Barto
The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman
Algebraic Geometry and Statistical Learning Theory by Watanabe
Pattern Recognition and Machine Learning by Bishop
An Introduction to Kolmogorov Complexity and Its Applications by Li and Vitányi
Scaling and Renormalization in Statistical Physics by Cardy

New

Mathematical Theory of Bayesian Statistics by Watanabe
Probability Theory: The Logic of Science by Jaynes
Deep Learning by Goodfellow
Algebra, Topology, Differential Calculus, and Optimization Theory for CS and ML by Gallier and Quaintance
Introduction to the theory of Computation by Sipser

Stretch/Undecided

Category Theory by Awodey
Topology by Munkres
Radically Elementary Probability Theory by Nelson

Stretch Goals

🇨🇳 Learn Mandarin. I like learning languages, and hobbies seem healthy even when the world is ending. It also seems valuable to make myself a future asset if world governments ever get their shit together to figure out AI policy.
👓 Myopia. Actually reduce my diopters by 0.5 in both eyes.
🏃 Moving. I'd like to balance out my exercise regime a bit more. Right now, I'm going to hot pilates/yoga several times a week which seems to get me what I need in terms of mobility/flexibility/core/cardio. I'd like to get some actual strength training into my regimen.
💰 Money. I'd love to generate a bit of passive income. Obvious routes are selling content (some notes or lecture series) or some kind of (AI-driven) service.

I'm avoiding the independent research route because I think the value of a strong group of peers and mentors is too high to be missed. ↩