Reviews

2023-Q2

Highlights from Q2

  • Launched the developmental interpretability ("devinterp") research agenda with Alexander Gietelink Oldenziel, Stan van Wingerden, and Daniel Murfet.
    • This came out of the 2023 SLT & Alignment Summit, which I co-organized with the same people.
    • I prepared six lectures, contributing to over 20 hours of recorded materials.
  • Worked as a research assistant at the university of Cambridge.
    • Submitted "Unifying Grokking & Double Descent" with Xander Davies, Lauro Langosco, and David Krueger to NeurIPS.
    • Started working on a project on capability unlearning with Jake Mendel, Bilal Chughtai, and Lauro Langosco.
  • Worked as a writer with CAIS on ██ ████████ ████████ ██ ██ ██████ .
  • I'm making progress on the posts I set out to complete in my previous quarterly review: The Shallow Reality of 'Deep Learning Theory', What are inductive biases, really?
    • I dropped/modified some of these: Neural (network) divergence, toy models of loss landscapes, and path dependence.

Plans for the rest of 2023

I have only one priority the coming 6 months: to test the basic claims behind the devinterp research agenda.

That means saying no to pretty much everything else and closing off my responsibilities with the Krueger Lab and CAIS ASAP. It's time to grind.

Research. My primary role will be leading the empirical component of the investigations into devinterp. I will manage (and contribute to):

  • Building tooling/libraries for measuring RLCTs, singular fluctuations, and prosaic "progress measures".
  • Building out a "zoo" of models and settings in which to test these tools from models trained on synthetic data to vision models, simple language models, and full-fledged LLMs.

I will also move to Melbourne for a few months (September—December) to work alongside Daniel Murfet on this agenda.

Organizational. My secondary role is laying the groundwork for devinterp to scale rapidly if the empirical claims survive scrutiny. This means:

  • Organizing a follow-up devinterp summit in November.
  • Managing and coordinating contributors to the empirical branch of devinterp.
  • And ████████ █ ████████ ███.

Now, all that said, I'm not exactly planning to neglect the rest of my life. It's time for a more in-depth reflection.

Areas

Health

Nutrition

  • Diet. My diet's been pretty good, but I've been eating many more meals not prepared by myself. That's meant a lot more seed oils/added sugars/etc. than I'd like.
    • Breakfast is usually oatmeal + peanut butter + protein + banana, etc. in smoothie or porridge form.
    • I could use an equally easy default lunch.
    • As long as I'm in the same location with Robin, she's happy to cook dinner.
  • Meat. I've been eating much less meat. Perhaps too little. As much as I'd like vegetarianism to be equivalent in terms of health, it's not.
  • Alcohol. I drink occasionally, usually not more than 1 or 2 drinks per week, but it's time to stop.
  • Protein. I've been supplementing with protein shakes regularly since I'm trying to build muscle. I've gained ~3-5kg over the last half year and would like to gain another ~5kg over the rest of the year.
  • Intermittent fasting. I've fallen out of the habit of 16/8 IF and would like to restart some kind of fasting. 16/8 isn't ideal since I like being able to drink a cappuccino in the morning and because of the bulking. Alternatively, I can try a 5-day fast once a quarter, which might even be better, but I love food too much, so I need someone else to force me to do this.
  • Supplements. I've been taking Athletic greens, creatine, fish oils, and occasionally vitamin C + zinc (when I have a cold).
  • Caffeine. I drink two-three cups of coffee per day (and stop responsibly at noon).
  • Nicotine. Most days, I take 2mg of Nicotine in the afternoon sometime between 15:00 and 17:00.
  • Melatonin. I take 3mg of melatonin per night (or rather I was, but then fell out of the habit). Not very consistent.

Obstacles

  • Time to cook (esp. because travel).
  • Willpower for fasting.
  • Meat is unethical.

Todos

  • Talk to Robin: ask for quick lunch recipes, or plan a meal prep day, or find someone in Melbourne who can do meal prep for me. She also recommends more broccoli sprouts.
  • Get someone to hold me accountable to do longer fasts.
  • Find access to more stimulants: Modafinil for regular use and Vyvanse or Adderall for occasional use. Maybe also LSD for microdosing.
  • Find a somewhat ethical source of meat like venison or kangaroo.

Fitness

I think of three main components to physical fitness:

  • Endurance: aerobic/cardio, sauna.
  • Strength: anaerobic, weight-lifting, calisthenics.
  • Dexterity (mobility, agility, flexibility, plasticity, elasticity, stability & balance): yoga, pilates, handstand practice.

Target. My ideal schedule would look like:

  • ~30min light cardio/yoga/core to start the day (maybe jump-rope and some sun salutations).
  • ~45m Weight-lifting/calisthenics and HIIT on alternating days, followed by 30min of skills/stretching/sauna.
  • ~15min of restorative yoga before bed.

Weight-lifting/calisthenics. I started weight-lifting again a few months ago but then pulled something in my back, then went traveling for a month, and fell out of the habit again. I'm going to go from StrongLifts to Starting Strength, since the time commitment is lower, and I'll add weight more slowly this time.

Cardio. Historically, I've found cardio the hardest to commit to (and probably the one I need most). I recently discovered a love for hot yoga and pilates, which combines endurance and dexterity reasonably well but isn't easy to do when you're traveling all the time (even with ClassPass). The best option is probably more HIIT-focused approaches, which I enjoy much more than steady-state work on, e.g., an erg.

Obstacles:

  • Injury risk with weight-lifting.
  • Access to gyms (due to travel).
  • Inconsistency (due to travel & having three separate times to do exercise).
  • Cardio sucks.

Todos

  • Meet with a PT to check weightlifting form. Focus on calisthenics until then.
  • Use ClassPass & Alo Moves (which I already have).
  • Find a gym in Melbourne.
  • Assemble a set of default routines for each of these moments + alternatives that don't require gym access.
    • Either select from Alo Moves or ask a PT.
  • Finally figure out some kind of habit tracking software (maybe just a spreadsheet?). Obsidian isn't good because you shouldn't shit (=track habits) where you eat (=come up with ideas) and Linear doesn't have great tooling for repetitive tasks.

Sleep / Rest / Stress

  • I've been sleeping in rooms that have too much light and sleeping less as a result.
  • I'd also like to wake up a bit earlier (~6:00)

Todos

  • Buy a good eye mask.
  • Put my phone away from my bed and get some light immediately after waking.

Mental

  • Sitting quietly in a sauna is the closest thing that comes to meditation for me, and it seems to be enough for now.

Other

  • Main problem is I've been getting too many colds (~2-3 this year already). It's too much of a productivity decrease, and I'm not sure what's causing this. Maybe lack of rest? Maybe travel?
  • I'd like to put more effort into skincare and will ask Robin to hold me accountable for that.
  • I am 3 months behind on going to an oral hygienist and need to schedule an appointment to get my wisdom teeth pulled.
  • Need to be more consistent in wearing my differentials when doing near work.

Todos

  • Talk to Robin about colds & skincare. See what she recommends.
  • Unregister with current dentist and reregister in Amsterdam. Schedule mouth cleaning.
    • Start flossing again.
  • Add differentials to habit tracker.

Family & friends

  • Family-wise, I'm too far away from everyone, but that's not really to be avoided. I still get to see everyone every few months intensively for a week or two at a time.
  • Friend-wise, the last half year has been great. I've gotten very close to Alexander, maybe Stan soon as well, and formed a bunch more intermediate friendships. Having a community is overpowered.

Todos

  • Schedule a weekly reminder to call parents & Elmer.

Love

  • The last few months have been tough because Robin and I have been long-distance so frequently. I think this will go better the next half year, since Robin is planning to come with me to Australia, but it won't be fully resolved (since I'll go back to the UK for November). I'd very much like to settle with her in one place with more than a three month horizon. Maybe that's in the cards in 2024.

Todos

  • Restart biweekly relationship check-ins. Plan in a time.
  • Schedule weekly reminder to plan a date.

Money

  • The SERI MATS extension grant has been the biggest help. Working as an RA pays dirt. Working as a writer with CAIS has been much better, but I haven't had very many hours, so it's not a major addition.
  • I'll be applying to the Century Fellowship, and I think my odds are reasonable of getting it, which would make a major difference to my financial security. Otherwise, I'll manage on an R.A. budget in Australia.

Todos

  • Actually budget out this year.
  • Figure out how/where to invest money when I receive grants so it doesn't just sit in my account.

Career & impact

  • Everything here is going well. (See above.)
  • Need to grow my twitter clout. A few months ago, I started strong, but I haven't been posting recently. I'm at O(500) followers, and would like to get to O(5,000) by the end of the year.

Todos

  • Schedule weekly reminder to post.

Personal growth & learning

  • I think the past few months have been too focused on execution and not enough on growth. In particular, I haven't been learning in a structured way as much as I'd like to. I've fallen out of my Anki routines, I haven't learned any new languages recently, and I'm not reading enough outside of technical articles (and even there I think I could be doing more).

Todos

  • Decide between FluentForever & Anki + iTalki (or some in-between) for learning Japanese (because Watanabe speaks Japanese and we want to honeymoon here). I'd also like to learn Mandarin, but one thing at a time, this now seems pressing.
  • Map out a learning plan for this summer.
  • Remember to call Alexander whenever I need tutoring.

Leisure / play

  • Definitely could be doing more here though I've had plenty of time for social events and don't particularly feel like I'm coming short.
  • There's room to refactor this with fitness into some kind of team sport or martial arts, but for now it's probably too much of a time commitment or too intermittent (therefore hard to form habits around) or has too much of a risk of brain damage.
  • Also room for having fun in learning a new language (see personal growth).
  • Most of all, there's room for reading more fiction.

Todos

  • Polish off my want-to-read list on GoodReads.
  • Charge my kindle & download the top books off the list.
  • Schedule a regular massage?
  • Schedule a trip with Robin in Australia.

Technology

  • Too much twitter.
  • Not enough care and maintenance for my Obsidian.

Todos

  • Find a day to go through my Obsidian and clean.

Environment

  • The main thing is I'd like to be able to settle in one place in 2024 for more than 6 months. This is a problem for Q4.
  • Let's pay for a cleaner.

Todos

  • Find a cleaner in Melbourne.
  • Get a CO2 monitor.

2023-Q1

The last half year has been one of the most turbulent periods of my life. It's also been one of the best.

I quit the start-up that was sucking out my soul and rotting my intellect (Okay maybe that's a tad melodramatic). I started working on a problem I care about and reviving my brain. I found the community, mentors, and projects I'd been looking for. I started doing original work and advocating for a neglected area of research (singular learning theory). It's been pretty great.

Which makes it a great time for reflection and looking forward. What's in store for the rest of the year?

The last six months

Six months ago, I got an FTX Future Fund grant to do some upskilling. One of the conditions for receiving that grant was to write a reflection after the grant period (six months) expired. So, yes, that's part of my motivation for writing this post. Even if FTX did implode in the interim, and even if there is likely no one to read this, it's better to be safe than sorry.

A quick summary:

  • Reading: Mathematics for Machine Learning, Bishop, Cracking the Coding Interview, Sutton & Barto, Russell & Norvig, Watanabe, and lots of miscellaneous articles, sequences, etc.
  • Courses: Fast.ai (which I quit early because it was too basic), OpenAI's spinning up (abandoned in favor of other RL material), and ARENA (modeled after MLAB).
  • SERI MATS: An unexpected development was that I ended up participating in SERI MATS. For two months, I was in Berkeley with a cohort of others in a similar position as mine (i.e., transitioning to technical AI safety research).
  • Output: singular learning theory sequence & classical learning theory sequence.

It's been quite a lot more productive than I anticipated both in terms of input absorbed and output written. I also ended up with a position as a research assistant with David Krueger's lab.

The next six months

But we're not done yet. The next six months are shaping up to be the most busy in my life. As I like 'em.

Summit

I'm organizing a summit on SLT and alignment. My guess is that, looking back a few years from now, I will have accelerated this field by up to two years (compared to worlds in which I don't exist). The aim will be to foster research applying SLT within AI safety towards developing better interpretability tools, with specific attention given to detecting phase transitions.

Publications

So many projects. Unlike some, I think writing publications is actually a pretty decent goal to work to. You need some kind of legible output to work towards and that can serve as a finishing line.

In the order of most finished to least:

  • (SLT) The Shallow Reality of 'Deep Learning Theory': when I'm done writing the sequence on LessWrong, I'm going to work with Zach Furman and Mark Chiu Chong to turn this into something publishable.
  • Pattern-learning model: this is the project I'm currently working on with Lauro Langosco in the Krueger lab. The aim is to devise a simplified toy model of neural network training dynamics akin to Michaud et al.'s quantization model of neural scaling.
  • Neural (network) divergence: a project I'm working on with Samuel Knoche on reviewing and implementing the various ways people have come up with to compare different neural networks.
  • What are inductive biases, really?: a project I'm working on with Alexandra Bates to review all the existing literature on inductive biases and provide some much needed formalization.
  • (SLT) Singularities and dynamics: the aim is to develop toy models of the loss landscape in which to investigate the role of singularities on training dynamics.
  • Path dependence in NNs: this the project I started working on in SERI MATS. The idea is to study how small perturbations (to the weights or hyperparameters) grow over the course of training. There's a lot here, which is why it's taking quite some time to finish up.
  • (SLT) Phase detectors: a project I recently started during an Apart Hackathon, which explores how to detect "phase transitions" during training.

There's a lot here, which is why some of these projects (the last three) are currently parked.

(And to make it worse I've just accepted a part-time technical writing position.)

Career

What's next? After the summit? After wrapping up a few of these projects? After the research assistant position comes to a close (in the fall)?

Do I…

I'm leaning more and more to the last one (/two).

A job with Anthropic would be great, but I think I think I could accomplish more by pursuing a slightly different agenda and if I had a bit more slack to invest in learning.

Meanwhile, I think a typical PhD is too much lock-in, especially in the US where they might require me (with a physics background) to do an additional masters degree. As a century fellow, I'd be free to create my own custom PhD-like program. I'd spend some time in Australia with Daniel Murfet, in Boston with the Tegmark group, in New York with the Bowman lab, in London with Conjecture, in the Bay Area with everyone.

I think it's very likely that I'll end up starting a research organization focused on bringing SLT to alignment. That's going to take a slightly atypical path.

2022 Review

A lot has changed this year.

From Entrepreneurship to Research

I started the year set on committing to the path of entrepreneurship. I ended it as an AI safety researcher. What can I say? Priorities change.

One day you may be sold on the earning-to-give route (or — if you're feeling cynical — the social status accompanying entrepreneurship and philanthropy). The next day, you're sold on maybe preventing powerful AI from causing the demise of humanity (or the social status accompanying AI safety research within the EA bubble).

Social status dynamics aside, it's a better fit. Working on Health Curious (the company I founded) made me feel like my brain was shrinking. I just wasn't built to spend my days writing React apps.

Meanwhile, I've always been fascinated by AI (and have always contorted my physics degrees into excuses to study NNs). Research keeps my curiosity levels far better satiated. I also didn't have anything near this healthy a support network while working on my company.

I'm happier, more focused, and working far more productively. If anything, my life has gotten much easier and better since it's gotten less balanced. Having the one overarching priority of "solve alignment" makes taking any kind of decision much easier.

I still think founding some kind of a research organization might very well be in my future. I like working with people and working on big-picture strategy. There's a big premium on that kind of thing in technical AI safety (considering we're a bunch of nerds).

Goals for 2022

Overall, I'd say my goals for 2022 had about a 50% success rate. It's that low mainly because they were bad goals that didn't suit the person I ended up becoming.

Or maybe that's just my coping strategy.

  1. 🛑 No more scrolling (YouTube, Reddit, Porn, etc.):
    • Complete failure. I even ended up joining a new platform (Twitter).
  2. 🚪 Screen time:
    • I'd call this a success. My screen time for my phone is about an hour. For my computer, it's atrocious, often upwards of 8-10 hours. But hey, it's my job, so I accept it as the price of admission.
  3. Self-monitoring:
    • Altogether a success. My main innovation was getting on Linear and building an integration with Toggl, so that my tasks are automatically time-tracked. This meant I didn't have to do much thinking to log my time, which is the best way to make sure it actually gets logged.
    • There's definitely room for improvement: I'm not actually doing anything with the information. I think the most natural way to address this would be to build a little dashboard for all my sources of data.
    • The other main room for improvement is that there's always more I could track: I didn't manage to track additional media consumption beyond books.
  4. 📚 Books (1 book per week):
    • I didn't get anywhere close to my goal of 50 books this year, at least as logged on Goodreads (it says 22). In terms of total volume, however, I think I far exceeded last year. The discrepancy consists in, e.g., Worm being listed as a single book (at 6,680 pages, about 1.5 times the Harry Potter series), and most of my reading consisting of textbooks and papers.
    • The bigger failure is that I didn't meet most of the particular categorical targets I set (in terms of, e.g., reading X books of a specific language, X books by Y author).
    • Just goes to show that optimizing the wrong metric is stupid.
  5. 🗃 PKM:
    • Bit of a failure. My personal knowledge management is a mess in need of a thorough cleaning.
  6. ✍️ Writing:
    • I missed a few of the targets, but I'm overall happy with what I've published.
    • The main new thing I'm trying to do is publish my own notes on a given subject.
  7. 🗣 Languages:
    • I learned Portuguese to a pretty high level, but I've given up on German (for the time being), and now accept that I will lose my bet with my roommate on reading Faust in the original German by my 25th birthday. I've also seriously fallen behind on the Mandarin.
    • The bigger problem is that I've had trouble keeping my Anki habit active. For a period of about 6 years, I did Anki pretty much every day, and I need to get back to that commitment, not just for languages, but for everything in my brain.
    • Oh, and I didn't meet any of the specific create X flashcards targets. They were too ambitious.
  8. 🏃 Moving
    • I stopped diligently trying to close my Apple Watch rings. Bad Jesse. And I've been bad about walking. But overall, my fitness has been pretty good. It really peaked in Brasília when I was doing pilates every day and the pilates instructor was this demon sent from the seventh circle of hell to torture us with her core wrath.
    • As for the other subgoals, I can manage a 15s or so handstand, but the 30s is not quite there yet. And I've stopped doing my Kegel exercises — the non-ejaculatory orgasm will have to wait.
  9. 🍽 Fasting
    • I love food too much to make myself not eat for a full day every month. So I'm going to stick to the 16/8 that has served me well for years.
  10. 🌏 Diet:
    • I've become progressively more and more vegetarian, and I think it's about time to make the full plunge.
  11. 👓 Myopia:
    • Nope. Didn't make progress here. But that's also because I stopped putting much effort into this.
  12. 👥 Relationships:
    • This is where I've had the most success. I've found a network of people doing the same things I'm doing, and I'm now doing the programs that will get me where I need to be. I have a research mentor and plenty of other guidance to help me along the way.
  13. 💰 Money:
    • Between the SERI MATS stipends and the FTX regrant (which I may ultimately have to repay), I'm doing well. My partner and I are financially and locationally independent.

My takeaways for next year are to set fewer goals and to allow myself more freedom within the goals (e.g., don't try to prescribe a list of the exact authors I'm going to read). I ended up constraining myself more than was useful, and setting goals for things that weren't actually priorities. Lessons learned.

2023 Planning

The Goal

My priority is solving alignment, and for now that means seeking out (1) a position with some financial padding so I can continue doing research and (2) mentorship so I can keep getting better at it. So either: a position at a research organization like Anthropic, OpenAI, DeepMind, etc. or a PhD position (with one of a handful of advisors doing actually relevant research).1

The Subgoals

The main way I'm going to get a research position is — no surprises — to do research. I'll be at SERI MATS for the next two months with explicitly this purpose.

In particular, I'll be doing research on path dependence and theory of deep learning under the guidance of Evan Hubinger. I'm aiming to publish (in conferences) two or three papers out of this work because you have to play the signaling game just a bit if you hope to succeed.

Afterwards, there's an option for an extension (~6 months). If I decide to stick it out with the industry route, I'm aiming to obtain a position by the end of SERI MATS. If I decide for the academia route (or if two months turns out to be a crazy, unrealistic timeline), I'll go with the extension.

My lesson from last year was to set fewer goals and offer myself more freedom within each goal (e.g., to avoid a reading list of exactly these and these authors).

Writing

I want to publish impactful research (or at least have content that I could publish if I decided it was worth it to go through the process of submission).

Since it's hard to measure impact (at least on a one-year timescale), let's stick to setting targets for the observables (and live with the true goal in mind)...

  • 📢 Publish(able) 3 papers. This seems pretty conservative target considering I have 2 already in the works.
  • 📚 Launch Textbook on AI Safety. This has taken a seat on the back-burner for the last month, but it seems pretty important and valuable. I'm going to throw out the sections on "Foundations" and "Machine Learning", and work on the thing it's actually about.
  • 📝 Publish notes at least once a month. Something I want to get in the rhythm of is publishing high-quality notes. Let's be real, writing little blog articles is fun, but it's not the best thing I can be doing provided I can get my writing fill in other ways, such as publishing notes. Which is what I'll be doing.

Reading List

I'm going to throw out my specific "# of books" goals from previous years, though I will set some specific goals in terms of reading textbooks.

First, though a definition of "reading textbooks."

Reading textbooks means skipping the content that doesn't matter, skimming the content that seems possibly somewhat relevant, and investing in the content that seems important (with multiple readings and problem sets). It doesn't mean actually read end-to-end.

Currently in progress

  • Artificial Intelligence by Russell and Norvig
  • Reinforcement Learning by Sutton and Barto
  • The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman
  • Algebraic Geometry and Statistical Learning Theory by Watanabe
  • Pattern Recognition and Machine Learning by Bishop
  • An Introduction to Kolmogorov Complexity and Its Applications by Li and Vitányi
  • Scaling and Renormalization in Statistical Physics by Cardy

New

Stretch/Undecided

  • Category Theory by Awodey
  • Topology by Munkres
  • Radically Elementary Probability Theory by Nelson

Stretch Goals

  • 🇨🇳 Learn Mandarin. I like learning languages, and hobbies seem healthy even when the world is ending. It also seems valuable to make myself a future asset if world governments ever get their shit together to figure out AI policy.
  • 👓 Myopia. Actually reduce my diopters by 0.5 in both eyes.
  • 🏃 Moving. I'd like to balance out my exercise regime a bit more. Right now, I'm going to hot pilates/yoga several times a week which seems to get me what I need in terms of mobility/flexibility/core/cardio. I'd like to get some actual strength training into my regimen.
  • 💰 Money. I'd love to generate a bit of passive income. Obvious routes are selling content (some notes or lecture series) or some kind of (AI-driven) service.

Footnotes

  1. I'm avoiding the independent research route because I think the value of a strong group of peers and mentors is too high to be missed.

2022-M11

It's been three months since I decided to pivot to AI safety. Who knew three months can be such a long time.

FTX

I got an FTX future fund regrant for six months to help make the switch. Then FTX imploded, and it turns out my grant may be clawed back during the bankruptcy proceedings. Unfortunate. On the bright side1, FTX seems to have been such a mess that it could take years for the process to get to me. So if you have short enough timelines…2

Courses

  • The ARENA virtual program is going along smoothly.
  • This month, I joined SERI MATS to get my hands dirty in research. I'm officially under Evan Hubinger's Deceptive AI stream (though I'm also participating in John Wentworth's workshops). Yes, I'm reaching the limits of what I can juggle. We're also working our way through the Alignment 201 curriculum.3
  • When ARENA finishes up, I'm going to dedicate more of my attention to metauni (especially their track on singular learning theory (SLT)).
  • I've also finished most of the miscellaneous Alignment Forum sequences I wanted to go through (as well as AXRP and The Inside View).

The SERI MATS research sprint is about to start, and I'll be in Berkeley from January to at least February to work on this in person. Safe to say, I have way too many ideas for research projects, but I'm planning to focus on Toy Models of Superposition.

Distillation

  • I'm working on an introduction to SLT that should be out soon.
  • The video I'm working on with Hoog on AI risk is a little delayed because OpenPhil funding was paused (and I'm a little overextended but don't want to admit it), but it is coming along.
  • Next to all that, I've started working on an online, interactive AI safety textbook (very much a work in progress, more coming soon)4.

Textbooks

  • Mathematics for Machine Learnings: Done. Great book. Highly recommend.
  • Pattern Recognition and Machine Learning: Completed up through chapter 9. I have 5 chapters to go. Instead of trying to bang these out in the next month to meet my original deadline, I'm going to push back my deadline by two weeks, so I have more time during the research sprint.
  • Cracking the Coding Interview: 6 more chapters to go. Like PRML, I'm going to push back my original deadline 2 weeks to mid-January.4
  • Reinforcement Learning: I've already gone through 6/17 chapters ahead of schedule. I'm aiming to be done by April.
  • Artificial Intelligence: A Modern Approach: Here too, I'm 3 chapters in. Ahead of when I originally planned to get this started. This is a big book, so my (self-enforced) deadline is May 1st.

Outreach

  • I'm writing this on a plane to the EAGxBerkeley. Judging from EAGxRotterdam, EAGxBerkeley is going to be great (as long as they leave out the food poisoning part).
  • I've started laying out some feelers for a longer term project of establishing an AI-safety-oriented company in mainland Europe. There's a lot of early-career interest among very smart students. Give them a few years, and mainland Europe will be ripe for a new organization like 3 Resources/Rationalia/EA/Topics/Anthropic, Redwood, or Conjecture.

Conclusion

The next month is going to be hectic. I'll be in the Bay area for a week and a half, then my parents' place in NY state for a week, then Michigan at my girlfriend's fiancée's dad's, then NY for another week. Oh yeah, did I mention? I got engaged!5 On the day I publish this, it's our 5-year anniversary. Robin, I love you, and I can't wait to spend the rest of our lives together. (However long that is, doomer.)

Footnotes

  1. For me. I think it's safe to say the inconvenience for me is less than the inconvenience for people who lost their life savings.

  2. My timelines aren't actually that short. But I'm not worried about eventually being able to pay this back (even very soon with the SERI MATS stipend).

  3. I've come to conclude that I can safely skip Intro to ML Safety for now. Much of the content overlaps with these other programs/textbooks.

  4. I'm probably a bit newer to the field than would be ideal for this task, so I'm hoping to migrate to a more editorial role, delegating the bits that I can. I think my main strength here is more a kind of Olahian interactive distillation. That's an ability which seems to be pretty rare among active researchers. 2

  5. I proposed with an Oura ring, which definitely says something about the kind of people we are. Now that I think about it, I should have probably asked for a sponsorship and gotten the whole wedding funded by a late-stage capitalism PR departments, but hey hindsight is 20/20.