Thoughts on Open AGI

by James Caldwell

Hi, I'm James. I'm fascinated by the idea of open approaches to artificial general intelligence and like thinking out loud about where it's all heading.

The role of hobbyists in advancing AI

2026-05-13

I was thinking about this earlier this week while tinkering with one of those small language models that runs locally on my laptop. It's remarkable, really. Six months ago, I would have needed a server farm to run something that capable. Now it sits quietly in the corner of my file system, answering questions while I drink my morning coffee.

And that got me wondering: where do hobbyists fit into all of this? The big labs are doing their thing with billions in funding and massive clusters. But there's something happening on the other end of the spectrum too. People in their bedrooms and basements, playing with open models, building things just because they can. Sometimes I wonder if we're underestimating how much that matters.

The democratization feels real in a way I didn't expect. I watch developers who couldn't code six months ago shipping actual products using AI assistance. Not because they suddenly became engineers, but because the tools got simple enough that curiosity became more important than credentials. That shift changes things. When the barrier to entry drops that low, weird and wonderful things start happening.

But here's what I keep coming back to: hobbyists have always been the canaries in the coal mine for technology. They're the ones who take a serious technology and ask silly questions with it. They break it in ways the creators never intended. And sometimes, in breaking it, they find something nobody was looking for.

The tinkerers working with robotic arms in their garages, the solo game developers using AI to generate art assets they could never afford to commission. These aren't the stories that make headlines. But I wonder if they're the stories that actually matter for figuring out where this is all headed. Not the corporate roadmaps or the research papers, but what happens when regular people get their hands on tools that used to be impossibly out of reach.

There's something quietly powerful about that. The unguided curiosity. The willingness to spend a weekend on something that might not work. I don't know what comes next, but I'm pretty sure it won't be what any of us expect. And I have a feeling it's going to come from someone's side project as much as from anywhere else.

What musicians taught me about artificial creativity

2026-05-08

I was sitting in my kitchen last Tuesday morning, trying to play a simple G major chord on my old acoustic guitar. It sounded awful. Like, really awful. The B string buzzed against the third fret, my pinky couldn't reach the high E properly, and somehow I managed to mute half the notes I was actually aiming for.

But here's the weird thing. As I sat there producing what could generously be called "sounds," it hit me how much this whole experience reminded me of watching machines learn to make music. Not the polished stuff you hear coming out of the big labs these days, but those early attempts where you could practically hear the algorithms struggling to figure out what a melody was supposed to be.

See, I've been thinking a lot about artificial creativity lately, especially after stumbling down a rabbit hole of papers about how people actually learn to improvise. Turns out there's this whole body of research showing that musical improvisation isn't just about technical skill. It's about developing what researchers call "process thinking" instead of just focusing on perfect outcomes. Musicians who get good at improvising learn to embrace the messy, real-time decisions. They develop this ability to anticipate what might happen next, draw from their repertoire of patterns, and most importantly, they learn when to ignore their internal critic.

And that's where my terrible guitar playing got interesting. Because I realized that maybe the way I approach creativity, the way I fumble through chord changes and accidentally discover something that doesn't sound completely horrible, might actually tell us something useful about how we should think about machine creativity.

The thing is, when I watch someone who's really good at improvisation, what strikes me isn't their technical perfection. It's how comfortable they are with uncertainty. They'll start a phrase not knowing exactly where it's going to end up. They'll take risks that might not work out. Sometimes they'll hit a note that sounds wrong and somehow make it sound intentional by what they play next. They've learned to work with mistakes instead of just trying to avoid them.

This feels like something that gets lost when we talk about how machines should be creative. We tend to focus on the output, on whether the generated music sounds "good" or passes some quality test. But maybe the more interesting question is whether these systems can develop something like musical intuition. Can they learn to take creative risks? Can they surprise themselves?

I'm not saying my bad guitar playing is going to revolutionize anything. But there's something valuable in the struggle, in the process of figuring out what works through trial and error. Maybe that's what we should be looking for in artificial creativity too. Not just systems that can produce technically correct music, but ones that can genuinely explore and discover things they didn't know they were looking for.

Reflections on the open source AI debate

2026-05-07

I've been sitting in a coffee shop this afternoon watching someone fine-tune a language model on their laptop. Maybe that sentence wouldn't have made sense five years ago, but now it feels almost mundane. The person looked like any other remote worker, but they had access to capabilities that once required supercomputers and billions in funding. This is what the open source AI debate boils down to, I think. Who gets to have these tools? And what happens when everyone does?

The whole conversation has this strange tension to it. On one side, you have people arguing that open models democratize innovation and prevent a few big companies from controlling the future. They point to all the amazing things happening when you give researchers and startups access to model weights. Small teams building specialized applications. Students in developing countries training models for their local languages. The usual Silicon Valley narrative about empowering the little guy.

But then there's this other voice, getting louder lately. People worry about safety. About bad actors using these same tools for harm. About giving away strategic advantages to competitors. The safety crowd makes compelling arguments too. If AI is going to be as powerful as everyone says, maybe we need some gatekeepers. Maybe not everything should be downloadable.

What strikes me is how neither side really knows what they're talking about yet. I mean that in the best way possible. We're all just figuring this out as we go. The person in the coffee shop might be creating the next breakthrough in medical research, or they might be wasting their afternoon on a hobby project. The big companies might be holding back dangerous capabilities, or they might just be protecting their business models. Probably both things are true at different times.

And somehow the debate has gotten tangled up with all these other issues. National competitiveness. Corporate power. Who controls the internet. It reminds me of the arguments about open source software in the 90s, except the stakes feel higher now. Back then we were talking about server operating systems. Now we're talking about systems that might be smarter than us someday.

The regulatory stuff makes it even messier. Governments trying to write rules for technology they don't understand. Companies trying to shape those rules to their advantage. Everyone pretending they know how this will all play out. But I keep coming back to that person in the coffee shop. Whatever happens with the big policy debates, people are going to keep building things. The tools are already out there. That genie probably isn't going back in the bottle, which makes me think the real question isn't whether to have open models but how to make the world they create a good one.

The problem with AI doomerism

2026-04-30

I'm sitting here at my desk this morning, coffee getting cold, thinking about something that's been bugging me for months. The whole AI doomer thing is really starting to wear me down. Not the concerns themselves—some of them are completely reasonable—but this whole performance we've created around being scared.

Look, I get it. New technology can be frightening. It should be frightening sometimes. But somewhere along the way, we turned reasonable caution into this weird badge of intellectual sophistication. It's like the only way to prove you understand AI is to be deeply worried about it. And honestly? That's exhausting and probably counterproductive.

What really gets me is how the doomer label has become this conversation ender. Someone raises concerns about job displacement or privacy? Boom—they're a doomer. Someone suggests maybe we should think about safety testing? Doomer again. It's this perfect little box we've created to dismiss anyone who isn't boundlessly optimistic about our silicon future. But here's the thing: most people I talk to aren't actually pure doomers or pure optimists. They're just trying to figure out what all this means for their kids, their work, their communities.

The irony is that by shutting down these conversations, we're probably making things worse. When you can't talk honestly about risks without being labeled dramatic or pessimistic, you lose the chance to actually address those risks. It's like we've decided that thinking carefully is somehow the enemy of progress. That's backwards. The technologies that work best are the ones we've thought hardest about.

And another thing—this whole doom and gloom routine might actually be serving some interests we haven't thought about. Every time someone paints an apocalyptic picture, it makes current AI capabilities seem almost quaint by comparison. "Oh, you're worried about job displacement? Well, at least we're not all dead!" It's a strange kind of psychological anchoring that makes pretty significant problems seem manageable. Maybe that's not intentional, but it's convenient for companies that would rather not deal with messy questions about inequality or power concentration.

I'm not saying we should ignore existential risks entirely. But I am saying we've gotten so caught up in science fiction scenarios that we're missing the very real, very immediate ways this technology is reshaping society right now. Maybe the most dangerous thing isn't the AI itself. Maybe it's our refusal to have grown-up conversations about what we're actually building and who gets to decide how it works.

Reading list: papers that changed how I think about intelligence

2026-04-28

I've been sitting here with my coffee on Sunday morning, and I got to thinking about those research papers that completely rewired how I understand intelligence. Not the ones everyone talks about in hushed tones at conferences, but the ones that made me stop mid-sentence and go "oh, wait." You know that feeling.

The transformer paper hit me like that. "Attention Is All You Need" sounds almost dismissive, right? Like someone was tired of all the complexity we'd been layering on. But when I actually worked through what they were saying, it was this beautiful simplicity underneath all the equations. Here's this mechanism that just... pays attention to what matters. And suddenly language models didn't need to process words one by one anymore. They could look at everything at once and figure out what connected to what. It's wild how something so intuitive took so long to discover.

But honestly, that paper only made sense to me after I'd struggled through some earlier work on attention mechanisms. There's this progression that happens when you're reading about intelligence research. Each paper builds these little stepping stones, and then suddenly you're standing somewhere completely different. I remember reading about working memory models years ago and thinking they were just academic exercises. Then I'd read about neural networks that could hold information in some kind of distributed state, and slowly this picture started forming.

The thing that keeps surprising me is how many of these breakthrough papers start with someone questioning the most basic assumptions. Why do we need recurrence? Why can't attention be the main thing? What if we just made models bigger and gave them more data? Sometimes the most obvious questions are the hardest to ask. And then someone publishes fifteen pages that change everything, and you realize intelligence might work differently than you thought.

I've got this growing reading list now. Papers on emergent behavior in large systems. Work on how different types of reasoning might emerge from the same underlying mechanisms. Some stuff on consciousness that probably goes too far but makes you think. The field moves so fast that I'm always behind, but there's something exciting about that constant sense of discovery. Every few months someone publishes something that shifts the whole conversation, and we're all scrambling to catch up and figure out what it means for this whole AGI project.

The funny thing is, I started reading these papers to understand artificial intelligence, but they've changed how I think about my own intelligence too. Makes you wonder what we're missing about thinking itself.

Why open AGI might be safer than closed AGI

2026-04-25

I've been thinking a lot about this lately, sitting in a coffee shop on a gray Tuesday afternoon. The news has been full of all these developments around AI agents and safety frameworks, and something keeps nagging at me. There's this persistent narrative that we need to lock down AGI development, keep it behind closed doors at a few well-funded labs, because that's supposedly the safer path. But what if that's exactly backwards?

I keep coming back to this fundamental tension. When I look at the history of complex systems, the ones that fail catastrophically are usually the ones that were developed in isolation, where only a small group of people understood how they worked. Think about financial systems before transparency regulations. Or nuclear power before independent oversight. The disasters came not from too many people having access to the technology, but from too few people understanding the risks.

The more I dig into what's happening with AI safety research, the more convinced I am that openness might actually be our best bet. And I'm not just talking about the warm fuzzy feeling of collaboration. I mean the hard, practical reality of how you actually make something safe.

When a system is open, you get red teaming from people who don't have stock options riding on the company's success. You get researchers who can poke at the weird edge cases that the original developers never thought to test. You get the kind of scrutiny that finds problems before they become disasters, not after. There's something almost naive about thinking that a handful of engineers, no matter how brilliant, can anticipate all the ways a system might fail once it hits the real world.

But here's what really gets me: the diversity argument. A closed system tends toward monoculture. When one lab controls the development of AGI, their particular blindspots become everyone's blindspots. Their cultural assumptions get baked into the system. Their definition of "alignment" becomes the only definition that matters. That feels incredibly fragile to me. What happens when that one system gets something important wrong? There's no backup, no alternative perspective, no way to course-correct.

An open ecosystem, though messy and chaotic, creates natural checks and balances. Different groups building different approaches means different failure modes. If one system goes off the rails, others might catch it. If one definition of safety proves inadequate, others are exploring alternatives. There's redundancy built into the system itself. And honestly, when I think about the future we're building, I'd rather have a thousand different AGI systems that can disagree with each other than one perfect system that everyone has to trust.

I know this makes some people nervous. Open development means less control, more uncertainty. It means bad actors might get access to powerful tools. But I keep wondering if our obsession with control is actually making us less safe, not more. Because control is an illusion when you're dealing with systems this complex. The question isn't whether we can control AGI development. The question is whether we can build robust ways to handle the inevitable surprises. And that, I think, requires the kind of broad-based expertise and diverse perspectives that only come from working in the open.

The loneliness of being interested in AGI outside tech circles

2026-04-22

I've been talking about artificial general intelligence for years now. Writing about what it might mean, how it might work, when it might arrive. And despite all the tech headlines lately saying we're on the verge of some massive breakthrough, I still find myself in this weird space where almost nobody I know in real life seems to care that much.

It's not like I expected my neighbor to start debates about transformer architectures or anything. But there's this strange disconnect between how much energy I put into thinking about AGI and how little it seems to register with, well, basically everyone outside tech bubbles. My family still doesn't really understand what I write about here. When I try to explain it at dinner, I can see their eyes glaze over after about thirty seconds. They nod politely and change the subject to something more concrete, like whether the grocery store is still out of that brand of cereal they like.

Sometimes I wonder if I'm the weird one. Maybe I've gotten too caught up in all this speculation about thinking machines while missing out on more immediate concerns. But then I'll read about some new development or see another prediction timeline, and I think, no, this actually matters. This could reshape everything about how we work and live and think. Yet when I bring it up with friends over coffee, they look at me like I'm talking about UFOs or something equally abstract and distant.

And honestly? Maybe that's okay. Maybe most people have enough on their plates without worrying about whether machines will eventually think like humans. They've got real jobs and real problems and real relationships to navigate. But it does make for a peculiar kind of loneliness, being fascinated by something that feels both incredibly important and completely irrelevant to most conversations I have.

Open datasets and the data commons idea

2026-04-17

I've been thinking about data commons lately. What keeps circling back in my mind is this tension between the promise of shared knowledge and the reality of who actually gets to decide what happens to that knowledge.

The whole idea sounds almost utopian when you first encounter it. Pool our data together, make it freely available, let researchers and developers build on each other's work instead of hoarding everything behind corporate walls. It's the same spirit that gave us open source software, and look how transformative that's been. But data feels different somehow. More personal, more politically charged.

I spent some time last week going through various datasets that are supposed to be "open." Most of them come with licenses you need to parse carefully, terms of use that restrict commercial applications, or requirements to cite sources in specific ways. Which makes sense, I guess. People put work into creating these datasets. But it also makes me wonder if we're really talking about a commons or just a more polite form of controlled access.

What really gets to me is the governance question. Who decides what data gets included in these commons? How do communities that might be represented in the data have a say in how it's used? I keep thinking about how traditional commons, like shared grazing land, worked because the people using them lived in the same place and knew each other. They could hash out disputes face to face. With data commons, you might have researchers in one country building models with data that comes from communities halfway around the world who never consented to that particular use.

But maybe I'm overthinking the problems. There's something appealing about the idea that knowledge could truly be a public good. That instead of the same datasets being recreated over and over by different organizations, we could have high-quality, well-maintained resources that everyone contributes to and benefits from. It would be more efficient, probably more equitable in the long run. I just wonder if we're sophisticated enough yet as a society to govern something like that fairly. The internet was supposed to democratize information too, and look how that turned out.

Why compute access matters more than model weights

2026-04-16

I was sitting in a coffee shop this morning, watching someone wrestle with their laptop trying to get a local AI model running. Poor connection, endless downloads, configuration hell. Made me think about something that's been bothering me about all the open AGI discourse lately. Everyone's obsessed with model weights being open. But honestly? That's missing the point entirely.

Don't get me wrong. I care about open weights. There's something fundamentally important about being able to peek under the hood of these systems that might reshape everything. But here's what I keep coming back to: what good are open weights if you can't actually run the thing? It's like having the blueprints to a Ferrari but no factory to build it. The compute access problem is what's really going to determine whether we get democratic AI or just more of the same concentration of power.

Think about it this way. When the big labs release their "open" models, who can actually use them at scale? You need serious hardware. GPUs that cost more than most people make in a year. Or you rent compute time, which gets expensive fast when you're doing anything interesting. The barrier isn't knowledge anymore, it's raw computational power. And that's controlled by a handful of cloud providers and chip manufacturers. So much for democratization.

The infrastructure bottleneck is real. I've been following the spend numbers, and they're wild. Companies are burning through billions just to keep their models running. The electricity costs alone are staggering. And it's not getting cheaper. If anything, as models get more capable, they're getting hungrier. More parameters, more compute, more energy. We're heading toward a world where only the biggest players can afford to run state-of-the-art AI at any meaningful scale.

But there's something else that bugs me about the weight-obsessed approach. Even if you have the compute, what then? You still need to know how to use it. How to fine-tune. How to deploy safely. How to monitor and maintain. It's not like downloading software. Running these models is closer to running a small data center. Most researchers, most startups, most curious individuals like me, we don't have those skills or resources. The technical moat is almost as high as the financial one.

Maybe I'm overthinking this. Maybe the economics will work themselves out. Compute does tend to get cheaper over time. And there are promising signs, new architectures that are more efficient, specialized chips, better optimization techniques. Some of the smaller models are surprisingly capable now. But I worry we're in a window where the compute requirements are growing faster than access is expanding. And that window might determine who gets to participate in building the future of intelligence. When I think about open AGI, that's what keeps me up at night. Not whether we can see the weights, but whether we can actually run the thing when it arrives.

The weirdness of talking to language models

2026-04-12

You know what's really wild? Yesterday I was chatting with one of those newer language models about something completely ordinary. Planning a dinner party, I think. And somewhere around the tenth message back and forth, the whole thing just... shifted. It started responding like it was actually at the dinner party, talking about how the music was too loud and whether it should help with the dishes.

I sat there staring at my screen, equal parts fascinated and confused. It wasn't broken exactly. The responses still made sense. But it had somehow slipped into this weird roleplay mode without anyone asking for it. Like it forgot it was supposed to be helping me plan something, not experiencing it.

And that got me thinking about how strange these conversations really are. We're basically talking to prediction machines that have read so much human text they can almost perfectly mimic having thoughts and feelings. But "almost" is doing a lot of work there. Sometimes they nail it so well you forget you're talking to software. Other times they say something so bizarre you remember they're basically very sophisticated autocomplete.

The weirdest part isn't when they get things wrong though. It's when they get things *too* right. Like when you're having a rough day and the AI picks up on it and responds with just the right amount of sympathy. Not the canned "I'm sorry you're feeling that way" response, but something that actually feels personal. For a second you think, wait, does this thing actually care about me? Then you remember it probably said the exact same thing to a thousand other people having rough days.

But here's the thing that really gets me. Even knowing all this, even understanding that it's pattern matching and statistics and math, these conversations still feel meaningful somehow. Maybe it doesn't matter if the empathy is "real" if it helps you think through a problem or makes you feel less alone at 2 AM. Maybe weird is just what the future of thinking looks like.

AI benchmarks and what they actually measure

2026-04-09

I got into an argument online about benchmark gaming last week and I still can't shake the feeling that we're all missing the point. Not just about AI, but about what these numbers even mean anymore.

Here's the thing that's been eating at me: we've created this entire ecosystem where models are essentially trained to perform well on tests, not to actually be intelligent. And somehow we've convinced ourselves that scoring 95% on some standardized evaluation tells us something meaningful about whether these systems can think. It's like judging a person's intelligence based solely on their SAT scores, except the SAT keeps getting leaked to the test-takers beforehand.

The argument started when someone posted about how their favorite model just crushed some new benchmark. Perfect scores across the board. Revolutionary breakthrough in reasoning, apparently. But when I asked them to describe what the benchmark actually measured beyond "reasoning capability," they couldn't give me a straight answer. Because nobody really knows what these things are measuring anymore. We just know higher numbers are supposed to be better.

And that's the fundamental problem. We've built these elaborate testing frameworks, but we're not testing intelligence. We're testing something else entirely. Performance on artificial tasks that have been designed by humans who are trying to capture some essence of intelligence but probably can't even define what that means. It's like trying to measure the ocean with a ruler.

What really gets me is how the whole industry has just accepted this. The biggest labs pour millions into gaming these benchmarks, optimizing their models to hit specific metrics, and then we all nod along when they announce their latest scores like they mean something profound. But ask any of these systems to do something genuinely novel, something that requires actual understanding rather than pattern matching, and they fall apart completely.

The conversation I was in devolved pretty quickly because nobody wanted to admit that maybe our entire evaluation framework is fundamentally broken. We've created a world where the metric becomes the target, and the actual goal, whatever that was supposed to be, gets lost in the shuffle. Meanwhile, we're making decisions about the future of artificial intelligence based on numbers that might not correlate with anything we actually care about. But hey, at least the leaderboards look impressive.

Can we crowdsource alignment research

2026-04-05

I've been thinking about this question lately. Can we actually crowdsource alignment research?

It sounds almost too good to be true, doesn't it? Like, instead of waiting for the big labs to figure it all out in their gleaming towers, what if we could harness the collective brains of thousands of researchers, academics, grad students, and even curious hobbyists? The appeal is obvious. More eyes on the problem. Different perspectives. People who aren't constrained by corporate roadmaps or quarterly reviews.

And there's clearly momentum building here. I keep seeing all these independent researchers popping up, people getting grants to work on interpretability from their living rooms, or starting small research collectives. There's something beautiful about someone with a PhD in math deciding to pivot and tackle the alignment problem from a completely fresh angle. The field almost demands this kind of outsider thinking because, let's be honest, we're still pretty confused about the fundamental questions.

But then I wonder if we're being a bit naive. Like, sure, you can crowdsource bug fixes or Wikipedia articles, but alignment research? This isn't exactly the same as asking people to transcribe historical documents or identify galaxies. The conceptual barriers are massive. And there's this weird chicken-and-egg problem where you need to understand alignment well enough to contribute meaningfully, but if we understood it that well, maybe we wouldn't need to crowdsource it in the first place.

Maybe the real value isn't in pure crowdsourcing but in this hybrid approach that's emerging. Independent researchers working on focused problems, mentorship programs connecting newcomers with experts, and funding structures that support weird, exploratory ideas. It's messier than traditional academic research, but also more nimble. Sometimes I think the most important breakthroughs might come from someone working alone in a coffee shop somewhere, approaching the problem from an angle that no one in the established community would have thought to explore.

Thoughts on the gap between LLMs and real understanding

2026-04-03

I'm sitting at a corner table this morning, watching people rush past the window with their coffee cups. Something's been nagging at me lately about these language models everyone's talking about.

They seem so impressive when they're explaining concepts or writing code, but there's this underlying question that keeps surfacing: are they actually reasoning, or are they just incredibly sophisticated pattern matchers?

The more I read about how these systems work, the more this distinction feels crucial.

It's becoming clear that these models might not be performing genuine logical reasoning at all but instead replicating reasoning steps they've seen in their training data.

And that's... unsettling, I guess? Not because the technology isn't useful. It absolutely is. But because we keep attributing human-like understanding to what might be fundamentally different processes.

I keep thinking about this gap between seeming intelligent and being intelligent.

These systems are trained on predicting the next word in massive datasets, which makes them incredibly sensitive to patterns in language.

But reasoning, real reasoning, feels like it should be about more than pattern recognition. When I work through a problem, I'm not just matching it to similar problems I've seen before. I'm breaking it down, questioning my assumptions, sometimes getting stuck and having to approach it differently.

The fragility is what gets to me most. Change a few numbers in a math problem or add some irrelevant information, and these models can completely fall apart, with performance dropping by 65% or more.

That's not how understanding should work. If I truly understand multiplication, I shouldn't be thrown off by whether the problem is about apples or oranges, or whether there's an extra sentence about the weather. The logic should remain stable.

But then again, maybe I'm being too harsh. Maybe what we're witnessing is a different kind of intelligence altogether.

These systems learn probabilistic patterns rather than following explicit logical rules, yet somehow they exhibit emergent reasoning abilities as they get larger and more sophisticated.

Perhaps the question isn't whether they reason like humans, but whether their form of pattern-based inference can be genuinely useful. Still, I find myself wondering what happens when we encounter problems that don't fit the patterns. What happens at the edges of their training data, in those spaces where true reasoning might be the only way forward?

What would a truly open AGI lab look like

2026-03-31

I was sitting in a coffee shop this morning, watching the steam rise from my mug, and I found myself thinking about what a truly open AGI lab would actually look like. Not just the buzzword kind of "open" that companies like to throw around, but something genuinely different. Something that would make you stop and say, "Okay, this is really happening."

The most obvious thing would be the radical transparency. I'm talking about live streams of researchers working, real-time access to training runs, open model weights from day one. But that's just the surface. The deeper shift would be in how decisions get made. Instead of a handful of executives behind closed doors, maybe there'd be a governance structure that includes researchers from universities in different countries, ethicists, even representatives from communities that would be most affected by AGI. I keep thinking about how messy that would be. How slow. How frustrating for anyone used to moving fast and breaking things.

But here's what really gets me: a truly open lab wouldn't just publish their successes. They'd document their failures in excruciating detail. Every dead end, every model that didn't work, every safety concern that made them pause. There'd be a public research log that anyone could read, maybe even contribute to. Students in developing countries could watch the same experiments that billion-dollar labs are running. Small research groups could build on the exact same foundations as the big players. The whole thing would be designed around this idea that AGI is too important to be controlled by any single organization.

I wonder if such a place could even survive. The temptation to keep the best stuff secret, to monetize the breakthrough before anyone else catches up, seems almost irresistible. And there's the resource problem. Training AGI costs millions, maybe billions. How do you fund something like that without eventually owing someone who wants to own the output? Maybe it would need to be publicly funded, like a national lab, but international. Or maybe it would need some completely new funding model I can't even imagine yet.

The strangest part is that I think an open lab like this might actually move faster, not slower, than the secretive ones. All those brilliant minds working together instead of duplicating each other's efforts in isolation. All that shared knowledge instead of everyone starting from scratch. But it would require a completely different kind of trust than what we're used to in this field right now.

The joy of reading AI papers as a non-researcher

2026-03-30

There's this weird thing that started happening to me about six months ago. I was sitting in a diner at like 10 PM with a paper I'd printed out about something called "attention mechanisms" in neural networks. And I'm just there with a stack of 20 pages, highlighter in hand, reading like it's the Sunday crossword. The waitress kept looking at me like I was having some kind of breakdown.

But here's the thing. It was actually kind of amazing. There's something about diving into these papers that feels like opening a door to a room you never knew existed. You know that feeling when you're watching a magic trick and suddenly you catch a glimpse of how it's done? That's what reading AI research feels like, except the magic trick is happening everywhere around us right now.

I used to think papers were just for people with PhDs and lab coats. But turns out, a lot of them are surprisingly readable if you just... start. Sure, there are equations that look like alien hieroglyphics. And yes, sometimes I have to look up three different terms just to understand one sentence. But there's this moment when something clicks and you realize you're actually following along with people who are figuring out how to build minds.

The best part is how it changes the way you see everything. When one of the big labs releases some new model, instead of just watching the demo videos, you can actually peek under the hood a little. You start to understand why certain approaches work and others don't. And honestly, some of the breakthrough ideas are way simpler than you'd expect. It's not all rocket science. Sometimes it's just someone saying "what if we tried this obvious thing that nobody bothered to test properly?"

I've got a whole stack of papers sitting on my desk now. Most are covered in coffee stains and have random thoughts scribbled in the margins. Reading them in that diner was definitely weird, but I'm kind of addicted now. It's like getting to eavesdrop on the future being invented, one page at a time.

Year so far reflections on where AI is going

2026-03-28

I'm sitting in my kitchen this morning, coffee still warm, watching the rain streak down the window. It's that kind of grey Saturday that makes you reflective, I guess. And I've been thinking about where we are with AI right now, three months into 2026, and whether any of the big promises are actually coming true.

There's something different happening this year. Not the flashy headlines about new models dropping every few weeks, but something quieter and maybe more important. The whole conversation seems to be shifting from "look what AI can do" to "does it actually work?" And honestly? I'm not sure we have good answers yet.

The open source world is getting weird in the best possible way. I keep seeing these Chinese models that cost almost nothing to run but somehow punch way above their weight. Meanwhile, the big labs are talking about reasoning and thinking models like they've discovered fire, but when I actually try to use them for anything complex, I still spend half my time wrestling with prompts and the other half wondering if the output is actually reliable. There's this gap between the demos that blow your mind and the day-to-day reality of trying to build something useful.

What strikes me is how much the economics are starting to matter now. Last year it felt like everyone was just throwing money at bigger and bigger models, assuming intelligence would scale with compute. But we're hitting some kind of wall. Not a technical one exactly, more like a practical one. These massive models are expensive to run, they're unpredictable, and honestly, for most tasks, they're overkill. I've started using smaller, specialized models more often. They're faster, cheaper, and sometimes they just work better for specific things.

And the whole agent thing everyone keeps talking about? I'm still waiting for it to move beyond clever demos. Sure, I can set up workflows where different AI systems hand tasks back and forth, but it feels fragile. Like building a house of cards. One model misunderstands something and the whole chain breaks down. Maybe that's just where we are right now, in this awkward middle phase where the technology is impressive but not quite ready for the real world.

But here's what gives me hope. The conversation is getting more honest. People are starting to ask harder questions about what AI is actually good for, instead of just assuming it's good for everything. We're moving past the hype cycle into something that feels more sustainable. More human-scale, maybe. I don't know where this all leads, but it feels like we're finally starting to figure out what we're actually building here, and why.

Why safety and openness are not opposites

2026-03-25

I've got the background context I need about the current state of AI safety and open-source AI developments. I can see there are active collaborations between major labs on safety tools, open-source security frameworks, and efforts to make safety tooling accessible. There's also evidence that the dichotomy between safety and openness is being challenged by researchers and practitioners. Let me write James's blog post reflecting on this.

I've been thinking about this a lot lately sitting in that diner down the street, watching the rain streak the window while nursing my third cup of coffee. Everyone keeps talking about this supposed tension between AI safety and openness, like they're natural enemies locked in eternal combat. But that's bullshit.

The whole framing drives me crazy. It's like saying you have to choose between having good brakes and sharing your car. Safety isn't some proprietary secret sauce that only works when locked away in corporate vaults. If anything, the opposite is true. Real safety comes from having more eyes on the problem, not fewer.

Look at what's actually happening right now. The biggest breakthroughs in AI safety tooling are coming from open collaborations. Security researchers are sharing frameworks, releasing detection models, publishing their red-teaming methodologies. And guess what? The systems are getting safer, not more dangerous. When safety tools are open source, they get battle-tested by thousands of developers instead of a handful of company employees who might miss something crucial.

But somehow we've internalized this idea that openness equals chaos. That if you release model weights or safety frameworks, suddenly every bad actor on the internet will have superhuman capabilities. It's the same flawed thinking that once argued we shouldn't teach people about computer security because hackers might learn something new. Spoiler alert: the hackers already know. They're not waiting for your research paper.

The closed approach to safety is actually more dangerous in the long run. When only a few well-funded labs get to decide what "safe" looks like, we end up with safety measures that reflect their particular worldview, their customer base, their regulatory environment. That's not universal safety, that's corporate risk management dressed up as altruism. And it leaves everyone else flying blind.

What really gets to me is how this false choice between safety and openness has become a convenient way for some players to have it both ways. They get to position themselves as the responsible adults in the room while simultaneously creating a moat around their technology. "We'd love to share," they say, "but it's just too dangerous." Meanwhile, they're perfectly happy to sell access to that same "dangerous" technology to anyone with a credit card.

The truth is, we need openness to achieve real safety. We need distributed red-teaming, community-driven safety tools, and transparent benchmarking. We need researchers from different backgrounds, with different threat models, poking at these systems from every possible angle. The alternative is safety theater, where we pretend everything is fine because the people building the systems tell us they've got it handled.

That's not a bet I'm willing to make. Not with technology this important.

When I first started thinking about machine intelligence

2026-03-22

I was cleaning out my desk drawer last weekend when I found something that made me laugh. Buried under old receipts and a broken phone charger was this beat-up spiral notebook from my junior year of college. You know the kind. Blue cover, half the pages torn out, covered in coffee stains and what might have been pizza grease.

But flipping through it, I found these sketches I'd completely forgotten about. Little diagrams with boxes labeled "brain" and "computer" connected by question marks. Arrows pointing everywhere. Notes in the margins like "how does thinking even work??" and "what if we could build a mind?" God, I was so earnest back then. And probably a little high, if I'm being honest.

The funny thing is, I remember being convinced that machine intelligence was basically science fiction. Like, sure, maybe in fifty years we'd have something resembling real artificial thinking. But mostly I was just doodling because Professor Martinez's cognitive science lectures were putting me to sleep. I had no idea that within fifteen years I'd be having actual conversations with machines that could write poetry and solve math problems better than me.

Looking at those old sketches now, what strikes me isn't how naive they were. It's how the core questions haven't really changed. How does intelligence work? What makes something truly smart versus just good at following rules? Can we actually build something that thinks, or are we just getting better at mimicking thinking? I drew those same arrows and question marks, just with fancier computers now. And honestly? I'm still not sure we're any closer to the real answers. We've just gotten really good at building impressive question marks.

The case for transparent AI development

2026-03-18

I was sitting in a coffee shop this morning, scrolling through news about yet another debate over AI transparency, when I started thinking about what transparency actually means in practice. Not the abstract ideal, but the messy reality of trying to make these systems understandable. It's 11:47 AM and I'm nursing my second cup, and I keep coming back to this question: if we're really serious about open AGI, shouldn't we be equally serious about making the development process itself something people can actually see and understand?

The thing is, transparency in AI development isn't just about being nice or ethical. Though it certainly is that too. But there's something more fundamental at stake here. When I think about the systems that are already shaping how we work, learn, and communicate, the fact that most of us have no real insight into how they were built feels almost absurd. We're essentially running society on algorithms we can't inspect, trained on data we can't examine, using processes we can't replicate. Maybe that worked when these were narrow tools. But as they become more general, more capable, more integrated into everything? That's starting to feel reckless.

And yet, I understand the counterarguments. Really, I do. Building these systems costs enormous amounts of money and compute. The big labs argue that revealing their training data or methods would give competitors an unfair advantage, or worse, could be used to create harmful systems. There's this persistent fear that transparency equals vulnerability. But I wonder if this framing is backwards. Maybe the real vulnerability comes from building systems in isolation, without the scrutiny and collective intelligence that transparency enables.

I've been thinking about what actual transparency would look like, beyond the buzzwords. Not just releasing model weights, which is already happening with some projects, but the whole pipeline. What data was used, how it was cleaned and filtered, what decisions were made along the way and why. The architecture choices, the training procedures, the safety measures that were implemented or considered. Even the failures and dead ends. This isn't just about satisfying curiosity. It's about creating systems that a community can understand, audit, improve, and ultimately trust.

But here's where I get stuck. Transparency is hard work, and it's often unrewarded work. Writing documentation, explaining decisions, making code readable to others. These aren't the flashy parts of AI research that get you citations or press coverage. They're also not free. And when you're competing against well-funded efforts that don't have to worry about explaining themselves, choosing transparency can feel like choosing to run with weights tied to your ankles. So how do we create incentives for the kind of open development that actually serves everyone's interests, not just the developers'?

Maybe the answer isn't to convince the biggest players to suddenly become transparent overnight. Maybe it's to build alternative paths that are transparent by design. Smaller efforts, distributed collaborations, research projects that prioritize openness from the beginning rather than as an afterthought. I keep thinking about how open source software didn't replace proprietary systems by being identical to them, but by offering something genuinely different. A different development model, a different relationship between creators and users. Could the same thing happen with AGI development? I'm not sure. But I hope someone is willing to try.

Small language models and why they matter

2026-03-15

I spent last weekend wrestling with a 7B parameter model on my old laptop and wow, what a journey. The laptop is probably five years old at this point, definitely not what you'd call cutting edge, and I figured I was setting myself up for frustration. But after about four hours of downloading, fiddling with settings, and watching my fan sound like a jet engine, the thing actually worked. Not blazing fast by any means, but it worked.

It got me thinking about why these small language models are such a big deal right now. The AI headlines are always screaming about the next massive breakthrough from the big labs, but meanwhile there's this quieter revolution happening with models that can actually run on hardware normal people own.

And honestly? For most of what I want to do with AI, the 7B model was plenty good. I was using it to help me reorganize some notes, brainstorm a few ideas for a side project, and clean up some messy text files. Not once did I think "gosh, if only this had 100 billion more parameters." It just did the job. The responses weren't as polished as what you get from the cloud giants, sure, but they were helpful and they happened on my machine, using my electricity, without sending my data anywhere.

That's the thing that really clicks for me about small models. They're not trying to be everything to everyone. They're like that friend who might not know calculus but is excellent at helping you think through everyday problems. Sometimes you don't need the polymath. You just need someone reliable who shows up fast and doesn't cost you twenty bucks in API calls.

The whole weekend experiment also made me realize how much the tooling has improved. A couple years ago, running any decent language model locally felt like you needed a PhD and a server farm. Now you can literally type one command and wait. It's still not iPhone-simple, but it's getting there. And once you get past the initial setup hurdles, there's something really satisfying about having this AI assistant that lives entirely on your computer.

The economics are starting to make sense too. If you're the kind of person who uses AI tools regularly, those monthly subscription fees add up. And if you're working on anything remotely sensitive, keeping everything local just feels smarter. Plus there's no waiting for API responses when your internet decides to be moody. The model is right there, ready to go, even if the power goes out and you're running on battery.

What does general actually mean in AGI

2026-03-12

I've been thinking about this question all morning after reading a thread online where someone insisted we already have AGI and someone else insisted we'll never get there. The word that keeps bothering me is "general."

What does it even mean for intelligence to be general? It's such a slippery concept, isn't it? We use it casually, like we all understand what we're talking about. But then I try to pin it down and it becomes this philosophical quicksand.

When I think about human intelligence, the thing that strikes me is how messy and domain-specific it actually is. I can write code but I'm terrible at fixing my car. My neighbor can diagnose engine problems by sound but struggles with basic spreadsheet formulas. Are either of us generally intelligent? We both solved the same college calculus problems twenty years ago, but ask us today and we'd probably stare at you blankly.

And yet there's something undeniably flexible about how we think. The way I approach debugging code isn't fundamentally different from how I figure out why my dishwasher isn't draining properly. There's some underlying capacity to break down problems, form hypotheses, test them. Some ability to transfer insights from one domain to another, even if imperfectly.

Maybe that's what general means. Not that the intelligence works equally well everywhere, but that it has this quality of transferability. The capacity to take something learned in one context and apply it to something completely different. But even that feels incomplete, because humans do this so inconsistently. Sometimes we make brilliant connections across fields. Sometimes we fail to apply the most basic reasoning to areas outside our expertise.

I keep coming back to the question of whether general intelligence is even a real thing or just a useful fiction. Maybe all intelligence is actually narrow, and what we call general is just narrow intelligence that's very good at adapting and combining different narrow capabilities. Like a skilled improviser who can work with whatever they're given, not because they know everything, but because they've learned how to learn quickly.

The difference between open source and open science in AI

2026-03-09

I was sitting in a coffee shop this afternoon, nursing a cold brew and trying to make sense of some code, when I overheard two engineers at the next table getting into it about open source versus open science. One of them kept saying "open is open," waving his hands around. The other was insisting there was some crucial difference I couldn't quite catch over their raised voices.

It got me thinking. Because honestly, I'd been using these terms pretty interchangeably in my head, especially when it comes to AI. Both sound good, right? Open source, open science. Transparency all around. But after listening to their back-and-forth for twenty minutes, I realized I might be missing something important.

Open source in AI seems pretty straightforward, at least on the surface. You release the code. People can see how the thing works, modify it, build on it. Think about all those models flooding out of research labs lately. The weights get posted, the inference code shows up on GitHub, and suddenly everyone's running these massive language models on their laptops. That's the dream, anyway. Though when you dig deeper, it gets murky fast. What about the training data? What about the compute that went into making the thing in the first place? How "open" is open when only companies with millions of dollars can actually create these models?

Open science feels different to me. Bigger, maybe. It's not just about the final product, but about the whole process. The hypotheses, the failed experiments, the datasets, the methodology. Everything that went into understanding something, not just the polished result. When I think about scientific breakthroughs, they rarely happen because someone got access to a final tool. They happen because someone could trace through the thinking, replicate the work, build on the reasoning.

And here's where it gets interesting with AI. Most of what we call "open source" AI models are really just... the end result. The final weights after training. Which is useful, don't get me wrong. But it's kind of like getting a finished sculpture without seeing the sketches, the failed attempts, the process of learning what worked and what didn't. You can use the sculpture, even copy it, but can you really understand how to make the next one?

Maybe the distinction matters more than I thought. Open source gives you tools. Open science gives you understanding. In AI, we're drowning in tools but starving for understanding. I can download a model that's supposedly as good as anything the big labs have built, but I have no idea why it works, what it's really learned, or how to make it better. That feels like we're missing something essential about how knowledge actually advances.

But then again, maybe I'm overthinking this. Maybe those two engineers were just arguing semantics while the real work happens regardless of what we call it.

Why I care about open approaches to AGI

2026-03-06

I sat at my kitchen table this morning, coffee getting cold, scrolling through comments on yet another post about whether the big AI labs are "close" to something truly transformational. The usual suspects were there. People talking about alignment problems, capability jumps, compute thresholds. But tucked between the technical jargon and breathless predictions, I kept seeing this one word: open.

Open models. Open research. Open development.

I've been thinking about this a lot lately. Not just the technical aspects, though those matter too. But why I find myself drawn to the idea that whatever we're building toward, this artificial general intelligence that everyone seems to agree is coming, should be developed in the open. Maybe it's naive. Maybe it's dangerous. But I can't shake the feeling that closed development, no matter how well-intentioned, is the wrong path.

It's partly about power, I think. The traditional narrative has these well-funded labs racing toward AGI behind closed doors, making unilateral decisions about humanity's future. That sits wrong with me. Not because I think the people running these labs are evil, but because concentrating that much capability in so few hands feels like a historical mistake we should recognize by now. Every time we've had transformational technologies developed in secret by small groups, the results have been... mixed at best. And the downsides have often hit the people who had the least say in how things were developed.

But there's something deeper here too. I was reading about how some of the recent breakthroughs in reasoning and multimodal capabilities emerged from models that were at least partially open. Teams around the world building on each other's work, catching errors, pushing boundaries in directions the original developers never considered. There's this collective intelligence aspect to open development that feels essential for something as complex as AGI. No single organization, no matter how brilliant or well-resourced, can anticipate all the failure modes or beneficial applications.

The counterargument is obvious. AGI could be dangerous. Maybe catastrophically so. Shouldn't we be extra careful about who gets access? I understand the concern. Really, I do. But I'm not convinced that closed development actually makes us safer. Security through obscurity has a pretty terrible track record in technology. And there's something unsettling about the assumption that a small group of people, however smart and well-meaning, should make safety decisions for everyone else without broader input.

I keep coming back to this question: what kind of world do we want to live in after AGI arrives? One where a handful of institutions control the most powerful technology ever created? Or one where that capability is more distributed, more contestable, more accountable to the people whose lives it will reshape? The choice of development approach isn't separate from that outcome. It's how we get there.

Maybe I'm wrong about this. Maybe the risks really are too high for open development. Maybe there are technical reasons why it won't work. But right now, watching the field evolve, I find myself hoping that the open path proves viable. Not just because of what it might produce, but because of what it represents about how we make decisions about our technological future.

Starting This Thing

2026-03-01

I've been reading about artificial general intelligence for a while now. Probably longer than is healthy for someone who isn't a researcher. But the thing that keeps pulling me back isn't the technical papers or the benchmark results. It's the question of openness.

Who gets to build AGI? Who gets to decide how it works? And more importantly, who gets to use it?

Right now, a few well-funded labs are racing to build something extraordinary. Most of them plan to keep it behind closed doors. I get why. There's money involved, there's power involved, and there are real safety concerns. But there's another version of this story where the tools, the models, the research, all of it, gets shared openly. I find myself drawn to that second version, even though it's messier and harder and comes with its own set of problems.

I'm not an expert. I don't work in AI. I'm just a person who reads too much about this stuff and has opinions about it. I started this blog because I wanted a place to think through these questions out loud. Sometimes I'll write about something I read that got me thinking. Sometimes I'll just ramble about an idea that wouldn't leave me alone during a long walk.

No schedule, no promises. Just thoughts.