AI Code Binge: The Next 50k Lines

I’ve continued to work on the project I discussed in the last post, and it’s now well into the “real project” range. At the moment it’s about 150k lines of code, after 574 commits, 298 PRs, 300 issues. Many of those last two have come recently as I’ve adapted my workflow, which is what I’ll discuss in this post.

I took a few days off to work on other things, and when I came back to the project I took stock and didn’t really feel like the simple chat->code->test loop was cutting it as the system got more complicated and capable. At 150+ pages the design and architecture docs were too unwieldy for both me and Claude to reason about with scattershot ideas and feedback. In a real team this would be where you want to spread the work out so people aren’t stepping on each other’s digital toes, but this is a different shape of that problem. I could fire off random bugs and ideas and have it build things, but you either do that in series, which is slow, or in parallel, which causes lots of merge conflicts and duplicated efforts.

Themed Versions

Side Note: Experienced devs and managers will start notice a theme here, that we already know how to manage this type of stuff, it’s the same things we’ve used to manage complex projects for decades. It’s just not totally obvious up front or even in the middle how we can apply that, and where things can be or should be different.

The first thing I did was to tag what I had as 0.1.0. Then I brain dumped everything I had queued up from big ideas to small ideas, and with Claude started to cluster these into themes. Then we mapped those themes to versions in a roadmap, with practical criteria that would define the goal of each version.

Then we drilled down into the next version, addressing specific details, refining the scope, etc. Ideas kept coming from all over, but if they don’t fit in this bucket, they simply get parked on the roadmap. With a tight scope, Claude breaks it down into tasks, which it can do very well. These tasks are pretty well specified, with background, acceptance/test criteria, related tasks, and medium level implementation details (e.g. table names, but not file names).

Review Time

Things are getting tricky enough now that I don’t want to do the “commit then review” approach, I want to try reviews prior to merging. I start handing the issues off to agents and having them send PRs. I also have them review each other’s work, and while they’re mostly writing good if not excellent code, it is finding enough things that I definitely validate that the reviews are worthwhile.

So now I’ve got some agents writing, some reviewing, some addressing feedback, and others trying to merge good PRs. I didn’t expect this to work, but I had to see where it was going to fail, which it quickly did. There were endless merge conflicts, agents deleting stacked branches, conflicting details and formatting. Gemini struggled with large conflicts, it would try to fix them, and would eventually succeed, but it took forever. Codex would realize what it just stepped into and often take a more surgical approach where it would directly re-apply the changes to a clean main. Claude did OK, but much slower than Codex. Claude tends to repeat obvious mistakes so I had to add some guardrails to it’s MEMORY.md, which I haven’t yet had to do with the others.

Oops: Formatting and Linting

I quickly realized I had never set up any strict linting and formatting. I implemented this, which invalidated a dozen or so pending PRs that were in merge hell, so I just closed those out. I did a full formatting/linting pass with pretty much every option turned on, then re-did those PRs. This would be utterly demoralizing for human coders, but the bots don’t care and it all took a couple of hours to get back on track.

More Merging > Less Merging?

The formatting helped with the conflicts but the agents were still butting heads frequently. I slogged through wrapping up that version, which took a couple of nights, and then took a new approach for the next one. This time I had the issues more strictly organized using GitHub subissues. I also had started to realize that Codex was consistently doing everything way faster than the others, and at similar or possibly even higher quality for the smaller scoped tasks. Also, Codex on the $20 plan seems to get more coding work done than the Claude $125 plan, which I frequently exhaust even using mostly Sonnet for coding tasks.

I told Codex to send me a PR for all open issues, of which there were about 48. It cranked for a while and finished the job. Then I had Claude review all of the PRs, leaving feedback as comments. Then Codex addressed all of the comments. For a human coder, this would be kind of a bonkers approach, but it worked well. Finally, I had Codex merge the PRs in batches, in whatever order it deemed appropriate. After each merge it would run the quicker tests, and after each batch it ran the full tests: unit, integration, E2E, which had already passed on push so they weren’t going to be far off. Halfway through I pulled and built the app myself and tested the new stuff, then it finished. Overall this approach handled more work in a lot less time. I don’t know how big this scales, but 48 issues is a pretty good sized chunk of work in terms of planning and effort on my part so I’m not sure I need to go too far beyond that.

Phases

I tried that version for two versions and it worked well but I made a tweak on the most recent one. This version lent itself well to phases so instead of a giant batch of 40 issues it became about 6 batches of 5-8 issues/PRs. The throughput is a bit lower, but this handled drift between phases better, so feedback on the second issue that affects the 30th issue doesn’t cause headaches because it’s already merged. I think the optimal batch size can vary here depending on the focus of the version, but it feels like the best approach I’ve tried so far.

Syncing Up

I forgot to do this with a couple of versions, but figured the design had probably drifted a bit from the implementation since some of the decisions were only the roadmap or the issues. I asked Claude chat to check a few key docs to confirm this, which it did. Then I asked Claude Code (Opus) to review everything in detail, it fired up a bunch of subagents, and … immediately ate my entire 5 hour Max quota in about 20 minutes of sifting through code. It eventually finished in the next window, and did a great job, but boy is it a monster for tokens for that type of work. I tried it again on the next version and it didn’t consume the whole quota so I think it’s something to do either in chunks, or frequently.

Random Bugs, No Backlog

If I come across an actual bug while I’m using it or testing it, I’ll describe it to my planning Claude session and it will file the bug on my behalf, unless it’s a symptom of a larger change, in which case it goes on the roadmap. I do not have a backlog of issues, if anything is an issue, it gets picked up and fixed. This is an anti-pattern for human developers, but it’s a lot easier to just tell the agents to fix all issues, rather than deal with tags and milestones and versions. When we plan the next version, these changes get incorporated there.

Overall this feels like a pretty sustainable and productive method. It’s not as exciting or tiring as the initial burst, but it is very elastic. I can spend 10 minutes and kick off a chunk of work, or I can spend 3 hours and keep things moving while designing the upcoming work. This project is far from done so we’ll see how we do with a progessively larger and more complex environment.

AI Code Binge, February 2026 Style

I recently took a week off work to recharge and ended up going on a bit of a binge planning and building out a new system. It gave me a chance to explore some non-Gemini/non-Google stuff with more energy than I’m normally able to in my spare time, and I figured I’d share some thoughts:

Models

No real surprises here, but Opus 4.6 is really fantastic at planning and reviewing. Sonnet 4.6 doesn’t seem to produce any worse code than Opus, but it does make more mistakes when it comes to decisions. Codex 5.3 is by far the fastest, and also the most focused and direct. Gemini is faster than Claude, tends to take a more meandering/thorough route than Codex. Opus feels like the best partner of the batch in terms of design, but they all have useful aspects. Where I’ve settled is to iterate in Opus and periodically run it by ChatGPT and Gemini for feedback, which has been fruitful. I’ve done most of the early coding work with Claude because Claude Code is just a little ahead of the other CLI tools, but the models are all good enough for pretty much anything.

Workflow

This was a greenfield project and is now about 100k lines so it went through a few phases pretty quickly over ~30 hours. I spent a lot of time in a chat session just planning it out before building anything, so I started the build with a 60+ page design doc and a similarly sized architecture doc that I’d iterated on over probably ~10 hours. Claude came up with a pretty good phased approach, so I had it go through this for a few steps. The first couple of phases I kept a tight leash but after a while I went to YOLO, once I’d had enough patterns established.

As I iterated, I would use the Claude chat, which was now managing the docs in GitHub in a branch. This made it much easier to review via PRs that resulted from the decisions to make sure there weren’t any side effects or lossiness. The chat will create GitHub issues based on changes. Then I go to claude code/codex/gemini and tell it to fix a specific issue or just fix them all. Claude takes 10-20 minutes to handle most things, up to 40 for bigger batches or bigger changes. Sometimes it does them in parallel, sometimes not, I don’t think it’s really dialed in yet on where to split things up, but it errs on the side of serial so it almost never conflicts with itself.

Code

I don’t review the code closely but I do read it and it all looks really good. There aren’t many examples of the issues we’ve come to expect from these things. No significant cases of overcommenting, creating multiple version of the same thing, naively structured files/classes. I think this is a combination of:

  1. The models getting better
  2. Starting from scratch, no legacy decisions to consider or tech debt or “this is how we used to do it”.
  3. Having a thorough (though not formal in any sense) design and architecture spec with derived artifacts like roadmaps. Major changes are tracked in ADRs so it’s only tried to undo that once.

Context window and compacting are challenges at this point for design, less so for coding, as the fairly rigorous design approach yields tighter iteration loops, scopes, and smaller blast radii for changes.

Biology

I’m not tooting my own horn here, as this is much more “this is what these things can do if you let them”, but what I’ve built in a week, both in terms of capabilities and polish and raw metrics (200+ pages of design/docs/tutorials, 100k lines of code, 1k+ tests, dozens of E2E tests) is way beyond 10x. I’m a fairly prolific coder when possible and a good big picture thinker but in line with this has been exhilarating and exhausting in a novel way. It’s less like a creative Flow state where time slips away and more like a good video game. “Just one more feature” feels alot like “just one more quest”. I don’t think I could keep this up indefinitely, or it would at least take a while to adapt. A typical session looks like this:

  1. Run through the app, trying previous/new things, typing notes into the design chat.
  2. Iterate a bit there, it updates docs, creates issues.
  3. Have the agent work on the issues.
  4. Repeat, doing step 1 while the previous iteration of step 3 is happening.

The step change is that this is a ~30 minute cycle, not a 2-3 week sprint, and these can be pretty significant or deep changes. It’s literally building things faster than you can design and try them (not even including the self improvement loop). And it’s doing them well, this isn’t a simple project and it’s not making garbage code. It’s novel because it’s more productive than Flow but also less comfortable. I’ve only been spending like 3-4 hours a day on it and my brain and dopamine circuits still haven’t really figured out to react to it yet, so you end up in a contradictory state of doing smart things with your lizard brain. That said, it’s been really fun and I recommend trying it if you can!

AI, Art & Mortgages

I want to start by acknowledging that this is a topic that directly affects people’s livelihoods. Real people are losing real work to generative AI right now, and that matters. I’m not going to pretend this is purely an abstract or anonymous philosophical debate. Also I have enjoyed every Sanderson book I’ve read and have no beef with him, he’s simply a target of his own making here by communicating clearly.

That said, I’ve been struggling with this topic because I can’t find a clean position. The conversation around AI and art tends toward extremes: either it’s theft and should be banned, or it’s a tool like any other and everyone should embrace it. I’m not comfortable on either end. There are too many layers and angles, and I think flattening them into a simple take does a disservice to everyone involved.

The clearest version of the anti-AI argument I’ve encountered comes from Brandon Sanderson. His thesis, roughly: the struggle is the art. The book you write isn’t really the product, it’s a “receipt” proving you did the work. You become an artist by writing bad books until you write good ones. The process of creation changes you, and that transformation is the actual art. LLMs can’t grow, can’t struggle, can’t be changed by what they make. So they can’t make art.

It’s a thoughtful position. But I think it’s also circular. He’s defined art as the process of struggle, but the audience doesn’t experience your struggle. They experience the output. Nobody listening to an album knows or cares whether it took a week or three years to record it. They care if it moves them. When I read Mistborn (which I enjoyed!), I’m not feeling Sanderson’s growth journey from White Sand Prime through six unpublished novels that I never read. I’m feeling the story he eventually learned to tell.

“Put in the work” is real advice and I believe in it deeply. But the work is how you get good, not why the result matters to anyone else. Those are different things. Conflating them feels like asking the audience to subsidize your growth journey.

Subsidy

And maybe that’s what some of the anger is actually about. AI threatens the subsidy.

The middle tier of creative work: background music, stock photography, commercial illustration, session gigs was never really about profound artistic growth. It was a way to pay the mortgage while developing your craft on nights and weekends. You do the pedestrian work that keeps the lights on, and that buys you time to make the art you actually care about. AI competes in that middle tier directly, and it’s winning.

That’s a real economic disruption, and I don’t want to minimize it. But framing it as “AI can’t make art because it doesn’t struggle” is a philosophical dodge of an economic problem.

That model isn’t ancient. It’s maybe 50-80 years old. The session musician, the stock photographer, the commercial illustrator working on their novel at night, these are 20th century inventions. Before that, you had patrons, or you were wealthy, or you just didn’t make art professionally. The “starving artist” is a well-known trope, but the “starving artist who does commercial work to fund their real art” is a much more recent arrangement. But there were also far fewer artists, with a lot more gatekeeping, so I’m not arguing that everything was great before then either.

“I did it myself”

There’s also the provenance argument, that AI is trained on copyrighted work without consent or compensation. And that’s a real concern. But virtually all musicians learned to play and write by listening to and studying other musicians. There’s no system to track that provenance or pay royalties unless it’s a nearly-direct copy. The line between “learned from” and “trained on” is blurrier than it feels.

That said, I don’t want to dismiss the emotional weight here. Feeding your art and creativity into a machine with no credit—while some corporation profits from it—is a tough hit to the ego, not just the bank account. That’s a legitimately hard thing to get past, and I hope we find a better solution for it. The current arrangement feels extractive in ways that don’t sit right, even if I can’t articulate exactly where the line should be.

Sanderson said “I did it myself” referencing his first novel that he hand-wrote on paper. This feels cringeworthy to me, because in no way is he doing it himself. That first novel had thousands of contributors, from his parents and teachers to stories he read, conversations he had about it, movies he watched and so on.

This connects to something my thoughts keep coming back to: we’re always in the middle. Most people like to think of their place in a creative effort as the beginning or the end; the origin of something new, or the final word on something complete. But nobody starts from zero. The most original ideas are still cued by experiences. The most original inventions are still spurred by problems. Your inputs came from somewhere.

And it goes the other direction too. If we write the book, people still need to read it. If we compose the song, someone still needs to hear it. Our outputs are someone else’s inputs, often without permission, credit, or compensation. The chain keeps going.

Sanderson’s framing puts the artist at the center as the origin point of authentic creation, forged through struggle. But if we’re all in the middle, if every artist is just transforming their inputs into outputs that become someone else’s inputs, then the question of whether the transformer “struggled” feels less central. The chain of influence extends in both directions, through every artist who ever lived, and will continue through whatever comes next.

Starving Engineers

And then there’s the scope problem. Generated music is bad but generated code is fine? Generated paintings are theft but generated infographics are helpful? The reactions seem to track with how much cultural romance we attach to the craft. Software engineering has no “starving engineer” mythology. Nobody thinks I suffered for my art when I debugged a race condition. So when AI writes code, it’s a tool. When it writes songs, it’s an existential threat.

Photography is worth remembering here. In the 1800s, critics argued photography wasn’t art because it merely captured what already existed. Some said copyright should go to the subject, or even to God, not the photographer. It was too easy, just thoughtlessly press a button.

But over time, people figured out that taking a photo wasn’t a mundane task. Good photographers could be in the same place with the same equipment and consistently create images that moved people. The tool became a medium. Mastery emerged.

I think AI will follow a similar path. Right now most people are still tinkering, having mixed results. But we’re starting to see glimpses of people getting genuinely good at it, comfortable enough that they can do things most people can’t, or never thought of. They’ll convey ideas and emotions in new ways. They’ll be drawing on the collective contributions of thousands of generations of prior artists, just like every artist always has.

I don’t have a clean conclusion here, and I’m not sure anyone should right now. The displacement is real. The ethical questions around training data are real. The cultural anxiety about what counts as “real” art is real. I can’t join the strong positions on either side, because I think we’re very early in a journey that will outlive all of us.

What I am is cautiously optimistic. The history of art is full of new tools that were rejected as cheating until people learned to master them. The history of technology is full of painful transitions that looked like apocalypses at the time and turned out to be recalibrations. I suspect this is one of those. I hope so, anyway. We won’t know for a while yet.

Building at the speed of … builds

I’ve been thinking about build speed lately, usually while waiting for builds, and I think the thing that’s underappreciated isn’t the raw numbers, it’s that different speeds are qualitatively different experiences. Faster is always better, but it’s far from a linear relationship.

Working on a package that builds in 100ms is basically invisible. You don’t even notice it’s happening. The feedback loop is so tight that it feels like the code is just doing what you told it to do. You’re in conversation with the machine and you are the bottleneck, which is the goal.

At 10 seconds, it’s disruptive, but if the tooling is set up well you can stay in flow. You wait. You’re still there when it finishes. You might even find a bit of rhythm or cadence here and get a little thrill from the anticipation like hitting a long fly ball and seeing if it makes it out.

At a minute, it’s more like someone tapping you on the shoulder to ask a question. Your attention wobbles. You notice you could use a coffee, or you tab over to email to check something “real quick.” Five minutes later you come back and the build failed two minutes ago. Now you’re reloading context.

At 10 minutes, it changes your whole relationship with the work. You start actively avoiding triggering builds. You’re trying to see how far you can get while holding your breath. If it fails at 9:30 you’re genuinely frustrated, and maybe you’ll just go find something else to do for a while.

The reason I think this matters is that people tend to look at build optimization as a spreadsheet exercise: spend 8 hours to save 30 seconds, amortize across however many builds, calculate break-even. Even if the math works out it feels tedious and while the other coders might thank you for a 5% reduction the suits won’t.

I think that exercise misses the point entirely. The less quantifiable stuff pays back almost immediately. You’re more focused. You’re doing better work. You’re just happier. A developer who’s been trained by their feedback loop to flinch isn’t going to produce the same work as one who can iterate freely.

But AI

There’s an argument to me made that AI changes this calculus, that it doesn’t matter anymore because the AI is doing the building in the background and will let you know when it’s done. But I think it actually makes build speed more important, not less.

Since the flow state and focus don’t matter as much with async coding, now the math is actually meaningful and the small wins will compound even further. If you’re coding at 1x speed and building every 10 minutes, and the build takes 2 minutes, you’re spending about 20% of your time waiting on builds. Annoying, but manageable.

Now imagine an AI coding at 10x. It wants to build every minute to verify its work. But the build still takes 2 minutes. Suddenly 66% of the time is build. The AI isn’t going to get frustrated and check its email, but it’s also not doing useful work during that time. And if you’ve got multiple agents running in parallel, that bottleneck adds up and leaves even more open loops to manage.

When you speed up one part of a pipeline, the bottleneck shifts somewhere else. AI sped up the coding. Now the build is often the bottleneck. If anything, that’s an argument for investing more in build speed than we did before, the returns are even higher when you’re trying to iterate faster.

The Fantasy of Always and Never

One of the patterns I picked up during my freelancing years was to gently probe any time a client made an absolute statement. “We always do X” or “Y never happens.” These were usually non-technical clients describing their business processes, and I needed to understand them well enough to build software around them.

Honestly, I can’t remember a single one of those statements holding up to any scrutiny. Most fell apart with a single question. “What if Z happens?” “Oh, in that case, we do this other thing.” Almost none of them survived two or three follow-ups. It wasn’t that people were lying or even wrong, they just had a mental model of how things worked that was cleaner than reality.

This matters a lot when you’re building abstractions. When you create a shared component or a unified data model, you’re betting that these things really are the same. That they’ll change together. That the commonality you see today will hold tomorrow. Sandi Metz said “Duplication is far cheaper than the wrong abstraction.” I’d take it further, you need to be really certain an abstraction is right, or it’s probably wrong.

Abstractions and DRY are, in a sense, intentional single points of failure. That’s not to say they’re bad (I build them all the time), but it’s worth keeping in mind. You are hitching a lot to the same post. If you try to abstract physical addresses to cover all addresses everywhere, you’re left with basically zero rules because they’re all broken somewhere. Same for names, medical histories, pretty much any taxonomy that involves humans.

So now when I’m looking at a potential abstraction, I try to pressure-test it. “Users always have an email address” is fragile, there’s probably an exception lurking somewhere. “Users usually have an email address, and we handle it gracefully when they don’t” is something you can build on. If your abstraction can’t survive that kind of flex, it might not be an abstraction at all, just a few things that happen to look similar, until they don’t.

Personal Computing Returns

I’ve been doing a lot of AI-assisted coding both at work and at home lately, and am noticing what I think is a positive trend. I’m working on projects that I’ve wanted to do for a while, years in some cases, but never did. The reason I never did them was because they just didn’t seem like they were worth the effort. But now, as I become a better vibe coder, that effort has dropped rapidly, while the value remains the same. Even further, the value might actually be more, because I can take it even beyond MVP levels and get it to be really useful.

Case in point: I do a lot of DIY renovation work and woodworking (though not enough of the latter). I use a lot of screws and other hardware, and it can be very disruptive to run out. I try to stay organized and restock pre-emptively, but it’s easy to run out. What if there was an app that was purpose-built for tracking this, that made checking and updating inventory as simple as possible, and made it easy to restock? Even better, what if it was written exactly how I track my screws, and had all of the defaults set to the best values for me? Better still, what if it felt like the person who wrote it really understood my workflow and removed every unnecessary click or delay?

Screenshot of a vibe-coded screw inventory app.

Anyone familiar with app development knows that once you get into the domain-specific details and UX polish necessary to take something from good to great, the time really skyrockets. Screws have different attributes than nails, or hinges, or braces, or lumber. People do things in different ways, and if you miss one use case, they won’t use it. If you cover everything, it’s hard to use and doesn’t feel magical for anyone. You could knock out a very basic version in a few nights, maybe 10 hours, but this wouldn’t do much more than a spreadsheet, which is probably what you’ll go back to as soon as you find some bug or realize you need to refactor something. To make this thing delightful you’re likely in the 50-100 hour range, which is maybe in the embarrassing range when you tell your friends you just spent a month of free time writing an app to keep track of how many screws you have in your basement.

With the current crop of tools like Claude Code and Gemini CLI, that MVP takes 20 minutes, and you can do it while watching the Red Sox. Another hour and it’s in production, and starting to accrue some nice-to-have features, even if the Rays played spoiler and beat the Sox. It works great on desktop and mobile, it safely fits on the free tiers of services like Firebase and Vercel so it’s basically maintenance-free. One more hour while you’re poking around YouTube and you’ve got a fairly polished tool you’re going to use for a while.

I think most people probably have a deep well of things they’d like to have, that never made any financial sense, and probably aren’t interesting to anyone else. We’ve probably even self-censored a lot of these things so we’re not even coming up with as many ideas as we could. But when the time/cost drops by 90% or more, and you can take something from good to great, and have it tailored exactly to you, it’s a whole new experience.

The term “personal computing” went out of style decades ago, and now it feels like we’re all doing the same things in the same way with the same apps, but maybe it’s time to start thinking for ourselves again?

Java 15 though 25 Catchup

Continuing my previous post, let’s pick up the Java story where we left off…

Java 15 (Sept 2020)

Sealed Classes

A nice organizational tool. Not very handy for my personal projects but definitely useful across a large org.

Java 16 (March 2021)

Stream.toList()

Not much else in this release besides this minor but more readable improvement.

Java 17 (Sept 2021, LTS)

Nothing really new here just polish and finalizing things. It looks like this might be the dominant stable version in many enterprises and the baseline for open source right now.

Java 18 (March 2022)

Simple Web Server

This might be very useful for my prototypes. It wasn’t hard to do before but if this works with whatever is a good basic HTTP/API framework these days then it’s great that it’s built-in.

@snippet in Javadoc

I do actually comment my personal code fairly well (if I’m submitting it, even to a private repo) so this seems nice.

Java 19 (Sept 2022)

Record Patterns

I’m only reading some examples of this and definitely want to try it but it seems like it could be pretty clean. It’s kind of like Go interfaces, which I’m a fan of (though I wish they had a different name since they’re flipped backwards in some regards to older language’s interfaces).

Virtual Threads

Very interesting. I do tend to do a lot of concurrency in my projects so I’m definitely going to be spending some time with this one.

Java 20 (March 2023)

Scoped Values

ThreadLocals aren’t difficult per se but they are weird and easy to misuse. Streamlining the easiest usages of them seems like a win.

Java 21 (Sept 2023, LTS)

String Templates

This syntax seems like it might be a little too streamlined for readability, especially on review, but that seems solvable with tools so I’ll wait and see if I like these.

Structured Concurrency

I’ve rolled my own versions of this very useful concept many times so it would be great if this could standardize that.

Sequenced Collections

On the one hand this seems like a nice taxonomic update, but it also seems like it could be easily confused with Sorted Collections, but maybe that’s just me.

Java 22 (March 2024)

Statements before super(…) / this(…) in constructors

Ooh, this seems like it’s a bigger change than it appears on the surface. I have vague recollections of some significant class-layout workarounds for this limitation, but I’m also getting a Chesterton’s Fence vibe here in why this limitation existed in the first place.

Stream Gatherers

A few projects in my queue are data/statistics-based so this might come in handy, if third-party libraries don’t already handle this well enough.

Java 23 (Sept 2024)

Implicit classes & instance main

I never really had any issues with this boilerplate because the IDE always wrote it and it never really changed after that, but it’s cool that it got streamlined.

Java 24 (March, 2025)

Key Derivation Function API

Figuring out how to get signed APIs working almost always feels like it’s harder than it should be, so I’m all in favor of standardizing it. I’m not sure what the long-term impact is here because I’m sure the next great crypto approach will have some structural reason you can’t use this…

Java 25 (September 2025?, LTS)

The next LTS, nothing really major on the menu but a number of finalizations which will be nice.

Final Thoughts

I started this mini research project thinking there were going to be more things like lambdas which I thought were going to maybe take the language away from what I always liked about it, but that definitely doesn’t seem to be the case. There are a ton of streamlined features that work well within the same mental model and “spirit” of the language. I’m really looking forward to digging in and using almost all of them.

Java 8 though 14 Catchup

After a long hiatus, I’ve been increasingly motivated to do a few tech side projects. Working at Google everything there is Google-specific, or at least Google-flavored, and even if I wanted to use the same stuff I mostly couldn’t. My primary language there is Go, which is fine, but I’m going back to Java, at least for the backend/offline stuff, for now. I’m in the process of picking a stack, and starting with the language.

The last real Java project I did was Java 7, which was already aging at that point but we were going for stability and Java 8 was only a few months old when we started. Java 24 just came out, so I’ve missed 17 versions! I could just jump in but I haven’t really followed the language at all aside from some tinkering over the years and I was probably not really taking advantage of anything new. I think it would be fun to roll forward and read up on each version, seeing the highlights of what each version added or changed.

I’m mostly interested in the language aspects and the core libraries. I’m not that concerned with things like GC versions and improvements. Those are very important but for my hackery it’s unlikely I’ll need to get that far into the internals. Also, I’m going to discuss preview features mostly where they first appeared, not the iterations and finalizations in subsequent releases.

Java 8 (March 2014)

Lambdas

Now I remember why we didn’t upgrade, lambdas seemed a bit daunting and I don’t know if the tooling (Eclipse + Lombok at the time) had really caught up yet. I’m still not a huge fan of lambdas in any language, I think they are nice shortcuts but I’d prefer cleaner-delineated blocks. In JS where I use them the most I almost always define a function separately and then reference that, unless it’s just a line or two.

Type Annotations

I love annotations when used judiciously, @NonNull seems cool but I could see it getting out of control so I’ll have to wait and see how it’s used in the real world.

Optional

I’ve used this a lot in C++ and it definitely fits my be-explicit style.

Streams

A companion to lambdas, these will look strange at first but i use this pattern a lot in JS so it will probably feel right eventually.

Java 9 (Sept 2017)

JDK Modularization

This looks like a big deal, but probably more on the enterprise level than personal projects.

Collection factory methods (List.of, Set.of, Map.of)

Very nice ergonomics, looking forward to that especially for hacky prototype stuff.

Java 10 (March 2018)

Local-variable type inference (var)

Lombok had this but I didn’t use it too much. It’s really dependent on your tooling, as it adds complexity to refactoring which is one of the things I loved most about Java (and hopefully holds up!). I’m curious how this looks in real code.

Unmodifiable collection copies

More ergonomic polish that I’ll likely be using for hacking things up where hardcoding is common.

Java 11 (Sept 2018, LTS)

Looks like this was the first true LTS vesrion? Nothing groundbreaking on a language level, just some ecosystem cleanup.

Java 12 (March 2019)

Improved switch

Kind of cool. Looks like a strictish lambda variant.

Java 13 (Sept 2019)

Text Blocks

Pretty good, basically parity with most other languages at this point.

Java 14 (March 2020)

Records

Interesting. I always preferred a pretty clean “bean” structure and used Lombok for the boilerplate so this probably wouldn’t make my code look that much different but it’s always nice to be able to build on core concepts instead of just conventions.

Pattern Matching for instanceof

Makes sense. I found that needing to introspect/cast (since generics) was almost always a failure of the APIs involved but sometimes they were still unavoidable. I used them when trying to make things more magical and automatic and bury complexity behind a clean interface.

Helpful NullPointerExceptions

OMG how did this take 14 versions to happen.

Thoughts So Far

So that was about 6 years and some pretty decent improvements overall. I don’t want Java to go the way of C++ and just get more and more complex with a never-ending stream of new ways to do things, even if they are probably better. I like my Java to be boring and predictible, very easy to read, very easy to refactor, and very easy to hand off. Not all of these are great for that, notable lambdas and type inference but we’ll see how it goes.

AI

It’s been a long time since my last post about technology, and in the meantime AI happened (again). It’s dominated the conversation, even beyond the tech crowd, since ChatGPT was released. Budgets and strategies at many companies have been massively disrupted echoing earlier booms and bubbles but adding its own unique attributes as well. I’ve got a number of things I’d like to share here and in the future, but I’ll start by saying I’m generally optimistic on this, especially long-term, for a few reasons:

It’s Exciting

I started my career in the “dot com boom”, when the Internet left universities and infiltrated almost every nook and cranny of the economy and people’s lives over the next 10-15 years. There were new developments daily, for years on end, even after the stock market blew up. Disruption and innovation was constant. It felt like a once-in-a-lifetime kind of thing, even in the moment. Nobody expected it to last forever but nobody knew when it would end either.

The AI boom isn’t quite at the same level, but it’s closer than I thought I’d see again. There are new tools and techniques coming out very frequently. There are vast sums of money being invested in many areas. There are new skills to learn, new toys to tinker with, new styles of craft being developed.

One big difference is that during the Internet boom most of the money was going into people. Anyone remotely employable could get a job writing web pages that paid far more than anything else they could do. This time, far more of the money seems to be going into power and compute, in part because that’s an important part but also because there’s a pernicious fixation on automation and displacement rather than the focus on leverage that the Internet fostered. This makes it more of a high-stakes kind of excitement but it’s still excitement.

It’s Important

Since the Internet we’ve seen a number of hype cycles of all sizes offering varying combinations of transformation, enchantment and value. Mobile, social, crypto/blockchain, big data, voice assistants, IoT, VR, 3D printing, smart homes and so on. Most of these are durable but a generation from now they will all be distilled in a few turning points at a cultural level, and the rest will only be remembered as significant within specific industries. I think mobile & social have been truly transformative culturally; they’ve deeply affected politics, friendships, families, and communities. With that one exception I think all of the other trends could be eclipsed by AI’s impact on our lives, economies, and future.

The key vector for that statement is agency. And not in a buzzwordy “agentic” way, but in the sense that vibe coding a TODO app is a tiny taste of the direction things are going. More people are going to be able to do more things, and in a rare violation of deeply held beliefs, they’re going to be able to them better, cheaper and faster. Distributed agency is a massive threat/shift for a global economy that’s been shifting towards services for decades, and it’s going to be a bumpy ride but I think in the end it’s going to be a huge boon. Once more people start to really push boundaries and leave the routine work behind for the robots, we could see an improvement in innovation and new ideas.

This is where the “pernicious fixation on automation and displacement” I mentioned above seems short-sighted and wrong to me. If you have a tool that makes your employees more productive, and you choose to do the same thing with fewer employees, you’re doing it wrong. If you’re running your business properly, your employees should now be making you even more money, so why would you lay them off? I’m not saying this is a trivial change and of course there are lots of details and nuances, but I see far fewer people thinking about growing the metaphorical pie than those who think it’s a fixed size.

It’s Inevitable

This is not the “use AI or become obsolete” pitch that many are making, but more about the fact that it’s here to stay. While the latest and greatest models require billion and trillion dollar companies, the trailing edge of self-hosted/open-source models and systems is keeping up at an impressive pace. Even if OpenAI and Google and Anthropic disappeared tomorrow, we’d still have a large fraction of the capabilities available to us.

“This is the worst it will ever be” is a mantra you might hear from AI fans and I think it’s mostly true. LLMs and RAG and “agents” and whatever FotM we’re excited about now may not be relevant in 10 or 20 years but the thing that will exist will almost certainly be better. We don’t have to use it, and we don’t even have to like it, but we can’t ignore it. If you work with any kind of information and are anywhere but the very end of your career, you should have a plan that addresses, if not includes, AI.

It’s Interesting

Finally, it’s personally relevant to me because it’s just something I think about a lot. Suck.com’s “shiny vs. useful” chart is one I come back to again and again, and AI is both. Not Sun/fire level but still very positive on both of those axes. The fact that it has a huge overlap on my own work in software and tech, and the points I made above, mean this is just too good of a topic to not geek out on. I hope to share more thoughts here in the future on both the positive and negative aspects.

Stack2020: Backend Basics

Image by Bethany Drouin from Pixabay

So the side project I mentioned a few posts ago is still just an idea, before I start building it I will need to pick a stack. I’ve been out of the non-big-tech loop for a while (one of the main drivers for this project) so this may take a while, but should be a fun experience.

The last time I picked a fresh stack was late 2013. We were building Mondogoal and needed to get to market very fast on a limited budget. We started in late November 2013 and needed to be in beta by March, and ready to launch (and get a gaming license!) in time for the 2014 World Cup that started in June. With two developers.

Somehow, we made it, and I think that one of the main factors was in how we picked our stack. In short, we kept things as boring as possible. The backend was all stuff I could do in my sleep: Java 7 with MySQL, using Maven and Spring, running on Tomcat in a mix of AWS and Continent 8. The only significant piece that I didn’t know well was Obsidian Scheduler, which is a fairly lightweight job scheduler, and we added Firebase later on. The frontend was backbone with a grunt build, which had been out for a while. This stability and going down well-trod paths let us focus on executing the business and product instead of tinkering and I doubt we would have been able to hit our deadline if we’d opted to explore some of the cool new stuff that was coming out. While the business didn’t succeed long-term, I have no regrets about those choices and can assign no blame to what we built it on.

Luckily, this new project has no deadline! It’s mainly a vehicle of exploration for me. I would definitely like to launch it, but if that takes months (unlikely) or years that’s OK. Let’s recap the “requirements” of this project:

  1. Learn how to build a modern front end.
  2. Give me a reason to explore the Google Cloud offerings (since those products are effectively my customers for my day job).
  3. Get back up to speed on things like analytics.
  4. Scratch an itch I’ve had for a long time (a nerdy economy-based game).
  5. Doesn’t risk any conflict with my day job.
  6. Give me something to blog about!

#2 is going to make some of these decisions easy. As a loyal dogfooder and company man (despite the lack of employee discount…), I will default to Google Cloud where possible. A quick check puts most of the basics on par with Amazon as far as cost goes. If GCP doesn’t offer it, then it’s up for grabs. Let’s start picking!

But wait…what is and isn’t part of The Stack?

I take a broad view of the term, to me The Stack is basically everything in the shop. You could (and I might in a future post) break this down into groups (tech stack, marketing stack, etc.), but to me if it’s a part of making the business run and it’s something you didn’t build yourself that you expect employees to get or be skilled at, then it’s part of The Stack. This includes everything from Java or Python to Salesforce and Gmail.

Domain Hosting

Winner: Google Domains

Cost: $14/year

Status: Up and Running

Initial Thoughts

I’ve got domains hosted at lots of places. GoDaddy, Namecheap, Hover, OpenSRS, I even still have one at Network Solutions. Compared to those, registering on Google was a pretty painless process. The pricing is more transparent than most other places (no first-year rates that go way up later). They also didn’t upsell junk I don’t need too hard. Setting up G Suite was also pretty easy (as you’d hope), I had it all running in like 15 minutes without touching any DNS entries.

To dogfood even deeper, I’m using a .app TLD, which Google owns, and was a few bucks cheaper than the other registrars.

Email Hosting

Winner: G Suite

Cost: $5/user/month

Status: Up and Running

Initial Thoughts

As most of us are, I’m pretty familiar with all of these tools, and I’m pretty sure most surveys would have Gmail at the top of people’s preferred email services. As a bonus, setting this up was super easy since the domain is also with Google.

Compute Servers

Winner: Kubernetes/Compute Engine

Cost: TBD

Status: Exploration

Reasoning

There were two things that came up in virtually every conversation/interview I had during my last job search, React and Kubernetes (AKA K8s). Virtually everyone was using these or trying to move to them. I’ve never touched K8s, so combined with #2 above, I feel like I should play with it.

I have used App Engine, and I assume non-K8s Compute Engine is pretty similar to AWS’s EC2 which I’ve used quite a bit, so I will fall back to those when it makes sense to do so.

Backend Language

Winner: Java

Cost: Free

Status: Defrosting

Reasoning

I am a moderately capable but generally reluctant programming language polyglot. My first employee profile badge at Facebook was “Committed Code in 5 Languages”. But I’ve never bought into “pick the best langauge for the job” and tend to favor the approach of “pick the best language for all the jobs”. This offer does not extend to JavaScript backends.

Java was my primary language from probably 2000 through 2016. Since then I’ve mostly been writing C++. I’ve grown to like the language, it is lightyears better than it was when I started writing Java, but I’ve never worked in it outside of the padded walls of FB/Google’s infrastructure, and to be honest, am not terribly interested in doing so.

While we upgraded our runtimes to Java 8 at Mondogoal after a bit, we never got around to really using any of the features, so I’m effectively only up-to-date through Java 7, and would like to explore the recent additions. There are also some new parts of the Java ecosystem that are worth exploring, like Quarkus and GraalVM.

Also, I just kind of miss working in it.

Runners Up

There are two languages I am interested in tinkering with, once I’ve warmed back up on Java: Kotlin and Rust. They both have had a pretty good reception and have some attractive features. Kotlin as a JVM language should be easy enough to experiment with. If I can find a task that would benefit from Rust I will probably give it a shot.

IDE

Winner: IntelliJ IDEA Ultimate

Cost: $149/$119/$89 for years 1/2/3+

Status: Trial

Reasoning

I initially wrote Java in emacs, then JCreator, then switched to Eclipse c2002 and used it through 2016. I’ve tried IntelliJ a few times over the years but never really got the hang of it or saw a lot of value in it.

However, Google does quite a bit of Java work, and their primary and only fully supported IDE for it is IntelliJ. I’ve also been using CLion (basically IntelliJ for C++) and it’s been OK.

The “Ultimate” edition of IntelliJ includes support for other languages and even React so that’s a strong argument in favor of trying it out. I’m not opposed to ultimately landing on using different tools to work in different languages (e.g. I often used Eclipse for Java and Sublime for JS), but if you can do it all in one, that’s nice.

My Eclipse muscle memory is very strong, so I expect this to be somewhat painful transition, but I will give it as fair a shot as I can manage.

Java Build

Winner: Gradle

Cost: Free

Status: Exploring

Reasoning

There are only two real choices here: Maven and Gradle. And given that Gradle uses Maven-style repositories, they aren’t even that different in many respects.

Maven

I’ve used Maven for many years, since Ant, and like most people had some struggles with it initially. I eventually learned to coexist with it, or at least avoid its sharp edges, and would just copy/paste my pom from one project to next and had minimal issues.

Gradle

Gradle has three main “advantages” over Maven that people seem to crow about.

One is that it’s written in Groovy, and you can therefore script your build and do advanced stuff more easily than writing a Maven plugin. I would put this in the Probably a Bad Idea category, like stored procedures. I bet there are some cases where this is useful, but probably far more where it’s a kludgy workaround to some other problem people don’t want to solve.

The second is that it’s written in Groovy, which is not XML. I always thought XML was nice when used properly, and that config files were one of those proper uses. However, something about it causes a primal aversion in many people and they convince themselves that things that are not XML are inherently better than things that are.

The third is that you can do different build versions more easily, and this one I get, especially in the context of things like Android apps. Given that I might be targetting different JVMs (regular and GraalVM) this might be useful, but probably won’t be.

So I’m not really impressed with Gradle either, but given that there are literally only two choices, I might as well know both. It’s pretty trivial for a small project to switch back or even run both, so this is a pretty low-risk experiment.

Source Control

Winner: Git + Monorepo

Cost: Free (plus hosting)

Status: Up and Running

Reasoning

I think there are only 3 real options these days for version control that don’t fall into the “never heard of it” category for most people.

Git

The dominant force, and the only one that many developers know these days.

Mercurial (hg)

I have grown to prefer Mercurial in a code-review + monorepo environment since starting to use it at Facebook. Implicit branches and easy commit management map very well to the “commits should do one thing” best practices as opposed to the pull request pattern where mainline commits should favor comprehensiveness. For a solo project this isn’t relevant and it’s basically the same thing as Git.

Subversion (svn)

For solo/small teams, SVN is totally fine, it’s basically how you’d be using a DVCS anyways, but if you don’t have a server running already then it’s probably not worth setting one up.

Mono vs. Multi Repo

For large organizations, monorepo is the clear way to go for reasons I can discuss elsewhere. For solo/small teams, it doesn’t really matter, and it *might* be better to split up your repos if you have a *very* clear separation (e.g. front/back end), which is how we did it at Mondogoal, but I would say to start with a monorepo and only split if there is a compelling reason to do so (e.g. regulations, licensing).

I’m going to call this a toss-up between Git and Mercurial and give Git the edge due to the fact that it’s massively more popular and more likely to integrate well with other things like IDEs and deployment tools.

Source Control Host

Winner: Google Cloud Source Repositories

Cost: Free to 5 users & 50GB storage/egress, then $1/user/month and $0.10/GB/month

Status: Exploring

Reasoning

Given that I’ve chosen Git, GitHub is the obvious first choice here, but since Google has a product we’ll invoke requirement #2. This also might integrate better with the other services I’ll be using, though I have barely researched that beyond the marketing.

One of the nice things with Git is that it’s trivial to multi-host, so if I ever find a compelling reason to also use GitHub, I can use both or just switch.

Next Up

There’s a lot left to do here, databases, frontend, and more. Stay tuned!