Stack2020: Backend Basics

Image by Bethany Drouin from Pixabay

So the side project I mentioned a few posts ago is still just an idea, before I start building it I will need to pick a stack. I’ve been out of the non-big-tech loop for a while (one of the main drivers for this project) so this may take a while, but should be a fun experience.

The last time I picked a fresh stack was late 2013. We were building Mondogoal and needed to get to market very fast on a limited budget. We started in late November 2013 and needed to be in beta by March, and ready to launch (and get a gaming license!) in time for the 2014 World Cup that started in June. With two developers.

Somehow, we made it, and I think that one of the main factors was in how we picked our stack. In short, we kept things as boring as possible. The backend was all stuff I could do in my sleep: Java 7 with MySQL, using Maven and Spring, running on Tomcat in a mix of AWS and Continent 8. The only significant piece that I didn’t know well was Obsidian Scheduler, which is a fairly lightweight job scheduler, and we added Firebase later on. The frontend was backbone with a grunt build, which had been out for a while. This stability and going down well-trod paths let us focus on executing the business and product instead of tinkering and I doubt we would have been able to hit our deadline if we’d opted to explore some of the cool new stuff that was coming out. While the business didn’t succeed long-term, I have no regrets about those choices and can assign no blame to what we built it on.

Luckily, this new project has no deadline! It’s mainly a vehicle of exploration for me. I would definitely like to launch it, but if that takes months (unlikely) or years that’s OK. Let’s recap the “requirements” of this project:

  1. Learn how to build a modern front end.
  2. Give me a reason to explore the Google Cloud offerings (since those products are effectively my customers for my day job).
  3. Get back up to speed on things like analytics.
  4. Scratch an itch I’ve had for a long time (a nerdy economy-based game).
  5. Doesn’t risk any conflict with my day job.
  6. Give me something to blog about!

#2 is going to make some of these decisions easy. As a loyal dogfooder and company man (despite the lack of employee discount…), I will default to Google Cloud where possible. A quick check puts most of the basics on par with Amazon as far as cost goes. If GCP doesn’t offer it, then it’s up for grabs. Let’s start picking!

But wait…what is and isn’t part of The Stack?

I take a broad view of the term, to me The Stack is basically everything in the shop. You could (and I might in a future post) break this down into groups (tech stack, marketing stack, etc.), but to me if it’s a part of making the business run and it’s something you didn’t build yourself that you expect employees to get or be skilled at, then it’s part of The Stack. This includes everything from Java or Python to Salesforce and Gmail.

Domain Hosting

Winner: Google Domains

Cost: $14/year

Status: Up and Running

Initial Thoughts

I’ve got domains hosted at lots of places. GoDaddy, Namecheap, Hover, OpenSRS, I even still have one at Network Solutions. Compared to those, registering on Google was a pretty painless process. The pricing is more transparent than most other places (no first-year rates that go way up later). They also didn’t upsell junk I don’t need too hard. Setting up G Suite was also pretty easy (as you’d hope), I had it all running in like 15 minutes without touching any DNS entries.

To dogfood even deeper, I’m using a .app TLD, which Google owns, and was a few bucks cheaper than the other registrars.

Email Hosting

Winner: G Suite

Cost: $5/user/month

Status: Up and Running

Initial Thoughts

As most of us are, I’m pretty familiar with all of these tools, and I’m pretty sure most surveys would have Gmail at the top of people’s preferred email services. As a bonus, setting this up was super easy since the domain is also with Google.

Compute Servers

Winner: Kubernetes/Compute Engine

Cost: TBD

Status: Exploration

Reasoning

There were two things that came up in virtually every conversation/interview I had during my last job search, React and Kubernetes (AKA K8s). Virtually everyone was using these or trying to move to them. I’ve never touched K8s, so combined with #2 above, I feel like I should play with it.

I have used App Engine, and I assume non-K8s Compute Engine is pretty similar to AWS’s EC2 which I’ve used quite a bit, so I will fall back to those when it makes sense to do so.

Backend Language

Winner: Java

Cost: Free

Status: Defrosting

Reasoning

I am a moderately capable but generally reluctant programming language polyglot. My first employee profile badge at Facebook was “Committed Code in 5 Languages”. But I’ve never bought into “pick the best langauge for the job” and tend to favor the approach of “pick the best language for all the jobs”. This offer does not extend to JavaScript backends.

Java was my primary language from probably 2000 through 2016. Since then I’ve mostly been writing C++. I’ve grown to like the language, it is lightyears better than it was when I started writing Java, but I’ve never worked in it outside of the padded walls of FB/Google’s infrastructure, and to be honest, am not terribly interested in doing so.

While we upgraded our runtimes to Java 8 at Mondogoal after a bit, we never got around to really using any of the features, so I’m effectively only up-to-date through Java 7, and would like to explore the recent additions. There are also some new parts of the Java ecosystem that are worth exploring, like Quarkus and GraalVM.

Also, I just kind of miss working in it.

Runners Up

There are two languages I am interested in tinkering with, once I’ve warmed back up on Java: Kotlin and Rust. They both have had a pretty good reception and have some attractive features. Kotlin as a JVM language should be easy enough to experiment with. If I can find a task that would benefit from Rust I will probably give it a shot.

IDE

Winner: IntelliJ IDEA Ultimate

Cost: $149/$119/$89 for years 1/2/3+

Status: Trial

Reasoning

I initially wrote Java in emacs, then JCreator, then switched to Eclipse c2002 and used it through 2016. I’ve tried IntelliJ a few times over the years but never really got the hang of it or saw a lot of value in it.

However, Google does quite a bit of Java work, and their primary and only fully supported IDE for it is IntelliJ. I’ve also been using CLion (basically IntelliJ for C++) and it’s been OK.

The “Ultimate” edition of IntelliJ includes support for other languages and even React so that’s a strong argument in favor of trying it out. I’m not opposed to ultimately landing on using different tools to work in different languages (e.g. I often used Eclipse for Java and Sublime for JS), but if you can do it all in one, that’s nice.

My Eclipse muscle memory is very strong, so I expect this to be somewhat painful transition, but I will give it as fair a shot as I can manage.

Java Build

Winner: Gradle

Cost: Free

Status: Exploring

Reasoning

There are only two real choices here: Maven and Gradle. And given that Gradle uses Maven-style repositories, they aren’t even that different in many respects.

Maven

I’ve used Maven for many years, since Ant, and like most people had some struggles with it initially. I eventually learned to coexist with it, or at least avoid its sharp edges, and would just copy/paste my pom from one project to next and had minimal issues.

Gradle

Gradle has three main “advantages” over Maven that people seem to crow about.

One is that it’s written in Groovy, and you can therefore script your build and do advanced stuff more easily than writing a Maven plugin. I would put this in the Probably a Bad Idea category, like stored procedures. I bet there are some cases where this is useful, but probably far more where it’s a kludgy workaround to some other problem people don’t want to solve.

The second is that it’s written in Groovy, which is not XML. I always thought XML was nice when used properly, and that config files were one of those proper uses. However, something about it causes a primal aversion in many people and they convince themselves that things that are not XML are inherently better than things that are.

The third is that you can do different build versions more easily, and this one I get, especially in the context of things like Android apps. Given that I might be targetting different JVMs (regular and GraalVM) this might be useful, but probably won’t be.

So I’m not really impressed with Gradle either, but given that there are literally only two choices, I might as well know both. It’s pretty trivial for a small project to switch back or even run both, so this is a pretty low-risk experiment.

Source Control

Winner: Git + Monorepo

Cost: Free (plus hosting)

Status: Up and Running

Reasoning

I think there are only 3 real options these days for version control that don’t fall into the “never heard of it” category for most people.

Git

The dominant force, and the only one that many developers know these days.

Mercurial (hg)

I have grown to prefer Mercurial in a code-review + monorepo environment since starting to use it at Facebook. Implicit branches and easy commit management map very well to the “commits should do one thing” best practices as opposed to the pull request pattern where mainline commits should favor comprehensiveness. For a solo project this isn’t relevant and it’s basically the same thing as Git.

Subversion (svn)

For solo/small teams, SVN is totally fine, it’s basically how you’d be using a DVCS anyways, but if you don’t have a server running already then it’s probably not worth setting one up.

Mono vs. Multi Repo

For large organizations, monorepo is the clear way to go for reasons I can discuss elsewhere. For solo/small teams, it doesn’t really matter, and it *might* be better to split up your repos if you have a *very* clear separation (e.g. front/back end), which is how we did it at Mondogoal, but I would say to start with a monorepo and only split if there is a compelling reason to do so (e.g. regulations, licensing).

I’m going to call this a toss-up between Git and Mercurial and give Git the edge due to the fact that it’s massively more popular and more likely to integrate well with other things like IDEs and deployment tools.

Source Control Host

Winner: Google Cloud Source Repositories

Cost: Free to 5 users & 50GB storage/egress, then $1/user/month and $0.10/GB/month

Status: Exploring

Reasoning

Given that I’ve chosen Git, GitHub is the obvious first choice here, but since Google has a product we’ll invoke requirement #2. This also might integrate better with the other services I’ll be using, though I have barely researched that beyond the marketing.

One of the nice things with Git is that it’s trivial to multi-host, so if I ever find a compelling reason to also use GitHub, I can use both or just switch.

Next Up

There’s a lot left to do here, databases, frontend, and more. Stay tuned!

Logging Like it’s 2011

Earlier this year I revisited how I was logging things in Java, and decided I would try a hybrid approach.  I’m reporting back to say that it’s been successful. There are basically two ways I do it now:

Use java.util.Logging for libraries

It turns out that there’s nothing actually wrong with JUL aside from its limitations in terms of configuration.  It has different names than the Ceki Gülcü loggers (warning() instead of warn(), finest() instead of trace(), etc.) but otherwise works the same.  The configuration side of things I can’t even speak do as I haven’t had to configure it, I never use the actual JUL output.

Use Logback for applications

As suspected, Logback is basically just a new version of Log4J.  It’s got some niceties like MDCInsertingServletFilter that means I don’t have to write that filter myself anymore, and it’s supposedly faster and more stable, so there’s no reason I can see not to use it.  I also like that it has a format for condensing class names without completely losing the package names, so it goes from “com.efsavage.application.servlet.LoginServlet” to “c.e.a.s.LoginServlet” instead of just “LoginServlet”.

Why use two?

I like the fact that my libraries have zero logging dependencies and configuration, so I can always pop them into another app without adding baggage or conflicts or much additional configuration.  I can upgrade the logger in one application without having to deal with version conflicts of my other apps and having to do classpath exclusions and that type of nastiness.

Tip

If you do it this way, and you see your JUL logging showing up twice, you can edit the default logging config in your JDK installation, or if you prefer to leave that untouched as I do, try this (via Java Bits):

java.util.logging.Logger rootLogger = LogManager.getLogManager().getLogger("");
Handler[] handlers = rootLogger.getHandlers();
for (int i = 0; i < handlers.length; i++) {
rootLogger.removeHandler(handlers[i]);
}
SLF4JBridgeHandler.install();

This basically yankts out the default logger and let’s SLF4J do it’s thing alone.  In a webapp you’ll probably throw this in your context listener or startup servlet where you previously did your log4j config.

Workstation Setup 2011

A new workstation means it’s time to install lots of stuff, and we’re still a long way from DropSteam.  Here’s my log from fresh Windows 7 install in a new VM image to a functional development environment:

First, I hit Ninite and install:

  • All Browsers (I use Chrome as default)
  • Adobe Air
  • Adobe flash
  • VLC Player
  • IrfanView
  • Inkscape
  • Paint.NET
  • Foxit Reader
  • PDFCreator
  • CutePDF (yes, you need both PDF printers, as it’s fairly common for one of them to have a problem with a particular job)
  • CCleaner (tweak settings before running so you don’t nuke more than you want to, like browser history)
  • 7-Zip
  • Notepad++
  • WinSCP
  • JDK

Then I grab the ZIP of all of the Putty programs.  I put installer-less programs like this in C:\bin

Cloudberry Freeware for Amazon S3 buckets.

Download JavaDoc and install in JDK folder.

Download Eclipse (3.4, not impressed with 4.X so far) and then:

  • Set text font to 8pt Lucida Console
  • Most companies and many open source projects are still using SVN so I install the Subclipse plugin for Eclipse.
  • I’m not a huge fan of m2eclipse but I find that doing eclipse:eclipse from the command line costs you too much integration, so I use it.
  • Turn on all compiler warnings except:
    • Non-Externalized Strings – Enable as-needed
    • serialVersionUID – Not useful for most projects
    • Method can potentially be static – False positives on unit tests
  • Turn on line numbers
  • Install CheckStyle.
  • Install FindBugs.

Maven 3 seems a little rough around the edges so I still use Maven 2.X

Install Cygwin and add git, svn, curl, and ssh packages.

Install MySQL Community Edition.  During the installer I:

  • Change the charset to utf8
  • Fix the windows service name to something like MYSQL5
  • Add to windows path
  • Add a password

JRebel.  You’re using this, right?  If not, slap yourself and then go get it.  Pay for the license out of your own pocket if you need to.

Lombok.  I have finally used this on a real project and can say it’s ready for prime-time.  It does not work with IntelliJ IDEA but I haven’t really seen any reasons to use IntelliJ that outweigh the benefits of Lombok.

Photoshop Elements because while IrfanView is great for viewing and Paint.NET is great for simple edits, you will at some point need a more powerful editor.  Also most designers work in Photoshop so this let’s you open those files directly.

Photoshop Elements+ is basically a $12 unlock of some of Elements’ crippled features.  For me it’s worth it for tracking alone.

LastPass is useful even if you don’t store anything sensitive in it, it’s great for testing webapps with multiple users.

I use Git for my own work so we’ll need that. Don’t forget to set your name!

I also make some Windows tweaks:

  • Set desktop background to black.
  • Check “Show hidden files, folder and drives”.
  • Uncheck “Hide extensions for known file types”.
  • Set %JAVA_HOME to JDK directory.
  • Add Maven’s bin directory to %PATH
  • Add C:\bin to %PATH

I will obviously add more over time, but this is the stuff I know I will need.  What’s great is that almost all of it is free, and it can all be downloaded (except the original Windows install), so no disks required like the old days

You might think this is an incomplete list, where is my email client, my MS/Open office, my music player?  I don’t use those unless I have to.  Keep in mind that this is a VM so some of this software is installed on the Host OS, while the rest of it I prefer to use web-based solutions (Meebo, Google docs, webmail) so there’s no issues of having to keep updating settings.

Java 7

Java 7 (AKA Java 1.7.0) is out! What does this mean, is this a Big Deal like Java 3 or 5 or a snoozer like Java 6 or something in the middle like Java 4? It all depends on what you do with it, but for most Java developers, we’re looking at a Java 4-ish release. There are some new APIs, and some really nice syntax cleanup, but ultimately it’s not a big deal.

Here’s the list of what’s changed:

The stuff I hope I don’t have to care about (AKA The Boring Stuff)

These are the under-the-hood changes I will probably only encounter when debugging someone else’s code.

Strict class-file checking

I’m not entirely sure what this is, and it doesn’t seem like too many other people are either.  I think it means that Java 7 files have to declare themselves as different than previous versions and do some pre-verification as part of compilation.  I don’t know if this is actually something new or just declared baggage.

As specified in JSR 202, which was part of Java SE 6, and in the recently-approved maintenance revision of JSR 924, class files of version 51 (SE 7) or later must be verified with the typechecking verifier; the VM must not fail over to the old inferencing verifier.

Upgrade class-loader architecture

Fixes some deadlock issues when writing non-hierarchal custom classloaders.  If you’re writing custom classloaders, this is probably great news.  Unfortunately if you’re writing custom classloaders you’re pretty much screwed with or without this improvement.

More information here.

Method to close a URLClassLoader

Pretty self-explanatory.  if you’re using a URLClassLoader, you may at some point want to close it.  Now you can.

More information here.

SCTP (Stream Control Transmission Protocol)

From what I can gather SCTP is somewhere between UDP’s lightweight “spray and pray” approach and TCP’s heavy-duty “armored convoy”.  Seems like a good choice for modern media narrowcasting and always-on sensor networks, but don’t quote me on that.

More information here.

SDP (Sockets Direct Protocol)

Implementation-specific support for reliable, high-performance network streams over Infiniband connections on Solaris and Linux.

SDP and/or Infiniband seem to networking where you can get more throughput by bypassing the nosy TCP stack, but without completely ignoring it.  Kind of like flashing your badge to the security guard and letting all your friends in, I think?

More information here.

Use the Windows Vista IPv6 stack

I’m assuming that this also applies to Windows 7, but I’ll worry about this when my great-grandkids finally make the switch to IPv6 in 2132.

TLS 1.2

Better than TLS 1.1! (TLS 1.2 is SSL 3.3).

More information here.

Elliptic-curve cryptography (ECC)

The official description sums it up:

A portable implementation of the standard Elliptic Curve Cryptographic (ECC) algorithms, so that all Java applications can use ECC out-of-the-box.

XRender pipeline for Java 2D

Now, when you aren’t gaming, you can  leverage some of your GPU power to do image rendering and manipulation in Java instead of just mining BitCoins.

Create new platform APIs for 6u10 graphics features

Let’s you use opacity and non-rectangluar shapes in your Java GUI apps.

More information here.

Nimbus look-and-feel for Swing

Your Swing apps can now look like outdated OS X apps.

More information here.

Swing JLayer component

Useful way to write less spaghetti when trying to make Java GUI apps behave like other modern frameworks.

More information here.

Gervill sound synthesizer

Looks like a decent audio synthesis toolkit.  Not sure if this really needs to be in the JDK but now you can add reverb to your log file monitor’s beeping.

More information here.

The important stuff I don’t really use (AKA The Stuff I’m Glad Someone Thought Of)

These are the changes that probably only help if you’re on the cutting edge of certain facets of using Java.  Eventually these will probably benefit us all but I doubt many people were waiting for these.

Unicode 6.0

Also self-explanatory, new version of Unicode.  Has the new Indian Rupee Sign (?) and over 2,000 other new glyphs, so it’s certainly important if you use those, but otherwise not so much.  I don’t know enough about switching versions of Unicode to say if this is something to be worried about, but if you have lots of it in your app, it’s probably safe to say you should do some major tests before switching.

More information here.

Locale enhancement

Locale was previously based on RFC 1766, now it is based on BCP 47 and UTR 35.

This isn’t just new data, this is a new data structure, as the number and classification of languages on computers has evolved beyond the capabilities of RFC 1766.  I’ve always been glad that Java is so robust in it’s internationalization support even though I’ve never really taken advantage of most of it.

Separate user locale and user-interface locale

Now this is interesting, but I can’t find any good use cases yet.  My guess is that this is part of Java’s ongoing push towards multi-platform/multi-language interfaces, notably Android.  You can set a Locale (and the rules a locale dictates) not only on geo/cultural basis but on a device/environment basis.  Locale.Category only has two entries but I wouldn’t be surprised if more of them creep in in future versions.

JDBC 4.1

java.sql.Connection, java.sql.Statement, and java.sql.ResultSet are AutoCloseable (see try-with-resources below).

Also, you can now get RowSets from a RowSetFactory that you get from a RowSetProvider.  If you’re still writing JDBC this probably makes your life easier, but only in the “at least I”m not writing custom classloaders” way.

Update the XML stack

Upgrade the components of the XML stack to the most recent stable versions: JAXP 1.4, JAXB 2.2a, and JAX-WS 2.2

Your XML library is always a version that’s too old or too new, so you might as well go with the new one.

The important stuff I don’t use yet (AKA The Cool Stuff)

InvokeDynamic – Support for dynamically-typed languages (JSR 292)

The short version is that this makes things better for people implementing dynamically-typed languages on the JVM, like Clojure.  While the JVM is actually less statically typed than you might think if you’ve only used it via normal Java, methods in particular were difficult for these languages.

The long version is here.

Enhanced MBeans

I’ve never knowingly used MBeans (Managed Beans), but they do seem useful for large systems.  Regardless, they’ve been enhanced:

Enhancements to the existing com.sun.management MBeans to report the recent CPU load of the whole system, the CPU load of the JVM process, and to send JMX notifications when GC events occur (this feature previously included an enhanced JMX Agent, but that was dropped due to lack of time).

The Important Stuff I Use (AKA The Good Stuff)

These are the reasons I’ve even bothered to look at Java 7.  I’ll probably write a follow-up on each of these when I have some experience with them.

JSR 203: More new I/O APIs for the Java platform (NIO.2)

I’m including this here because I’m actually doing some data/file work where this will come in handy.  Basically this exposes a lot of I/O and Filesystem functionality that you previously had to dip into native libraries for, like actually telling the OS to copy a file rather than reading and writing the data yourself.  If you’re not doing any file-based work right now, keep a mental note to look into this the next time you type “File”.

More information here.

NIO.2 filesystem provider for zip/jar archives

Included here for similar reasons as the previous item, but I think it’s pretty cool that you can treat a zip file as a filesystem and operate within it.  Unfortunately it looks like the basic implementation is read-only, but I’m guessing the pieces are there for a read/write version.

More information here.

Concurrency and collections updates (jsr166y)

java.util.concurrent, as far as I’m concerned, is something everyone who considers him/herself a senior Java dev should be familiar with.  It’s rare that your code will not be able to take advantage of it in some way.  One of the missing pieces was a good fork/join component, where you can split jobs and wait for them all to complete in parallel before moving on.  I actually built a version of this approach (sadly not open sourced), so I’m eager to tinker with the new standard one.  I expect that list most of the concurrent library you’ll need to sprinkle in some of your own helper code to make things elegant, but the hard parts are solved and built and ready for assembly.

More information here.

Project Coin – Small language enhancements (JSR 334)

Ah, the sugary goodness that you’ve been waiting for.

  • Strings in switchNot terribly useful since we could (and probably still should) use enums, and you still have to use if blocks for anything fancy like regex matches.
  • Binary integral literals and underscores in numeric literalsI never really thought of this as a problem, much less one worth solving, but sure, why not.
  • Multi-catch and more precise rethrowNow this, as an ardent fan of exceptions, I like.  It not only cuts down on copy/paste exception handling, it helps avoid subclassing exceptions, which is almost always the wrong approach.  More info on this later.
  • Improved type inference for generic instance creation (diamond)Great idea that really should have always been there. Example:
    Map<String,Integer> myMap = new HashMap<>();
  • try-with-resources statementProbably the most important syntax change.  Now you can skip all of those finally { resource.close(); } blocks that were important but verbose even for my taste.  Doesn’t really help with the situation of the close method throwing an exception of its own though, so I’m curious to see how I end up handling that situation.
  • Simplified varargs method invocationJust avoids some methods that really never should have been there in the first place.

More information here (PDF).

Summary

So basically, a bunch of “nice to have” stuff.  Code will be a little cleaner, resources will be closed a little more often.  GUI apps will be a little fancier.  Some more people might try concurrency.  Programs will run a little faster.  The next thing you know we’ll be looking at Java 8

 

 

 

Life After JSON, Part 1

XML. Simply saying that term can elicit a telling reaction from people. Some will roll their eyes. Some will wait for you to say something meaningful. Some will put their headphones back on and ignore the guy that sounds like a enterprise consultant. JSON is where it’s at. It’s new. It’s cool. Why even bother with stupid old XML. JSON is just plain better. Right?

Wrong. But that’s not to say that it’s worse, it’s just not flat-out better, and I’m a little concerned that it’s become the de facto data format because it’s really not very good at that role. I think its success is a historical accident, and the lessons to be learned here are not why JSON is so popular, but why XML failed.

Without going into too much detail, XML came out of the need to express data as more complex relationships than formats like CSV allowed. We had OOP steamrolling everything else in the business world, HTML was out in the wild showing that this “tag” idea had promise, and computers and I/O were finally able to handle this type of stuff at a cost level that didn’t require accounting for every byte of space.

In the beginning, it worked great. It broke through much of headaches of other formats, it was easy to write, even without fancy tools, and nobody was doing anything fancy enough in it that you had to spend a ton of time learning how to use it. After that though, things got a little, actually a lot … messy. User njl at Hacker News sums it up very nicely:

I remember teaching classes in ’99 where I needed to explain XML. I could do it in five minutes — it’s like HTML, but you get to make up tags that are case-sensitive. Every start tag needs to have an end tag, and attributes need to be quoted. Put this magic tag at the start, so parsers know it’s XML. Here’s the special case for a tag that closes itself.
Then what happened? Validation, the gigantic stack of useless WS-* protocols, XSLT, SOAP, horrific parser interfaces, and a whole slew of enterprise-y technologies that cause more problems than they solve. Like COBRA, Ada, and every other “enterprise” technology that was specified first and implemented later, all of the XML add-ons are nice in theory but completely suck to use.

So we have a case here of a technology being so flexible and powerful that everyone started using it and “improving” it. We started getting asked questions by recruiters like “can you program in XML” and “they used to do C++ but now they’re trying to use a lot more XML”. We had insanely complex documents that were never in step with the schemas they supposedly used. We had big piles of namespaces nobody understood in the interest of standards and reusable code.

Where the pain was really felt was in the increasing complexity of the libraries that were supposed to make XML easy to use, even transparent. There was essentially an inexhaustible list of features that a library “needed” to support, and thus none of them were ever actually complete, or stable. 15 years later there are still problems with XML parsing and library conflicts and weird characters that are legal here but not there.

So, long before anyone heard of JSON, XML had a bad rep. While the core idea of it was sound, it ultimately failed to achieve it’s primary goal of being a universal data interchange format that could scale from the simplest document to the most complex. People were eager to find a solution that delivered the value XML promised without the headaches it delivered. In fact, they still are, as nothing has really come close yet, including JSON, but perhaps we can build on the lessons of both XML and JSON to build the real next-big-thing.

XML ended up being used for everything from service message requests and responses to configuration files to local data stores. It was used for files that were a few hundred bytes to many gigabytes. It was used for data meant to represent a single model to a set of models to a stream of simple models that had no relation to each other. The fact that you could use it for all of these things was the sexy lure that drew everyone to it. Not only could you use it for all of these things but it was actually designed to be used for all of these things. Wow!

Ultimately, I think this was the reason for it’s downfall, and JSON’s unsuitability for doing everything has not surprisingly been the reason for it’s ascendance on its home turf (JavaScript calling remote services). Does it really make sense for my application configuration file to use the same format as my data caches and as my web services? Even now it’s hard for me to say that it’s a bad idea because the value of having one format to parse, one set of rules to keep in mind, one set of expectations I don’t need to cover in my documentation is nice. But with XML and JSON both proving this to be false in different ways, we have to take it as such.

The problem that’s on the horizon is that as JSON becomes the go-to format for new developers to use, the people that have been told all along that XML is bad and horrible and “harmful”, they’re using it for everything. We’re heading towards a situtation where we have replaced a language that is failing to do what it was designed to do with a language that is failing to do things it was never designed to do, which I think would actually be worse. Even if it’s not worse, it’s certainly still bad.

If we abide by the collective answer that XML is not the answer, or at least it’s the answer we don’t like, and my prediction that JSON isn’t either, what is? And why is nobody working on it? Or are they?

To be continued…

To switch or not to switch, part 3

This entry is part 3 of 3 in the series To switch or not to switch

Continued from Part 2, we’re down to 41 languages that you could potentially build a modern, database-driven application with. I’d like to knock another 10-12 more off the list before I get into trying the language themselves, but I’m running out of black-and-white rules to do it with.

Concurrency is a big part of scaling, but that’s a tough characteristic to pin down. Any JVM-based language is going to have access to a great threading model, but people have been able to scale languages that lack threads at all, like Python (Edit: Python has threads) languages that have poor or difficult threading, to decent volumes as well, so clearly threads are not a requirement.

I spend much of my time writing web software, so robust support of that, like Java’s Servlet specification and third party additions like Spring MVC, is nice to have as well. Ruby and Python have made many of their gains based on the strengths of their web frameworks. However, an otherwise good language could add a web framework relatively easily, so again it’s not a requirement.

The way a language handles text is important when working with users, databases, web services, etc., but this can be addressed with libraries. Object orientation can be nice too, as can other models. Interpreted vs. compiled, machine code vs. byte code, exceptions, static typing, dynamic typing, the list goes on with important details that aren’t actually important enough that I can’t really live without them. I certainly prefer static typing, checked exceptions, painless threading, garbage collection, run-anywhere bytecode, running in a virtual machine, but I know smart people who can make valid arguments against every one of those, and maybe I just haven’t given the alternatives enough of a chance.

What is really important to me about a language is its community, leadership, and, for lack of a better term, motive. For this reason I’m going to knock the Microsoft-led CLI languages off the list. I know that CLI is a standard, and that the open source implementation is not beholden to it, but I’m simply not going to program in the Microsoft ecosystem. I think they just don’t care about open source and independent development. This eliminates:

  • C#
  • Boo
  • Cobra
  • F#

It’s kind of a shame because F# looks interesting and is something I’d like to tinker with some day. I also think C# is actually a pretty good language, and, while it initially seemed like Java Jr., it evolved to have some powerful features, moreso than Java in some ways. Unfortunately, even with Mono in the picture, it’s still a Microsoft product to me.

It should definitely be noted that I’m not exactly happy about Oracle owning Java now either. Due to Java’s inertia, I haven’t seen any Oracle decisions affect me yet, and I’ve probably got a number of years before I have to deal with that if and when they decide to do something bad. They could kill it entirely and people would still be using it for a long time.

Along similar lines, Objective-C’s fate seems to be tied closely with the whims of Apple. I don’t see any interest in using it for non-Apple OS projects, so it is bumped off the list as well.

  • Objective-C

Also some of these languages have such a small community or slow pace of development that they don’t seem worthy of investing in:

  • ALGOL 68
  • Clean
  • Dylan
  • GameMonkey Script
  • Mirah
  • Unicon

Three of these languages are active, but the community (and therefore the language) are solving very different problems, Processing with its graphics and visualizations, Scratch with its introductory/training aspects, and Vala is tied closely to GNOME applications.

  • Processing
  • Scratch
  • Vala

This leaves 27 languages moving on to part 4, most of which I’m definitely looking forward to learning more about:

  • Ada
  • Clojure
  • Common Lisp
  • D
  • Eiffel
  • Erlang
  • Falcon
  • Fantom
  • Factor
  • Go
  • Groovy
  • Haskell
  • Io
  • Java
  • JavaScript
  • Lua
  • MUMPS
  • Pike
  • Pure
  • Objective Caml
  • Python
  • Ruby
  • Scala
  • Scheme
  • Squeak
  • Tea
  • Tcl

To switch or not to switch, Part 1

This entry is part 1 of 3 in the series To switch or not to switch

I was listening to a talk the other day and the speaker derisively mentioned “those people who are happy writing Java for the rest of their lives”, and I thought “Am I one of those?” and then I thought “is that a bad thing?”. As part of my “question everything” journey, I decided that it was time, after 10+ years, to have Java report for inspection and force it to defend its title.

I should make it clear, that I am not a language geek, or collector. I generally disagree with “use the right language for the right problem”, I prefer “use the right language for most of your problems”. So far, Java has been that for me. Some things I do in Java are more easily done in other languages, but not so much so that it overtakes the headaches of heterogeneous codebases. If something is really difficult, or impossible in your main language, bring something else in, but keep it simple. I also think it’s fine to have more than one main language, a client of mine is currently transitioning off C#, keeping Java, and adding Python. What they don’t have is random parts of their infrastructure done in erlang or perl or tcl because that’s what someone wanted to use that day.

I could make this task easier and just look at the “marketable” skills out there, which is a small subset. While I think it’s unlikely that there is some forgetten language just waiting for its moment, it’s certainly possible I could find a neat one that’s fun to play with. Languages like Ruby and Python spent years before people could find jobs doing them. So I’m going to look at literally every single language I can find, and put them through a series of tests. If you find a language I haven’t mentioned, let me know and it will be given the same chance as the rest.

Round 1:

The point of this round is to identify languages that have any potential for being useful to me.

Qualifying Criteria

Rule 1. It must be “active”.
This is admitedly a subjective term, but we’ll see how it goes. Simula is clearly not active, while Processing clearly is, with a release only weeks ago.
Rule 2. It must compile and run on modern consumer hardware and operating systems.
This means, at minimum, it works on at least one modern flavor of Linux, because I will want this to run on a server somewhere, and I don’t want a Windows or OS X server, or worse, something obscure. For bonus points, it will also work on Windows 7 and/or OS X.

So, that’s it for now. There are no requirements for web frameworks or lambdas or preference for static versus dynamic typing, I think those elements will play out in later rounds.

  • Ada
  • Agena
  • ALGOL 68
  • ATS
  • BASIC
  • BETA
  • Boo
  • C
  • C#
  • C++
  • Clean
  • Clojure
  • COBOL
  • Cobra
  • Common Lisp
  • D
  • Diesel
  • Dylan
  • E
  • Eiffel
  • Erlang
  • F#
  • Factor
  • Falcon
  • Fantom
  • FORTH
  • Fortran
  • GameMonkey Script
  • Go
  • Groovy
  • Haskell
  • Icon
  • Io
  • Ioke
  • Java
  • JavaScript
  • Logo
  • Lua
  • Maple
  • MiniD
  • Mirah
  • Miranda
  • Modula-3
  • MUMPS
  • Nu
  • Objective Caml
  • Objective-C
  • Pascal
  • Perl
  • PHP
  • Pike
  • Processing
  • Pure
  • Python
  • Reia
  • Ruby
  • Sather
  • Scala
  • Scheme
  • Scratch
  • Self
  • SPARK
  • SQL
  • Squeak
  • Squirrel
  • Tcl
  • Tea
  • Timber
  • Unicon
  • Vala
  • Visual Basic .NET

This list is actually a LOT longer than I expected, and yes, there actually is a modern version of ALGOL 68. Stay tuned for part 2.

Logging Like it’s 2002

I’ve been going through my old code, looking for stuff that might be worth sharing. At the same time, I’ve been maven-izing my builds, and decided I should revisit each dependency, as some of this code is so old the dependencies are very out of date and/or included in the JDK now. Which brings me to log4j.

I’ve literally used log4j on everything I can ever remember doing in Java, but not anymore. Don’t get me wrong, I have no problem with it, and may continue to use it in my applications (if I don’t like logback), but I won’t be including it in any libraries anymore. After 8 years, I’ve finally adopted JUL. Here’s the options and why I chose JUL:

JUL (java.util.logging)

Pros

I’ll start with the victor, because the reason is the simplest. No dependency or version issues, one less thing to download, guaranteed to be there. There is also plenty of code out there to use one of the other frameworks to do your actual logging, so the config isn’t really a burden on the developers using the library.

Cons

Not sure yet. It’s not widely used, but I think that’s because there are far more Java frameworks/applications out there than libraries. There is also a performance issue with SLF4J when you have JUL logging set to a low level, but you shouldn’t need to run (presumably stable) libraries in debug or trace when performance is an issue (e.g. in production), only when you’re trying to debug something. The JUL’s actually logging isn’t really relevant here, as I think most applications will just be running it’s output through their own logging framework.

log4j

Pros

Works great, very stable, hasn’t changed in long enough that you only have version issues when you’re dealing with REALLY old code.

Cons

Less people are using it, more projects are going to SLF4J/Logback. These new frameworks do add some nice features, and log4j is basically abandoned, so I think it’s time to stop doing anything new with it.

commons logging

Pros

None?

Cons

I’ve always been against commons logging, because 99% of the time, it was just used to wrap log4j. The logic was that you could plug custom logging into it, but you can do that with log4j already, so it’s basically an abstraction of an extensible framework, with zero added value. Actually you have less value because you lose things like MDC. At this point it’s like a virus that just won’t go away, and always seems to end up in the classpath somehow. As far as I’m concerned, I consider this a completely superseded library.

logback

Pros

From what I can gather, logback really is (as claimed on their website) the continuation of log4j. Not having used it, I can only assume this is a good thing, and it just adds new features like parameterization. I’m going to try logback in my next application, and since logback includes slf4j, I will access my library logging that way.

Cons

It’s not really in wide usage yet, which means that a library requiring it is going to add an extra dependency.

slf4j

Pros

If logback is the modern version of log4j, slf4j is the supposedly useful version of commons logging, and supposedly improved version of JUL. It’s not a logger per se, it’s just an API/facade. It has the ability to combine multiple logging APIs and legacy frameworks into one stream, which is why it seems to be getting traction on complex applications.

Cons

I’ve had some serious versioning issues with slf4j, due to some methods being removed or changed, so you end up with older code throwing errors when you use a newer version, thus requiring that you only use the old version and introduce the chance for strange errors in code expecting the newer version. For this reason, I don’t feel very comfortable specifying any version of slf4j, and I will leave it to the user to add it if they wish.

Conclusion

So JUL seems like the best choice for a stable, single-purpose library to use, as it’s the least imposing on whatever uses it. It should be noted that I haven’t actually used JUL yet in a real app, so perhaps I will find out that there’s an actual reason for its lack of popularity. If that’s the case, I will likely use slf4j, and try and find out which methods cause issues so I can avoid them, and not be the person someone else curses for requiring it.

The 3 Ingredients Necessary to Make a Really Good Developer

I’ve been doing development long enough that I can now look back and have some perspective on the art/craft/profession. I’ve been asked many times “how do you become a developer?” and I now have a decent partial answer. They are not being “super smart” or “good with computers” or things like that. I think those are artifacts of these other attributes. I’d also guess that these apply to engineering in general, but I’ll limit myself to my own turf.

Curiosity

Good developers have a compulsion to understand how things work. they open files with text editors to see if its just xml or a zip file with a different extension. They run benchmarks against things that don’t seem to matter. They add query parameters like debug=true to websites. They try to break stuff. And not just software, they probably know how an internal combustion engine or an air conditioner works. They probably can tell you a bit about how the minimum wage affects inflation. My grandmother used to give me old radios and gadgets strictly so I could disassemble them.

This attribute is probably the one that separates the wheat from the chaff the most. There are lots of people who can code, or manage a system, but the ones that excel will need to understand how things work, and know that every juicy answer yields even more delicious questions.

Focus

The ability/requirement to focus is the subject of many other blog posts, but I view focus in a slightly different way. Focus is not eliminating distractions or even maintaining “flow”, focus is the ability to keep a problem in your head until you’ve solved it. Distractions can hamper this, so can multitasking or other external factors, but good developers can work on something, go to lunch, or go home for the evening, and pick up right where they left off.

Hard Work/Genuine Interest

I think there is a certain amount of innate aptitude, but I don’t think development is an exception to the 10,000 hour rule Malcolm Gladwell popularized. Luckily, its a trade where we can log those thousands of hours at an early age and make it look like we’re goofing off. I started with LOGO in the second or third grade, moving onto translating my piano music into BASIC, learning that a coda is just like a GOTO. On to writing databases to track fantasy baseball status with MS Works, and so on. Sure, we spent many college nights taking over IRC channels with bots, and crashing MUDs with scripts and floods, and that may have looked like simple nerd mayhem, but those experiences have tremendous value in the “real world”.

You can know all the buzzwords and put on a good show, but you can’t really fake the level of interest that side projects demonstrate. There are plenty of good-enough developers out there who punch in and out, and there are even jobs where you do cool enough stuff that you don’t feel compelled to break out of the rut (we try to be this way), but if you still need to hack on your own, you’ve got potential.

Contrary Opinion: Code coverage by checked exceptions

This is the first in what will hopefully be a series of statements that I may not entirely believe, but to put out there to see if they are viable…

Conventional Wisdom: Unit tests are great, you should have lots of them. More code coverage is better.

Conventional Sentiment: Checked exceptions are a hassle, more trouble than thier worse, promote sloppy coding, etc.

Contrary Opinion: Checked exceptions are a better way to address code coverage than unit tests.

On a scale of 1-10 of using and advocating for checked exceptions, I’m probably an 11. I completely disagree with pretty much all of the conventional complaints against them. They do not promote spaghetti code, they actually clean up your normal logic and neatly compartmentalize your error handling. They do not promote sloppy behaviors like exception swallowing, that’s entirely the programmer’s fault. Adding or removing exceptions breaks client code. Yes, it does, why is this a problem? You added additional error conditions, meaning you changed your contract with the clients, and they should be revisited. If you didn’t have this obvious way to signal a change, chances are that the client would handle things incorrectly. Even some of the major voices in computer science have fallen out of love with them, but usually the reasoning is based on people using them wrong (or being too hard to use right).

On a scale of 1-10 of using and advocating for unit tests, I’m probably a 2. They absolutely have their uses. A straight-up algorithm, especially things like math and parsing, should be unit tested for various input values to assure they’re returning the proper value. However in most modern business/consumer software these represent a very small portion of your code. There are far more lines of code dealing with things like authentication, user inputs, file loading, database and network operations, etc. These are complex activities. Even simple CRUD applications can end up invoking hundreds of functions across dozens of libraries for every operation.

The crux of my argument is that if you use exceptions properly, you don’t need to test if an operation completed properly. If the operation completes, it did so properly, all other conditions would fail to complete. Since you want to be using exceptions properly anyways for cleaner code, unit testing this code is redundant but a waste of time and energy to write and maintain. Not only that but exceptions give you code coverage at compile time AND error handling at run time, which unit tests cannot do.

If you disagree, please tell me why!