How to Pick a Software Stack

When starting a new company or project, one of the most daunting phases is selecting a stack. There are lots of things that factor into the decision, and anyone who has done this before will attest that you will make some bad choices, and some of those will be nearly impossible to undo, especially if you’re successful. Some questions to consider:

What do you already know/love/hate?

When considering your options for some piece of the stack, it is totally reasonable to award points to things you know and love, and deduct points from things you know and hate. It is, however, not a good idea for this to be the principal factor. You should be able to defend your favorite choice against the others objectively and quantitatively, not solely on the basis of “it feels better” or “I know it already”.

Are you just trying to execute a business/goal or are you trying to explore and learn something new?

One of my two virtues of engineering is curiosity. Throughout your career, you should always be looking for opportunities to explore new ideas, new tools, new methods. If this is some moonlight or passion project, actively try to weight new and unfamiliar things higher. If this is a major venture with lots of other people’s time and money, there is just as much to be gained from leveraging your experience and going for flawless execution of things you’ve already worked out the kinks on.

If you have teammates, what do they know/love/hate?

This is a case where you should set the hierarchy aside and try to get everyone’s opinion. If you’re a senior engineer or architect you can bring experience to the table but don’t discount the novel approaches and backgrounds of your teammates, regardless of their experience. We so often get stuck in our ways, even after only a few years, and avoid novelty and diversity in the name of risk.

Are you going to need to hire/outsource the work?

If your plan involves getting dozens or hundreds of people on board in the near future, you’re going to want to lean to the more popular/vanilla options. Trying to hire 300 people in the next 12 months to work on your Erlang/Neo4J stack running on an Arduino cluster is going to be much harder than getting people to do Java/MySQL on AWS.

Are you building a prototype, an MVP, or a “real” version.

Deciding, and more importantly understanding what you’re building is a topic worth exploring on it’s own, but roughly speaking you’re building one of three things:

The Prototype

You just need to get something “working” to explain your idea or your solution. Unless you’re doing something truly revolutionary (p99.999 confidence spoiler alert: you aren’t), try to prototype your idea/business/product OR your technology, not both at the same time. Also really commit to throwing this away and starting over with a proper plan if there is promise. I’ve seen/had too many prototypes turn into projects that have a pile of technical debt before they even really start.

The MVP

I feel the MVP is overused, and too often is an excuse for a poor quality product or a starved team/budget. You obviously want to iterate, and if you want to call your first release an MVP for buzzword sake, go ahead, but don’t write throwaway code unless you’re committed to throwing it away (see prototype above). I’ve lost count of the number of companies and projects I’ve talked to that are rewriting their Rails or Django app because those seemed like easier paths to an MVP and “now we just need to scale”. That’s like saying “I’m going to get a job, but then I’m going to go and get a completely different job that actually pays my bills.”

If you are truly building an MVP, then your process for stack selection is no different than building the real version, it’s just prioritized more aggressively.

The “Real” Version

The first step here is to figure out, as best as you and your team can, what the successful (often AKA profitable) version of your app/system looks like. Do you need 100 users to break even? Do you need a million? If you need a million, don’t spend a single minute building something that only works for hundreds. If you need 100, don’t spend any time building it for 1 billion (unless you get that for no additional work, E.G. something like Amazon S3).

What features are truly required to hit that goal? Make sure there is a place for them in your architecture even if you don’t plan to build them right away. Do you need support and sales teams to be successful? Make sure you’re putting hooks in for the tools they will need (E.G. CRM, ticketing, admin consoles).

What can you get “for free”?

Everything has a cost, in terms of time, money, people, etc. but those costs are not always the same across companies and projects. Your team’s experience can discount learning curves, your company may already have investments or licenses for things. Never use something just because it’s free, but make sure that is a part of your decision.

If you’re at a larger organization with a number of teams, consider if there is a team supporting certain aspects of what you might use. If there is a logging system that is in place and ready to go and has its own support, it’s should be a hard sell to build or buy or set up a new one just because of a few features. Often that team will even build what you need!

What are your competitors using?

Competitive analysis can be difficult, most software companies don’t go too deep on what they are using for any given situation, but that information can play a large part in your decision. If you find a conference talk where they talk about what they tried and had good or bad results from, you’ve possibly just saved yourself a lot of pain. You will often just get a lot of good ideas (“I never considered using X to do Y”), or pointers to things you’ve never heard of.

You might also find some competitive advantage. If they are using something you know is difficult to change or won’t scale well, make sure yours can beat them in those aspects. By the time they realize you’re about to pass them it might be too late for them to adjust.

What will fail first?

As you’re building your stack, know where the weak points are, and don’t be shy about circulating that information. If you can’t get to your target latency because some component will use 90% of that budget, try to find a replacement early rather than assuming you can plug something else in later. You should be confident that all of your core components can hit your targets, and if you aren’t sure, validate that now. You’ll find that the interconnected nature of systems will often lead you to replacing more than one piece at a time, which only makes it less like you’ll be able to do it, and more likely to fail.

What else?

Picking a stack is harder, and takes more time, than most people realize. I’m sure there are other things to consider and I’ve love to hear your thoughts on this in the comments.

How do you do, fellow kids?

I’ve spent the last few years sequestered away in “big tech” which has been an interesting and mostly positive experience, but one of the more unexpected side effects, especially as an extreme generalist, has been the specialization on the technical side of things.

From when I first started coding for money in the mid 90s I was always a true “full stack” engineer, even before that was a term. Give me a bare server and a sketch and I could handle the rest, even some basic UI design. As backends became more complex and interesting I gravitated that way, but was still pretty up-to-date on front-end technology for a nearly 20 year stretch.

The front end world got increasingly difficult to keep up with on a casual basis as single page apps became a thing and powerful frameworks took over. At Mondogoal I was entirely back-end from late 2013 onward, although I was also the product owner and oversaw design and branding. At Facebook I worked on maps and geocoding, writing mostly backend C++ code. At Google I’m deeper yet, working on the networking team.

During my last job search I initially still had “full stack” on my resume, but realized that I didn’t think I met that bar any more, namely because I didn’t really know any of the things most people were using these days, like React or Angular. My skills in that area, (e.g. jQuery), are no longer in demand at places I’d want to work for, supplanted by experience with scalability and distributed systems.

And don’t get me wrong, I have no regrets about this, at all. Back-end experience and knowledge decays much slower than the front-end, so now that I’m often the oldest person in the room, that’s a feature, not a bug.
Back-end work also tends to pay more and offer more paths to leadership roles. But I think I’m still a full-stack developer at heart, and the deeper I go the more wistful I get for being able to make something you can see and touch.

Enter, the Side Project

I haven’t had a software side project in years. With all that’s been going on, especially becoming a parent, my energies have been allocated elsewhere. But there’s been an increasingly more noticeable hole there that I need to fill.

I brainstormed a few ideas and kicked them around until one of them, a browser-based game, settled out that checked off most of the boxes:

  • Learn how to build a modern front end.
  • Give me a reason to explore the Google Cloud offerings (since those products are effectively my customers for my day job).
  • Get back up to speed on things like analytics.
  • Scratch an itch I’ve had for a long time (a nerdy economy-based game).
  • Doesn’t risk any conflict with my day job.
  • Give me something to blog about!

Now what?

I’m going to talk about this in more detail in a future post but after spending a bit poking around the current front end world, I feel like I’m basically starting over. React, Angular, Vue. Yarn, webpack, JSS. The list goes on and on. Even my JavaScript is out of date, with all of the additions over recent years, and TypeScript being a serious option now (it was still kind of a toy last time I tried to use it back in 2012/2013).

This is a bit intimidating, but more so exciting. There’s so much to learn I haven’t even really started to think about the actual project or how feasible it will be to launch. Stay tuned for updates!

Driverless Cars

I completely agree with the headline of this blog post, but not with the overall sentiment.  Driverless cars are going to change the world, and for the better.  I’m not sure how much they will do so in my lifetime, it’s hard to believe that anyone born before 1985 or so is going to completely trust them.

The car insurance industry will cease to exist. These cars aren’t going to crash. Even if there are hold-outs that drive themselves, insurance would be so expensive they couldn’t afford it, as no one else would need it.

These cars will crash.  For as long as humans are allowed to drive, they will be causing accidents, hitting other driverless cars and each other.  There are also a number of causes of accidents that are still going to happen, such as those involving wildlife or severe weather or mechanical failure.  The robocars will handle these situations far better than humans, but they will still happen, and people will still be injured and killed as a result.

If the cars don’t crash, then the auto collision repair / auto body industry goes away. The car industry also shrinks as people don’t have to replace cars as often.

The car industry will likely shrink over time, just as any other technology-driven industry as.  They will be forced to evolve to new products.  This will happen slowly enough that if they’re properly managed, they should be able to shrink through attrition.

Long-haul truck driving will cease to exist. Think how much money trucking companies will save if they don’t have to pay drivers or collision and liability insurance. That’s about 3 million jobs in the States. Shipping of goods will be much cheaper.  On that note, no more bus drivers, taxi drivers, limo drivers.

This is definitely true.  I bear no ill will towards professional drivers but I think we can find jobs that are more rewarding for people than driving goods or passengers from point A to point B, and often drive back to A with an empty truck.  Trucks also account for the vast majority of road wear, a single tractor trailer can do as much damage to a road as nearly 10,000 normal cars.  The main reason we load so much weight onto a truck is so you only need one driver.  It will be more efficient to send smaller loads by robotruck, as they can be better targeted (think one truck per state rather than one truck per region), which will result in smaller trucks.

Meter maids. Gone. Why spend $20 on parking when you can just send the car back home? There goes $40 million in parking revenue to the City of Vancouver by the way.

Considering much of that revenue is probably supporting the collection of that revenue (meter maids, infrastructure, towing, courts,etc.) I don’t think this is a net loss.  Also, fewer parking spots means more pleasant streets with less traffic problems.

Many in cities will get rid of their cars altogether and simply use RoboTaxis. They will be much cheaper than current taxis due to no need for insurance (taxi insurance costs upwards of $20,000/year), no drivers, and no need for taxi medallions (which can cost half a million in Vancouver). You hit a button on your iPhone, and your car is there in a few minutes.  Think how devastating that would be to the car industry. People use their cars less than 10% of the time. Imagine if everyone in your city used a RoboTaxi instead, at say 60% utilization. That’s 84% fewer cars required.

Absolutely!

No more deaths or injuries from drinking and driving. MADD disappears. The judicial system, prisons, and hospital industry shrink due to the lack of car accidents.

Let us hope that we don’t see MADD exhibit the Shirky Principle (“Institutions will try to preserve the problem to which they are the solution”) and simply fades away to an irrelevance we can all agree is a success.

Car sharing companies like Zip, Modo, Car2Go are all gone. Or, one of them morphs into a robo-taxi company.

I think they will definitely be robotaxis, but there will also be a need for specialty cars like pickups.  We may even see an increased diversity of car designs available for rental where you can have a special grocery-mobile sent over, or a van with 12 seats, or one with extra entertainment options for your long trips, and so on.

Safety features in cars disappear (as they are no longer needed), and cars will become relatively cheaper.

Very unlikely, as people buy fewer cars and use them less frequently the prices will go up accordingly.  We’ll also probably require, through legislation, even more safety features, simply because of an inherent distrust of the technology.

I’m  really hoping that robocars are a reality within the next 30-40 years when I will be at the point where I shouldn’t be driving any more, and I’m happy to see that we actually seem on track to do that.

Amazon Showrooms

Amazon caught a lot of heat over this past holiday season over some improvements to its shopping app.  It made it easier than ever to find out that you probably don’t need to buy that blender at Sears, when you can get it for 30% less on Amazon and don’t even have to carry it home.  There were cries that small businesses can’t compete with this and would all be dead soon.

There is nothing at Best Buy or Barnes & Noble that you can’t get on Amazon (or many other online stores).  It’s rare that it will not be cheaper online, even during a sale (which typically just brings the price down to a normal online price).  Is it sustainable to have a store where I can go and hold something, and then order it from somewhere else?  No.  Should we feel bad for the big box stores?  No.  Should we feel bad for the shopkeeper who sells a particular niche at a high markup without adding value?  No.

Stores that only sell commodity products are a recent innovation to take advantage of a temporary imbalance.  They will eventually go the way of dodos, video rental stores, and record labels.  We’re still going to have a few, because there are enough “need it now” purchases to sustain the Targets and Wal-Marts, and we’ll probably still have a few high-end ones where excellent service matters like Nordstrom, but most of the stores out there are turning into showrooms.

What if Amazon bought BJs?

(BJs is a consumer warehouse/bulk goods store, like Costco and others).  My Prime membership takes the place of my BJs membership.  Instead of walking around with a giant shopping cart and driving home with mass quantities of things, I simply browse the aisles for products I like.   When I see one, I scan it with my phone, and it’s on my doorstep the next day.  There are a few people on staff that might help, and there’s a hotline to specialists that understand the products and can answer my questions.  No need for a massive loading-dock infrastructure, or inventory control, or 50′ tall ceilings to heat, or many of the other overhead expenses that yield the current retail markups.

What about the little guy?

I’d like to see our shops go back to actually making things and/or adding value.  Custom products, not “regional dealers”.  There will definitely be less of them, but this will make more space and lower rents for the people who just want a spot where they can sell their craft, or for people to provide useful services instead of distribution.

Logging Like it’s 2011

Earlier this year I revisited how I was logging things in Java, and decided I would try a hybrid approach.  I’m reporting back to say that it’s been successful. There are basically two ways I do it now:

Use java.util.Logging for libraries

It turns out that there’s nothing actually wrong with JUL aside from its limitations in terms of configuration.  It has different names than the Ceki Gülcü loggers (warning() instead of warn(), finest() instead of trace(), etc.) but otherwise works the same.  The configuration side of things I can’t even speak do as I haven’t had to configure it, I never use the actual JUL output.

Use Logback for applications

As suspected, Logback is basically just a new version of Log4J.  It’s got some niceties like MDCInsertingServletFilter that means I don’t have to write that filter myself anymore, and it’s supposedly faster and more stable, so there’s no reason I can see not to use it.  I also like that it has a format for condensing class names without completely losing the package names, so it goes from “com.efsavage.application.servlet.LoginServlet” to “c.e.a.s.LoginServlet” instead of just “LoginServlet”.

Why use two?

I like the fact that my libraries have zero logging dependencies and configuration, so I can always pop them into another app without adding baggage or conflicts or much additional configuration.  I can upgrade the logger in one application without having to deal with version conflicts of my other apps and having to do classpath exclusions and that type of nastiness.

Tip

If you do it this way, and you see your JUL logging showing up twice, you can edit the default logging config in your JDK installation, or if you prefer to leave that untouched as I do, try this (via Java Bits):

java.util.logging.Logger rootLogger = LogManager.getLogManager().getLogger("");
Handler[] handlers = rootLogger.getHandlers();
for (int i = 0; i < handlers.length; i++) {
rootLogger.removeHandler(handlers[i]);
}
SLF4JBridgeHandler.install();

This basically yankts out the default logger and let’s SLF4J do it’s thing alone.  In a webapp you’ll probably throw this in your context listener or startup servlet where you previously did your log4j config.

Workstation Setup 2011

A new workstation means it’s time to install lots of stuff, and we’re still a long way from DropSteam.  Here’s my log from fresh Windows 7 install in a new VM image to a functional development environment:

First, I hit Ninite and install:

  • All Browsers (I use Chrome as default)
  • Adobe Air
  • Adobe flash
  • VLC Player
  • IrfanView
  • Inkscape
  • Paint.NET
  • Foxit Reader
  • PDFCreator
  • CutePDF (yes, you need both PDF printers, as it’s fairly common for one of them to have a problem with a particular job)
  • CCleaner (tweak settings before running so you don’t nuke more than you want to, like browser history)
  • 7-Zip
  • Notepad++
  • WinSCP
  • JDK

Then I grab the ZIP of all of the Putty programs.  I put installer-less programs like this in C:\bin

Cloudberry Freeware for Amazon S3 buckets.

Download JavaDoc and install in JDK folder.

Download Eclipse (3.4, not impressed with 4.X so far) and then:

  • Set text font to 8pt Lucida Console
  • Most companies and many open source projects are still using SVN so I install the Subclipse plugin for Eclipse.
  • I’m not a huge fan of m2eclipse but I find that doing eclipse:eclipse from the command line costs you too much integration, so I use it.
  • Turn on all compiler warnings except:
    • Non-Externalized Strings – Enable as-needed
    • serialVersionUID – Not useful for most projects
    • Method can potentially be static – False positives on unit tests
  • Turn on line numbers
  • Install CheckStyle.
  • Install FindBugs.

Maven 3 seems a little rough around the edges so I still use Maven 2.X

Install Cygwin and add git, svn, curl, and ssh packages.

Install MySQL Community Edition.  During the installer I:

  • Change the charset to utf8
  • Fix the windows service name to something like MYSQL5
  • Add to windows path
  • Add a password

JRebel.  You’re using this, right?  If not, slap yourself and then go get it.  Pay for the license out of your own pocket if you need to.

Lombok.  I have finally used this on a real project and can say it’s ready for prime-time.  It does not work with IntelliJ IDEA but I haven’t really seen any reasons to use IntelliJ that outweigh the benefits of Lombok.

Photoshop Elements because while IrfanView is great for viewing and Paint.NET is great for simple edits, you will at some point need a more powerful editor.  Also most designers work in Photoshop so this let’s you open those files directly.

Photoshop Elements+ is basically a $12 unlock of some of Elements’ crippled features.  For me it’s worth it for tracking alone.

LastPass is useful even if you don’t store anything sensitive in it, it’s great for testing webapps with multiple users.

I use Git for my own work so we’ll need that. Don’t forget to set your name!

I also make some Windows tweaks:

  • Set desktop background to black.
  • Check “Show hidden files, folder and drives”.
  • Uncheck “Hide extensions for known file types”.
  • Set %JAVA_HOME to JDK directory.
  • Add Maven’s bin directory to %PATH
  • Add C:\bin to %PATH

I will obviously add more over time, but this is the stuff I know I will need.  What’s great is that almost all of it is free, and it can all be downloaded (except the original Windows install), so no disks required like the old days

You might think this is an incomplete list, where is my email client, my MS/Open office, my music player?  I don’t use those unless I have to.  Keep in mind that this is a VM so some of this software is installed on the Host OS, while the rest of it I prefer to use web-based solutions (Meebo, Google docs, webmail) so there’s no issues of having to keep updating settings.

The Most Important Performance Metric

Up until about 5-10 years ago, the performance gains in computers were huge from one generation to the next. Tasks that took all night soon took an hour, then 15 minutes, then 3 minutes, and so on. You felt these and they changed the way you worked. Eventually, hardware got beyond the point where it changed what you did and just became an “oh that’s nice” effect.

I recently decided, after months of hemming and hawing, to replace my main workstation in my home office. The previous one was a c2008 midrange Dell that has served me well. I had been using some better hardware at my job, a decent desktop with an SSD and then a decent laptop, also SSD. When I went freelance, I splurged on the best laptop I could find, with the exception of it not having an SSD, but rather twin 7200rpm drives in RAID 0. It has an amazing 17″ screen, 8GB RAM, i7 CPU, etc. It ran me $3200, and I figured I’d replace the drives at some point rather than spend the extra $1k from the manufacturer. I wasn’t in any rush because it felt as fast as my previous SSD boxes so I wasn’t anticipating it being worth the expense, much less the time spent reinstalling.

As for my new desktop, I went similarly all-out. It’s an i7-2600k, 16GB Corsair Vengeance 1600MHz RAM (the most you can reasonably buy at the moment), 2 x 120 GB Intel 510 SSD in RAID on an MSI Z68A-G65 mainboard.  The video card is an nVidia GTX 460 SE, which is not really in the same league as the rest of the hardware but this is primarily a work computer.  600W power supply, Cooler Master HAF X case, and Windows 7 Professional and it came in at about $1800 altogether, which actually isn’t bad. It was very easy to put together, probably 90 minutes, and booted fine the first time I powered it on.

I’ve been using it for the past day, and the difference is astonishing. It’s not just fast, it’s different, and I think I’ve just figured out why.

The most important performance metric for modern workstations isn’t how fast it completes arbitrary tasks, it is the percentage of how many of your tasks completely instantly.

This computer, compared to every other computer I’ve used, Macs, PCs, Linux, whatever, does far more things without any delay whatsoever. So much so that my rhythm is off and I’m actually a little uncomfortable at the moment (I know this will pass).  The familiar delays I was used to, even when they were 50-100ms, whether opening a file or pulling up an autocompletion are gone.

I guess my point is that if you spend all day on a computer and think that there’s no point in upgrading any more except to replace things that break, I think you will be pleasantly surprised if you decide to splurge a bit.

Java 7

Java 7 (AKA Java 1.7.0) is out! What does this mean, is this a Big Deal like Java 3 or 5 or a snoozer like Java 6 or something in the middle like Java 4? It all depends on what you do with it, but for most Java developers, we’re looking at a Java 4-ish release. There are some new APIs, and some really nice syntax cleanup, but ultimately it’s not a big deal.

Here’s the list of what’s changed:

The stuff I hope I don’t have to care about (AKA The Boring Stuff)

These are the under-the-hood changes I will probably only encounter when debugging someone else’s code.

Strict class-file checking

I’m not entirely sure what this is, and it doesn’t seem like too many other people are either.  I think it means that Java 7 files have to declare themselves as different than previous versions and do some pre-verification as part of compilation.  I don’t know if this is actually something new or just declared baggage.

As specified in JSR 202, which was part of Java SE 6, and in the recently-approved maintenance revision of JSR 924, class files of version 51 (SE 7) or later must be verified with the typechecking verifier; the VM must not fail over to the old inferencing verifier.

Upgrade class-loader architecture

Fixes some deadlock issues when writing non-hierarchal custom classloaders.  If you’re writing custom classloaders, this is probably great news.  Unfortunately if you’re writing custom classloaders you’re pretty much screwed with or without this improvement.

More information here.

Method to close a URLClassLoader

Pretty self-explanatory.  if you’re using a URLClassLoader, you may at some point want to close it.  Now you can.

More information here.

SCTP (Stream Control Transmission Protocol)

From what I can gather SCTP is somewhere between UDP’s lightweight “spray and pray” approach and TCP’s heavy-duty “armored convoy”.  Seems like a good choice for modern media narrowcasting and always-on sensor networks, but don’t quote me on that.

More information here.

SDP (Sockets Direct Protocol)

Implementation-specific support for reliable, high-performance network streams over Infiniband connections on Solaris and Linux.

SDP and/or Infiniband seem to networking where you can get more throughput by bypassing the nosy TCP stack, but without completely ignoring it.  Kind of like flashing your badge to the security guard and letting all your friends in, I think?

More information here.

Use the Windows Vista IPv6 stack

I’m assuming that this also applies to Windows 7, but I’ll worry about this when my great-grandkids finally make the switch to IPv6 in 2132.

TLS 1.2

Better than TLS 1.1! (TLS 1.2 is SSL 3.3).

More information here.

Elliptic-curve cryptography (ECC)

The official description sums it up:

A portable implementation of the standard Elliptic Curve Cryptographic (ECC) algorithms, so that all Java applications can use ECC out-of-the-box.

XRender pipeline for Java 2D

Now, when you aren’t gaming, you can  leverage some of your GPU power to do image rendering and manipulation in Java instead of just mining BitCoins.

Create new platform APIs for 6u10 graphics features

Let’s you use opacity and non-rectangluar shapes in your Java GUI apps.

More information here.

Nimbus look-and-feel for Swing

Your Swing apps can now look like outdated OS X apps.

More information here.

Swing JLayer component

Useful way to write less spaghetti when trying to make Java GUI apps behave like other modern frameworks.

More information here.

Gervill sound synthesizer

Looks like a decent audio synthesis toolkit.  Not sure if this really needs to be in the JDK but now you can add reverb to your log file monitor’s beeping.

More information here.

The important stuff I don’t really use (AKA The Stuff I’m Glad Someone Thought Of)

These are the changes that probably only help if you’re on the cutting edge of certain facets of using Java.  Eventually these will probably benefit us all but I doubt many people were waiting for these.

Unicode 6.0

Also self-explanatory, new version of Unicode.  Has the new Indian Rupee Sign (?) and over 2,000 other new glyphs, so it’s certainly important if you use those, but otherwise not so much.  I don’t know enough about switching versions of Unicode to say if this is something to be worried about, but if you have lots of it in your app, it’s probably safe to say you should do some major tests before switching.

More information here.

Locale enhancement

Locale was previously based on RFC 1766, now it is based on BCP 47 and UTR 35.

This isn’t just new data, this is a new data structure, as the number and classification of languages on computers has evolved beyond the capabilities of RFC 1766.  I’ve always been glad that Java is so robust in it’s internationalization support even though I’ve never really taken advantage of most of it.

Separate user locale and user-interface locale

Now this is interesting, but I can’t find any good use cases yet.  My guess is that this is part of Java’s ongoing push towards multi-platform/multi-language interfaces, notably Android.  You can set a Locale (and the rules a locale dictates) not only on geo/cultural basis but on a device/environment basis.  Locale.Category only has two entries but I wouldn’t be surprised if more of them creep in in future versions.

JDBC 4.1

java.sql.Connection, java.sql.Statement, and java.sql.ResultSet are AutoCloseable (see try-with-resources below).

Also, you can now get RowSets from a RowSetFactory that you get from a RowSetProvider.  If you’re still writing JDBC this probably makes your life easier, but only in the “at least I”m not writing custom classloaders” way.

Update the XML stack

Upgrade the components of the XML stack to the most recent stable versions: JAXP 1.4, JAXB 2.2a, and JAX-WS 2.2

Your XML library is always a version that’s too old or too new, so you might as well go with the new one.

The important stuff I don’t use yet (AKA The Cool Stuff)

InvokeDynamic – Support for dynamically-typed languages (JSR 292)

The short version is that this makes things better for people implementing dynamically-typed languages on the JVM, like Clojure.  While the JVM is actually less statically typed than you might think if you’ve only used it via normal Java, methods in particular were difficult for these languages.

The long version is here.

Enhanced MBeans

I’ve never knowingly used MBeans (Managed Beans), but they do seem useful for large systems.  Regardless, they’ve been enhanced:

Enhancements to the existing com.sun.management MBeans to report the recent CPU load of the whole system, the CPU load of the JVM process, and to send JMX notifications when GC events occur (this feature previously included an enhanced JMX Agent, but that was dropped due to lack of time).

The Important Stuff I Use (AKA The Good Stuff)

These are the reasons I’ve even bothered to look at Java 7.  I’ll probably write a follow-up on each of these when I have some experience with them.

JSR 203: More new I/O APIs for the Java platform (NIO.2)

I’m including this here because I’m actually doing some data/file work where this will come in handy.  Basically this exposes a lot of I/O and Filesystem functionality that you previously had to dip into native libraries for, like actually telling the OS to copy a file rather than reading and writing the data yourself.  If you’re not doing any file-based work right now, keep a mental note to look into this the next time you type “File”.

More information here.

NIO.2 filesystem provider for zip/jar archives

Included here for similar reasons as the previous item, but I think it’s pretty cool that you can treat a zip file as a filesystem and operate within it.  Unfortunately it looks like the basic implementation is read-only, but I’m guessing the pieces are there for a read/write version.

More information here.

Concurrency and collections updates (jsr166y)

java.util.concurrent, as far as I’m concerned, is something everyone who considers him/herself a senior Java dev should be familiar with.  It’s rare that your code will not be able to take advantage of it in some way.  One of the missing pieces was a good fork/join component, where you can split jobs and wait for them all to complete in parallel before moving on.  I actually built a version of this approach (sadly not open sourced), so I’m eager to tinker with the new standard one.  I expect that list most of the concurrent library you’ll need to sprinkle in some of your own helper code to make things elegant, but the hard parts are solved and built and ready for assembly.

More information here.

Project Coin – Small language enhancements (JSR 334)

Ah, the sugary goodness that you’ve been waiting for.

  • Strings in switchNot terribly useful since we could (and probably still should) use enums, and you still have to use if blocks for anything fancy like regex matches.
  • Binary integral literals and underscores in numeric literalsI never really thought of this as a problem, much less one worth solving, but sure, why not.
  • Multi-catch and more precise rethrowNow this, as an ardent fan of exceptions, I like.  It not only cuts down on copy/paste exception handling, it helps avoid subclassing exceptions, which is almost always the wrong approach.  More info on this later.
  • Improved type inference for generic instance creation (diamond)Great idea that really should have always been there. Example:
    Map<String,Integer> myMap = new HashMap<>();
  • try-with-resources statementProbably the most important syntax change.  Now you can skip all of those finally { resource.close(); } blocks that were important but verbose even for my taste.  Doesn’t really help with the situation of the close method throwing an exception of its own though, so I’m curious to see how I end up handling that situation.
  • Simplified varargs method invocationJust avoids some methods that really never should have been there in the first place.

More information here (PDF).

Summary

So basically, a bunch of “nice to have” stuff.  Code will be a little cleaner, resources will be closed a little more often.  GUI apps will be a little fancier.  Some more people might try concurrency.  Programs will run a little faster.  The next thing you know we’ll be looking at Java 8

 

 

 

Readability + Kindle + Something Else

I really like my Kindle. Beyond all of the more tangible/advertised benefits it has, the most important thing it’s done for me is that I’ve been reading more since I started using it.

I also really like Readability, I think it’s an optimistic and hopeful view of the future of content on the internet, rather than the arms race of ad blockers and the AdWords-fueled plague of content scrapers.

The fact that these two things I like can join forces is also great. I can send an article to my Kindle via Readability. If I see some long, thoughtful piece, I click two buttons and it will be there for me when I settle in for the evening. Unfortunately I don’t/can’t use this as much as I’d like for two reasons.

Lost Commentary

I find most of my new/fresh content via link-sharing sites. Starting long ago in the golden age of Slashdot, I’ve gotten into the habit of checking the comments on an article before I read it. I don’t usually read the comments, I just skim them, and get a sense of how worthwhile it is to read the article. If I see a healthy discussion, or lots of praise, it’s clearly something worth spending a few minutes on. Even if I see some well-written refutations, it can be valuable in a “know your enemy” sense. If I see something like “Here’s the original article” or “How many times will this be reposted?” then perhaps I’ll just move on.

After I’ve read the article I might go back and read those comments, or perhaps even leave one. With the Kindle/Readability combo, I can’t do that. Blog posts will come through with their own comments, but for whatever reasons, there always seems to be better discussion offsite.

Linkability

The “premium” content sources like major magazines or newspapers rarely link off of their stories. I think this is conditioning from the print era, but it actually plays well to this offline system. If an author talks about another website he’ll probably include the relevant details in the article, or quote it, or include a screenshot.

Blogs, however, are chock-full of links, often without context, sometimes just for humor, but sometimes as an essential part of the article. Very few blog posts are actually viable in a vacuum. I have a special folder in Google Reader called “droid” which are blogs that generally don’t do this, and are good for reading when I have idle time (via my phone, hence the name) and don’t want to deal with links.

Something Else

I’d like to have some way to read an article or post offline, that can pull in these other sources. Perhaps a “send to kindle” that actually generates an ebook with the article as the first chapters and comments from my favorites collated into other chapters. Or perhaps a Kindle app that can do this and stay updated. What I don’t want is a mobile OS-style app that pops open browser windows, as that’s an entirely different use case. A “send back to computer” would be useful for stories that require switching back to browse mode.

TLDR: Sometimes I just want to read, not browse.

Security Club

Sony has been getting repeatedly hacked. We’ve seen it before with the TJX incident, and many others, most of which never get reported, much less disclosed, or even discovered. In some of these cases, only email addresses are taken, or maybe passwords. In others, names and addresses are exposed, as well as medical conditions, social security numbers, and other very sensitive information. It seems to me that this is happening more often, and I think it’s for a few reasons.

Bigger Targets

The first reason is that targets are getting bigger and the rewards go up with size. Nobody is going to waste their time getting into a system with a few thousand random users when you can get tens of millions for the same effort. As more people use more sites, it’s only natural that there are going to be more million-user sites out there. This reason isn’t a big deal, it’s just the way things are.

Better Rewards

The second reason is that more companies are collecting more data about their users. This data is valuable, possibly the most valuable asset some of these companies has. Facebook and Google make much of their money from knowing about you, what you do online, the types of things you’re interested in, different ways to contact you.

Large companies like Sony can afford to take whatever information you give them and cross-reference it against various databases to get even more information about you. This lets them focus marketing efforts, tailor campaigns to you, shape product development and so on. This also lets them make the site more easy to use with pre-filled information, to increase sales and conversions.

We don’t even really question when a site asks us for our name any more. What’s the harm, right? Sure, I’ll give them my ZIP code too, and maybe even my phone number, they probably won’t call me anyways, right? Now ask yourself, why do you need to give your name, mailing address and phone number to a company to play a game where you are a pseudonymous elf?

The real answer is that they don’t. They might need it for billing purposes, but billing databases are kept much more secure for reasons I’ll explain later. They ask for this information because it’s free, and because you’ll give it to them, and because it’s valuable to them. It’s probably not protected very well, and when it gets stolen everyone shrugs, changes the password on the database, emails the FBI to make it look like they care, and gets back to more important business like social media strategies.

No Penalties

The companies involved are embarrassed and probably suffer some losses as a result, but these are mostly minor injuries. The news stories spin it to make the intruders the sole criminals, and lose interest. The only people who really pay for these incidents are the people whose data has been stolen. There are no requirements on what companies have to do to protect this information, no requirements on what they need to do if it is compromised, no penalties for being ignorant or reckless. Someone might figure out that it’s going to cost them some sales, and they put some money in the PR budget to mitigate that.

This is the reason why billing information is better secured. The credit card companies take steps to make sure you’re being at least a little responsible with this information. And in the event it leaks, the company who failed to protect it pays a real cost in terms of higher fees or even losing the ability to accept cards at all. These numbers make sense to CEOs and MBAs, so spending money to avoid them also makes sense.

How to Stop It

There are obviously a large number of technological measures that can be put in place to improve security, but there’s one that is far simpler and much more foolproof. But first, let’s look at banks. Banks as we know it have been around for a few hundred years. I’d bet that you could prove that in every single year, banks got more secure. Massive vaults, bullet-proof windows, armed guards, motion detectors, security cameras, silent alarms, behavioral analysis, biometric monitors, the list goes on and on, and all of these things actually work very well. But banks still get robbed. All the time. When was the last time you heard of a bank robber getting caught on their first attempt? They are always linked to dozens of other robberies when they do get caught. Why?

Because they’re full of money.

They can make it harder to rob them. They can make it easier to catch the people who did it. But the odds don’t always matter to someone who sees a pile of money sitting there for them to take if they can bypass these tricks.

People break into networks for many reasons, but the user data is often the pile of gold that they seek. So the most effective way to stop someone from breaking in and stealing it is to not have it in the first place. This advice works in 2011, it will work in 2020. It works on Windows, OS X and Linux. It works online and offline, mobile or laptop, and so on.

“The first rule of security club is you try not to join security club: minimize the amount of user data you store.” – Ben Adida

So if you’re in a situation where you need to figure out if your data is secure enough, or how to secure it, start with this question: Do you need it in the first place? Usability people say they want it. Marketing people say they need it. If you’re an engineer, it’s a losing battle to argue those points, because they’re right. Stop convincing people why you shouldn’t do it, and put a cost on it so they have to convince each other that it’s worth it.

Anyone who went to business school knows the cost/value balance of inventory. It’s pretty straightforward to discuss whether a warehouse should be kept opened or expanded or closed. Nobody wants to store anything for too long, or make too much product or have too many materials. But ask them about how much user data you should be storing and the response will be something like “all of it, why wouldn’t we?”.

So make sure that the conversion bump from using full names asking age and gender and doing geo-targeting covers the cost of the security measures required to protect that data. Make it clear that those costs are actually avoidable, they are not sunk. They are also not a one-time investment. They should show up every quarter on the bottom line of anyone who uses that data. And if nobody wants to pay for it, well, you’ve just solved a major part of your security problem, haven’t you?

Update 10/18/2011: “FTC Privacy Czar To Entrepreneurs: “If You Don’t Want To See Us, Don’t Collect Data You Don’t Need