Voting Obsecurity

December 26, 2006

The Honorable George W. Bush
President of the United States
The White House
1600 Pennsylvania Avenue NW
Washington, DC 20500

Dear Mr. President,

I hope the holidays find you well, and wish you happy and productive new year.

I am not an expert on the topic of voting, nor am I a professional security expert, but I do try and follow the news with regards to the intersection of the two. I watched the HBO documentary “Hacking Democracy” this weekend while wrapping presents, and was flat-out disgusted. It was not especially well-made, made no attempt at being objective, nor was most of its information news to me, but I was moved when I watched a mock election be rigged using actual voting systems and saw a real, respected election official be speechless and dumbfounded that his job was completely undermined.

There is no simple solution to secure voting, nor is it remotely probable that any election will ever be 100.0% honest, but there are some monumentally obvious flaws in the way we currently count votes. The largest is the issue of the lack of openness of the software, systems, and processes that are involved.

I am a software developer and most non-programmers I’ve talked to have a difficult time understanding the idea that public access to the “secret codes” of software, AKA the source code, is more secure than private or closed source code. The general opinion is that if you can see the inner workings of something, it’s easier to break it, which is valid and true. The next step in this thought process, however, is more critical and is one that most people don’t take. That is that if you cannot see the inner workings of something that is broken, it’s more difficult to fix it.

The idea of “security by obscurity”, sometimes referred to as “obsecurity”, is valid and necessary when it comes to information such as private financial data, personal information like medical histories, and intelligence gathered by our military and other government agencies. However, regarding mechanisms and processes, such as software, obscurity lessens security. This is doubly true when those mechanisms are designed to collect and analyze public data, such as votes.

Here’s a dirty secret of the programming craft: 99.9999% of software is broken. By broken I mean there is some bug somewhere in it. In new or rarely used software these bugs can be serious and misleading. In most mature software, it’s nothing serious; something doesn’t display right, some obscure error condition is handled poorly, etc. This applies to video games, email programs, ATM software, Windows, Linux, etc. as well as voting software.

So how do you weed out these bugs? You test, over and over and over again. When you find a bug, you test again, even things that you didn’t fix (AKA regression testing). Eventually, you’ve fixed all or most of the bugs that were found, satisfied your unit tests and requirements, and you ship it. Then your customer does something you never planned on, maybe because they are being silly or stupid, or you aren’t the programming god you thought you were, or your QA staff is overworked, or it just wasn’t possible for you to test in the lab. This is why you get Windows updates every week, and why there are dozens of bug fixes for every Linux kernel, patches for every video game, it’s simply unavoidable. The more something is used, the more bugs are found, and the better it becomes.

Voting software isn’t used very much. Most machines are used one day every year or two. Excel has been used by millions of people every day for nearly 20 years, and there are still bugs in it. If I’m making an inventory of my baseball cards and have a problem with Excel I can report it to Microsoft and hopefully a ticket will be opened and hopefully they will fix it. The difference between a spreadsheet package and a voting system is that my grandfathers didn’t risk their lives overseas to make sure “=SUM(D3:D13)” was accurate, they did it so that I would grow up in a better world than they did, and just as importantly, have the power to make it better for my grandkids.

It is absolutely imperative that we apply the highest possible standards of scrutiny, security, and integrity to the systems that facilitate our most sacred public right. Voting software, hardware, and system should not only be open, they should be the zenith of openness. The public should be able to download complete specifications for every piece of hardware on every type of voting machine out there, from the device I vote on to the system that tabulates it to the printers that make the reports. We should have access to every line of code used in the entire process. I should be able to test it myself and find flaws or solicit advice from those I trust. The public should have as much access to the hardware as is feasible. Regular citizens, universities and vendors should be encouraged with bounties to find and report flaws they find. Defect reports on voting systems should be legal documents, also open to review by all. There should be digitally signed video publicly viewable via live broadcast or within hours of all access to every machine with the sole exception of the person casting a ballot. There is room for only ONE secret in this entire process, and that is who an individual voted for.

I would be exceptionally pleased if you would propose or support legislation to help protect our votes by virtue of an open and honest process. It would not only validate the sacrifices millions of Americans have already made, but it would set an example that other democracies and future generations will aspire to.

Sincerely,

Eric F. Savage


Also sent to my Senators Kennedy and Kerry, Representative Frank, Governor-Elect Patrick, State Senators Brown and Creem and Secretary of the Commonwealth Galvin. If you feel similarly I encourage you do write to your officials and feel free to borrow or copy from my letter.

For more, better information on this topic I recommend checking out Ben Adida’s Blog and Black Box Voting. A web search for ‘secure voting’ or similar topics will also turn up piles of other opinions and (often scary) facts.

Naming Conventions: Java Packages

When developing, conventions can mean the difference between producing something clear and concise and producing something confusing and arcane. They exist at all levels, from the industry and the language down to specific modules of applications. I’m going to attempt to codify some conventions for aspects I use heavily, and I’ll start with one of the easier ones, Java package names. Sun has some basic ones, but I think some more specific guidelines are warranted.

  1. Names should be all lowercase. Uppercase and mixed case denote other concepts in Java and there’s no need to muddy the waters further.
  2. Names should be alphanumeric, preferably just alphabetical.
  3. Do not use version numbers or dates (see below for a possible exception).
  4. Use tld.domain-you-own.project/library.* for distributed or published code. This follows with Sun’s convention, and is the only way to know a name is globally unique, or at least that you are the one that’s allowed to use it.
  5. Do not use tld.domain-you-own.* for internal code that should not be distributed. I typically just use the project or library’s name as the first segment. This convention can be useful in signaling other developers that code is for internal use only, and if something is converted from internal to external usage, this will help identify which version an application was built against.
  6. Package names that are nouns should be singular (mycompany.myproject.account). This maintains consistency with packages that are named for actions (mycompany.myproject.search) and adjectives (mycompany.myproject.common).
  7. Store DAOs in a “data” subpackage. This helps the tree-views in IDEs and also allows for easier control of logging. If you’re being formal and encapsulate DAO operations in a manager class, the manager class should not be in the data package because it’s grammar is business-based, not persistence-based.
  8. Classes that are heavily dependent on third-party packages should be in a subpackage named for the primary dependency. A Hibernate implementation of your DAO should live in mycompany.myproject.data.hibernate. This helps greatly with logging configuration. This is one place where version numbers are permissible, such as mycompany.myproject.data.hibernate3. Use this exception very sparingly, as it can be confusing with regards to forward compatibility.
  9. Classes that are extensions of third-party code should be named for the dependency and be outside of the project’s or product’s context. For example, if you are creating a new type of controller for Spring MVC that does not depend on project-specific code, put it in mycompany.spring.controller. If it is integrated with the application, see the previous point.
  10. Don’t expose organizational details in the package structure (mycompany.mydepartment.myproject). Naming packages by department, office, or region will surely be confusing soon due to management’s penchant to reorganize and reassign.

I consider the above a work in-progress, and welcome comments.

The Toolbox: Introduction & Log4J

In my own semantics, software development is a blend between a conventional profession and a craft. As a craftsman, I have a preferred set of tools with which I build and create. Of course, as a contract developer, I don’t always get to use what I want, but these days I more often find myself in a position to do so. I figure it might be useful to others, especially those just getting into the game, to see what an experienced developer actually uses, and why. So let’s start with an easy one, Log4J.



Log4J.There are essentially 3 major choices when it comes to logging in Java. Log4J, Commons Logging, and Java Logging. My current preference is Log4J. Why?

Commons Logging is an abstraction layer, and defaults to using Log4J for its actual logging. The main idea behind CL is that you aren’t tied to a specific logger. I think CL is flawed in two ways:

1. Log4J is itself extensible. Writing a custom appender (the piece that actually writes the log message) is a fairly trivial task, and if your project makes such a fundamental shift as to change its logging API, writing a bridge appender is a negligible amount of effort. So if you’re going to be using Log4J anyways, as most projects do, why add unnecessary complexity?
2. Commons Logging is always more difficult to configure. I honestly can’t figure out why, but almost every project I’ve worked on that uses CL has had issues, especially when you have a mix of libraries which do or do not use it.

Java Logging, added several years ago in Java 1.4, was supposed to eliminate the need for third party logging APIs. It was, and remains, a good idea, but it does not make a compelling case to switch from the vastly more popular Log4J. Last time I checked, the standard logging API was missing a rolling file appender, which is essential for any production environment. It is also slightly less flexible to configure, but the main reason it has largely been irrelevant is that it doesn’t offer anything that Log4J does not.

What would make me switch?

I’m a very heavy logger. I will often write code with as many log statements as regular lines of code, and of course most of these statements are debug/trace level. The problem this introduces is performance. Let’s look at this example:

Person person = PersonFactory.getPerson(name);
log.debug("Found " + person.getName());
precinct.register(person);

The problem is that even if logging is at a level above debug, getName() will be invoked. In practice these statements rarely do anything expensive, but it can be the proverbial death by a thousand cuts. The Log4J way to avoid this is to do the following:

Person person = PersonFactory.getPerson(name);
if(log.isDebugEnabled()) {
log.debug("Found " + person.getName());
}
precinct.register(person);

Messy huh? What I would like to see is the ability for the java compiler to skip these statements entirely, without the cruft currently required. If this ended up being some special case that required the use of the standard logging API, I would switch. The slick way to do this would be to have the ability to annotate methods for lazy reads of parameters being passed to them, but that, as Alton Brown would say, is another show.