Why don’t websites have credits?

Engineers of any discipline are largely an anonymous bunch. You don’t know who designed the fuel pump in your car, I’d even wager it would be extremely difficult for you find out if you wanted to. You don’t know who wrote the code for the OS X Dock or Windows Start bar or who wrote the Like button on Facebook. These people made decisions that affect you deeply every day, and you have no idea who they are.

The most interesting part of this is that those people are OK with it. If you ask them (myself included) they will tell you that it doesn’t matter, that what really matters is the quality of the work and the enjoyment you had doing it. Unfortunately, I think we’re wrong.

Should they?

I can’t seem to come up with a good framework for who figuring out who wants credit, never mind who deserves it. If you so much as make a photocopy during the production of a movie, you’re probably in the credits with some high-faluten title like “First deputy assistant duplication specialist”. Music credits are tied to royalties and managed very closely. Most authors wouldn’t think about publishing something anonymously, nor would artists or sculptors. Artists always sign their work.

This is not even strictly a software issue. Video games list credits, often in the box and at the end of the game, and they even have a IMDB-like site. Nor is it an “arts & entertainment” issue, any credible scientific paper will cite other works and acknowledge contributions. Patents have names on them, even when assigned to a company.

A few software packages have listed credits. If I remember correctly, Microsoft did it on old versions of Word and Excel, and Adobe had it on old versions of Photoshop and Illustrator. I’m curious why those were removed, or at least hidden. “The Social Network” had something about Saverin being removed and re-added to “masthead” of Facebook (although I don’t know what or where that is).

So it would seem that we might be in the minority here, perhaps due to convention rather than any specific reason. And if there’s one thing that bugs an engineer, it’s deviating from standards with no good reason.

So let’s do it.

Why do it?

  • Pride in your work – Sure there is some pride in doing a good job anonymously, but wouldn’t be just a little more motivated or happy now that your name is on it?
  • Being a stakeholder – We’ve all done projects we didn’t believe in, and consoled ourselves with the fact that “it’s not my project”. Well, now it is.
  • Reputation – We’ve got our resumes, but credits will verify them.
  • Honesty/Transparency – There is no good reason to withhold this information, so it should be out there.
  • All that money they spent on school – Show your parents your name on a website and watch them smile.

So who’s get listed?

I think the short answer here is, everyone. Movies do it, why not websites? It could be just a big list of names, or something more detailed with contributions, dates, whatever makes sense. Let’s just start throwing some names up there, and let the de facto standards evolve on their own.

If you know of any major sites that do this well, put it in the comments. Similarly, if you can think of a good reason why this shouldn’t happen, I’d love hear about it.

Register My Login to Join Your Account

One of the details that can be tough to keep track of with a large or fast-moving website is language consistency. Of course, to be consistent, you need to decide what to use. I did an audit of the most popular English-language sites (as determined by Alexa and Compete), to see how three key phrases were being used. These were:

Login/Log In/Sign in – The action of authorizing your account.
My/Your – My Movies, Your Account, etc.
Join/Sign Up/Register/Create – Creating a new account.

Here is the raw data, see below for some analysis.

adultfriendfinder.com login my join
aim.com sign in my join/get
amazon.com sign in your start
aol.com sign in my sign up
bankofamerica.com sign in your* enroll
blogger.com sign in my create
craigslist.com login N/A sign up
deviantart.com login N/A become/join
ebay.com sign in my register
facebook.com login my sign up
flickr.com sign in your create
fotolog.com log in/login my join
friendster.com log in my sign up
go.com (espn) sign in my register
google.com sign in my create
hi5.com log in my join
imageshack.us login my signup
imdb.com login my register
live.com sign in my sign up
mininova.com login my register
msn.com sign in my sign up
myspace.com login my sign up
neopets.com login my sign up
photobucket.com log in my join
pogo.com sign in my register
rapidshare.com login my join
store.apple.com login* N/A create/set up
veoh.com log in my register
walmart.com sign in my create
wikipedia.org log in create
wordpress.com login my sign up
yahoo.com sign in my sign up
youporn.com login my register*
youtube.com log in my sign up

* Inconsistent

“My” is the clear winner over “Your”, with 27 mys, 3 yours, and 2 that avoid using possessive pronouns.

“Login” takes the edge over “Sign In”, 20-14. “Sign In”, however, seems to be more popular with the biggest of the big sites, like Yahoo, Microsoft’s sites, and Google. I’d say this is a tossup, and I have a feeling that in a few years signup with come to dominate. Of those using login, 13 use “login”, and 7 use “log in”, with the space.

There’s a plurality of choices for sign up, with “sign up” being used on 12 sites. 7 used join, 7 used register, 6 used create (an account), 1 used start, and 1 used enroll. This is not an independent choice, however, as “sign up” is often seen where “log in” is used, and sites that use “sign in” use something like “register”. AOL, Microsoft, and Yahoo use “sign in/sign up”. I suspect that some people think using such similar phrases would be confusing, and I agree, despite the appeal of the general consistency.

My preference is to use “my, “log in”, and “sign up”. “Join” seems ambiguous, “register” seems bureaucratic and expensive, while “create an account” just feels a little dorky.

Dishonorable Mention: The Apple Store, supposed paragon of usability and attention to detail, is the worst offender on this list in terms of mixing and matching the terms, often on the same page. They also fail miserably on one major point, there’s no logout button!

Usernames

Usernames for most websites are based on UNIX conventions/standards. They are lowercase, usually begin with letters, and have no whitespace. Many sites offer a “display name” which is more flexible.

While discussing requirements for a new project, my first inclination was to do something similar, simply because “that’s how it’s done”, but someone suggested this method might be antiquated. After giving it a few days of thought, I tend to agree. “Old” user domains like AOL, Windows, and Slashdot have logins that have allowed spaces for years, yet most of even the latest, shiniest Web 2.0 sites go back to the 1970s for their guidelines.

We’ve even taken it a little further and not only can users use spaces, underscores, and dashes, these characters are ignored for purposes of uniqueness, because I’m guessing people’s brains will tend to stem these characters when it comes to memorizing them. So “Eric Savage” and “ericsavage” and “Eric_Savage” and even something like “Eri__c-SAVA g-_E” would all be the same.

When appearing in a URL or other machine-readable context, these characters are all changed to underscore and consecutive duplicates are eliminated, so the previous username would be “eric_savage”. Also, leading and trailing non-alphanumerics are stripped, otherwise we’d likely find users all naming themselves __alphadog so they appear first alphabetically. We could expand the list of which extra characters are allowed, but we’ll start off easy.

Questions:

    Can anyone think of good reasons for why you should stick to UNIX-style usernames?
  • Should users on a community site be able to change usernames? [I’m currently in the “no” camp]
  • If changeable, should the change history be public?
  • Most people like short usernames, some people prefer long ones. What do you think should be the limit? [I’m currently thinking 20]
  • Is a short limit too ethnocentric?

Database Naming Conventions

Time for another naming convention. This time it’s something people care more about than Java packages, we’re talking about databases. Here are the rules, and the reasons behind them.

Use lowercase for everything

We’ve got 4 choices here:

  • Mixed case

    • Some servers are case sensitive, some are not. MySQL for example, is case-insensitive for column names, case-insensitive on Windows for table names, but case-sensitive on Linux for table names.
    • Error prone
  • No convention

    • Same reasons as mixed case
  • Upper case

    • SQL is easier to scan when the reserved words are uppercase. This is valuable when scanning log files looking for things like WHERE statements and JOINs.
    • MySQL will always dump table names on Windows in lowercase.
  • Lowercase

    • Works everywhere. Some servers, like Oracle, will appear to convert everything to uppercase, but it’s just case-insensitive and you can use lowercase.

Only use letters and underscores and numbers (sparingly)

  • Most servers support other characters, but there are no other characters which all the major servers support.
  • Numbers should be used as little as possible. Frequent use is typically a symptom of poor normalization.

    • address1, address2 is OK
  • Whitespace isn’t allowed on most servers, and when it is you have to quote or bracket everything, which gets messy.

Table and column should be short, but not abbreviated.

  • You’ve seen the same thing abbreviated every way possible, like cust_add, cus_addr, cs_ad, cust_addrs, why not just customer_address? We’re not writing code on 80 character terminals any more, and most people aren’t even writing SQL, so let’s keep it clear, OK?
  • 30 characters is considered the safe limit for portability, but give some serious thought before you go past 20.

Table names should be singular.

Yes, singular! Oh yes, I went there. I used to use plural names, because it’s more semantically accurate. After all if each record is a person, then a group of them would be people, right? Right, but who cares. SELECT * FROM person isn’t any less clear than people, especially if you’ve got a solid convention. You don’t use plurals when you’re declaring class names for a vector of generics do you? Also:

  • English plurals are crazy, and avoiding them is good.

    • user -> users
    • reply -> replies
    • address -> addresses
    • data -> data (unless its geographic, then it’s datum -> data)
  • Singular names means that your primary key can always be tablename_id, which reduces errors and time.

Double Underscores for Associative Tables.

You’ve got your person table, and your address table, and there’s a many-to-many between them. This table should be called address__person. Why? Well what if you have a legacy_customer table that also ties to address. Now you’ve got address__legacy_customer. A new developer can easily pick up this convention and will be able to break down the names accordingly. Remember, no matter what the Perl/Lisp/Ruby/etc guys say, clarity of code is judged by how someone reads it, not how they write it.

Component Names of Associative Tables in Alphabetical Order.

This rule is somewhat arbitrary, but still beneficial. There’s no good way to determine which goes first. Table size, “importance”, age, who knows what else, and those assessments may change over time. Or, you might find that your manager assigned the same task to two people, and now you’ve got an address__person and a person__address table co-existing peacefully, when you only need one. Everyone putting them in the same order makes reading and writing queries easier.

That’s all I’ve got for now, but I encourage you to offer your own, or even refute some of the ones above (with some reasoning, of course).

Naming Conventions: Java Packages

When developing, conventions can mean the difference between producing something clear and concise and producing something confusing and arcane. They exist at all levels, from the industry and the language down to specific modules of applications. I’m going to attempt to codify some conventions for aspects I use heavily, and I’ll start with one of the easier ones, Java package names. Sun has some basic ones, but I think some more specific guidelines are warranted.

  1. Names should be all lowercase. Uppercase and mixed case denote other concepts in Java and there’s no need to muddy the waters further.
  2. Names should be alphanumeric, preferably just alphabetical.
  3. Do not use version numbers or dates (see below for a possible exception).
  4. Use tld.domain-you-own.project/library.* for distributed or published code. This follows with Sun’s convention, and is the only way to know a name is globally unique, or at least that you are the one that’s allowed to use it.
  5. Do not use tld.domain-you-own.* for internal code that should not be distributed. I typically just use the project or library’s name as the first segment. This convention can be useful in signaling other developers that code is for internal use only, and if something is converted from internal to external usage, this will help identify which version an application was built against.
  6. Package names that are nouns should be singular (mycompany.myproject.account). This maintains consistency with packages that are named for actions (mycompany.myproject.search) and adjectives (mycompany.myproject.common).
  7. Store DAOs in a “data” subpackage. This helps the tree-views in IDEs and also allows for easier control of logging. If you’re being formal and encapsulate DAO operations in a manager class, the manager class should not be in the data package because it’s grammar is business-based, not persistence-based.
  8. Classes that are heavily dependent on third-party packages should be in a subpackage named for the primary dependency. A Hibernate implementation of your DAO should live in mycompany.myproject.data.hibernate. This helps greatly with logging configuration. This is one place where version numbers are permissible, such as mycompany.myproject.data.hibernate3. Use this exception very sparingly, as it can be confusing with regards to forward compatibility.
  9. Classes that are extensions of third-party code should be named for the dependency and be outside of the project’s or product’s context. For example, if you are creating a new type of controller for Spring MVC that does not depend on project-specific code, put it in mycompany.spring.controller. If it is integrated with the application, see the previous point.
  10. Don’t expose organizational details in the package structure (mycompany.mydepartment.myproject). Naming packages by department, office, or region will surely be confusing soon due to management’s penchant to reorganize and reassign.

I consider the above a work in-progress, and welcome comments.