More thoughts on Stackoverflow.com
Since my previous post on the subject, Stackoverflow.com has moved from private beta to public beta. Â I’ve had more time to use the site and have some more thoughts. Â The criticisms here are meant to be constructive. Â Hopefully the feedback from users will help the Stackoverflow team to make a good site even better.
Performance
First the good news. Â The site has transitioned from private to public very well. Â Jeff and his team seem to have got it right in terms of architecture and infrastructure because, even with the increased load, it remains blindingly fast.
Front Page
In terms of usability, I think there’s more that could be done to help me find the content that I’m interested in. Â The default front page is, to be honest, not very useful. Â New questions are coming in so fast and on so many topics that displaying the most recent questions is just noise.
I would prefer to have a personalised home page that shows me relevant questions based on my previous answering/voting history. Â I realise that this is major new functionality and I’m not criticising the Stackoverflow team for not having this in the initial version, it makes sense to get the site up and running first. Â However, it would be great if this could be implemented at some point. Â I’m not alone on this one, it’s the second most popular requested feature at the moment.
Presently I’m finding stuff that I want to look at by going to the tags page and clicking on interesting topics. Â But I’m sure I’m missing out on questions that would be of interest if only I could find them.
Tag Cloud
The tag cloud on the right of the front page isn’t very helpful either. Â It’s ordered with the most recent first. Â If I just wanted to view questions tagged “html”, I’m going to struggle to find the tag in the cloud. Â An alphabetical ordering would be more usable. Â Unfortunately, this has already been suggested and rejected.
Voting and Reputation
I outlined my concerns on the voting mechanism previously. Â In the interests of being constructive, rather than just a whiny blogger, I’ve opened new issues on the Stackoverflow Uservoice page. Â If you agree with me, please vote on these issues:
Addressing each of these will help in resolving The Fastest Gun in the West Problem (currently the number one voted-on issue). Â The problem is that early answers get the votes and later, better answers are largely ignored. Â Removing the penalty for down-voting will encourage more down votes where they are deserved (so an early answer that is later shown to be wrong is less likely to retain a high score). Â Also, if a down vote was as powerful as an up vote, people might be more careful in crafting good answers as opposed to quick answers.
Source Control and Backups - More than just a good idea
Are there really software development teams out there that don’t use any form of proper source control at all, even the bad kind?  I’d like to think that it wasn’t the case but I’m not so naive.
There’s a reason that “Do you use source control?” is the first question on the Joel Test. Â It’s because it’s the most important. Â If you answer “no” to this question you shouldn’t be allowed to answer subsequent questions. Â Even if the rest of your process is perfect, you score zero. Â You failed at software development. Â I could say that if your team doesn’t use source control it is a disaster waiting to happen, but more likely the disaster already happened and you haven’t noticed yet.
Of course, you and I aren’t nearly dumb enough to try developing anything more complex than “Hello World” without version control in place. Â I’m sure I’m preaching to the converted. Â The kind of people who read obscure software development blogs probably already know a few things about effective software development.
But how good are your back-ups?
You do have a back-up, don’t you?
If you don’t have a back-up you are one accidental key-stroke or one hardware failure away from scoring zero on the Joel Test (under my rules)… and failing at software development.  Hardware will fail, people will screw-up, disgruntled former employees will set fire to the building. None of these is a problem but a failure to anticipate and prepare is.
How often do you back-up?
There is only one right answer to this: every day. Â Weekly back-ups are too costly. Â Can you really afford to have your whole team redo an entire week’s work? Â The first time you lose a week’s work you will switch to daily back-ups, so why not just do it now?
A melted back-up is no back-up at all
Off-Site Storage. You could physically take tapes to another location or you could upload files to a remote server. Â Just don’t leave them here.
Does it actually work?
Honestly, have you ever tried restoring your source control back-up onto a different machine? Â The most comprehensive back-up plan imaginable is useless if you can’t restore the back-ups. Â If you haven’t seen it working (recently) then it doesn’t work. Â There’s a good time and a bad time to find out that your back-ups don’t work. Â 15 minutes after your source control server spontaneously combusted is the bad time.
Are you still here? Â You should be checking those back-up tapes…
UPDATE: The good people of Stackoverflow are discussing what could possibly be a good excuse for not using source control.
Stackoverflow.com - First Impressions
Over the last few days I’ve been playing with the beta of Stackoverflow. Â In case you are unaware, Stackoverflow is a joint venture between Jeff Atwood of Coding Horror and Joel Spolsky of Joel on Software fame. Â It’s basically a question and answers site for software developers. Â A mixture of Experts Exchange, Proggit and Wikipedia. Â The site is scheduled to come out of beta on Monday when it will open its doors to everyone.
From initial impressions I think it’s fair to say that the site will be a success, initially at least. Â Being A-list bloggers (and now podcasters too), Jeff and Joel have been able to generate a lot of exposure for their project.
Like Jeff’s blog, the minimalist site design is clean and bold, and so far the whole system is very responsive (we’ll see if that’s still the case when the traffic spikes on Monday
). Â The beta audience are already posting thousands of questions, almost all of which generate extremely prompt answers (of varying quality).
However, I think the site suffers a little from the ambitions of trying to be too many different things (is it a programming forum, or is it a Wiki?). Â There are a lot of different ideas in the implementation that interact via a quite complicated set of rules that have evolved over the course of the beta.
Reputation & Badges
Stackoverflow has two mechanisms for measuring a user’s standing within the community. Â Firstly, each user has a reputation score. Â This starts at 1 and increases as you make positive contributions (posting questions and answers that get voted up). Â As you reach various milestones, you get more privileges within the community, such as being able to vote on answers or tag other people’s questions.
Your reputation can be diminished if you get voted down or reported for abuse, but it can’t go below 1 and on the whole it’s heavily biased in favour of upward movement.
The second incentive for users to contribute is the ability to collect “badges”. Â This works exactly like the Cub Scouts. Â Some badges are easy to achieve (just fill in your profile or post your first question), and others are much harder to obtain (get 100 up votes for one of your answers).
Voting
Voting is one area of the site that I think could do with an overhaul. Â It’s unbalanced and not transparent enough. Â If your answer gets voted up, you gain 10 points of reputation. Â But if your answer gets voted down, you only lose 2 points. Â So if you post something that sounds plausible to the uninformed masses but is actually wrong, you could get 5 up votes and 6 down votes for a net score of -1 yet still gain a 38-point reputation boost. Â An up vote should have equal weight to a down vote, just like on DZone or Slashdot. Â It also might be better to show both the number of up votes and the number of down votes (as on DZone) rather than just the net total. Â This would make it easier to identify controversial content (something with 10 down votes and 12 up votes is not quite the same as something with no down votes and 2 up votes).
Another problem with the voting is that down votes penalise the voter as well as the user whose answer is being voted on. Â So if you post something wrong like “Java passes objects by reference”, I can either ignore it or lose 1 point of reputation for giving you the down vote that you deserve (even then it will take five of us to fully cancel out the one up vote that you got from someone who didn’t know better).
When I queried the justification for penalising down-voters, I was told that it was to combat attempts to game the system. Â Apparently, earlier in the beta, users were posting answers to questions and then voting down everybody else’s answers so that their answer would appear at the top. Â The idea was that by making users pay to vote down this behaviour would be discouraged. Â A better solution to this problem would have been to remove the conflict of interest by not allowing users to answer and vote on the same question (which is how Slashdot’s mod points work), rather than punishing all down votes across the whole site.
The net effect of this voting system is that everybody’s reputation increases pretty quickly.  Beyond the minimum score required to get full privileges the numbers can become meaningless. To avoid having to rename the site Integeroverflow there are a couple of artifical limits that restrict the number of votes you can cast and the number of reputation points you can earn each day.
Other Thoughts
Aside from my reservations about the voting, my impressions of Stackoverflow are mostly positive. The fact that it has already attracted hundreds of enthusiastic participants suggests that it has genuinely found a niche. However, I do feel that it is probably more elaborate than it needs to be (I don’t really get the need for the Wiki functionality).
Further Reading
Denton Gentry and Sara Chipps have also written about their impressions of Stackoverflow. Â Or, on a less positive note, you could try Crapoverflow.
Software Project Names - We have a winner
After an exhaustive search, my quest to find the best software project name has struck gold.  Following in the footsteps of Ruby on Rails, Groovy on Grails and a whole host of similarly monickered web frameworks, I give you… PHP on Crutches.  It’s so much more than a name though. It also has a quality logo and comes with the following helpful advice:
PHP is hazardous to your health. Use something else if you can.
Naming Software Projects
“What’s in a name?” asked William Shakespeare, but then he wasn’t a software developer. Â If he had been his plays might have had mutually-recursive titles or Henry V might have been called YAPAK.
“How do you name your software projects?” was a question posed on Reddit recently. Â Coming up with good names is not easy. Â I’m reminded of the quote that letting software developers name products is like letting the marketing people code them. Â But in the absence of focus groups, or even a marketing professional, we try our best anyway.

© 2004-2007 Jeffrey Rowland/overcompensating.com
Common Strategies
Many developers will default to the tried-and-tested naming algorithm popular with the Linux crowd:
Another naming strategy, for the creatively-challenged, is the Ronseal approach. Â This involves choosing an entirely accurate but wholly unimaginitive name (often abbreviated to an acronym). Â I’ve been guilty of using this technique in the past. Â Occasionally, you can derive a mildly amusing acronym.
If you have the Web 2.0 cool you can take a word and miss a letter out, brilliantly disguising the fact that the domain name that you really wanted wasn’t available.
My Own Efforts
I didn’t have to think too long over the name for ReportNG. Â Unexciting as it is, it was a fairly obvious due to its relation to TestNG. Â It also has a built-in barometer for success. Â When Google stops asking “Did you mean reporting?” I’ve made it.
On the other hand, I spent ages trying to come up with a name for the Watchmaker Framework, and a couple of years later I’m not entirely satisfied with it, though I haven’t come up with anything better in the meantime. Â All of the obvious evolution-related words were already being used for other evolutionary computation projects. Â Eventually I settled on “Watchmaker”, an allusion to the Watchmaker Analogy and, by extension, to Richard Dawkins’ The Blind Watchmaker, but it’s probably a confusing project name to those who aren’t familiar with the analogy.
Thinking of a good name can be difficult. Â You have to wait for inspiration to strike and you have to try to avoid Firefox-esque cock-ups. Â I find it very difficult to start on a new project idea until I have thought of a name for it. Â And yet sometimes you have a great idea for a project name when you’re not looking for one and then you need to find a project to apply it to.
(Dis)Honourable Mentions
Which software projects have the best and worst names? Â I think Hibernate is a good name (although I often refer to that software in my own more colourful terms). Â Lisp and Smalltalk aren’t exactly positive words. Â Names like Python, Java and Ruby are much cooler, regardless of the relative merits of the languages.
What other projects have particularly good or bad names?
How to treat your users like idiots
If there is one thing that is possibly more annoying than software that asks a yes/no question without offering “yes” and “no” as the options then it must be websites that insist that I enter my e-mail address twice in order to register an account.
The implication is clear:
“We don’t trust you to type anything important without making a mistake… you crooked-fingered, myopic half-wit.”
If you are responsible for writing web applications, please don’t do this. Â It makes me angry. Â It’s utterly pointless because I just end up copying and pasting from one field to the next. Â It makes sense for password fields because you can’t see what you are typing so you might not notice a typo, but I can recognise whether my own e-mail address is correct or not when it’s clearly displayed on the screen.
Strangely, no website has ever asked me to enter a shipping address twice, even though it would be more costly to get that wrong.
I can kind of understand the thinking behind it:
“We need to be certain that you give us a valid e-mail address and we’re not convinced you’d give us the same answer if we asked you twice… you easily-distracted, feckless simpleton.”
But the solution is not just annoying, it’s ineffective too. Â It doesn’t guarantee that the e-mail address is valid. Â If I can make a mistake once I could conceivably make the same mistake twice. Â Or, if I don’t want to reveal my real e-mail address, I can enter the same fake address twice. Â The way to make sure the address is valid and under the user’s control is to send a confirmation message containing an activation URL.
So please don’t treat me like I’m incompetent. Â If my e-mail address is vital then verify it properly.
Revisiting the Comments Debate: The Self-Documenting Code Contest
The great commenting debate generated a lot of disagreement, both here and elsewhere, about the value of code comments with respect to writing self-explanatory code. Â If you are of the opinion that good code does not need comments, here is your chance to prove it. Â Laurie Cheers has created The Self-Documenting Code Contest. Â The idea is to solve a given problem using the most understandabe code that you can write. Â No comments are allowed. Â The problem is to generate all two-word anagrams of the word “documenting”.
Although I’ve clearly stated my opinion in favour of comments, I decided to give it a shot. Â I’ve already submitted my Haskell solution and, to be honest, you’d do well to improve on its readability. Â I believe that it satisfies all of the requirements using a brilliantly simple algorithm.
I’ve hidden my code in case you want to try for yourself first. Â Click “show” to reveal my solution: show
generateAllTwoWordAnagramsOfTheWordDocumenting :: [String] generateAllTwoWordAnagramsOfTheWordDocumenting = ["ceding mount", "document gin", "condign mute", "induct gnome", "coming tuned", "gnomic tuned", "cumin tonged", "cum denoting"]
I believe there is a lesson here about doing the simplest thing that could possibly work and how that can lead to simpler, more understandable code.
(I was also able to indulge in a little re-use while writing my solution).
UPDATE: I got disqualified from the contest for “being a smartass” ![]()
DZone RSS Tricks
A comment from “Dmitryx” on DZone about how DZone could provide better options for filtering its “Popular Links” got me thinking about how Yahoo! Pipes (my new favourite toy) could be used to filter DZone RSS feeds in interesting ways. Â Helpfully, DZone provides a lot of information in its RSS feeds including the number of votes for and against, the click count, the number of comments, the username of the submitter, the URL of a thumbnail image, a list of tags and more. Â So if you want to go down the Pipes route, there are a lot of possibilities.
However, something else that is not immediately obvious is that DZone provides a lot of functionality of its own for filtering the feeds that it serves. Â Most DZone users will be aware that they can subscribe to a feed of “Popular Links” (those that make it to the front page) or subscribe to the “New Links” feed (those that have recently been submitted and have not yet attracted enough votes to qualify for the front page). Â The respective URLs for these feeds are:
http://www.dzone.com/links/feed/frontpage/rss.xml http://www.dzone.com/links/feed/queue/rss.xml
But these two feeds are not the only ones available. Â There are also feeds for each of the tags. Â If, for example, you want to subscribe only to Python articles you would use one of the following feeds (again they are divided into “Popular” and “New” articles):
http://www.dzone.com/links/feed/frontpage/python/rss.xml http://www.dzone.com/links/feed/queue/python/rss.xml
This is great if the articles you want neatly fit in to one of the 48 categories that DZone provides, but what if you want to restrict the feed to articles about Haskell, which doesn’t have its own tag (it is lumped together with Lisp, Erlang, Scala and the rest under the “Other Languages” tag)? Â Fortunately DZone provides a solution for this as well. Â You can create a feed for any search phrase. Â A Haskell feed (for both new and popular links) has the following URL:
http://www.dzone.com/links/feed/search/haskell/rss.xml
Kevin Pang has also discovered that you can provide query parameters to DZone’s feed URLs to manipulate the results (although DZone main man Rick Ross warns that these are not guaranteed to be supported in the future).
It’s not just topics that you can filter on. Â You can also subscribe to a feed of links submitted by a particular user. Â First you need to find out that user’s numeric ID (it’s part of the URL for their profile page), and then use that to construct the feed URL:
http://www.dzone.com/links/feed/user/<user_id>/rss.xml
Likewise for that user’s shared links:
http://www.dzone.com/links/feed/shared/<user_id>/rss.xml
If these options alone aren’t enough, by using Yahoo! Pipes to combine, sort and filter multiple DZone feeds you should be able to tailor your subscription to match your interests precisely.
No, your code is not so great that it doesn’t need comments
Code-commenting is so basic and so universal that every programmer, regardless of the language that they practise, thinks that they know all there is to know and that their way is the only sensible approach (I am no different in this respect). Â I guess that’s why there are so many blog postings offering advice on commenting (you can add this one to the list).
Even the elite of programmer bloggers are having their say. Â Steve Yegge covered it and, more recently, so did Jeff Attwood. Â Jeff’s basic advice, that you wouldn’t need so many comments if you wrote the code to be more self-explanatory, is sound but the idea that we should be aiming for some kind of perfect code that has no need for any comments is dangerous.
It’s not a sensible goal for beginners and inexperienced developers. Â Tell them that they should write good code without any comments and they will deliver on the second part but struggle with the first. Â Even among experienced developers, assuming for a moment that it is possible to write perfect code that doesn’t require comments, there will be far fewer who are capable of this than there are who think that they are.
The other arguments against commenting are even weaker in my opinion. Â Yes, poor comments are …well… poor. Â So don’t write poor comments, write good ones. Â And yes, if comments become out-of-sync with the code then they are not helpful. Â So don’t let the comments become out-of-sync, they are part of your code and should be maintained/refactored along with the code itself.
I don’t believe that I’ve read a piece of code and thought “wow, this has far too many comments”. Â Unfortunately, I’ve had the opposite reaction all too often. Â I don’t for one moment believe that it is possible to write quality code without any comments. Â Take Jeff’s own example:
Here’s some code with no comments whatsoever:
r = n / 2; while ( abs( r - (n/r) ) > t ) { r = 0.5 * ( r + (n/r) ); } System.out.println( "r = " + r );
Any idea what that bit of code does? It’s perfectly readable, but what the heck does it do?
Let’s add a comment.
// square root of n with Newton-Raphson approximation r = n / 2; while ( abs( r - (n/r) ) > t ) { r = 0.5 * ( r + (n/r) ); } System.out.println( "r = " + r );
That must be what I was getting at, right? Some sort of pleasant, middle-of-the-road compromise between the two polar extremes of no comments whatsoever and carefully formatted epic poems every second line of code?
Not exactly. Rather than add a comment, I’d refactor to this:
private double SquareRootApproximation(n) { r = n / 2; while ( abs( r - (n/r) ) > t ) { r = 0.5 * ( r + (n/r) ); } return r; } System.out.println( "r = " + SquareRootApproximation(r) );
I haven’t added a single comment, and yet this mysterious bit of code is now perfectly understandable.
Sorry Jeff, but that’s not “perfectly understandable”. Â I agree with extracting the square root code into a separate method with an appropriate name, but your second version (the one with the comment) was more informative since it mentioned which algorithm you were using (in your version, the maintainer is going to have to figure that out for themselves). Â Also, we’re still left with at least two poorly-named variables. Â We can forgive the use of n for the parameter since that’s kind of a convention but what the hell are r and t?
In my opinion, this is better:
/** * Approximate the square root of n, to within the specified tolerance, * using the Newton-Raphson method. */ private double approximateSquareRoot(double n, double tolerance) { double root = n / 2; while (abs(root - (n / root)) > tolerance) { root = 0.5 * (root + (n / root)); } return root; }
Alternatively, if you don’t like the verbose comment at the top, you could either rename the method to something like newtonRaphsonSquareRoot (if you are happy for the method name to be tied to the implementation) or put an inline comment in the body explaining that this is the Newton-Raphson method. Any of the three variations will communicate useful extra information to the maintenance programmer, who can then Google “Newton-Raphson” if they want to find out more about it. Remember that code is written only once but read many times. It should be tailored for the reader rather than the writer.
This is all very well, but we’re still lacking some information. Why the hell is Jeff calculating square roots in this way? Why is he not using the library function? Â Is it because he doesn’t like the answers it gives him? Â Is it for performance? Â Who knows?
Well-written code will often answer the “what?” and “how?” questions with few or no comments, but you often also need to answer the “why?” question too. Â Avi Pilosof covers this in his response to Jeff’s post. Â Avi argues that rather than comment the code, you should comment the business justification for writing the code that way. Â This may mean inserting reference to particular requirements or issue reports.
So yes, favour code that is self-explanatory, but I don’t believe that you can always achieve the necessary clarity without a few well-placed comments to aid understanding. Â Code that is obvious to the author today is rarely obvious to the maintainer next year (or even to the author next month).
And if you still really believe that your code does not need any comments, then I never have to maintain your code.
Fun with Yahoo! Pipes and Last.fm
So I might be about 18 months late, but I finally got around to playing with Yahoo! Pipes today. Â I was aware of the basic concept but I was not aware of how impressive the implementation is. Â It’s an incredibly powerful tool with a slick UI that allows you to perform some pretty complex data processing without doing any real programming.
For my first experimental pipe, I just had it aggregate and sort the feed from this blog, my Google Reader shared links feed and my Flickr photostream feed. Â Easy-peasy.
Things got a bit more interesting when I tried to add my Last.fm “loved tracks” (favourites) to this super DanFeed. Â This is because Last.fm doesn’t actually have a feed for “loved tracks”. Â It has a feed for all recently played tracks, but I can’t really see the point of this because, with one entry every 3 or 4 minutes, it’s too much information for even the most dedicated stalker to digest.
Last.fm does however have a REST API to provide access to its data. Â Yahoo! Pipes is not restricted to processing RSS and Atom feeds. Â It can also extract data from JSON, CSV, arbritrary XML and even plain old HTML pages, so it didn’t take very long to get at the data I wanted.
After a little bit of trial-and-error I was able to include album art thumbnails in the feed too (for feed-readers that will display them). Â The only thing that wasn’t intuitive was how Pipes deals with dates for RSS entries. Â There was a lot of head-scratching before I finally succeeded in getting the dates from the Last.fm XML into the Pipes RSS.
The result of all of this is that I have published my first (semi-)useful Pipe, one that allows you to share your favourite tracks with your friends. Â In effect, they can subscribe your recommendations. Â The pipe is here. Â Just type in a Last.fm username and press the button. Â You can get a link to your personalised RSS feed from the output page. Â If you want to embed the feed (including the thumbnail images) on your website/blog/Facebook/whatever, just click on “Get as a Badge” after generating your custom feed.



