Archive for March, 2006

The committer model in practice

Tuesday, March 28th, 2006

Interesting article about the use of the “Committer Model” in commercial software development, borrowed from the open source community.

The committer model solves a few problems of open source distributed development, but the primary one is that of trust: in a developers intent and in their competence. There are other issues that push different buttons depending on the project, but in the end, these are just indicators of intent or competence.

Personally, I think this is a bit of an extreme approach in the average commercial development project – certainly for teams under 10 developers I would call it excessive – as the same issues don’t apply in a commercial project.

Firstly, you must implicitly trust the intent of the developers which you are paying to write the software – and if you don’t, why are they working for you to begin with? Secondly, if you’ve hired them, then you already have made a decision based on their competence.

(more…)

If you’re a programmer, you should know this…

Monday, March 27th, 2006

My initial post was going to be a snark-filled venting of all that ails me with my “day job” at the moment. Specifically, the woes of maintenance programming when dealing with a codebase from 2000 and ported from Windows to Linux.

If JP Sartre was a coder, he would have actually said:

“hell is other people’s (undocumented) code.”

So for your amusement, here is the list of simple mistakes I’ve seen of late. Consider these some simple C++ programming sins you may use to torment those that will follow you for eternity. (more…)

Coming up for air

Monday, March 27th, 2006

I’ve been too busy lately with real work to keep up with the Ruby spider project, but I’m hoping to get back to it real soon.

This is a vent/rant/gratitude post, based on what I’ve spent my last 2 weeks dealing with.

I’ll start with the gratitude, but it’s all downhill from there. I am immensely thankful for the developers responsible for the following tools which, IMO, no Linux developer should be without:

  • Valgrind – helped me track down a bunch of obvious memory errors in the code
  • dmalloc – helped me track down those which valgrind couldn’t
  • gdb – the more I use gdb, the more effective a tool it becomes to me. Why aren’t you using it? At least learn to get a backtrace and inspect variables if you’re at all programming anything with GCC.
  • distcc – chasing memory issues into base classes led to full recompilation more often than I would have liked. distcc let me put an extra 2 idle machines to work compiling the code.

What started as a rather curious crash in our dev branch a couple of weeks ago turned into a major audit of how we allocate, use and free memory in our codebase. When I did get to the bottom of the crash, it turned out not to be a memory issue, but a library dependency inherited from libstdc++.

So here’s a little nugget for you right off the bat:

If you use STL containers (or any code which uses STL’s default allocator) in a dynamic library which has been compiled with -MT, you must link any executable that dlopen’s said library with libpthread.

In hindsight this sounds fairly obvious, right? (more…)

How to calculate standard deviation

Monday, March 13th, 2006

When I started this blog last month, I thought “Standard Deviation” was a snappy title. Of course, I also knew about standard deviation as a statistical tool, however I didn’t expect that this overlap would cause Google search to drive 50+ visitors a month here looking for implementations of the standard deviation formula.

So as a “public service”, here is some code to figure standard deviation in Ruby and Java.

(disclaimer: no warranties as to correctness, particularly to the nth decimal place, don’t use this to run your home made nuclear reactor or air traffic control system, blah blah, etc, etc :-) )

(more…)

Origami, everything I hoped for…

Thursday, March 9th, 2006

from Apple.

It’s scary, I guessed the form factor to the inch even:

While an OS X based tablet would be cool, I wouldn’t be surprised to see Apple deliver something in a smaller form factor, say 7″-10″ and geared towards video and the web on the go – bridging the gap between a handheld and a true tablet form factor.

A lot of people don’t seem to get it, but to my way of thinking, a tablet PC the same size as a conventional “notebook” PC has minimal advantages. In fact, it has all the disadvantages of lugging a laptop that size around (owww, my shoulder). I want a tablet I can take with me to crash on the couch, or the coffee shop, or the back garden, so I can catch up on reading email, RSS, websites, which I can do without needing a 50wpm input device like a keyboard. This to me is truly notebook sized, not the 15″ screen “notebook” i have on my desk now.

If it’s larger than a book, then I’ll just take a book instead. There’s your design spec for portability right there.

Now i wonder if it can be hacked to run OS X? Even if it could, it still lacks the sexy Apple industrial design. I’m still holding out hope that Apple will bring something similar to market soon… after all iPod wasn’t the first MP3 player, and that proved the naysayers wrong when they said that there was no market and it would flop. ;-)

Ruby web spider Part 1: The scheduler

Wednesday, March 8th, 2006

This is the second part of a series of posts covering the development of my web spider in Ruby. You can read about the initial idea here, and the architecture in Part 0: Concept.

You may also recognise some of the code in Scheduler#run from a short post I made to check that the syntax highlighting was working :-)

First I want to recap the goal of the scheduler before getting into the code itself. Simply put, the scheduler exists to mangage the list of URIs (web pages, RSS feeds) that need to be spidered, and to manage the spiders themselves. In particular, we want to be able to limit the number of spiders working at any one time, out of politeness if nothing else.

I’m not going to make this a tutorial in Ruby syntax by explaining things line by line, if you haven’t used Ruby before and find something you don’t understand, the PragProg book, Programming in Ruby is the place to go look.

So let’s take a peek at some code!

(more…)

Ruby web spider Part 0: concept

Friday, March 3rd, 2006

(I should probably mention that I have never written a spider or worked on a search engine before, so this is a learning process… I don’t pretend to be an expert on this – I picked this partly because it is far enough from my “day” job that I’m not going to inadvertently end up in a conflict of interest. The closest I’ve come in the past was working on a natural language interface to search engine queries, way back in 2001 while I was in my final year at UTas.)

So how did I start?

(more…)