Who writes GNU/Linux?

You may have thought GNU/Linux was written by idealistic Unix Gurus camped up with a bunch of Jolt-Colas in their mom’s basement, but a recent report from the Linux Foundation states the opposite. Since Linux kernel version 2.6.11 in Mars 2005 the number of developers has grown from 483 to 1,057 in version 2.6.24 (January 2008). However, the number of sponsoring companies has also grown from 71 to 186 in the same time.

The major contributors aren’t Mom’s Basement Inc. either. Companies like Novell, IBM, Intel, SGI, Oracle, Google and HP rank among the 20 largest contributors (counted in number of sponsored changes, and here sponsoring means paying employees to program those changes).

This is just the Linux kernel (some 8.5 – 9 million lines of code). However, the Linux kernel in itself is of little use to anyone. You have to add the GNU part of GNU/Linux, consisting of commands like fdisk, aspell, bison, ghostview, and wget to that, and you’ll be looking at a much larger number of lines of code. If we go even further adding programs from other projects (like the Mozilla project’s FireFox web browser, or the OpenOffice suite) more lines of code are added (for exact numbers see ohloh.net), and we’re still talking about programs supported by large companies (IBM, Sun, etc).

To sum it all up: no, GNU/Linux is not being written by enthusiasts in the basement anymore. It’s being written by large corporations for competitive reasons. Hardware manufacturers wants to make sure Linux will work on their hardware, software companies can be anything from Linux distribution owners (Red Hat, Novell, MontaVista), use embedded versions of Linux in their consumer hardware (Sony, Nokia, Samsung), or for other reasons (for instance Volkswagen uses Linux for in-car networking between different components).

FTP with Wget

I’ve just had the total pain of trying to get files (a lot of files, in a lot of directories) via a musty old FTP client (in Linux/Ubuntu).

The problem is that the FTP client (ftp) doesn’t offer much to help (like recursive downloads, or mapping up the directories on the client side with those on the serverside, etc).

I searched and I found this thread:

http://ubuntuforums.org/archive/index.php/t-378221.html

…with this excellent snippet (posted by Mr. C.):

wget -r --ftp-user YourUSERNAME --ftp-password YourPASS ftp://FTPSITE//dir/'*.html'

If you want to download something other than *.html, you can change the file name pattern as you would expect.

If you want to add more directories, simply add them, but keep track of the number of slashes (“/”).  There should be only one after the new directory names (at least that’s how I made it work.  It may work wonderfully regardless of the number of slashes, but then again, why challenge fate?)

Happy FTPing!

Open UP

Anybody who ever came into contact with RUP (which is the name of Rational’s — now IBM’s — version of the Unified Process) may have stumbled upon their web application created to support the process. In there you can find work flows, actor and artifact definitions, templates etc etc.

I did, come across it some ten or so years ago. Since then I’ve had the (mis)fortune to work at companies with their own “UP” or what-have-you-versions of development processes. However, imagine my surprise and delight when I came across an Open version of UP (sponsored by the Eclipse project) with the web application, the actors and templates and all.

Institutionalized hearsay – no, we still don’t know what’s really going on!

I just read in my local news paper that the US’s war on terror has failed to weaken al Qaeda.

Interesting, I thought, and read on, just to find out that someone had made a survey of some 20 000 people in 23 countries around the world, asking them if the war on terror had weakened al Qaeda.

I managed to track down the survey to a company called Globescan. The actual survey can be found here. This survey seems rather scientific. They have a methodology page, where an account for how the data was collected for each of the involved countries, can be found in a table.

Interestingly enough, in more than half of the countries included interviews have been conducted face-to-face, suggesting the sample was only semi random, if even that. Most likely the interviewers asked people passing by in a market place, supermarket, or some other place, and since a random sample of the country’s population is less than likely to pass them in the hours or days they conducted the interviews, the sample for that country was not random. (Which in turn skews the result; imagine if only moms between 20 and 30 were asked this question, as opposed to everyone in the country, and then we’re not even discussing the difference between face-to-face and telephone interviews when it comes to interviewer influence errors.)

It is also possible to find the actual questions asked on the methodology page. One that stands out is the question “Overall would you say your feelings about al Qaeda are positive, negative or mixed?”

Imagine that question being posed to someone on a street in Egypt, Indonesia, or Pakistan (face-to-face interviews) and then imagine if the interviewee would be honest or not with respect to the far from negligible chance of being overheard…

The actual question probably quoted by my local news paper (and many like it, I’ve been able to gather) was probably “Do you think what US leaders refer to as the ‘war on terror’ has made al Qaeda stronger, weaker, or has had no effect either way?”

This question was not posed to prime ministers of the countries included in the survey, neither were it posed to members of the intelligence community or the military (at least not solely, as far as the methodology page of the survey reveals), instead the you’s and me’s of these countries were asked about the progress of the war on terror.

The question in itself is a potent and interesting one. The method to find the answer quite frankly sucks! The first question in the survey, if the interviewee was positive or not towards al Qaeda is interesting given it had been posed in a situation where anonymity were assured and spectator influence limited. Other questions trying to determine the interviewee’s personal experience and relation to the war on terror would have been interesting, but the attempt to try to make average Joe determine if the war was effective or not is, if I’m really nice, plain dumb.

However, as a survey this one is scientific and within parameters (even though I would have wanted more information on methodology). Unfortunately this kind of research (and others like it, there are other gems dealing with the war in Iraq in basically the same way, to mention one), dealing with these kinds of questions, does not exist in a vacuum.

Journalists will read these surveys and make headlines from them. You and I will read these articles and draw conclusions. We will bring them with us when we vote the next time, when we plan our vacations, and go about our lives (okay not always but now and then we will make decisions remembering “that article” about the war in Iraq or the war on terror), and most of the time we won’t have the background or know the methodology, and in light of that I believe these kinds of questions should not be asked in this way.

Globescan calls their activity reputation research, I think of it as institutionalized hearsay. With the kind help of these guys we can have what Bob next door has been saying for years served to us as a Survey, based on more than X-thousand interviews in so-and-so many countries. What a glorious example of statistics once more abused as a tool for spreading FUD.

To all of you who read about this survey and drew your own conclusions from it, I can only say this: No, we still don’t know what’s really going on!

Using Statistical Analysis to Create Intrusion Detection

Professor Avishai Wool presents a system that protects GNU/Linux machines from intrusion and malicious program code by using statistical analysis and policy files defining a programs normal behavior, and if that program deviates from said behavior the system stops it.

Since the analysis is hooked into standard GNU/Linux build tools and uses the source code to derive the policy the system is said to guarantee zero false positives. A system of this type is cited to be able to perform protection from threats long before traditional anti virus solutions has categorized them, and with far less penalty to system performance.

Here’s a list of links for further reading: