|
... This blog has no purpose ...
RSS Feed
Affiliations
Other blogs
|
Sat, 18 Mar 2006
Open Source Software
However the most interesting bit was the mention of Ronald Coase, the economist who, in 1937, wrote the famous paper " The Nature of the Firm ". (Advance warning: I am not an Economist, so I apologise if I get this wrong and it offends you!) He asked the question, "why are firms the size they are?". Before I read his paper I had never asked myself this simple question (shame on me!) When a company has a particular job to do, it can either use the "internal market" (i.e. get a fellow employee to do it) or outsource it to another firm (or contractor). Since the external market is bigger than the internal one, there ought to be more competition and a better price should be available. Taken to the extreme (for maximum efficiency), every company should have a single employee and outsource every job (other than the ones done by the single employee). Of course, we observe something quite different. What seems to happen is that a balance is struck between handling jobs internally and outsourcing, depending on the overheads. Outsourcing a job can be quite time-consuming and expensive because one must: locate a firm to do the work, negotiate a legal contract, wait for the job to be done and, if necessary, try to encourage the other party to stick to their end of the bargain. By contrast, an employee has already got a pretty watertight contract with their employer (complete with penalties in the event of job non-completion) and hence the overheads are lower. When working on an Open Source project, the terms are explicit and relatively simple to understand. Most interesting projects are under a small number of well-known licenses ( GNU GPL , BSD , etc) and there is often a convention on what happens to the copyright (e.g. the FSF may demand it if you contribute to emacs or you may be encouraged to keep it yourself as a way of preventing any future hostile takeover and license change). So a contributor doesn't need to engage a lawyer before starting work. Additionally, communication (via the web, email, etc) is cheap, fast and plentiful. So the overheads associated with outsourcing a job in the Open Source world are very small, so Open Source companies may be smaller and potentially the whole market more efficient. That link had never occurred to me before.
Working through the photo backlog (with the help of flickr)
Some photos of James and Fiona's wedding last month (or thereabouts): On our way back from there, we decided to visit the town of Bath. I'd not seen the Roman baths before... very impressive! Unlike "Verulamium" the (amusingly named) Roman town which was cannibalised to construct the cathedral at St. Albans, the baths at Bath managed to get hidden under some other miscellaneous rubble and are now remarkably intact (modulo the roof etc which fell down). The ancient tunnels which take the water from the hot spring into the various chambers still do their job pretty well and (IIRC) people used to take baths here as late as the 1930s. The spring water tasted slightly peculiar though and although people thought it had regenerative properties (and some physicians prescribed up to 5 litres a day!) I think I'll stick with old-fashioned tapwater. Mon, 10 Oct 2005
New Piano
2005: The year we made contact
Well, fear not, people of Earth! They have successfully made contact... by dropping a squash made to look like the planet Jupiter in Kieran's garden (what amazing technology these aliens have... I wonder if their spaceship is shaped like a giant turnip?). Now the question is, what were they trying to say? Is there a hidden message?? We might never know!
Jupiter picture is from NASA and is in the public domain Another picture of the squash with a bit of context:
Just read: Designing Extensible IP Router Software
Notes: This paper describes XORP a project to create Open-Source router software. They make some good observations: low-level protocols that support the Internet have largely ossified, and stresses are beginning to show. and The router software market is closed: each vendor's routers will run only that vendor's software. This makes it almost impossible for researchers to experiment in real networks both statements are true. Although in the case of the second I would argue that researchers can experiment in small networks but they can't experiment in the kind of large networks where the router equivalents of "big iron" dominate. I suspect then that this might be a non-sequitur: We therefore saw the need for a new suite of router software: an integrated open-source software router platform running on commodity hardware... As far as I can see, projects like OpenBGPD already aim to fill this niche and they don't solve this particular problem because you don't run this software in place of a "big-iron" backbone BGP router. When talking about Cisco and Juniper's routers, the paper says Unfortunately, these vendors do not make their APIs accessible to third-party developers, so we have no idea if their internal structure is well suited to extensibility I just find this kind of thing quite depressing. As a researcher, why aim to compete in this kind of area? Why suffer the indignity of having to write stuff like: While we don't know how Cisco implements BGP, we can infer from clues from Cisco's command line interface and manuals that it probably works something like this.
Small groups of researchers cannot and should not (in my opinion) try to compete this closely with industry. Researchers should play to their strengths; focus on more blue-sky stuff that's a little bit far-out with no direct industrial relevance. It seems like research suicide to take on large, well-funded companies at their own game.
Just read: Trickles: A Stateless Network Stack for Improved Scalability, Resilience and Flexibility
Notes: The paper describes a new API (to replace sockets) and a new protocol (to replace TCP) which allows all connection state to be handled by one endpoint. Since TCP distributes state scalability is limited - consider that a server requires explicit per-client buffers which can lead to memory exhaustion during a DoS attack. In trickles, the new protocol involves the stateful side (typically the client) sending a continuation (in the form of a serialised TCB) to the stateless side (typically the server). Additionally a new connectionless API is provided for the stateless side. Naturally a whole pile of issues arise when a protocol is designed this way, which makes for interesting reading. For example, in TCP each packet reception changes the endpoint state and this (e.g. window-size) is reflected in the next packet transmission. Since each "trickle" is a stateless client-server ping-pong, the state does not get updated for a whole round-trip. The trickles system must also worry about malicious client changing server state and try to prevent this using MACs (there's an obvious analogy here between trickless and web-applications which also use a stateless protocol). At the meta-level, it was interesting to see a paper propose an alternative to both the socket API and TCP, even if they compromise slightly by making a socket-compatible shim-layer and using a TCP-like protocol with TCP wire formats :-)
Just read: {Untangling the Web from DNS}
Notes: Suggests that, now DNS is commercial, "profit has replaced pragmatism as the dominant force shaping DNS ... Commercial pressures arising from its role in the Web have transformed DNS into a branding mechanism, a task for which it is ill-suited". The paper points out that URLs involving host names -- rather than services or objects -- makes some tasks unnecessarily difficult, like content replication. The paper suggests that web-references should be redesigned to be (i) persistent; and (ii) contention-free (have no relevance to trademark law for example). They suggest a system of opaque keys ("Semantic Free References") mapped by a DHT to concrete location records. In the proposed system, object keys must be found by search engines -- no more DNS name guessing -- and the implementation must handle network failures nicely (e.g. with DNS using the external network connection may still allow you to access internal names -- "fate-sharing") Random notes (not intended to express any particular opinion):
Just read: Acute: High-Level Programming Language Design for Distributed Computation
Notes: Acute is designed as an experimental platform to explore the design-space of type-safe distributed programming languages. As a bonus it was written in Fresh O'Caml to gain more experience of using Fresh O'Caml in larger projects. In contrast to systems like Obliq, Facile, JoCaml (and reminiscent of Nomadic Pict's two-level approach to communication), Acute does not specify a particular communication style in the runtime. Instead the system simply provides marshal and unmarshal to and from byte strings. Lots of interesting ideas in here.
Just read: Certification of programs for secure information flow
Notes: "Information flow control" is a procedure for regulating the dissemination of data amongst program objects. Program objects are assigned to security classes (e.g. top-secret, secret, public) and a policy is represented by a relation between these classes. A "flow" is only allowed to exist between x and y (i.e. the value of y is influenced by x) if (class(x),class(y))\in policy-relation. This paper proposes a compile-time mechanism for checking programs against information flow policies. Once certified, security policy breaches can only occur through out-of-band mechanisms e.g. faulty or sabotaged hardware or unmodelled communication through covert channels. Quite an old paper but well-worth reading. In particular the discussion of what constitutes a flow between two variables is interesting. Obviously the assignment x <- ycauses a flow from y to x. But more subtle flows are also possible eg:
x <- 0;
while (y > 0) do
x <- x + 1;
y <- y - 1;
done
Just read: Obliq: A Language with Distributed Scope
Notes: In lexically scoped languages, the binding location of every identifier is determined by simple analysis of the program text surrounding the identifier. Therefore, one can be sure of the meaning of program identifiers, and can much more easily reason about the behavior of programs. In Obliq, analysing the program text allows you to discover where a binding is located. When procedures are transmitted, free variables are sent as references back to their defining location. Lexical scoping has a security implication; incoming agents come with their own variable references and have no way of accessing state at the execution site.
Man-eating spider
More rendering stuff
Fraser Research trip to the beach
We had a nice meal here, in a restaurant next to the marina:
Camlimages, freetype and kerning
Once the basics were working I noticed that the camlimages/freetype binding seems to lack an API to extract kerning information (definition of kerning from wikipedia). So I added that and it seems to work! Check out these two pictures:
The topmost picture is without kerning enabled and the bottommost one has kerning enabled. Check out the relative distance between the "A" and the "V". Now to work on ligatures... Wed, 27 Jul 2005
Jon and Emma's wedding
Obligatory photos:
We also got to meet Jill, Richard's girlfriend:
Here's a nice picture of Eleanor with me looking.. less silly than in most of the other photos:
Some more obligatory photos: (Any wedding with lots of cheese is fine by me)
I quite liked the "Guest Signature Mount" which we were
Interestingly the hotel included it's own chapel which looked like a building straight out of "Quake".. converted appropriately into a games room.
That's enough foor this entry. Next time I'll upload some short videos and yet more photos of the wedding aftermath... Mon, 20 Jun 2005
Graphing
The red/pink line is my heart-rate during my previous run (about 5 days ago... far too long a gap) and the blue line is the measured rate during this morning's run. The first major drop after about 30mins is when I stopped running and started resting. I notice that my HR dropped to a plateau at about 120bpm last time but not this time. I don't really know why although it could be linked to the weather -- today was beautifully fresh while 5 days ago it was pretty hot and humid. Now to make the graph plot data in real-time from simulations. Did I mention that it uses OpenGL under ocaml via lablgl - all pretty fantastic tools. [/princeton-2005] permanent link Sun, 19 Jun 2005
Visit to NYC
We went first to the Empire State Building where we saw -- and participated in -- the worlds longest queue (TM). Rumour has it that the Petronas towers in Kuala-Lumpur has a longer one but I frankly don't believe it. From the 86th floor (just below the "dirigible mooring mast" -- alas there were no airships going up today so we had to take the boring old lift) we got a pretty cool view of Manhattan:
It's a bit unfortunate they have to put up such a big fence to prevent people jumping/falling off.. it really obscures the view.
Also from the top you can see the World's largest department store (Macy's) and the World's most green island statue (the Statue of Liberty):
Next on our tour of Manhattan we headed south to check out the financial district and the site of the World Trade Center. I was quite impressed that they'd repaired the site's transport links -- the PATH station was functioning normally. It was also interesting to see quite a lot of damage to nearby buildings still remained after all this time.
While waiting for Donald Trump to give his approval for the plans to redevelop the site into some kind of memorial, there is an official set of memorial plaques as well as copious unofficial memorials from members of the public:
Next we headed further south to Battery park where we took a "sea taxi" out to have a close-up look at Liberty island:
[/princeton-2005] permanent link Tue, 14 Jun 2005
Just read: The Effect of DNS Delays on Worm Propagation in an IPv6 Internet
Notes:
I found this paper linked from the
Worm Blog. Memorable quote from the abstract:
Although they focus on low-level address-scanning worms, they do point out that email worms operate completely independently of the Internet address scheme. They describe models of hypothetical worms which would use pipelined random DNS name lookups and conclude that they could run almost as fast as raw IPv4 address scanning worms (so the worm would constantly guess names like www.somedomain.com) They suggest employing traffic monitoring software near DNS servers to spot dodgy activity. I was wondering if we could usefully restrict access to the name system to slow down these attacks? Perhaps everyone has to authorise lookups via a smartcard/PDA and/or is rate-limited as well? It's a tricky one since obviously there is a tension between making good communication easy while making bad communication difficult... It does beg the question of why so many computers have names as well as addresses. Since I never want to log into my laptop remotely, it doesn't need a name. However a lot of current applications seem to prefer all IP addresses to have associated names (to help prevent address spoofing?) and suffer large name lookup delays when they don't (like ssh - I had to add an entry to DNS for a laptop yesterday just because of that)
First proper run with the Forerunner
I hacked together some scripts to analyse the data a bit. Here are some plots:
New GPS receiver
To transfer the data from the device to my Mac I'm using the latest versions of gpsbabel which can communicate with a large number of different GPS receivers and knows how to read and write numerous file formats (apparently the mapping world is plagued by lots of similar-in-purpose but yet trivially different file format standards). Unfortunately the particular model of receiver I just bought isn't completely supported yet by gpsbabel. Thankfully gpsbabel seems relatively easy to hack and I've made a few simple changes to help it understand the new format of "track point" data (it seems that recorded runs are called "tracks") (specifically the data type D/303 which is not covered by the Garmin specification). I'm not sure what the Garmin does with the heart rate data -- hopefully I'll get a better idea when I go for a run tomorrow. If I'm lucky it'll just be part of the track data... although I can imagine reasons why it might be stored somewhere else (e.g. consider that while GPS coverage may be patchy in a city one would expect a continuous stream of HR events... so perhaps they are indeed stored separately). To test the receiver I went for a walk from the house to Small World coffee. Here's a plot of the route via google maps:
Interestingly I seem to be walking parallel to Park Place -- I assume this is just systematic error in the GPS system. At the corner of Nassau and Witherspoon I actually turned right and walked under a whole lot of metal scaffolding. I presume the last point is some kind of reflection/ random error. Speaking of google maps, it seems that everyone is busy finding uses for it these days. Here are some useful referencess:
Memorial Day Parade
Some pictures of the marching people:
Naturally I took some video snippets... unfortunately they're in a mixture of different formats -- sorry if none of them work for you.
[/princeton-2005] permanent link
Visit to Pittsburgh
Before the main review started we had a practice session. Here are some pictures of the gradstudent poster session at which I presented the switch simulation (the demo is on the right, next to Sandy):
The demo itself seemed to go down well (probably because of all the flashy 3D graphics):
I didn't see too much of Pittsburgh while I was there but here are a few shots taken on the way from the university to the hotel. The leftmost shows the CS facility and the other two attempt to show how hilly the place was (the entrace to the building is on floor 4 and there is a ground-exit on every floor apparently). Notice the gothic looking building on the rightmost photo.
The leftmost picture shows a church near the campus and the rightmost one is of the Pittsburgh Software Engineering Institute:
Here's the impressive gothic building again -- certainly puts Cambridge's UL to shame:
Silly Javascript microbenchmark
As you can see, Firefox's javascript VM seems to run faster than Safari's. The final kink in the Firefox curve is probably when the program is suspended while the user is consulted with a dialog ("A script on this page is causing mozilla to run slowly.... Do you want to abort..."). Encouragingly, the javascript interpreters weren't that much slower then ocaml bytecode -- I didn't have to log any axes! So all we need is something to compile a nice language to javascript and perhaps we can do some proper client-side programming... Sun, 22 May 2005
Experimental tools for bibtex -> rss
Just read: Rethinking the Service Model: Scaling Ethernet to a Million Nodes
Notes: Ethernet networks are popular because they are easy to setup and maintain while the hardware is relatively simple (and therefore cheap) to make. Some ISPs are now offering ethernet VPNs rather than IP VPNs. Signs are that people want to make bigger ethernet networks. They note that only recently have large networks with flat address-spaces become feasible due to the increase of transistor density and hence memory capacity. The paper considers what would happen if someone tried to add a million end-systems to ethernet as it currently stands. Outlines problems with the convergence of "Rapid Spanning Tree Protocol" (RSTP); RSTP is intended to converge within 3x worst case network delay but in certain configurations it will actually take many seconds, during which the network reverts to inefficient broadcast flooding (they built a simulator to investigate various cases). They argue that the availability of a broadcast primitive has encouraged other protocols to rely on it, causing problems as the network gets bigger. The availability of broadcast means that RSTP has to be very conservative, never allowing forwarding loops to form. By banning broadcast, this can be relaxed and faster protocols used, essential for telecomm-style restoration. Two different approaches are considered: a "thin control plane" and a "distributed control plane". The thin control plane has two aspects: a decision plane and a dissemination plane. The decision plane calculates all the forwarding tables while the dissemination plane talks to the decision plane, sending status information and receiving switch configurations. The distributed control plane has each local bridge offer a registration service in which a rebooting host places its MAC, IP and possibly other service info. The bridges synchronise this information globally. Broadcast protocols like ARP are modified to ask the local bridge where to send the traffic instead of broadcasting it. They finish by describing simulation results which suggest they could use these techniques to allow a million node ethernets.
Silly time-lapse video
Uh-oh apparently Immy can't read yet
A sad day for the LCE
Obsession with food
and moving on to a fantastic curry which Eleanor and I cooked a few weeks ago (notice the spicy mango chutney which Eleanor's friend provided):
The home office shed
What a fantastic idea! There are lots of interesting sites on the web including:
Sarah and Dan's wedding
I did shoot quite a lot of videos (in fact I managed to fill up 1GB of SD) which took absolutely ages to transcode into more web-friendly versions. The files below might require you to upgrade to the latest version of Quicktime (version 7) (free, non-pro version should be fine) because they use apple's new H.264 video compressor.
Back in Princeton for the summer
packets/sec
bytes/sec So even though she's on the smallest NTL broadband package which has a monthly bandwidth cap of 3GB, she should be able to run a 30kB/sec video chat for about 30hours continuously before reaching the limit. [/princeton-2005] permanent link Tue, 19 Apr 2005
The folly of monorails
Luckily for me I left plenty of time (5hrs or so) to get on my flight so I was fine. After a total outage of about 1.5hrs the system was working well enough to ship people to the parking lot, from where they could walk to all the terminals. [/princeton-2005] permanent link Sun, 17 Apr 2005
Gratuituous hardware pictures
[/princeton-2005] permanent link Sat, 16 Apr 2005
More camera experiments (and dinner)
The new camera even takes videos
The new camera works!
New camera!
Catching up with the blog
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||