Dave's Blog

[ home ]

 

...

This blog has no purpose

...



David Scott
dave@recoil.org

September 2005
Sun Mon Tue Wed Thu Fri Sat
       
 

RSS Feed
Subscribe to an RSS feed.

Affiliations

Other blogs


Creative Commons License
Except where otherwise noted, this blog is licensed under a Creative Commons License.

     
Tue, 20 Sep 2005

Just read: Designing Extensible IP Router Software
by Mark Handley and Eddie Kohler and Atanu Ghosh and Orion Hodson and Pavlin Radoslavov. Proceedings of Proceedings of the 2nd USENIX Symposium on Networked Systems Design and Implementation (NSDI '05), May 2005

Notes: This paper describes XORP a project to create Open-Source router software.

They make some good observations:

low-level protocols that support the Internet have largely ossified, and stresses are beginning to show.

and

The router software market is closed: each vendor's routers will run only that vendor's software. This makes it almost impossible for researchers to experiment in real networks

both statements are true. Although in the case of the second I would argue that researchers can experiment in small networks but they can't experiment in the kind of large networks where the router equivalents of "big iron" dominate. I suspect then that this might be a non-sequitur:

We therefore saw the need for a new suite of router software: an integrated open-source software router platform running on commodity hardware...

As far as I can see, projects like OpenBGPD already aim to fill this niche and they don't solve this particular problem because you don't run this software in place of a "big-iron" backbone BGP router.

When talking about Cisco and Juniper's routers, the paper says

Unfortunately, these vendors do not make their APIs accessible to third-party developers, so we have no idea if their internal structure is well suited to extensibility

I just find this kind of thing quite depressing. As a researcher, why aim to compete in this kind of area? Why suffer the indignity of having to write stuff like:

While we don't know how Cisco implements BGP, we can infer from clues from Cisco's command line interface and manuals that it probably works something like this.

Small groups of researchers cannot and should not (in my opinion) try to compete this closely with industry. Researchers should play to their strengths; focus on more blue-sky stuff that's a little bit far-out with no direct industrial relevance. It seems like research suicide to take on large, well-funded companies at their own game.

Read pdf   Download bibtex  

[/justread] permanent link

Just read: Trickles: A Stateless Network Stack for Improved Scalability, Resilience and Flexibility
by Alan Shieh and Andrew C. Myers and Emin Gun Sirer. Proceedings of Proceedings of the 2nd USENIX Symposium on Networked Systems Design and Implementation (NSDI '05), May 2005

Notes: The paper describes a new API (to replace sockets) and a new protocol (to replace TCP) which allows all connection state to be handled by one endpoint. Since TCP distributes state scalability is limited - consider that a server requires explicit per-client buffers which can lead to memory exhaustion during a DoS attack. In trickles, the new protocol involves the stateful side (typically the client) sending a continuation (in the form of a serialised TCB) to the stateless side (typically the server). Additionally a new connectionless API is provided for the stateless side.

Naturally a whole pile of issues arise when a protocol is designed this way, which makes for interesting reading. For example, in TCP each packet reception changes the endpoint state and this (e.g. window-size) is reflected in the next packet transmission. Since each "trickle" is a stateless client-server ping-pong, the state does not get updated for a whole round-trip. The trickles system must also worry about malicious client changing server state and try to prevent this using MACs (there's an obvious analogy here between trickless and web-applications which also use a stateless protocol).

At the meta-level, it was interesting to see a paper propose an alternative to both the socket API and TCP, even if they compromise slightly by making a socket-compatible shim-layer and using a TCP-like protocol with TCP wire formats :-)

Read pdf   Download bibtex  

[/justread] permanent link

Just read: {Untangling the Web from DNS}
by Michael Walfish and Hari Balakrishnan and Scott Shenker. Proceedings of 1st Symposium on Networked Systems Design and Implementation (NSDI), March 2004.

Notes: Suggests that, now DNS is commercial, "profit has replaced pragmatism as the dominant force shaping DNS ... Commercial pressures arising from its role in the Web have transformed DNS into a branding mechanism, a task for which it is ill-suited". The paper points out that URLs involving host names -- rather than services or objects -- makes some tasks unnecessarily difficult, like content replication. The paper suggests that web-references should be redesigned to be (i) persistent; and (ii) contention-free (have no relevance to trademark law for example). They suggest a system of opaque keys ("Semantic Free References") mapped by a DHT to concrete location records. In the proposed system, object keys must be found by search engines -- no more DNS name guessing -- and the implementation must handle network failures nicely (e.g. with DNS using the external network connection may still allow you to access internal names -- "fate-sharing")

Random notes (not intended to express any particular opinion):

  • I wonder how often people actually type in URLs manually... I usually find it easier to type keywords into google ("I'm feeling lucky")... I don't really care what the URL bar says. When I want assurance I know to check for SSL anyway.
  • Initiatives such as DOI are trying to create persistent object references for electronic publishing anyway
  • The o-record (target of the keys in the DHT) look rather like mini HTML-redirection pages with caching hints and an onward pointer to either another reference or the final target
  • This is the kind of thing someone like google could just do, particulary if they release their own browser
  • They do remove the web URL misfeature where the host and path are exposed separately -- in this scheme, the full name is looked up in the name system before any HTTP request is generated. Good good.
. Is this just another instance of "every problem in Computer Science can be solved with another layer of indirection"? Or just another combination of application and "cool thing" where the web is the application and DHTs are the "cool things" today?

Read pdf   Download bibtex  

[/justread] permanent link

Just read: Acute: High-Level Programming Language Design for Distributed Computation
by Peter Sewell and James J. Leifer and Keith Wansbrough and Francesco Zappa Nardelli and Mair Allen-Williams and Pierre Habouzit and Viktor Vafeiadis . Proceedings of Proceedings of ICFP 2005: International Conference on Functional Programming (Tallinn), (Id SEP) 2005.

Notes: Acute is designed as an experimental platform to explore the design-space of type-safe distributed programming languages. As a bonus it was written in Fresh O'Caml to gain more experience of using Fresh O'Caml in larger projects. In contrast to systems like Obliq, Facile, JoCaml (and reminiscent of Nomadic Pict's two-level approach to communication), Acute does not specify a particular communication style in the runtime. Instead the system simply provides marshal and unmarshal to and from byte strings. Lots of interesting ideas in here.

Read pdf   Download bibtex  

[/justread] permanent link

Just read: Certification of programs for secure information flow
by Dorothy E. Denning and Peter J. Denning. Published in Commun. ACM, (20)7504--513(unknown month), 1977.

Notes: "Information flow control" is a procedure for regulating the dissemination of data amongst program objects. Program objects are assigned to security classes (e.g. top-secret, secret, public) and a policy is represented by a relation between these classes. A "flow" is only allowed to exist between x and y (i.e. the value of y is influenced by x) if (class(x),class(y))\in policy-relation.

This paper proposes a compile-time mechanism for checking programs against information flow policies. Once certified, security policy breaches can only occur through out-of-band mechanisms e.g. faulty or sabotaged hardware or unmodelled communication through covert channels. Quite an old paper but well-worth reading. In particular the discussion of what constitutes a flow between two variables is interesting. Obviously the assignment

  x <- y
causes a flow from y to x. But more subtle flows are also possible eg:
  x <- 0;
  while (y > 0) do
    x <- x + 1;
    y <- y - 1;
  done

Read pdf   Download bibtex  

[/justread] permanent link

Just read: Obliq: A Language with Distributed Scope
by Luca Cardelli. Published in (unknown journal), (unknown pages)June, 1994.

Notes:

In lexically scoped languages, the binding location of every identifier is determined by simple 
analysis of the program text surrounding the identifier. Therefore, one can be sure of the meaning of 
program identifiers, and can much more easily reason about the behavior of programs. 

In Obliq, analysing the program text allows you to discover where a binding is located. When procedures are transmitted, free variables are sent as references back to their defining location. Lexical scoping has a security implication; incoming agents come with their own variable references and have no way of accessing state at the execution site.

Read pdf   Download bibtex  

[/justread] permanent link

Tue, 14 Jun 2005

Just read: The Effect of DNS Delays on Worm Propagation in an IPv6 Internet
by Abhinav Kamra and Hanhua Feng and Vishal Misra and Angelos D. Keromytis . Proceedings of Infocom 2005, march 2005.

Notes: I found this paper linked from the Worm Blog. Memorable quote from the abstract: "It is a commonly held belief that IPv6 provides greater security against random-scanning worms by virtue of a very sparse address space. We show that an intelligent worm can exploit the directory and naming services necessary for the functioning of any network..."

Although they focus on low-level address-scanning worms, they do point out that email worms operate completely independently of the Internet address scheme. They describe models of hypothetical worms which would use pipelined random DNS name lookups and conclude that they could run almost as fast as raw IPv4 address scanning worms (so the worm would constantly guess names like www.somedomain.com)

They suggest employing traffic monitoring software near DNS servers to spot dodgy activity. I was wondering if we could usefully restrict access to the name system to slow down these attacks? Perhaps everyone has to authorise lookups via a smartcard/PDA and/or is rate-limited as well? It's a tricky one since obviously there is a tension between making good communication easy while making bad communication difficult...

It does beg the question of why so many computers have names as well as addresses. Since I never want to log into my laptop remotely, it doesn't need a name. However a lot of current applications seem to prefer all IP addresses to have associated names (to help prevent address spoofing?) and suffer large name lookup delays when they don't (like ssh - I had to add an entry to DNS for a laptop yesterday just because of that)

Read pdf   Download bibtex  

[/justread] permanent link

Sun, 22 May 2005

Just read: Rethinking the Service Model: Scaling Ethernet to a Million Nodes
by Andy Myers and Eugene Ng and Hui Zhang . Proceedings of ACM SIGCOMM HotNets, 2004.

Notes: Ethernet networks are popular because they are easy to setup and maintain while the hardware is relatively simple (and therefore cheap) to make. Some ISPs are now offering ethernet VPNs rather than IP VPNs. Signs are that people want to make bigger ethernet networks. They note that only recently have large networks with flat address-spaces become feasible due to the increase of transistor density and hence memory capacity.

The paper considers what would happen if someone tried to add a million end-systems to ethernet as it currently stands. Outlines problems with the convergence of "Rapid Spanning Tree Protocol" (RSTP); RSTP is intended to converge within 3x worst case network delay but in certain configurations it will actually take many seconds, during which the network reverts to inefficient broadcast flooding (they built a simulator to investigate various cases). They argue that the availability of a broadcast primitive has encouraged other protocols to rely on it, causing problems as the network gets bigger. The availability of broadcast means that RSTP has to be very conservative, never allowing forwarding loops to form. By banning broadcast, this can be relaxed and faster protocols used, essential for telecomm-style restoration.

Two different approaches are considered: a "thin control plane" and a "distributed control plane". The thin control plane has two aspects: a decision plane and a dissemination plane. The decision plane calculates all the forwarding tables while the dissemination plane talks to the decision plane, sending status information and receiving switch configurations. The distributed control plane has each local bridge offer a registration service in which a rebooting host places its MAC, IP and possibly other service info. The bridges synchronise this information globally. Broadcast protocols like ARP are modified to ask the local bridge where to send the traffic instead of broadcasting it.

They finish by describing simulation results which suggest they could use these techniques to allow a million node ethernets.

Read pdf   Download bibtex  

[/justread] permanent link