Guru++

November 16th, 2005

(here’s another burning-hot news nugget that will be posted on the official conference site within a few days)

This year our Guru-is-in Coordinator did such an amazing job at finding great speakers that we found ourselves with a overflow of great speakers on topics we knew attendees would want to hear about.

We know that people really like these sessions because it gives them a chance to interact with some of the heavy-hitters in an area. Imagine getting advice on (to pick some at random):

  • backups from Curtis Preston
  • Samba from Jerry Carter
  • OSX from Jordan Hubbard
  • OpenSSL from Ben Laurie
  • Solaris 10 features (dtrace, zones, ZFS) from key members of the Sun engineering team

This level of access to the best in the field was too cool for us to limit, so we decided to do everything we needed to make sure we could accommodate the bonanza. As a result, you’ll find that we’ve split some of the sessions and added extra session rooms to the schedule wherever we could. Going to be a heck of a track this year.

Register for the conference and the tech program here.

Yup, We Used a Wiki

November 16th, 2005

I’m mindful of the promise I made when I started this blog to reveal how things really work behind the scenes at the LISA conference. Here’s another little tidbit in that category.

Each program chair gets to choose her or his method for a) keeping track of all of the kerjillions of details associated with planning the program and b) coordinating all the communication that has to take place between the various volunteers during the planning process.

I knew email alone wasn’t going to cut it. Sure, we exchanged tons of it (and continue to keep port 25 hot) but email doesn’t provide an easy repository of information for the entire set of people involved to reference/edit. A collection of email also doesn’t do a good job of representing the current state at any one time (e.g. a concise summary of what was decided on a particular subject). We really needed a canonical data store and rendezvous point for all of the activity taking place.

I chose to put up a Wiki for the organizers to use because it seemed to be a perfect application for that model. Originally I planned to use MediaWiki because that’s what Wikipedia uses but a test instance ran way too slow for me when I tried it. (This speed problem, I’m certain, was a reflection of the particular way I had it set up and has nothing to do with the software itself). At that point I found TWiki and decided to use it instead because it looked fairly mature and my Perl background would be helpful if I needed to hack/debug it.

In general we used a stock version of the production release with only a few small tweaks beyond the normal configuration work:

So how well has this worked for us? Here’s my impressions as program chair, other conference organizers who were in the process may have different opinions:

Bad: several of the committee members despise wikis (for reasons that I never quite investigated), but they managed to use it through gritted teeth just to humor me. I thank them for this, though I wasn’t happy using a method that other people didn’t like. (On the flip side, several committee members really dig wikis, so perhaps it all evened out?).

Less good: As an experiment, I tried using the wiki as a annotation tool. I spent a bunch of time converting the results of the 2004 LISA attendees survey into wiki format, commented the heck out of it, and then asked other people to add their own annotations. The wiki was not the ideal tool for this, probably because its (intentional) lack of tools to deal with structured content. Ideally you’d like to be able to freeze the source document and just let people freely edit comments on it. I’m sure I could have hacked TWiki into doing a better job of this ala QuickTopic Document review, but I didn’t spend the time. [As an aside: yes, we really do pay attention to the results we get back from the survey we ask attendees to fill out. I spent a lot of time pouring over it to help me understand ways to improve this year’s conference. Please fill out the 2005 version so my successor also has that kind of data.]

Less good: The wiki got used at times as a discussion/conversation forum about various topics. People would leave comments on the content (sometimes interleaved into the content itself) or in response to other people’s comments. While this worked ok, my experience was this tended to get unwieldy or just graphically unpleasant. Perhaps this was just bad typographical conventions on our part, but my experience with wikis in general suggests they don’t function nearly as well as threaded discussion forums do for this purpose. We never got to the point where one had to perform archaeological forays to find things (as one does sometimes on TWiki.org) but that was a likely endpoint. I took a very strong hand at times and shuffled off past discussion to separate wiki pages whenever it seemed appropriate.

Good: Besides the issue with discussions mentioned in the last item, the wiki functioned well for general brainstorming (e.g. where people contribute to a list of items).

Very good: The wiki functioned well in two cases: scheduling and status reporting. I find the process (especially with people this busy) of scheduling meetings and conference calls via email to be pretty painful. It worked very well to ask everyone to go to a wiki page and edit a pre-made page where they could indicate their availability. Yes, this could have been done with either polling software or calendar software, but this was a quick and easy (for me) solution that didn’t require a bunch of round-trip email transactions. Similarly, we used the EditTable plugin mentioned above to allow people to easily report back the status of papers as they moved through the shepherding process.

Very good: (all of the stuff regarding ad-hoc content creation/editing wikis are good for)

I had thoughts about bringing in other cool web-based planning software (e.g. Basecamp) into the picture, but I ran out of time to deal with the infrastructure (vs. just planning the conference). Still, I’m pretty happy with what we did use and I plan to apply the knowledge I gained from working with the software to my day job.

Refereed Paper Track

November 15th, 2005

Early on in the life of this blog, I talked a bunch about how the papers in the Refereed papers track were actually chosen for the LISA conference. But dear reader, I’ve done you a disservice by not actually talking about which papers were selected.

One of the things which makes LISA special and different from other conferences is the research and investigation into system administration that gets presented. I don’t know of any place else that publishes peer reviewed work like this designed to advance the field.

Here’s a few examples I’m pulling randomly from the technical session listing:

  • Fast User-Mode Rootkit Scanner for the Enterprise, Yi-Min Wang and Doug Beck, Microsoft Research (seen anything in the news about rootkits lately?)
  • A Case Study in Configuration Management Tool Deployment, Narayan Desai, Rick Bradshaw, Scott Matott, Sandra Bittner, Susan Coghlan, Rémy Evard, Cory Lueninghoener, Ti Leggett, John-Paul Navarro, Gene Rackow, Craig Stacey, and Tisha Stacey, Argonne National Laboratory (what is deploying this stuff really like?)
  • Reducing Downtime Due to System Maintenance and Upgrades, Shaya Potter and Jason Nieh, Columbia University (reducing downtime something you’ve been asked to do at your job?)
  • A1: Spreadsheet-based Scripting for Developing Web Tools, Eben M. Haber, Eser Kandogan, Allen Cypher, Paul P. Maglio, and Rob Barrett, IBM Almaden Research Center (if you’ve suspected spreadsheets could be useful for something beyond number crunching, here’s how to do sysadmin with them)
  • Manage People, Not Userids, Jon Finke, Rensselaer Polytechnic Institute (identity management issues buzzing louder at your workplace these days?)

Register for the conference and tech sessions here.

An Evening with MAKE Magazine

November 15th, 2005

There’s a really close intersection between the sort of people who attend LISA and those who read MAKE magazine. We all love to build stuff and tinker.

That’s the thinking behind a special evening program on Monday night we’ve arranged for this year’s LISA. We’ve invited two frequent contributors to the magazine (one of which is on their tech advisory board) to come and talk to us about their work and the philosophy behind the new resurgence of do-it-yourself-ing behind MAKE. They’ll also be bringing a bunch of stuff for attendees to see and play with after their talks.

(added bonus: the first 100 people attending the talk will receive a free copy of the magazine, courtesy of the nice people at MAKE magazine. If you’ve never seen the magazine, here’s a special online sampler for people reading this blog.)

Here’s the info on the talks for this special evening:

Talk I: Tweaking, Bending, and Making: Stories of a Hardware Hacker
Joe Grand, Grand Idea Studio, Inc.

Never before has the do-it-yourself ethos been so popular. Bolstered by loose-knit communities of curious tinkerers and O’Reilly’s new quarterly MAKE magazine, tweaking, hacking, and bending have all but reached the mainstream. Behind the projects lie individuals with the drive to make something better, to modify a product to do something it was never intended to do, or to just create something out of the ordinary. This approach to problem solving should be familiar to the USENIX community. 


In this fun and light-hearted session, Joe Grand, electrical engineer and obsessed inventor, will tell his story and that of MAKE magazine. Armed with some interesting, wacky, and/or curious hardware hacks, Joe will provide a show-and-tell that will hopefully motivate you to embrace the Maker mindset in your own lifestyle. 


Joe Grand is the President of Grand Idea Studio, Inc. (www.grandideastudio.com), a San Diego-based product research, development, and licensing firm, where he specializes in the invention and design of consumer electronics, video game accessories, and toys. Joe is the author of several books, including Hardware Hacking: Have Fun While Voiding Your Warranty and Game Console Hacking. He is on the Technical Advisory Board and is a Contributing Writer for MAKE magazine. 


Joe is also a globally recognized figure in computer security. He has testified before the United States Senate Governmental Affairs Committee and is a former member of the legendary hacker collective L0pht Heavy Industries. Joe holds a Bachelor of Science degree in Computer Engineering from Boston University.

Talk II: Hacking Silicon: Secrets From Behind the Epoxy Curtain
Bunnie Huang, bunnie studios, LLC

I’ll talk about basic methods and theory behind silicon hacking:

  • motivation
  • examples of silicon-based security
  • overview of methods for decapsulating silicon chips
  • methods for imaging chips
  • theory behind deciphering silicon chips (briefest introduction)
  • practical example of hacking a PIC microcontroller to recover data from security fused regions

Bunnie Huang (www.bunniestudios.com) has a strong background in silicon design and reverse engineering. bunnie completed his PhD at MIT on computer architecture, with an emphasis on the big-picture silicon implementation issues of large scale parallel machines. During the course of his studies, bunnie reverse engineered cryptographic keys out of the Xbox hardware and published his findings in CHES (Cryptographic Hardware and Embedded Systems) and in a book titled Hacking the Xbox. bunnie’s professional experience in silicon design (which includes 802.11b/Bluetooth radios, 10 Gigabit transceivers, CMOS photonics, and various prototype chips for silicon devices research) combined with his reverse engineering expertise gives him a unique perspective on silicon hacking.

Register for the conference here.

New guru added: Ben Laurie

November 15th, 2005

Hoo boy, we’re getting really close to the conference (early bird reg discount ends in just three days!). Even at this late date we’re still working hard to add cool stuff to make this LISA conference extra cool. Here’s one piece of news that’s so hot-off-the-wire that it hasn’t hit the official conference website yet.

Ben Laurie has agreed to be a speaker in our guru-is-in track. You probably know his name because he’s been key contributor in several of the projects you deal with daily (Apache/Apache-SSL and OpenSSL ring any bells?). He’s also done some tremendously cool security related work like the Bluetooth attacks (see his blog and homepage for other examples).

Here’s the official bio:

Ben Laurie is the Director of Security at The Bunker Secure Hosting. He is the author of Apache-SSL as well as serving as an Apache core team and board member, and an OpenSSL core team member.

Register for the conference and the tech program here.

Training Spotlight 3: Network Security Monitoring with Open Source Tools

October 26th, 2005

Speaking of amazing teacher opportunities, Richard Bejtlich, formerly of Foundstone and well known for his network security blog, is offering a training class on using the cornucopia of open source security tools for network security monitoring. There are a tremendous number of tools now available in this space which makes keeping up with them tricky. Having someone like Bejtlich tell you which ones to pay attention to and how to best use them is a superb jumpstart.

More info on the training class here. Register for the conference and this class here.

Training Spotlight 2: Ethereal and the Art of Debugging Networks

October 26th, 2005

I know that I break out Ethereal at the first sign of trouble on my network. I’ve used it to deal with security issues, client-server problems, and all sorts of other hairy situations. Though I’ve gotten pretty facile with it over the years through sheer trial-and-error, I’ve always wondered just how much more effective I’d be if I had someone who really knew what she or he was doing showing me the ropes.

Now you have the chance. Gerry Carter, one of the hardest working people in the training biz, is teaching a class on just this subject (Ethereal and the Art of Debugging Networks). He’s been a core member of the Samba team (who know more than anyone should have to know about protocols on a wire) and a LDAP/Kerberos guru so he has tremendous Ethereal chops.

More information on the training class here. Register for the conference and this class here.

Training Spotlight 1: Understanding Configuration Management

October 26th, 2005

LISA (and the other USENIX conferences) are well known for the quality of their training/tutorial sessions. Highly practical and timely, they are a good place to pick up the info you need to be on top of the latest tech to do your job. The classes are taught by some of the top people in our field (who often are around during the rest of the conference for side questions and conversations).

Dan Klein, the training program coordinator, does his best to make sure that each conference brings with it new and exciting classes. This year’s LISA is no exception so I though I would highlight some of the new classes that caught my eye in this blog entry and the next few entries.

This year I noticed that Mark Burgess (the only full professor of network and system administration I know of), is teaching a new intro class in configuration management (Understanding Configuration Management). This is like offering a beginning animation class with Will Eisner (RIP) or Hayao Miyazaki. Burgess is the author of cfengine and has been an active researcher at the forefront of the configuration management for many years. It’s a heck of an opportunity if you are interested in configuration management at all.

More info on the training class here. Register for the conference and this class here.

Guru Spotlight: Jordan K. Hubbard

September 19th, 2005

If you’ve been in the business for a while you probably recognize the name Jordan K. Hubbard. (If you’ve been in the business as long as I have, you may even remember it from the comp.sources.unix days).

Hubbard was one of the co-founders of FreeBSD and one of the reasons why that project has developed into the well-respected operating system it is today [just fyi, the words you are reading are being served off FreeBSD boxes].

I still remember when it was announced in 2001 that Hubbard joined Apple to work on Darwin. It was at that point that I knew we were in for some interesting and substantial stuff out of Apple. Turns out I was right [he says while typing on an Apple Powerbook].

For the Guru-is-In track, we do our best to bring someone who really knows their stuff on a topic. For Mac OSX I’m sure you’ll agree we found the very epitome of the term “guru” when you hear that we’re privileged to have Hubbard as the Mac OSX guru-is-in speaker at LISA 2005.

Here’s Jordan Hubbard’s bio from our program:

Jordan Hubbard is the Director of UNIX Technology, CoreOS, at Apple Computer. He has been a software developer since the late 70’s and is a longtime contributor to the open source community, from the earliest days of USENET’s comp.sources.unix group, through MIT’s X11 contributed software collection, to the FreeBSD Project, which he co-founded in 1993. These days, he focuses on the day-to-day development of Mac OS X and, more generally, on Apple’s open source strategy and its relationship with traditional UNIX developers and administrators. His current pet count, for those who follow such things, is 10 cats and 4 dogs.

The LISA Conference Network

September 8th, 2005

If you want to know how the tech behind-the-scenes at LISA works, the right guy to talk to is Tony Del Porto, the USENIX system administrator and conference network administrator. He’s the laid-back, ultra-capable guy you see moving at warp speed during the conference keeping everything running.

I was curious about what it took to provide a network for a conference full of sys and netadmins so I asked Tony to describe the setup he uses for LISA. Here’s what Tony, the sysadmin’s sysadmin, wrote back:

David asked me to talk a little about the USENIX conference LAN. I’ve tried to limit the following to the bits that are somewhat unique to a conference network, and LISA at the Town and Country specifically though most of it applies to every USENIX conference LAN.

The most crucial bit of the LISA ‘05 conference LAN is the internet connection, without which there really isn’t much point in having a network. Attendees used to corporate LANs or cable modems don’t think twice about downloading ISOs or pulling large chunks of code from CVS while at a conference, so having plenty of bandwidth is an obvious primary concern. ISP contracts being what they are, USENIX can’t order up a T3 for a week, or even a month, so we’re largely reliant on what the venue has to offer. The T&C has a shiny new T3 which is wonderful compared to the T1 we’ve used in previous years.

Second to the connection is the site network infrastructure and how much leave I have to use and alter it. The T&C is a property (meeting planners call hotels “properties”) USENIX has visited many, many times and, unlike some properties I’ll not mention, is very accommodating in granting access to its infrastructure. The T&C’s ethernet isn’t great, but isn’t non-existent either. There are always challenges in making a network that is designed to work a certain way work the way I need it to. Most of the resolutions to those challenges involve me on my hands and knees taping down several hundred feet of CAT5. Don’t walk barefoot at a conference. Trust me. The T&C requires three such runs of cable to work around the way the room the router sits in is wired. Why not move the router to some central location you ask? Access. The main wiring closet of the hotel is in a locked cage that only a few people have unrestricted access to, and I don’t number amoung them.

A bit on the hardware and software I use. The “router” for the conference LAN in past years has been an 800 PIII Mhz Dell laptop with three interfaces running OpenBSD. A little over a year ago I discovered the hard way that PCMCIA cards are pretty limited in the amount of traffic they can handle. Thus the current “router” is a 700Mhz PIII desktop with a gigabit interface for the conference LAN and a four port Soekris card for the internet connection and registration LAN. A note on the Soekris card: it buffer underruns under load. I have a cron job that ifconfigs the active interfaces up and down every five minutes. The next conference router will not have a four port Soekris card.

The “router” runs the usual collection of network software: Bind 9, ISC dhcpd (the OpenBSD version), Squid, and an ftp proxy. NAT, packet filtering and redirection is done by OpenBSD’s packet filter, PF. A laptop provides a second dns server for the network and doubles as a router and firewall for hands-on security training classroom. The Squid proxy has been voluntary at past conferences but became transparent for our Security conference. 400 people on a 1.1Mbit DSL line without caching is not pretty.

Wireless hardware is a collection of four old Aironet 4800 series access points, five Cisco 1200 series access points, and an Airport Base Station for small isolated meetings. The T&C presents more of a challenge than most venues because it is so spread out, thus requiring more hardware than any other property. The LISA conference format recently changed such that training and technical sessions happen on the same days which requires additional hardware. In short, I need more access points for LISA at the T&C than any other USENIX conference, and I don’t have them. I’ve tried using borrowed SOHO access points, but they fall apart with more than about 10 active connections. The Ciscos can handle 40 to 50 active connections on a single power outlet and ethernet connection. If you have spare Cisco gear laying idle you’d like to lend to the network please let me know.

What I do is based on the work of many others, my own experience, and the suggestions of attendees at each USENIX conference. LISA is the most challenging USENIX conference as its attendees use the most bandwidth, use “security evaluation” tools the most, and have the highest percentage of laptop usage. LISA is also the USENIX conference I learn the most at, and have the most fun at. This year a network team is forming to provide additional services on the Conference LAN. If you are interested in helping or have an idea for a service to provide, please send e-mail to wireless at usenix dot org.

LISA ‘05 Web Site and Registration Live

September 7th, 2005

Casey (of the super-cool USENIX production staff) informs me that the official web site for the conference, complete with registration is now live at: http://www.usenix.org/events/lisa05/.

Be sure to check back periodically (and watch this blog) because we’ll be adding more stuff to the site as the conference continues to gel. But in the meantime if you want to jump on the early registration discount, now’s a great time to do it!

Blog Spun Up

September 6th, 2005

And…we’re back. With that icky vacation thing out of the way I can get back to keeping you in the thick of LISA 2005. Have a number of things queued to tell you about, will post them as soon as I can. In the meantime, comments have been turned back on, so have at it.

P.S. Note to self:

Dear Self,

Maybe that vacation thing without the computer thing wasn’t so bad. Probably should try to replicate the experiment in another few years to be sure this wasn’t a fluke.

Blog on Power Save mode

August 23rd, 2005

For the first time in at least 5 years (perhaps more), I’m going to be taking a one week vacation during which I’ll be completely off the net. I’m going to try to avoid anything computer-like. If I start to have too many withdrawal symptoms I suspect I’ll find myself constructing a turing machine simulator out of seashells and kelp.

Ok, sorry, back to the point. I’m going to turn off comments on this blog for the duration of my absence just because I won’t be around to keep an eye on things. Please save up your pith for my return when we’ll resume our usual program.

P.S. For those of you who are curious, the new word I used in this post was vacation. Be sure to go to that page and click on the link to have the word read to you if you haven’t heard it spoken in a long time.

Training and Guru Spotlight: Virtualization

August 23rd, 2005

One of the key things the organizers of LISA have tried to do is keep our ear to the ground for emerging topics our attendees need to know about (or will need to know about). It is pretty clear that we are going back to the future because the topic of Virtualization seems to be coming up more and more these days.

The old guard from the Big Iron days is sure to be amused by a new generation of sysadmins discovering the value of running services in a consolidated virtual machine framework. But this trend is not just a rehash of Ye Olde MVS or other architectures from the glory days of mainframes. The new crop of virtual machine environments and current day service requirements are bringing new challenges to our field.

This year LISA will offer a full three hours of training plus a Guru-is-In session with John Arrasjid and John Gannon from VMware to help people deal with the new challenges around Virtualization. Here are their bios to give you an idea of their qualifications in this realm:

John Y. Arrasjid has 20 years experience in the Computer Science field. His experience includes work with companies such as AT&T, Amdahl, 3Dfx Interactive, Kubota Graphics, Roxio and his own company, WebNexus Communications, where he developed consulting practices and built a cross platform IT team. John is currently a senior member of the VMware Professional Services Organization as a Consulting Architect. John has developed a number of service offerings focused on Performance Management, Security, and Disaster Recovery and Backup. John earned his Computer Science degree at SUNY Buffalo.

John Gannon has over ten years of experience architecting and implementing UNIX, Linux, and Windows infrastructures. John has worked in network engineering, operations, and professional services roles with various organizations including Sun Microsystems, University of Pennsylvania, Scient Corporation, and FOX Sports. John is currently responsible for delivering server consolidation, disaster recovery, and virtual infrastructure solutions to VMware’s Fortune 500 clients.
John received a BS degree in Computer Science Engineering from the University of Pennsylvania.

Invited Talk Spotlight: Dan Kaminsky

August 10th, 2005

Dan is the author of one of my favorite sysadmin-related hacks of all time. I can still remember the glee I felt when I heard he had found a way to tunnel SSH over DNS (yes, you heard right) and had provided the code to do it. Later that year at BlackHat, he showed not just SSH, but audio and video streaming through DNS.

I came to learn this was just one of a series of tremendously creative ideas in security that put him permanently on my “people to watch” list. Other examples included innovative work in port scanning and network visualization. He’s also known for work on the practical application of some of the new attacks on MD5. I understand this year he made some waves with announcement of a security scan that showed 230,000 DNS servers are still potentially vulnerable to DNS cache poisoning.

I’m delighted that Dan has accepted an invitation to speak at LISA 2005. We might even get him to demonstrate some of the cool DNS hacking I mentioned plus some of the new stuff he has up his sleeve. On top of this, Dan has agreed to address the questions that system and network administrators must deal with when faced with these and other mind-blowing security hacks if (or more likely when) they appear on your network.

Here’s the official blurb for the talk:

There is set the of functionality we expect from our network. There’s the set of functionality your network is capable of. These two sets are not identical. This talk will explore security risks you may not even be aware your network is exposed to and will demonstrate new techniques for managing those risks. Mechanisms will be discussed for:

  • Establishing video-capable tunnels over DNS (and detecting such tunnels)
  • Evading intrusion detection systems by exploiting IP’s lack of statelessness
  • Reliably auditing internet-scale networks
  • Visualizing complex network activity
  • See Dan’s web site for a flavor of the sort of stuff you’ll be hearing at the LISA 2005 conference.

    Picking Papers, Part 4

    August 4th, 2005

    (be sure to read parts one, two and three so you know where we are in the story)

    I wish I could say that we held the program committee meeting in some marble-adorned Hall of Justice edifice but it was, in truth, a standard hotel conference room in San Francisco. There I and most of the program committee members got down to the serious business of deciding which papers were to make the cut.

    The first step of this process was relatively easy. The initial accept/cut round started off by looking at a graph I had brought to the meeting. A couple of days before we met I wrote a small Perl script to scrape the final paper scores from the webreview system and plot them via GD:Graph on a graph that looked very similar to this:

    LISA paper scores

    [note: I don’t have a problem showing you the scores like this because it would be impossible for an author or anyone else to guess a paper’s score given the rest of the process we’re about to see.]

    From this graph we were able to determine a high cutoff score and a low cutoff score. Papers at the high cutoff or better were moved into the initial accept pile; papers at the low cutoff or below were moved into the initial reject pile. The program committee members were then asked to make sure there weren’t papers rejected by score that deserved consideration and papers provisionally accepted that shouldn’t be there for any reason.

    With the first chunk of accepts and rejects done, we then moved on to considering the contenders in the middle. This typically takes the most time because the decisions tend to get harder and harder as you proceed (with a commensurate increase in the amount of discussion time per paper). This year’s meeting was no exception, most of the day was spent working on this set of submissions.

    If you’ve never been on a conference program committee before, you may not realize just how carefully and seriously each submission is considered in this process. Example weighty questions that get asked and answered include “How similar is this paper to ones we’ve already accepted? Does this paper have lasting value to the field? Is this paper too specific? Not specific enough? Will it spark a useful discussion in the community? Can it be the start of an interesting line of investigation by future papers?”
    We worked our way down the list from the highest scoring papers to the lowest until all but about a fifth of the remaining slots had been filled. At that point the process changed and the program committee members were asked to go through the remaining pool, and if they chose, through any of the previous decisions made, to mark any papers they deemed worthy of more discussion irrespective of score. We then worked out way through this final pool and filled the remaining spots in the program.

    It’s a hard job that comes with considerable responsibility, but I think this year’s program committee (with the help of our external reviewers) did a superb job with the selection of papers. I’m very thankful I could be part of this process.

    And so that, my friends, is how papers were picked at LISA 2005.

    Keynote details finalized

    August 4th, 2005

    I’m delighted to say that the details for the LISA 2005 keynote have been finalized. We’re very lucky to have Dr. Qi Lu, Vice President of Engineering of Yahoo! Inc. open the conference.

    Scaling Search Beyond the Public Web
    What’s next in “search?” Scaling, fault tolerance, and storage management become a lot more exciting when we go from the colossal scale of Yahoo! to the challenges of searching not just the public web, but your desktop, email, bookmarks and other repositories of information like your on-line communities. This talk introduces Yahoo!’s personal and social search initiative, and focuses on technology infrastructure that can store, index and search user and community content on a massive scale. Specific topics also include storage management, fault tolerance, metrics and real-time monitoring, and much more.

    Dr. Qi Lu is a VP of Engineering of Yahoo! Inc. responsible for the technology development of Yahoo’s Search and Marketplace businesses unit, which includes the company’s search, e-commerce, and local listings businesses and products. Prior to joining Yahoo! in 1998, Dr. Lu was a Research Staff Member at IBM Almaden Research Center. Before that, Dr. Lu worked at Carnegie Mellon University as a Research Associate, and at Fudan University in China as a faculty member. He holds 20 US patents, and received his BS and MS in Computer Science from Fudan University and PhD in Computer Science from Carnegie Mellon University.

    Picking Papers, Part 3

    July 25th, 2005

    Continuing where we left off after parts one and two, the reviewers are just to about to get to work. They log in to the web review system and see the list of papers assigned to them. They also see a much longer list of papers they could review should they have time and energy after the assigned papers are completed.

    The actual reviews entered for each submission consists of two parts: scores and comments. Let’s look at each separately:

    Reviewers are asked to assign several scores to each submission:

    • Overall marks
    • Confidence Level: how confident is the reviewer in her or his ability to rate that paper.
    • Technical Quality: The quality of the technology being documented in the paper.
    • Editorial Quality: how good is the paper as a paper?
    • Suitability: does this paper fit the LISA conference?

    These scores are only used by the program committee as part of the selection process. Authors never actually see their scores.

    What authors do see, however, are the comments each reviewer submits for a paper. Reviewers can submit three kinds of comments:

    • comments to the author(s)
    • comments to the program committee
    • comments to the program chair (me)

    This lets each reviewer be as candid as possible. They can leave good process comments like “recommend only accepting this paper if it gets heavy editing help” or “important topic but the paper is a bit weak. Only accept if there is not another paper on this topic.”

    As program chair, my primary job while all of this reviewing is going on is to make sure things are going smoothly–namely, all papers receiving the proper amount of attention (in the form of completed reviews). The webreview system can show the reviews received for a each paper but it doesn’t have a good heads-up display. As you probably guessed, I wound up writing yet another Perl script to scrape the info from the webreview system and generated a simple HTML status table. The table showed green, yellow and red cells to show which paper had and had not received adequate coverage. This script was run every 15 minutes from cron to generate a status page I could check at will. I let the reviewers see this page as well so they would know where to focus their attention next after completing their assigned submissions.

    After several weeks of hard work by the reviewers, it was time to close down the review process and begin to get ready for the meeting where the final decisions would be made. That meeting will be the subject of our next entry on this topic…

    Keynote details coming soon!

    July 17th, 2005

    Thought I would take a few moments away from a Harry Potter upgrade and the picking papers tome to mention that we have a great keynote speaker confirmed. As soon as the details of the talk get nailed down, I’ll post them here. You’ll get the inside skinny well before the official announcement.

    Here’s an (obtuse) clue (.wav file).

    Picking Papers, Part 2

    July 14th, 2005

    Last we left off in part 1, the last minute torrent of submissions had ceased and it was review’n time.

    LISA’s paper track is peer-reviewed. We’re very lucky to have a community of highly-experienced peers willing to help out with an important process like this. After the call went out for volunteers, we had ~35 people offer to review papers in addition to the 12 program committee members.

    The first step in the setup process was the assignment of categories in the webreview system. I read through every single submission and tagged each one with one more more categories. For example, some of the categories used were “theory / process,” “security,” “network management / monitoring / configuration.” These categories are used in the paper assignment process.

    Accounts in the webreview system were then created for each reviewer. This is normally a manual procedure but sysadmins like myself aren’t particularly keen on drudgery. I did what any self-respecting sysadmin would do and wrote a (Perl) script to automate the process. It used HTML::TableExtract, one of my favorite modules, to scrape the list of reviewers/email addresses from our internal planning wiki (more on this wiki in another post), created a reasonable secure, pronounceable password with Crypt::GeneratePassword, filled in the account web form with WWW::Mechanize, and then mailed the password and reviewing instructions to each person with Mail::Mailer.

    Each reviewer was then asked to log on to the system and set their review area preferences. Review areas are just another name for the categories I had assigned to each paper earlier. Reviewers are asked to identify the areas within which they feel competent/experienced to review and those they can’t review. This helps to make sure that papers are reviewed by people who believe they have expertise in that paper’s subject.

    The first step towards assigning papers actually consists of deciding which papers can’t be assigned. All known conflicts of interest are entered into the system at this point. For example, reviewers are not allowed to even view the reviews of papers they have written or taken part in helping to construct. Throughout the whole process, we’re very careful to keep this boundary intact.

    Finally it is time to assign papers. This consists of matching up people’s review area preferences to paper categories while still respecting conflicts of interest. Given the number of reviewers, review areas/categories and submissions, this can be a heck of a manual process because the webreview system has no support for this matching at all.

    Yup, Perl to the rescue again. I took a couple of days to write a matching script. The end result:

    % wc -l match.pl
    720 match.pl

    Though a respectable number of those 720 lines are comments and self-generating documentation (so another program chair can use it), it’s still a pretty honkin’ script. I found out while writing it that this turns out to be a non-trivial problem to solve. One has to perform a match while taking care to also observe all of the review area preferences. For example, if someone says they will review things in categories A and C but not B, and a paper is tagged with all three categories, the matching process has to make sure not to assign the paper to the person because of the negative preference. Finding an optimal search order when performing the match was another thing that gave me pause.

    This script was also written with the desire to do what-if calculations. I wanted to be able to ask “if each reviewer gets assigned a max of 8 submissions but no less than 4, how many reviewers would each submission get, given everyone’s preferences?” And truth be told, there are a number of bells and whistles and I couldn’t resist including. It can autodownload the data out of the webreview system, do the match, display the results in pretty tables and then actually commit the results back to the webreview system (saving me literally hundreds of clicks). Here’s the usage just to give you a final taste of the script:

    match [options]
    Options:
    –fetch: fetch a new copy of the data files <off>
    –potential: display all potential assignments <off>
    –commit: actually post the assignments in webreview <off>

    –user=username: user name for chair (required for fetch or commit)
    –pass=password: password for chair (required for fetch or commit)

    –random: avoid the smarter assignment strategies <no>
    –reallycare: if reviewer says “don’t care” treat that as a no <no>

    –maxperpaper=#: max # of reviewers assigned to each paper <5>
    –minperpaper=#: min # of reviewers assigned to each paper <5>
    –maxperperson=#: max # of reviewers assigned to each paper <5>
    –minperperson=#: max # of reviewers assigned to each paper <4>

    –help|manual: display summary or doc

    (Hey, I promised you behind-the-scenes, here they are!)

    Phew. I ran this script until I found a set of values that provided good coverage for all papers and even distribution of work to reviewers. Once I found them, I ran the script and told it to commit the results back to the webreview system. Held my breath, everything worked great, and we were off to the races…

    We’ll pick this saga up next time with how reviews actually work.