Archive for the 'Papers' Category

Refereed Paper Track

Tuesday, November 15th, 2005

Early on in the life of this blog, I talked a bunch about how the papers in the Refereed papers track were actually chosen for the LISA conference. But dear reader, I’ve done you a disservice by not actually talking about which papers were selected.

One of the things which makes LISA special and different from other conferences is the research and investigation into system administration that gets presented. I don’t know of any place else that publishes peer reviewed work like this designed to advance the field.

Here’s a few examples I’m pulling randomly from the technical session listing:

  • Fast User-Mode Rootkit Scanner for the Enterprise, Yi-Min Wang and Doug Beck, Microsoft Research (seen anything in the news about rootkits lately?)
  • A Case Study in Configuration Management Tool Deployment, Narayan Desai, Rick Bradshaw, Scott Matott, Sandra Bittner, Susan Coghlan, Rémy Evard, Cory Lueninghoener, Ti Leggett, John-Paul Navarro, Gene Rackow, Craig Stacey, and Tisha Stacey, Argonne National Laboratory (what is deploying this stuff really like?)
  • Reducing Downtime Due to System Maintenance and Upgrades, Shaya Potter and Jason Nieh, Columbia University (reducing downtime something you’ve been asked to do at your job?)
  • A1: Spreadsheet-based Scripting for Developing Web Tools, Eben M. Haber, Eser Kandogan, Allen Cypher, Paul P. Maglio, and Rob Barrett, IBM Almaden Research Center (if you’ve suspected spreadsheets could be useful for something beyond number crunching, here’s how to do sysadmin with them)
  • Manage People, Not Userids, Jon Finke, Rensselaer Polytechnic Institute (identity management issues buzzing louder at your workplace these days?)

Register for the conference and tech sessions here.

Picking Papers, Part 4

Thursday, August 4th, 2005

(be sure to read parts one, two and three so you know where we are in the story)

I wish I could say that we held the program committee meeting in some marble-adorned Hall of Justice edifice but it was, in truth, a standard hotel conference room in San Francisco. There I and most of the program committee members got down to the serious business of deciding which papers were to make the cut.

The first step of this process was relatively easy. The initial accept/cut round started off by looking at a graph I had brought to the meeting. A couple of days before we met I wrote a small Perl script to scrape the final paper scores from the webreview system and plot them via GD:Graph on a graph that looked very similar to this:

LISA paper scores

[note: I don’t have a problem showing you the scores like this because it would be impossible for an author or anyone else to guess a paper’s score given the rest of the process we’re about to see.]

From this graph we were able to determine a high cutoff score and a low cutoff score. Papers at the high cutoff or better were moved into the initial accept pile; papers at the low cutoff or below were moved into the initial reject pile. The program committee members were then asked to make sure there weren’t papers rejected by score that deserved consideration and papers provisionally accepted that shouldn’t be there for any reason.

With the first chunk of accepts and rejects done, we then moved on to considering the contenders in the middle. This typically takes the most time because the decisions tend to get harder and harder as you proceed (with a commensurate increase in the amount of discussion time per paper). This year’s meeting was no exception, most of the day was spent working on this set of submissions.

If you’ve never been on a conference program committee before, you may not realize just how carefully and seriously each submission is considered in this process. Example weighty questions that get asked and answered include “How similar is this paper to ones we’ve already accepted? Does this paper have lasting value to the field? Is this paper too specific? Not specific enough? Will it spark a useful discussion in the community? Can it be the start of an interesting line of investigation by future papers?”
We worked our way down the list from the highest scoring papers to the lowest until all but about a fifth of the remaining slots had been filled. At that point the process changed and the program committee members were asked to go through the remaining pool, and if they chose, through any of the previous decisions made, to mark any papers they deemed worthy of more discussion irrespective of score. We then worked out way through this final pool and filled the remaining spots in the program.

It’s a hard job that comes with considerable responsibility, but I think this year’s program committee (with the help of our external reviewers) did a superb job with the selection of papers. I’m very thankful I could be part of this process.

And so that, my friends, is how papers were picked at LISA 2005.

Picking Papers, Part 3

Monday, July 25th, 2005

Continuing where we left off after parts one and two, the reviewers are just to about to get to work. They log in to the web review system and see the list of papers assigned to them. They also see a much longer list of papers they could review should they have time and energy after the assigned papers are completed.

The actual reviews entered for each submission consists of two parts: scores and comments. Let’s look at each separately:

Reviewers are asked to assign several scores to each submission:

  • Overall marks
  • Confidence Level: how confident is the reviewer in her or his ability to rate that paper.
  • Technical Quality: The quality of the technology being documented in the paper.
  • Editorial Quality: how good is the paper as a paper?
  • Suitability: does this paper fit the LISA conference?

These scores are only used by the program committee as part of the selection process. Authors never actually see their scores.

What authors do see, however, are the comments each reviewer submits for a paper. Reviewers can submit three kinds of comments:

  • comments to the author(s)
  • comments to the program committee
  • comments to the program chair (me)

This lets each reviewer be as candid as possible. They can leave good process comments like “recommend only accepting this paper if it gets heavy editing help” or “important topic but the paper is a bit weak. Only accept if there is not another paper on this topic.”

As program chair, my primary job while all of this reviewing is going on is to make sure things are going smoothly–namely, all papers receiving the proper amount of attention (in the form of completed reviews). The webreview system can show the reviews received for a each paper but it doesn’t have a good heads-up display. As you probably guessed, I wound up writing yet another Perl script to scrape the info from the webreview system and generated a simple HTML status table. The table showed green, yellow and red cells to show which paper had and had not received adequate coverage. This script was run every 15 minutes from cron to generate a status page I could check at will. I let the reviewers see this page as well so they would know where to focus their attention next after completing their assigned submissions.

After several weeks of hard work by the reviewers, it was time to close down the review process and begin to get ready for the meeting where the final decisions would be made. That meeting will be the subject of our next entry on this topic…

Picking Papers, Part 2

Thursday, July 14th, 2005

Last we left off in part 1, the last minute torrent of submissions had ceased and it was review’n time.

LISA’s paper track is peer-reviewed. We’re very lucky to have a community of highly-experienced peers willing to help out with an important process like this. After the call went out for volunteers, we had ~35 people offer to review papers in addition to the 12 program committee members.

The first step in the setup process was the assignment of categories in the webreview system. I read through every single submission and tagged each one with one more more categories. For example, some of the categories used were “theory / process,” “security,” “network management / monitoring / configuration.” These categories are used in the paper assignment process.

Accounts in the webreview system were then created for each reviewer. This is normally a manual procedure but sysadmins like myself aren’t particularly keen on drudgery. I did what any self-respecting sysadmin would do and wrote a (Perl) script to automate the process. It used HTML::TableExtract, one of my favorite modules, to scrape the list of reviewers/email addresses from our internal planning wiki (more on this wiki in another post), created a reasonable secure, pronounceable password with Crypt::GeneratePassword, filled in the account web form with WWW::Mechanize, and then mailed the password and reviewing instructions to each person with Mail::Mailer.

Each reviewer was then asked to log on to the system and set their review area preferences. Review areas are just another name for the categories I had assigned to each paper earlier. Reviewers are asked to identify the areas within which they feel competent/experienced to review and those they can’t review. This helps to make sure that papers are reviewed by people who believe they have expertise in that paper’s subject.

The first step towards assigning papers actually consists of deciding which papers can’t be assigned. All known conflicts of interest are entered into the system at this point. For example, reviewers are not allowed to even view the reviews of papers they have written or taken part in helping to construct. Throughout the whole process, we’re very careful to keep this boundary intact.

Finally it is time to assign papers. This consists of matching up people’s review area preferences to paper categories while still respecting conflicts of interest. Given the number of reviewers, review areas/categories and submissions, this can be a heck of a manual process because the webreview system has no support for this matching at all.

Yup, Perl to the rescue again. I took a couple of days to write a matching script. The end result:

% wc -l match.pl
720 match.pl

Though a respectable number of those 720 lines are comments and self-generating documentation (so another program chair can use it), it’s still a pretty honkin’ script. I found out while writing it that this turns out to be a non-trivial problem to solve. One has to perform a match while taking care to also observe all of the review area preferences. For example, if someone says they will review things in categories A and C but not B, and a paper is tagged with all three categories, the matching process has to make sure not to assign the paper to the person because of the negative preference. Finding an optimal search order when performing the match was another thing that gave me pause.

This script was also written with the desire to do what-if calculations. I wanted to be able to ask “if each reviewer gets assigned a max of 8 submissions but no less than 4, how many reviewers would each submission get, given everyone’s preferences?” And truth be told, there are a number of bells and whistles and I couldn’t resist including. It can autodownload the data out of the webreview system, do the match, display the results in pretty tables and then actually commit the results back to the webreview system (saving me literally hundreds of clicks). Here’s the usage just to give you a final taste of the script:

match [options]
Options:
–fetch: fetch a new copy of the data files <off>
–potential: display all potential assignments <off>
–commit: actually post the assignments in webreview <off>

–user=username: user name for chair (required for fetch or commit)
–pass=password: password for chair (required for fetch or commit)

–random: avoid the smarter assignment strategies <no>
–reallycare: if reviewer says “don’t care” treat that as a no <no>

–maxperpaper=#: max # of reviewers assigned to each paper <5>
–minperpaper=#: min # of reviewers assigned to each paper <5>
–maxperperson=#: max # of reviewers assigned to each paper <5>
–minperperson=#: max # of reviewers assigned to each paper <4>

–help|manual: display summary or doc

(Hey, I promised you behind-the-scenes, here they are!)

Phew. I ran this script until I found a set of values that provided good coverage for all papers and even distribution of work to reviewers. Once I found them, I ran the script and told it to commit the results back to the webreview system. Held my breath, everything worked great, and we were off to the races…

We’ll pick this saga up next time with how reviews actually work.

Picking Papers, Part 1

Monday, July 11th, 2005

I think one of the most mysterious parts of a conference like this is how the papers in the refereed paper tracks get picked. Let me see if I can de-mystify that process for you. It is a little bit of a tale, so sit back and let me tell you the story. To tell this right, I have to go back to the beginning so let’s hop in the way-back machine and start there…

Once upon a time (November and December 2004, actually), the program committee and I worked hard to pull together the CFP for this year’s conference. The CFP is either the Call for Papers or Call for Participation depending on who you ask. It provides the official rules about paper submissions. At the same time we also created the latest version of the author guidelines.

Both of these documents took a fair amount of effort because of a change in this year’s preferred format for submissions. After a bunch of discussion with the LISA community starting at last year’s conference and with the current program committee, it was decided that draft papers would become the new preferred format over the previous standard of extended abstracts. There’s a whole slew of arguments both for and against this idea that I can blog about at a later time if you are interested. Suffice it to say, we thought it was reasonable to try this shift for a year and see what happens.

Shortly before the CFP was published, the official submission system/web review system for the conference was brought online by the fine folks at USENIX. If you are interested in how that system works, hang on for a bit because you’ll be hearing plenty about it soon.

So, we get the CFP ready, we publish it to the web, let the LISA community know about it, throw open the door to the submission system and….

nothing.
lots of nothing.
the big donut.

If you thought there might be hordes of people just waiting to submit papers (like I did), you thought wrong. Our first submission came at the very end of March 2005. It turns out that most papers (and this is considered normal for LISA) were submitted within 24 hours of the May 10th submission deadline.

All in all, we received 52 submissions. This is down from ~70 of the previous year (which itself was down from the year before that). Why the decrease? I wish I knew. Was it the change in preferred format? Increased speed of life? Dunno, theories welcome.

Ok, so at this point we have 52 submissions to review, how do papers get picked? For that, you’ll have to see the next entry.