Systems Class Projects as Science Labs, not Problem Sets

August 26, 2022

I'm doing a new experiment with my BS/MS networking course this semester, and this note is me explaining the rationale behind it. We'll see how it goes.

Computer Networks at CMU is an upper-division systems class. Students entering 15-441/641 (my class) have already taken one introductory systems class where they learn what concurrency is and how to wrangle gdb and why caches are good, etc.. After this intro systems class, students have to take at least one (but many take more) upper division systems class all of which involve really big projects. And I mean really big: 15-440/640 has students implement their own blockchain protocol; my class has students implement a web server from scratch.

These are cool systems projects, but I have always felt that there is a really big gap between what we ask students to do in my class, and what I do in my research lab or what I hear real engineers talk about doing when they build systems in industry. The gap is, I think, that in class we really treat students as code-monkeys rather than engineers. We give them a very specific spec, and then test whether their system does what we told them to do. And that's more or less where the majority of their grade comes from.

Contrast this to what I am teaching in class: Compare and contrast distance vector versus link state routing protocols. Are there situations where HTTP pipeline won't make your web page load any faster? In this network setting, would you expect to get more reliable transmissions using this forward error correction scheme or a retransmission based scheme?

At no point in our projects do we ask our students to do any reasoning about the systems they are building with regard to performance, reliability, deployability, or any of the other concepts we talk about in class! This is very disappointing.

At the fancy pants private school I work at, we have access to teaching consultants who are full-time faculty who spend part of their time teaching, and part of their time teaching other faculty how to teach. I was lucky to be paired up with a consultant who is a chemist and by talking with her, I realized that what I want from my students is very similar to what she wants her students to learn in Chemistry Labs. In a Chemistry lab, students are not merely graded on whether or not they "got the correct answer" -- like one might in a problem set for an introductory calculus class. Instead, students are both required to (a) do the sequence of steps the professor told them to, and also (b) generate a lab report where students are expected to hypothesize about what they will see in their experiments, explain what they saw, and justify their observations relative to the hypothesis.

Systems needs lab reports! This semester, I'm re-architecting all of my class projects to contain a lab report component.

Here's an example given this web server project we have offered for years. Historically, we have graded students on all sorts of nitty gritty RFC edge cases (Did they return the right error code?). This year, we are only asking students to support two error codes and to build a slightly more bare-bones web server. This will give them more time to consider a few different versions of their server: (a) a version that supports HTTP pipelining, (b) a version that supports concurrent connections, (c) a version that supports both, and (d) a version that supports neither of these optimizations.

After students have built their server, they will test it with different workloads -- pages that have no images, that have lots of images, that use javascript to sequentially load different dependencies, etc. And, they will load their server on AWS with varying levels of distance from CMU. Before running any of these tests, we will ask students to hypothesize: which web page might you expect to benefit the most from this optimization? Which web page might benefit the least? Students can then run experiments, measure page load times, and evaluate the contribution of each design choice to system performance.

Naturally, there are a bunch of challenges to this: making it clear to students what they are expected to do, figuring out how to grade the lab reports, and making sure we still make our students get rigorous coding experience alongside their analysis. Hopefully we don't stumble too much, because I'm excited about this shift in philosophy behind our projects. I think that this change will help our students better make their own design choices rather than simply carry out the design choices made by others like code monkeys. Wish us luck!