Saturday, June 7, 2008

The Goals of a Computer Science Education

I'm far enough along in my UC Berkeley Comp Sci refresher podcasts -- 61A (Structure and Interpretation of Computer Programs), 61B (Data Structures), and most of 162 (Operating Systems) -- that it feels like time to make some blog posts about computer science in general.

After all, one of the perennial topics for argum^H^H^H^H^H discussion is whether it's better to hire a Comp Sci major, a naturally talented coder, or someone with a lot of vocational training and experience. What anyway are the goals of a Comp Sci degree?

IBM has a multidimensional measure of software quality and/or customer satisfaction called CUPRIMD [that link is a PDF file]. Standing for Capability, Usability, Performance, Reliability, Installability, Maintainability, and Documentation. Never one to leave well enough alone, IBM later extended this acronym to CUPRIMDSO by adding measurements for Support and Overall. I contend that the goal of computer science is really all about the CPR in that acronym: capability, performance, and reliability.

For example, CS 61A focuses on capability (how do you write a program to accomplish X?) and reliability (what practices are needed to ensure the answer is right and cover all the possible inputs?). CS 61B focuses on capability (how do you sort, find the highest/lowest value, or transform a data structure?) and performance (is it faster to sort this way or that, what are the best and worst cases to do such-and-such a transformation on this type of data structure?). And CS 162 focuses on reliability (how do we keep a process from writing over another's memory, how do we avoid deadlock?) and performance (how do we keep address translation, resource tracking, et al from grinding the system to a halt?).

I know that all the while, students are exhorted about documentation ("write good comments", "learn Javadoc notation"), and later on, depending on the institution, there might be optional branches off into specialties for technical writing and usability. But those are a bit squishy for the core Comp Sci goals. The best technical writers put their deep knowledge aside to write things that are mostly correct and comprehensive, in words that an eight-grader can understand. The questions "is Ruby on Rails usable?" and "is this ATM touchscreen usable?" lead off in very different directions.

The goal of teaching someone who can make the most reliable, fastest, and fullest-featured program or system runs into some conundrums in the real world.

Today, if you want to enable your company to do something it couldn't do before, you might do a "View Source" on a web page to see how to do a certain effect in CSS, or adopt a new language or code library that solves your toughest problem in one line of code. So capability is not really such a distinguishing factor between people with higher education, natural skill, or on-the-job training.

That leaves us with performance and reliability. That's where the real paradoxes come in.

You might call in someone with a CS degree, who turns an O(N**3) solution into O(N  log N) and so saves untold hours of waiting. Or they might reduce memory requirements so that the underpowered machine devoted to an application suddenly runs like a champ. Or they might network together several servers to turn formerly idle time into productive work.

On the other hand, if you have something that works most of the time but occasionally gives a wrong answer or hangs, the Comp Sci solution may add quite a lot of complexity and overhead to make things 100% reliable. UCB's operating systems course really puts that into perspective, taking me down memory lane from the Commodore 64 days of gleefully smashing the stack, to Motorola 68000 and the Amiga 1000 where there was no paging or swapping and we dreamed of the 68020 with MMU, to the world of today with robust multitasking and memory management all accelerated by hardware. Not a day goes by that I don't say to myself, things were a lot faster in the '80s. :-)

I see echoes of these CS goals in the "tough tech interview"-style word problems that people post from time to time. (C'mon, admit it, you've looked at them.) Hey, this one is really recursive unwinding, that one's a merge sort, I'll bet the clock algorithm would work here, and so on. For the most part, all drawn from those early fundamental CS courses. As an employer, you want your recent grads to really have internalized those lessons, rather than deciding that this theory stuff isn't important in the real world. And you want your industry hires to not have completely forgotten the basics after years of dealing with COTS ("commercial off-the-shelf") software and office politics.

No comments: