Saturday, May 5, 2007

PL/SQL vs. LISP, Scheme, and friends

Getting my Comp. Sci. degree from about as far away from Silicon Valley as you can get in North America, I was curious what the prestigious US schools teach that's so great. So just for kicks, I started going through podcasts for the CS courses at UC Berkeley. The teaching language they start out with is Scheme, a LISP-like language. Walking through the coursework in my head, I couldn't help but draw comparisons with PL/SQL.

There is a section in the PL/SQL User's Guide and Reference on recursion. It was there before I owned the book, and it's still there after I gave it up. Listening to UCB focus on functional programming from the get-go clarified for me exactly what questions that blurb about PL/SQL recursion is trying to answer.

Yes, PL/SQL can do recursion like Fibonacci sequences, where you're passing numbers around. But it isn't really set up for recursion in the LISP or Scheme sense, where you're recursing through a list by chopping elements off at the front or back, and possibly returning a value that is some new set of elements. When you have a PL/SQL collection, the only practical alternative is to loop through all the elements and do arithmetic on the subscripts. (After all, the P in PL/SQL is for "procedural".)

You could fake this kind of recursion to some degree, but PL/SQL's type strictness makes it awkward. Plus with the types of problems you're solving in PL/SQL, you like to visualize exactly how much memory is being allocated and keep things simple so your programs scale well and play nicely with other work happening on the same database server. Functional programming asks you to take a leap of faith that you won't run out of memory, and a logic bug that causes an infinite recursive loop is something to be expected during debugging. Database programming occasionally asks for the same type of faith -- say when querying several big tables joined together -- but the DB programmer expects to be able to look at the distribution of data and the optimizer plan for the query.

Another interesting aspect in Scheme is that functions are first-class items. You can pass a function (rather than a function result) around as a parameter, assign it to a variable, and use it as a member of a composite data structure. It's not quite like the Wild West days of function pointers in K&R C, but it's close.

PL/SQL stored procedures are intimately entangled with the database in ways that don't mesh with that way of thinking. Creating a PL/SQL subprogram is a relatively expensive operation: CREATE OR REPLACE PROCEDURE/FUNCTION is a DDL statement that might block if someone is calling the previous version of that subprogram; DDL commits, so your transaction is over; the source of the subprogram gets squirreled away in a form that can be queried; DDL is syntactically more complicated to issue from a PL/SQL subprogram than DML statements like INSERT or UPDATE. If you have a bunch of subprograms defined and just want to pass symbolic references, you might wind up passing around the procedure or function names as strings and calling them via:

execute immediate variable_name || parameter_list;

But with all the bad guys trying to find ways to hack into a database, that approach raises the possibility of SQL injection attacks, by stuffing something more elaborate into the string used with EXECUTE IMMEDIATE.

Another approach is to define SQL types that inherit different variations of the same subprogram. Instead of passing around a variable X and having your code call X(), you pass around an object X and your code calls X.MEMBER_FUNCTION(). This can do the job if you know in advance all the variations of MEMBER_FUNCTION and can declare them in your code. But the LISP/SCHEME notion of functions returning functions likes to be able to make such things up on the fly, which would involve EXECUTE IMMEDIATE statements running CREATE OR REPLACE statements with big blocks of code. Again, I would stay away from this approach for security and performance reasons.

No comments: