Thursday, May 31, 2007

Less is More, More or Less

Do you fret every time you declare a PL/SQL variable, that you've carelessly allocated a few unnecessary bytes, and that this wastage will be replicated thousands of times as your code is called from a bazillion sessions, and that that will be the straw that breaks the camel's back and makes your server thrash to a halt?

Well, calm down. It's not that bad.

If you don't know exactly how much space is needed for a PL/SQL variable, the most space-efficient thing you can do is counterintuitive: declare a gigantic variable. For example:
email_address varchar2(32); -- Could give error from being too small
email_address2 varchar2(512); -- Probably no error, but wastes space
email_address3 varchar2(32000); -- No error, no wasted space

Once your variable is declared as 2000 characters or more, the variable is no longer statically allocated to accomodate its maximum size; instead, it's dynamically allocated every time it's assigned, based on the exact size that's really needed.
email_address2 := ''; -- Rest of 512 characters are wasted empty space
email_address3 := ''; -- Although declared with length 32000, only 24 characters are allocated

VARCHAR2s declared with huge lengths make great stocking stuffers, such as collection elements. You can make complex data structures that potentially can represent big documents (each line up to 32K), yet don't take up any more space than is really needed.

Now let's look at a complementary technique, where again we keep the allocated size under control.

When you want a variable to hold the contents of a table column, there is a simple way to match the right length:
var1 my_table.my_column%type;

If my_table.my_column is declared VARCHAR2(600), then that's what var1 will be too. No worries about selecting a value from the table and getting a length error. If you run ALTER TABLE to change the length of the column, next time your code is run, it will pick up that change automatically and adjust to the new length.

Maybe your selects are not quite so simple. Say you want to select col1 || col2 and store the results in a variable. How...?
cursor my_cursor is select col1 || col2 concatenated from my_table;
var2 my_cursor%rowtype;

We let Oracle figure out the maximum length of the concatenated columns, then piggyback on that result to make a variable of the same type. (Actually, it's a record variable that has a field of the same type; to refer to the field, we'd specify var2.concatenated.) Again, we're safe from ALTER TABLE.


Eddie Awad linked to this post from his blog, and I see that post attracted some comments that I should address here.

Looks like, in 10g, the limit has been raised from 2000 characters to 4000 characters. I like it when PL/SQL limits align with SQL limits, and the number 4000 is significant because that's the biggest VARCHAR2 you can stuff in a table. So you don't have to worry about declaring a variable of type TABLE_NAME.VARCHAR2_COLUMN%TYPE and having this kind of dynamic allocation happen or not depending on the size of the column.

Anyway, I'll just reiterate my advice from the PL/SQL Guide: don't try to finesse the VARCHAR2 length depending on knowledge of the exact limit; if you prefer dynamic allocation for a variable, use a huge length. I always use 32000. A PL/SQL guru I know uses 32500. (I haven't seen anyone use 3276(7|8), proof that nobody likes to tempt the gods of boundary conditions.)

Performance guru Jonathan Lewis discusses some detailed performance / scalability implications of using big or different-sized VARCHAR2 variables. The kind of code that would use huge VARCHAR2s, in my experience anyway, is not inserting into or querying those values from SQL tables; rather it's building data structures like the buffers used in HTP.P() or DBMS_OUTPUT.PUT_LINE(). So I'll keep that performance tip in mind for DML-intensive code, but not worry so much for code that mainly does string manipulation and displays the results as a web page via mod_plsql.


Joel Garry said...

Shouldn't that be go above 1999 characters?

Trivial, yes, but a boundary condition we are talking about here.

deek102 said...

Hi John,

This is interesting, I didn't know PL/SQL behaved this way! I'm curious what the cost of dynamically allocating that memory is. I wonder which 'costs' more: replicating unncessary bytes or calculating how much to allocate each time. Any thoughts?


Markus said...

I just stumbled across your blog. As a note, the maximum possible length of an email address is 320 characters. 255 is the maximum FQDN length, plus the @, plus 64 maximum characters for user names.

512 and 32,000 is too much overhead.