Saturday, November 22, 2008

The No-Output Debugging Technique

I run into this situation now and then by accident. If I could just figure out how to bottle it, all the world's software problems would be solved!

Here's the scenario:

- Notice a slight bug in some program output (typically a web page).
- Make the obvious code change, reload the page, bug is still there.
- Make more aggressive code change, making code more modular. Still no dice.
- Add several assertions, improve default assignments. Make control flow easier to follow. Comment the heck out of everything. Still the same bad output.
- Put in some impossible-to-ignore condition: remove the procedure, or always raise an exception, or print some big banner with debug info. Still the same page as before!
- Notice I'm loading the production version of the page instead of the test version. Try the right version with the updated code. Bug was fixed by very first edit.

This is a great way to trick yourself into writing better code.

Tuesday, November 18, 2008

Mystery of the FIRST_ROWS hint

I've always been intrigued by the FIRST_ROWS hint, so I paid special attention when we reached it in the 11g SQL Tuning class. But I'm still puzzled.

The course notes said that although you shouldn't be using hints generally, when you do, FIRST_ROWS is the most useful of the hints (in the form FIRST_ROWS(n) where you specify how many rows to optimize for). Also that it's not effective when the query block contains any "sorting or grouping" operations.

Now, I always assumed that this hint would be used like so:

select /*+ FIRST_ROWS(10) */ * from
select count(*) howmany, x from t group by x order by count(*) desc
where rownum < 11;

which would get you the top 10 items from a particular table. Notice that the ORDER BY is in an inner block, so presumably not covered by the "no sorts in the block with the hint" restriction.

However, when you put a WHERE ROWNUM < (N+1) clause in the query, the plan will show a line with COUNT STOPKEY that presumably means the optimizer knows only N rows are needed. So shouldn't the hint be a no-op in that case? I always figured there must be some other case where it's needed. I've been trying to find some authoritative information.

If the WHERE clause uses a variable, e.g. ROWNUM < N, maybe the hint is just a way of suggesting what the likely value of N is, for purposes of COUNT STOPKEY. But that's just speculation on my part.

I got one suggestion that the FIRST_ROWS hint makes the optimizer spend less time looking for the ideal query plan, since it's not going to traverse the whole table anyway, speeding up the hard parse phase. Could that be the meaning of "optimized for throughput"? One of the SQL internals guys didn't think so.

I got another suggestion that it could really change the execution plan, but you would use it not in a top N situation, rather where you wanted arbitrary values. E.g. I wonder if certain values in my table are uppercase or lowercase, so I'll just look at a few, don't care which ones (and in real life might be some devilishly complicated join):

select /*+ FIRST_ROWS */ my_col from my_table where rownum < 6;

In this scenario, I could imagine a change in the execution plan producing different results, depending on how exactly the query ran through the index. But I don't have confirmation that could ever happen.

If you are getting the top N values from an inner query block, maybe the hint helps figure out how to efficiently process the inner results, which after all aren't a real table with indexes and such. That's just my speculation again.

The SQL Reference states: The optimizer ignores this hint ... in SELECT statement blocks that include any blocking operations, such as sorts or groupings. However, I find explicit advice and examples featuring ORDER BY here in the Oracle Text docs. The PSOUG page on hints is a little more specific in its language -- it mentions a bunch of things that render FIRST_ROWS inoperative, including GROUP BY and DISTINCT but not ORDER BY; and it also uses ORDER BY in its examples.

So, after much thought... I'm back where I started. FIRST_ROWS is useful for doing top N queries... or for filtering top N queries done inside an inner block... or only for getting arbitrary values, not top N at all. It does this by speeding up parsing... or choosing a different explain plan (which might or might not change the order since no sorts are allowed)... or just confirming to the COUNT STOPKEY operation how many results are likely to come back from WHERE ROWNUM < N.

Any performance mavens who can say from hands-on experience which, if any, of these likely stories are true?


I will update the original post, as new answers (or new questions) come in, and with responses to suggestions in the comments.

One other top-N-almost-but-not-quite scenario I sometimes encounter is this: display the top N results, but if there is a tie for Nth place, keep going until you've exhausted all the tied values. It's common for traffic analysis on new or lightly visited sites, where many pages may be tied with 1 visitor, and the value 1 makes the top N rankings. The intuitive, strictly correct thing to do is to include AND howmany >= (subquery to find the value of Nth item; then either bump up the value of N in WHERE ROWNUM < (N+1), or leave out ROWNUM entirely and just stop fetching after getting the top N values plus all successive values tied for Nth. I've never delved deep into what is the best performing way to do this, since like I say it usually happens where relatively small amounts of data are being analyzed. But I could surmise that FIRST_ROWS might help in cases where you omit WHERE ROWNUM < (N+1), yet you stop fetching before exhausting the whole result set.

One of the reasons I stopped including FIRST_ROWS hints in my new queries was that mod_plsql always processes the whole procedure before returning any HTML output. So if there is a query that takes 5 minutes, it doesn't matter if FIRST_ROWS starts sending back output after a few seconds; the user won't see anything until the full 5 minutes are elapsed. (Unlike the client-side GUI situation mentioned by one commenter, where presumably the output could start appearing before all the result set rows were available.)

The Jonathan Lewis blog post referenced by a commenter provides some useful information, but still leaves me wanting more information. I feel like I need a FIRST_ROWS FAQ, with short direct answers -- even if some of them are "it depends". Here are my FIRST_ROWS FAQs and the best answers I have:

Q: Any version dependencies?
A: The optimizer is smarter in 9i, meaning less need for the hint. It's smarter again in 10g (especially as regards WHERE ROWNUM < (N+1) so even less need for the hint.

Q: Any effect on parse time, does FIRST_ROWS result in faster hard parses?
A: Have heard both "think so" and "don't think so".

Q: Does ORDER BY totally negate the performance benefits?
A: It depends. Sometimes the query results come back from an index already in the ORDER BY order, in which case ORDER BY doesn't hurt the potential FIRST_ROWS optimization. If the FIRST_ROWS hint goes on the outer block and ORDER BY is on a subquery... don't know if FIRST_ROWS might still help, or if the sorting in the subquery has already lost the chance to make the outer query any faster.

Q: Are those longstanding Oracle Text examples with ORDER BY wrong or obsolete?
A: Don't know for sure, see above. Might depend on results already being sorted and ORDER BY being a no-op. Might be obsolete based on optimizer improvements in 9i and 10g. Might be immaterial when using Oracle Text for a web-based application through mod_plsql, due to the lack of incremental output.

Q: Any difference when a query includes WHERE ROWNUM < (N+1) using a variable N instead of a constant?
A: Don't know if bind variable peeking plays any role here. I know that I am tempted to use e.g. FIRST_ROWS_10 in cases where the number of items is a variable, but the default or most common case is top 10. However, given all the other factors above (particularly the interaction with ORDER BY), this might just be superstition on my part.

The comments in the Jonathan Lewis blog post discuss different approaches to displaying "top N results" combined with "page X of Y" like in a search engine. I personally prefer the technique "figure out the top N * pages results, then skip all pages prior to the current one". What I do in the Tahiti search engine is just grab the ROWIDs of those top N * pages results, find the desired range of values, then query based on ROWID; so I'm only retrieving the actual data for the 10 or so on the page that's really displayed.

Saturday, November 8, 2008

Things That Make Us Go

I've just finished taking the (EM-based) Performance Tuning course and the SQL Tuning course. It's always a strange feeling taking a course where I already have hands-on experience, or I was around in the early days when standards were being hammered out.

Sometimes that odd feeling (I know this subject backwards and forwards, but I don't know THAT) is a disincentive to take a class. That's probably why I never took a C++ class, after doing the intro to the language with John Vlissides, hearing the early design discussions for the STL, and watching my department mate (and future manager) transcribe the original C++ reference manual to publish with IBM's first C++ compiler.

Other times, behind-the-scenes experience gives a dreamlike quality to the class. Small details become major epiphanies -- THAT'S what that person meant by that remark I didn't believe when they asked me to include it in the documentation. Discussions of timelines kick off a parallel track in my head -- yes, this feature first appeared in that release, but it almost made it in the previous release, or maybe it was there but never exposed, or it was almost called XYZ but changed at the last minute to ABC, and that's why the syntax is the way it is.

My mindset during the performance courses was a mixture of those feelings. I could handwave my way through 95% of the SQL stuff, but didn't know some of the reasons why technique X is just as fast as technique Y; or still believed urban myth Z, which stopped being true some time ago. (Like when SELECT DISTINCT stopped always returning results in sorted order.) I didn't know as much of the EM stuff, but it helped put into perspective why someone might need to know 50 pages worth of internal details just so they could decide whether or not to select one specific checkbox in some dialog.

Just to make things interesting, I took the Enterprise Manager course before the SQL course. The recommended order is the reverse. Sometimes, it's better to wade through the tough stuff first, being thrown into the deep end, and then feel like the easy stuff is a cinch. Like when I first read all those famous science fiction series, for one reason or another I usually read them out of order: Lord of the Rings (3, 1, 2), Dune (2, 3, 1), Riverworld (4, 1, then after that it's too complicated to explain).

My expectation with the SQL Tuning course, was that I would find out I had made dumb newbie decisions about bitmap indexes and function-based indexes all those years ago, and learn that index-organized tables were ideal for some new stuff I'm doing now. Actually, I learned that my original index choices were spot on, and my latest data structures might not be appropriate for IOTs after all.

I will revisit some of my ancient dealings with performance in light of the course material, and write some blog posts about what I find. Back in the late 8i days, I was in charge of the "Indexes" chapter in "Application Developer's Guide - Fundamentals". I was never satisfied with the presentation of bitmap, function-based, reverse key, etc. indexes, but I didn't know enough at the time to be dangerous.

Sunday, November 2, 2008

Don't Fence Me In: All About Constraints

Constraints are simultaneously one of my most favorite and least favorite Oracle Database features. They're great for keeping bad data out of the database. They're a terrible imposition on object-oriented, agile, or <insert your favorite buzzword here> coding style. They save a ton of repetitive coding, writing the same logic in different languages. Hey, we already wrote all that redundant code in 10 languages and we'd hate for all those weeks of debugging to be wasted. Etc. etc.

If you're coming from a Java background, you might think of constraints like a straitjacket that forces your exception handling code to be organized a certain way. But that's the wrong way to think of it. Robust database code must TRY/CATCH for unexpected conditions anyway, in case the power goes out or the hard drive fills up just as you ask to do a completely innocuous INSERT or UPDATE. What if some maverick team member fires up SQL*Plus to make bulk updates to one table, bypassing all the Java application code and its error checking, without understanding the conventions that everyone agreed on for the data values, or the corresponding changes that must be made in other tables? The exception from a constraint is just another error that you expect never to happen, but you don't rely on it not happening.

If you're coming from a SQL, PL/SQL, or maybe more of a scripting background, or if your team programs in different languages or consists of fallible human beings, you can think of constraints like your very own bridge-keeper from "Monty Python and the Holy Grail". Whoever would insert or update table T must answer me these questions three, else an exception raised shall be.


  • If Sir Lancelot and Sir Robin both give the same answers, a UNIQUE constraint will catapult Sir Robin into the Gorge of Eternal Peril.
  • If Sir Galahad can't name a favorite color, a NOT NULL constraint will send him to the same fate.
  • If King Arthur gives the wrong airspeed velocity of an unladen swallow, a CHECK constraint will spell his end.

As a bonus...

  • If you don't count correctly when intoning the ritual, the Holy Hand Grenade of Antioch won't work because of its PRIMARY KEY constraint. (It's probably best to count to 3 via a sequence, and not write a psalm about that special value. In the database world, the ritual is different every time.)
  • If you are a French knight and receive a gift of a Trojan Rabbit, a FOREIGN KEY constraint will keep it out of your fort.

When I first joined Oracle, almost the first thing I had to do was document constraints in 8i, in the Application Developer's Guide - Fundamentals. That was a frustrating experience for me as an Oracle newbie. There was information about "benefits of constraints" -- but who exactly were we trying to convince, or was that just marketing fluff? You could set up constraints with CREATE TABLE or ALTER TABLE -- why one or the other? Who exactly would be issuing all these DDL statements to work with constraints, a developer or a DBA? Here was technology leaps and bounds ahead of the competition, but as a result, there was maybe less depth of understanding about how that feature would be used in the real world.

Years later, now I (think I) know some of the answers, but it's too late to rearrange and rewrite the documentation to my whims. (I make suggestions with the reader comment forms in the 11g library, same as anyone else.)

The "benefits" information is aimed at the crowd who like to put all their error-checking logic in the middle tier. If anything, I'd like to have more simple examples to extol the benefits, particularly for scenarios involving Java client code, to get a feel for best practices for exception handling.

The CREATE TABLE syntax is used by the experienced enterprise types who plan everything out in advance. The ALTER TABLE syntax is used by those who came around to the idea of constraints after their coding was already underway, or who want to experiment before deciding. (I call this the "try it and see" style of development.) I'd like to be able to bridge from one style to the other: here is the ALTER TABLE syntax, and if you decide this is a worthy change, here's the equivalent CREATE TABLE syntax to make in your installation script.

There's at least one counterintuitive case, where careful advance planning requires you to use ALTER TABLE instead of CREATE TABLE. If you want to periodically turn off constraint checking for a primary key, you must create the table without the primary key clause, create an index on the primary key column, and do an ALTER TABLE to add the primary key. The primary key will then use your index, which must not be a UNIQUE index so that later you can "defer" the primary key checking.

The extra keywords in the CREATE TABLE / ALTER TABLE syntax for constraints indicate how dogmatic you are about keeping bad data out of your tables. DEFERRABLE means you may some time in the future allow bad data in the table for a brief period, but you're confident you can fix any problems before committing. INITIALLY DEFERRED means you expect to do this cleanup the very first time you insert data, or that the table already contains bad data and you'll clean it up immediately after creating the constraint. NOVALIDATE means you don't want to test the data that's already in the table -- either you're really confident that it's good, or you don't care if there are problems -- but you do want to check future data that's inserted or updated.


When/why would you use all of those keywords? I guess it boils down to a combination of convenience and performance. Consider a development scenario, where you insert personal data that has minor problems with the data, perhaps bad capitalization in some of the names. You could run a fixer function as the data goes into the table, either by calling the function in the INSERT statement, coding it into a SQL*Plus datafile, or by setting up a trigger. But (a) each of those is less convenient than just loading the data and then running 1-2 UPDATE statements afterward, and (b) maybe you want to start crunching numbers that don't rely on the textual columns ASAP, so you don't want the cleanup processing to slow down the initial data load.

My favorite analogy here is to XML processing. We've all been given XML documents that were invalid for some ticky-tacky reason... an unquoted attribute like width=100, or an uppercase tag name like <P> in an XHTML document that requires lowercase tags. It's cumbersome to do anything with those documents as XML, the "constraint checking" is so tightly integrated into the parsers. Oracle SQL lets you choose how "constraining" such easily detected/fixed data inconsistencies should be -- fatal error, try again, or don't check until I've cleaned up the data.

The ability to rename constraints is another offshoot of the "try it and see" development methodology. If you create a constraint without giving it a name, it might be years before bit rot sets in and somebody actually tries to insert bad data. At that point, the error message will cite some unintelligible name like SYS_C000123. After some detective work querying the USER_/DBA_/ALL_CONSTRAINTS views, you'll track down that this name refers to some primary key, check constraint, or what have you, and you'll use the ALTER TABLE ... RENAME CONSTRAINT syntax to give the constraint a descriptive name, so future error messages will be easier to understand. Again, I haven't found a clear path in the documentation that leads from troubleshooting examples of ALTER TABLE syntax, back to the equivalent CREATE TABLE examples or syntax so you can plan ahead next time.

As to who actually needs the information about constraints, that's still fuzzy for me. The world is full of application developers who don't take enough advantage of stored procedures, triggers, and constraints. It's also full of DBAs who aren't allowed to poke around with business logic in application code. So anywhere the constraint best practices are documented, it never feels to me like they are reaching quite the right receptive and empowered audience. There's no "Oracle Database Jack-of-All-Trades User's Guide" for the person who fills both developer and DBA roles.

Further reading: