Don’t depend on expression short circuiting in T-SQL (not even with CASE)

There are a fair number of blog posts and forum discussions regarding expression short circuiting in T-SQL. Some of the most authoritative posts, like this one, come to the following conclusions: (a) You cannot depend on expression evaluation order for things like “WHERE <expr1> OR <expr2>“, since the optimizer might choose a plan that evaluates the second predicate before the first one. But, (b) order of evaluation of the expressions in a CASE statement is fixed, so you can depend on deterministic short circuit evaluation of a CASE statement. For example, this wouldn’t protect you from a divide-by-zero error:

    WHERE (@value = 0) OR ((col1 / @value) = 2)

But the idea is that this variation is functionally-equivalent, and should protect you from the error:

    WHERE 
        CASE 
            WHEN (@value = 0) THEN 2 
            ELSE (col1 / @value)
        END = 2

Before now that’s the advice I would have offered, too. But I just ran into a situation where a CASE statement does not provide predictable short circuiting. Here’s a simplified repro:

    ALTER FUNCTION dbo.test_case_short_circuit (@input INT)
    RETURNS TABLE 
    AS 
    RETURN (
        SELECT calculated_value = 
            CASE 
                WHEN @input <= 0 THEN 0
                ELSE LOG10 (@input)
            END
    );
    GO

    SELECT * FROM dbo.test_case_short_circuit (0);
    GO

This fails with this error:


Msg 3623, Level 16, State 1, Line 2
An invalid floating point operation occurred.

The LOG10 function raises this error when its input is 0 or a negative value. In some cases it appears that the plan may still evaluate the expression in the second CASE branch even when it won’t be using the value. This is a case where CASE doesn’t provide deterministic short circuiting.

I want to make sure no one takes away the conclusion that SQL Server doesn’t support expression short circuiting. It definitely does. It’s just that you don’t have explicit control over the order of expression evaluation — even with CASE, apparently. And if you’re going to depend on short circuiting, you need a deterministic order of expression evaluation.

What can you do about it? One option would be to always scrub things so that an error isn’t possible even when the CASE branch’s output won’t be used. For example, using “ELSE LOG10 (CASE WHEN @input <= 0 THEN 1 ELSE @input END)” in the repro script doesn’t change the behavior of the function, but avoids the error. Unfortunately, that’s not so pretty.

UPDATE (4 Mar 2011): To be clear, I’ve used CASE before for its short-circuiting properties, and I don’t intend to go back and revisit all of that old code in light of this example. This seems like an edge case to me. But it’s worth being aware that such edge cases exist if you’re thinking about relying on CASE for short circuiting. The most defensive programming approach would be to write the expression in such a way that it doesn’t require particular short circuiting behavior.

(Cross-posted to my MSDN blog.)

UPDATE (21 Mar 2011): The QO team investigated this and decided that it’s a bug. The problem should be fixed in a future SQL release, but it’s still a small risk on existing releases so keep an eye out for it if you’re using CASE for short circuiting on those releases along with a function like LOG.

UPDATE (10 Jun 2011): The owners of this code have marked this bug as fixed. From their comments, it sounds like you are supposed to be able to rely on deterministic order of expression evaluation for CASE statements. But any SQL release in your hands right now will still be vulnerable to this problem — keep an eye out for the issue as you use CASE for short circuiting.

16 Responses to Don’t depend on expression short circuiting in T-SQL (not even with CASE)

  1. spaghettidba says:

    Nice post, Bart. Thanks for commenting on my article: http://www.sqlservercentral.com/articles/T-SQL/71950/

  2. Pingback: A short-circuiting edge case « spaghettidba

  3. Pingback: Frank Kalis

  4. David Howard says:

    That’s really interesting, nice post. Just out of curiosity, I tried the same thing without using a function call and the error was not generated. Also no error when converting to a scalar UDF instead of a table valued function.

  5. Alex Kuznetsov says:

    Hey Bart,

    You are not saying that this is a bug. Do you think that this is a feature that is to stay, get documented in MSDN and such?

    • bartduncan says:

      Alex, I’m not positive it’s a bug. BOL makes the following promises: (a) The WHEN expressions will be evaluated in the order they are defined in the CASE statement. And (b), the THEN part of the first branch to evaluate to true will be returned. The guarantee of order of evaluation doesn’t necessary imply a guarantee that no evaluation of the WHEN expressions beyond the true one will occur.

      Regardless, I think CASE’s implementation actually does guarantee both order of execution and predictable short circuiting at execution time. That doesn’t prevent this error, though, because in this case the error is occuring at compile time, not execution time. As another person commented on a forum discussing this edge case, this is the result of constant folding. You may note that it’s impossible to retrieve a query plan for the SELECT statement. That’s because the error occurs when the optimizer tries to evaluate “LOG10(@input)” at compile time to replace it with a constant.

      It’s not entirely clear to me whether compile-time simplifications like this one are expected to be blocked in order to preserve CASE’s not-so-clearly-documented short circuiting behavior.

      Having said all of that, I suspect it is a bug. I filed an internal workitem (686223) requesting that someone on the QO team investigate this and determine whether they agree it’s a bug.

  6. Alex Kuznetsov says:

    Bart,

    I think that if this edge case did not exist, if CASE was clearly documented to short circuit in the same way as do Java, C++, and C#, and if the compiler were just a little bit smarter, we would have a slightly more useful product.

    IMO the scenario that you have described is a complication, and I feel it is a completely unnecessary one – overall this thing does not help us get the job done easier and/or faster. What am I missing?

    • bartduncan says:

      > What am I missing?

      Nothing, AFAIK <G>. But suppose that there are other compile-time optimizations that would need to be blocked in order to guarantee that this class of problem could never occur. Or suppose that there are possible side effects — say, suboptimal query plans — that could result from blocking this compile-time optimization. In that case it might not be so clear cut… a person who suffered a perf regression due to the “fix” of this issue would not agree that the product had been made more useful, especially if they hadn’t previously been affected by problems with CASE not short circuiting. I don’t have specific knowledge of any problems like that, so if I was forced to choose based on my current understanding of the situation, my vote would be to block constant folding in this case. But my current understanding may not be complete.

      Another consideration is that this has been the behavior of the product since SQL 2005 at least, and possibly since the introduction of UDFs in SQL2K. I’m not aware of the issue being reported before now. That implies that the number of users affected negatively by this instance of constant folding is fairly low. The number of users that might be affected negatively by a regression caused by the fix is an unknown quantity and is more difficult to predict. (Even if the new planned behavior has no anticipated downsides, nearly every fix introduces a nontrivial possibility of some unexpected regression, especially when we’re talking about changes to the optimizer.) The urgency of the need for a fix and the risk vs. reward ratio are both things that need to be taken into account when deciding whether and when to do a tactical fix for an issue like this. For example, suppose that we decide to fix some problem. Do we fix it in the next QFE release, or do we wait until the next service pack (or the next major release) so that the change has a chance to get tested in a lengthy public beta before it’s released to production? You can make a customer-oriented argument in favor of either option.

      So while it looks like a bug to me, and I tend to agree with your assessment that things would be better with a fix, I realize that there may be some good reasons not to fix the thing that I’m just ignorant of. The owners of the affected code are in a better position to decide (a) to fix it, (b) that it’s a bug but the regression risks are high and outweigh the small benefit of a fix, or (c) that the current behavior is actually ‘by design’, generally meaning that it’s unsafe to fix or that fixing the problem would break some other promised behavior. I’m with you in hoping that the right decision is for option (a).

  7. Doug Milam says:

    “You cannot depend on expression evaluation order for things like “WHERE OR ” In this case, pun not intended, what can be relied on, if not logic? In other words, does ((NOT A) OR B) no longer hold in replacing a CASE expression in a WHERE clause?

    • bartduncan says:

      No, you cannot rely on “NOT A” being evaluated before “B” in “WHERE ((NOT A) OR B)”. Suppose this is really “WHERE col1!=X OR col2=Y”, and suppose that there is an index on col2, but no index on col1. The optimizer would guess that a query plan that did a seek for “col2=Y” might be much more efficient than a query that scanned the whole table to evaluate “col1!=X” first. Or it might be the other way around. Order of operations is not guaranteed in this case.

      A CASE expression is usually suggested as the way to get a guaranteed order of operations in SQL. Usually it does do that. However, the point of this post was to point out that there are edge cases where even this may not work the way you expect.

  8. spaghettidba says:

    Marked as fixed: http://bit.ly/eROtCY
    That was really fast. Kudos.

  9. techvslife says:

    Note: this is not a bug and won’t get fixed. with aggregate expressions, case does not guarantee order of operations, and will not short-circuit:
    https://connect.microsoft.com/SQLServer/feedback/details/691535/aggregates-dont-follow-the-semantics-of-case

    • spaghettidba says:

      Seems to me that you’re referring to a similar but different issue. The one described on this post has already been fixed.

      • bartduncan says:

        Yep, I agree with Gianluca – they seem to be different issues. But it is another example of an edge case where CASE short-circuiting doesn’t behave quite like you expect. It looks like a fix for that one isn’t in the works.

        • techvslife says:

          I’m referring to the minor variation shown in the connect item I posted (using aggregates); I can confirm that now occurs, in the latest version of sql server (sql server 2012). See comments. It won’t be fixed (by design and there is no guarantee of order with CASE), and is not I think an edge case–aggregates are very common and this occurs with even the simplest use of them. Sql server docs will be changed to clarify that case has no ordering guarantee or implication.

Leave a comment