Friday, March 30, 2012

IF NOT EXISTS (... - EXISTS TABLE : nested iteration. Table Scan.Forward scan.

Hi,

This is on Sybase but I'm guessing that the same situation would happen on SQL Server. (Please confirm if you know).

I'm looking at these new databases and I'm seeing code similar to this all over the place:

if not exists (select 1 from dbo.t1 where f1 = @.p1)
begin
select @.errno = @.errno | 1
end

There's a unique clustered in dex on t1.f1.

The execution plan shows this for this statement:

FROM TABLE
dbo.t1
EXISTS TABLE : nested iteration.
Table Scan.
Forward scan.
Positioning at start of table.

It's not using my index!!!!!

It seems to be the case with EXISTS statements. Can anybody confirm?

I also hinted to use the index but it still didn't use it.

If the existence check really doesn't use the index, what's a good code alternative to this check?

I did this and it's working great but I wonder if there's a better alternative. I don't really like doing the SET ROWCOUNT 1 and then SET ROWCOUNT 0 thing. SELECT TOP 1 won't work on Sybase, :-(.

SET ROWCOUNT 1
SELECT @.cnt = (SELECT 1 FROM dbo.t1 (index ix01)
WHERE f1 = @.p1
)
SET ROWCOUNT 0

Appreciate your help.

Do you have the code in a SP? I don't know if Sybase does parameter sniffing or not. Since the search argument is parameterized, the optimizer has to guess the value of @.p1 and use it for generating a plan. That will explain why it picks a table scan. And index hints will not help in this case. You could rewrite the code like below:

if (select min(1) from dbo.t1 where f1 = @.p1) is null

begin

select @.errno = @.errno|1

end

If Sybase has something like OPTIMIZE FOR clause then you can provide hint about the value of @.p1.

|||

Hi, thank you for the reply.

Yes, the code is in an sp. Not sure why the parameter needs to be evaluated for the plan to be generated correctly. I would think that it only needs the table and columns that are being queried. ?

I implemented your suggestion but it still didn't use the index. Once I added a hint, it did use the index.

I'd like to find a solution that doesn't have any aggregates in it, ie min(1). I'm looking for top performance; this query is executed a lot. I'll keep digging.

Thank you,

Etienne

|||

The optimizer analyzes the predicates and search arguments to decide which indexes to use. Indexes have statistics and if a particular search argument value is not known at the time of compilation then a guess will be made. For example, consider following:

Index on table t, column i

Statistics in the index will contain information on how the values of column "i" are distributed in the table

Given query like: "select i from t where i > 1"

Query optimizer can use the statistics to infer how many values of i are greater than 1. Based on this information, a plan to either use index seek or index scan or table scan will be used. The choice depends on the cardinality of the returned result set based on the search arguments.

Now, for the same query if you do: "select i from t where i > @.i"

If query optimizer cannot sniff the value of the parameter/variable @.i then it has to make a guess on what the value of @.i could be at run-time. This guess is often conservative because based on the value of @.i the query can either return zero rows, one row, many rows or entire table.

Both approaches has their pros and cons. Even with parameter sniffing the plan choice depends on the supplied value and that plan will not be great for other potential search arguments. So as you can see it is not just the tables, columns and indexes that matter.

Anyway, it looks like you have to ask this question in a Sybase forum. I don't know about the details of the query optimizer in Sybase SQL Server. You can try various rewrites but I think the crux of the problem is that the value of the search argument is guessed by the optimizer and then the query plan is generated.

No comments:

Post a Comment