Tuesday, August 26, 2008

The true cost of sub-select in queries

Yesterday I was working on a stored procedure trying to help to optimize it. It has some subselects in a big select statement and that was one of the main issues. One of the subselects was responsible for almost 50% of the relative cost of the stored procedure.

That got me to thinking, what is the real cost of subselects in queries? I "know" they are bad, but just how bad are they? This is probably old news to you all, but I found it very interesting to actually see the cost in how many reads were produced by each type of query. So I did a little test. Here are the parameters.

I created three simple tables (Table_A, Table_B, Table_C) with no indexes. Here is the structure (all three are the same).

ID INT (Identity)
OrderID INT
Fname VARCHAR(50)
Lname VARCHAR(50)



Then I filled each table with random numbers and letters.

Then I wrote the following 3 queries to help measure how many reads each type of query would produce.

#1 (subselect with no temp tables)
select * from table_a
where orderID not in (select distinct orderid from table_b)
and orderID not in (select distinct orderid from table_c)

#2 (subselect using temp tables)
Select distinct OrderID into #B from Table_B
Select distinct OrderID into #c from Table_C

select * from table_a
where orderID not in (select orderid from #b)
and orderID not in (select orderid from #c)

drop table #b
drop table #c

#3 (Joins using temp tables)
Select distinct OrderID into #B from Table_B
Select distinct OrderID into #c from Table_C

select * from table_a a
left join #b b on a.orderid = b.orderid
left join #c c on a.orderid = c.orderid
where b.orderid is null and c.orderid is null

drop table #b
drop table #c

Then using SQL Profiler, here are the results
(please forgive the horrible, ugly table)...



















# of records in tableJoinTotal
ABCTypeReads
5,0005,0005,000SubSelect no temp table25,063
5,00055,00055,000SubSelect no temp table50,548
5,000100,000100,000SubSelect no temp table51,022
50,000100,000100,000SubSelect no temp table251,002
5,0005,0005,000SubSelect with temp table40,505
5,00055,00055,000SubSelect with temp table40,799
5,000100,000100,000SubSelect with temp table41,091
50,000100,000100,000SubSelect with temp table401,273
5,0005,0005,000Join using temp table476
5,00055,00055,000Join using temp table799
5,000100,000100,000Join using temp table1,091
50,000100,000100,000Join using temp table1,237



To tell you the truth, I was surprised at just how much more IO was produced using SubSelects than not. Just thought I would share.



No comments: