Monday, October 1, 2007

Make Your Case Sensitive Searches 1000 Times Faster

I had an case sensitive update query that ran in about 9 minutes. Since it was a case sensitive query it did an index scan not an index seek.
Once I modified my WHERE clause the update took a little less than 3 seconds

Let's get started and see what I did

First create this table

CREATE TABLE #CaseSensitiveSearchTemp (Val CHAR(1))

INSERT #CaseSensitiveSearchTemp VALUES('A')
INSERT #CaseSensitiveSearchTemp VALUES('B')
INSERT #CaseSensitiveSearchTemp VALUES('C')
INSERT #CaseSensitiveSearchTemp VALUES('D')
INSERT #CaseSensitiveSearchTemp VALUES('E')
INSERT #CaseSensitiveSearchTemp VALUES('F')
INSERT #CaseSensitiveSearchTemp VALUES('G')
INSERT #CaseSensitiveSearchTemp VALUES('H')

Now we will insert some lowercase characters

INSERT #CaseSensitiveSearchTemp
SELECT LOWER(Val) FROM #CaseSensitiveSearchTemp

Now we will create our real table which will have 65536 rows

CREATE TABLE CaseSensitiveSearch (Val VARCHAR(50))

We will do a couple of cross joins to generate the data for our queries

INSERT CaseSensitiveSearch
SELECT t1.val + t2.val + t3.val + t4.val
FROM #CaseSensitiveSearchTemp t1
CROSS JOIN #CaseSensitiveSearchTemp t2
CROSS JOIN #CaseSensitiveSearchTemp t3
CROSS JOIN #CaseSensitiveSearchTemp t4


This should give you 65536 rows

SELECT * FROM CaseSensitiveSearch


Create an index on the table

CREATE INDEX IX_SearchVal ON CaseSensitiveSearch(Val)


This is how you do a case sensitive search

SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS



Now hit CRLK + K (SQL Server 2000) or CRLK + M(SQL Server 2005)

run these 2 queries in one batch by highlighting them both and hitting F5

SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS

SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS
AND Val LIKE 'ABCD'

Look at the execution plan, I get 98.71% for the first query and 1.29% for the second query. Just by adding the AND condition SQL server is able to do an index seek and run the query many times faster




Now try it with a lowercase a

SELECT * FROM CaseSensitiveSearch
WHERE Val = 'aBCD' COLLATE SQL_Latin1_General_CP1_CS_AS

SELECT * FROM CaseSensitiveSearch
WHERE Val = 'aBCD' COLLATE SQL_Latin1_General_CP1_CS_AS
AND Val LIKE 'aBCD'


You see it all works without a problem, the correct result is returned

No comments: