Tuesday, September 4, 2007

Ten SQL Server Functions That You Hardly Use But Should

Below are 10 SQL Server functions that are hardly used but should be used a lot more
I will go in more detail later on but here is a list of the ten functions that I am talking about

BINARY_CHECKSUM
SIGN
COLUMNPROPERTY
DATALENGTH
ASCII, UNICODE
NULLIF
PARSENAME
STUFF
REVERSE
GETUTCDATE


BINARY_CHECKSUM
BINARY_CHECKSUM is handy if you want to check for data differences between 2 rows of data

In order to see what rows are in table 1 and not in table 2 and vice versa you can do 2 left joins, 2 right joins or 1 left and 1 right join. To get the rows that are different you can use BINARY_CHECKSUM. You have to run this example o SQL Server 2000 to see it work, you can ofcourse use any tables just modify the queries
Let’s get started…

--let's copy over 20 rows to a table named authors2
SELECT TOP 20 * INTO tempdb..authors2
FROM pubs..authors

--update 5 records by appending X to the au_fname
SET ROWCOUNT 5


UPDATE tempdb..authors2
SET au_fname =au_fname +'X'


--Set rowcount back to 0
SET ROWCOUNT 0

--let's insert a row that doesn't exist in pubs
INSERT INTO tempdb..authors2
SELECT '666-66-6666', au_lname, au_fname, phone, address, city, state, zip, contract
FROM tempdb..authors2
WHERE au_id ='172-32-1176'

--*** The BIG SELECT QUERY --***

--Not in Pubs
SELECT 'Does Not Exist On Production',t2.au_id
FROM pubs..authors t1
RIGHT JOIN tempdb..authors2 t2 ON t1.au_id =t2.au_id
WHERE t1.au_id IS NULL
UNION ALL
--Not in Temp
SELECT 'Does Not Exist In Staging',t1.au_id
FROM pubs..authors t1
LEFT JOIN tempdb..authors2 t2 ON t1.au_id =t2.au_id
WHERE t2.au_id IS NULL
UNION ALL
--Data Mismatch
SELECT 'Data Mismatch', t1.au_id
FROM( SELECT BINARY_CHECKSUM(*) AS CheckSum1 ,au_id FROM pubs..authors) t1
JOIN(SELECT BINARY_CHECKSUM(*) AS CheckSum2,au_id FROM tempdb..authors2) t2 ON t1.au_id =t2.au_id
WHERE CheckSum1 <> CheckSum2

--Clean up
DROP TABLE tempdb..authors2
GO




SIGN
Sometimes you are asked by the front-end/middle-tier developers to return a rowcount as well with the result set. However the developers want you to return 1 if there are rows and 0 if there are none. How do you do such a thing?
Well I am going to show you two ways. the first way is by using CASE and @@ROWCOUNT, the second way is by using the SIGN function

For CASE we will do this

RETURN CASE WHEN @@ROWCOUNT > 0 THEN 1 ELSE 0 END

So that's pretty simple, if @@ROWCOUNT is greater than 0 return 1 for everything else return 0

Using the SIGN function is even easier, all you have to do is this

RETURN SIGN(@@ROWCOUNT)

That's all, SIGN Returns the positive (+1), zero (0), or negative (-1) sign of the given expression. In this case -1 is not possible but the other two values are
So let's see this in action


USE pubs
GO

--Case Proc
CREATE PROCEDURE TestReturnValues
@au_id VARCHAR(49) ='172-32-1176'
AS
SELECT
*
FROM authors
WHERE au_id =@au_id

RETURN CASE WHEN @@ROWCOUNT > 0 THEN 1 ELSE 0 END
GO

--Sign Proc
CREATE PROCEDURE TestReturnValues2
@au_id VARCHAR(49) ='172-32-1176'
AS
SELECT
*
FROM authors
WHERE au_id =@au_id

RETURN SIGN(@@ROWCOUNT)
GO


--Case Proc, 1 will be returned; default value is used
DECLARE @Rowcount int
EXEC @Rowcount = TestReturnValues
SELECT @Rowcount
GO

--Case Proc, 0 will be returned; dummy value is used
DECLARE @Rowcount int
EXEC @Rowcount = TestReturnValues 'ABC'
SELECT @Rowcount
GO

--Sign Proc, 1 will be returned; default value is used
DECLARE @Rowcount int
EXEC @Rowcount = TestReturnValues2
SELECT @Rowcount
GO

--Sign Proc, 0 will be returned; dummy value is used
DECLARE @Rowcount int
EXEC @Rowcount = TestReturnValues2 'ABC'
SELECT @Rowcount
GO


--Help the environment by recycling ;-)
DROP PROCEDURE TestReturnValues2,TestReturnValues
GO


COLUMNPROPERTY
COLUMNPROPERTY is handy if you need to find scale, precision, if it is an identity column and more. I have listed all of them below

CREATE TABLE blah (ID DECIMAL(5,2) not null DEFAULT 99)
INSERT blah DEFAULT VALUES

SELECT * FROM blah
SELECT COLUMNPROPERTY( OBJECT_ID('blah'),'ID','AllowsNull') AS AllowsNull,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsComputed') AS IsComputed,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsCursorType') AS IsCursorType,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsDeterministic') AS IsDeterministic,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsFulltextIndexed') AS IsFulltextIndexed,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsIdentity') AS IsFulltextIndexed,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsIdNotForRepl') AS IsIdNotForRepl,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsIndexable') AS IsIndexable,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsOutParam') AS IsOutParam,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsPrecise') AS IsPrecise,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','IsRowGuidCol') AS IsRowGuidCol,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','Precision') AS 'Precision',
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','Scale') AS Scale,
COLUMNPROPERTY( OBJECT_ID('blah'),'ID','UsesAnsiTrim') AS UsesAnsiTrim
FROM Blah


So what does all that stuff mean?

AllowsNull
Allows null values. 1 = TRUE
0 = FALSE
NULL = Invalid input

IsComputed
The column is a computed column. 1 = TRUE
0 = FALSE
NULL = Invalid input

IsCursorType
The procedure parameter is of type CURSOR. 1 = TRUE
0 = FALSE
NULL = Invalid input

IsDeterministic
The column is deterministic. This property applies only to computed columns and view columns. 1 = TRUE
0 = FALSE
NULL = Invalid input. Not a computed column or view column.

IsFulltextIndexed
The column has been registered for full-text indexing. 1 = TRUE
0 = FALSE
NULL = Invalid input

IsIdentity
The column uses the IDENTITY property. 1 = TRUE
0 = FALSE
NULL = Invalid input

IsIdNotForRepl
The column checks for the IDENTITY_INSERT setting. If IDENTITY NOT FOR REPLICATION is specified, the IDENTITY_INSERT setting is not checked. 1 = TRUE
0 = FALSE
NULL = Invalid input

IsIndexable
The column can be indexed. 1 = TRUE
0 = FALSE
NULL = Invalid input

IsOutParam
The procedure parameter is an output parameter. 1 = TRUE
0 = FALSE
NULL = Invalid input

IsPrecise
The column is precise. This property applies only to deterministic columns. 1 = TRUE
0 = FALSE
NULL = Invalid input. Not a deterministic column

IsRowGuidCol
The column has the uniqueidentifier data type and is defined with the ROWGUIDCOL property. 1 = TRUE
0 = FALSE
NULL = Invalid input

Precision
Precision for the data type of the column or parameter. The precision of the specified column data type
NULL = Invalid input

Scale
Scale for the data type of the column or parameter. The scale
NULL = Invalid input

UsesAnsiTrim
ANSI padding setting was ON when the table was initially created. 1= TRUE
0= FALSE
NULL = Invalid input




DATALENGTH
Okay so you know the LEN function but do you know the DATALENGTH function? There are two major difference between LEN and DATALENGTH.
The first one deals with trailing spaces, execute the following code and you will see that LEN returns 3 while DATALENGTH returns 4

DECLARE @V VARCHAR(50)
SELECT @V ='ABC '
SELECT LEN(@V),DATALENGTH(@V),@V

The second difference deals with unicode character data, as you know unicode uses 2 bytes to store 1 character
Run the following example and you will see that LEN returns 3 while DATALENGTH returns 6
DECLARE @V NVARCHAR(50)
SELECT @V ='ABC'
SELECT LEN(@V),DATALENGTH(@V),@V

If you do DATALENGTH(CONVERT(VARCHAR,@V)) you will get the same as LEN because LEN does a RTRIM and converts to VARCHAR before returning



ASCII, CHAR,UNICODE
ASCII will give you the ascii code for a character so for A you will get 65
CHAR does the reverse of ascii CHAR(65) returns A
UNICODE will give you the unicode value for a character
NCHAR will give you the character for a unicode or ascii value
let's see how this works

SELECT ASCII('A'),CHAR(65),CHAR(ASCII('A')),
UNICODE(N'Λ'),NCHAR(923),NCHAR(UNICODE(N'Λ'))




NULLIF
NULLIF Returns a null value if the two specified expressions are equivalent.

Syntax
NULLIF ( expression , expression )

DECLARE @v VARCHAR(20)
SELECT @v = ' '

SELECT NULLIF(@v,' ')

You can combine NULLIF with COALESCE if you want to test for NULLS and Blanks for example

DECLARE @v VARCHAR(20)
SELECT @v = ' '

SELECT COALESCE(NULLIF(@v,' '),'N/A')


Here is another NULLIF example:
CREATE TABLE Blah (SomeCol VARCHAR(33))

INSERT Blah VALUES(NULL)
INSERT Blah VALUES('')
INSERT Blah VALUES(' ')
INSERT Blah VALUES('A')
INSERT Blah VALUES('B B')

--Using COALESCE and NULLIF
SELECT COALESCE(NULLIF(RTRIM(SomeCol),' '),'N/A')
FROM Blah


--Using CASE
SELECT CASE WHEN RTRIM(SomeCol) = '' THEN 'N/A'
WHEN RTRIM(SomeCol) IS NULL THEN 'N/A'
ELSE SomeCol END SomeCol
FROM Blah


Output for both queries
-----------------------
N/A
N/A
N/A
A
B B


PARSENAME
PARSENAME retrieves parts of string delimited by dots. It is used to split DataBaseServer, DataBaseName, ObjectOwner and ObjectName but you can use it to split IP addresses, names etc

DECLARE @ParseString VARCHAR(100)
SELECT @ParseString = 'DataBaseServer.DataBaseName.ObjectOwner.ObjectName'

SELECT PARSENAME(@ParseString,4),
PARSENAME(@ParseString,3),
PARSENAME(@ParseString,2),
PARSENAME(@ParseString,1)


CREATE TABLE #Test (
SomeField VARCHAR(49))

INSERT INTO #Test
VALUES ('aaa-bbbbb')

INSERT INTO #Test
VALUES ('ppppp-bbbbb')

INSERT INTO #Test
VALUES ('zzzz-xxxxx')

--using PARSENAME
SELECT PARSENAME(REPLACE(SomeField,'-','.'),2)
FROM #Test



Another example:

CREATE TABLE BadData (FullName varchar(20) NOT NULL);
INSERT INTO BadData (FullName)
SELECT 'Clinton, Bill' UNION ALL
SELECT 'Johnson, Lyndon, B.' UNION ALL
SELECT 'Bush, George, H.W.';

Split the names into 3 columns

Your output should be this:
LastName FirstName MiddleInitial
Clinton Bill
Johnson Lyndon B.
Bush George H.W.

SELECT FullName,PARSENAME(FullName2,NameLen+1) AS LastName,
PARSENAME(FullName2,NameLen) AS FirstName,
COALESCE(REPLACE(PARSENAME(FullName2,NameLen-1),'~','.'),'') AS MiddleInitial
FROM(
SELECT LEN(FullName) -LEN(REPLACE(FullName,',','')) AS NameLen,
REPLACE(REPLACE(FullName,'.','~'),', ','.') AS FullName2,FullName
FROM BadData) x



STUFF
STUFF is another function that is hardly used, it is useful if you want to replace or add characters inside data
Take a look at the code below. the first STUFF will replace X with 98765, the second STUFF will place 98765 before the X and the third stuff will replace X- with 98765
DECLARE @v VARCHAR(11)
SELECT @v ='-X-'


SELECT STUFF(@v, 2, 1, '98765'),
STUFF(@v, 2, 0, '98765'),
STUFF(@v, 2, 2, '98765')


The STUFF function is very handy if you need to insert dashes in a social security. You can accomplish that by using the function STUFF twice instead of using substring,left and right

DECLARE @v VARCHAR(11)
SELECT @v ='123456789'

SELECT @v,STUFF(STUFF(@v,4,0,'-'),7,0,'-')



REVERSE
REVERSE just reverses the value, for example the code below returns CBA

SELECT REVERSE('ABC')

Reverse is handy if you need to split values, take a look at this example

CREATE TABLE #TestCityStateZip (csz CHAR(49))
INSERT INTO #TestCityStateZip VALUES ('city ,st 12223')
INSERT INTO #TestCityStateZip VALUES ('New York City,NY 10028')
INSERT INTO #TestCityStateZip VALUES ('Princeton , NJ 08536')
INSERT INTO #TestCityStateZip VALUES ('Princeton,NJ 08536 ')
INSERT INTO #TestCityStateZip VALUES ('Long Island City, NY 10013')
INSERT INTO #TestCityStateZip VALUES ('Long Island City, NY 10013 ')
INSERT INTO #TestCityStateZip VALUES ('Long Island City , NY 10013')
INSERT INTO #TestCityStateZip VALUES ('Long Island City ,NY 10013 ')


SELECT LEFT(csz,CHARINDEX(',',csz)-1)AS City,
LEFT(LTRIM(SUBSTRING(csz,(CHARINDEX(',',csz)+1),4)),2) AS State,
RIGHT(RTRIM(csz),CHARINDEX(' ',REVERSE(RTRIM(csz)))-1) AS Zip
FROM #TestCityStateZip





GETUTCDATE
SELECT GETUTCDATE()

Returns the datetime value representing the current UTC time (Universal Time Coordinate or Greenwich Mean Time). The current UTC time is derived from the current local time and the time zone setting in the operating system of the computer on which SQL Server is running.



And that is all, those are the ten functions that you should be using but currently you are not using all of them. Look them up in Books On Line so that you can see some more examples

5 comments:

Anonymous said...

Great stuff!

FYI -- if you haven't seen this technique, I feel that it is by far the easiest way to compare two tables in both directions:

http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx

guercheLE said...
This comment has been removed by the author.
guercheLE said...

Dear,

With regard to the BINARY_CHECKSUM sample, what do you think of the SQL statement below?

--
Luciano Evaristo Guerche
Taboao da Serra, SP, Brazil

---
SELECT CASE WHEN t1.au_id IS NULL THEN 'Does Not Exist On Production'
            WHEN t2.au_id IS NULL THEN 'Does Not Exist In Staging'
            WHEN t1.CheckSum1 <> t2.CheckSum2 THEN 'Data Mismatch'
            ELSE 'Data Match'
       END AS Status,
       ISNULL(t1.au_id, t2.au_id) AS au_id
FROM (
      SELECT BINARY_CHECKSUM(*) AS CheckSum1,
             au_id
      FROM pubs..authors
     ) AS t1
FULL OUTER JOIN
     (
      SELECT BINARY_CHECKSUM(*) AS CheckSum2,
             au_id
      FROM tempdb..authors2
     ) AS t2
     ON t1.au_id = t2.au_id
ORDER BY CASE WHEN t1.au_id IS NULL THEN 3
              WHEN t2.au_id IS NULL THEN 2
              WHEN t1.CheckSum1 <> t2.CheckSum2 THEN 1
              ELSE 0
         END

Unknown said...

I've read this already here.

kcornwell said...

You should show output of the function in a table. It makes things absolutely clear and easier to read.