Archive for the ‘Query Optimisation’ Category

Filtered Index

Tuesday, April 20th, 2010

SQL Server 2008 introduced a lot of new features one of which is filtered indexes. Filtered indexes can be used as an alternative to indexed views and partitioning to improve data retrieval performance on large tables.

A filtered index is a non-clustered index with a where clause. The index will only contain the rows which satisfy the where clause. This means the index will require less storage space and have better performance.

To create the filtered index the syntax is relatively simple:

CREATE NONCLUSTERED INDEX {INDEX_NAME}
ON {TABLE}({FIELD})
WHERE {FIELD} — Insert Where criteria here.
GO

The query optimiser should pick the best index to use however you can use a hint to force the query optimiser to use the filtered index:

SELECT * FROM {TABLE} WITH(INDEX({INDEX_NAME}))
WHERE {FIELD}

While they give a performance boost there are two main limitations with using a filtered index. Firstly they are only created on a single table. Second, complex criteria like the LIKE clause are not allowed.

NVarchar vs Varchar

Thursday, April 15th, 2010

I often get asked the difference between NVarchar and Varchar. Both datatypes are used to store character information. While they are largely identical there is one significant difference. NVarchar is used to store unicode data which means it can store multilingual data in your database tables. This means that Nvarchar can take up twice as much space as varchar, therefore unless you plan to store extended character codes Varchar will offer the best performance.

Inserting multiple records to a table in SQL Server

Thursday, April 15th, 2010

There are several ways you can insert multiple records into a table on SQL Server. Some are much better than others.

The traditional way to insert is to use multiple insert statements:

INSERT INTO Table (Column1, Column2)
VALUES (Value1, Value2)
INSERT INTO Table (Column1, Column2)
VALUES (Value1, Value2)
INSERT INTO Table (Column1, Column2)
VALUES (Value1, Value2)
GO

Note: The GO command signifies that all the INSERT statements should be executed as a batch.

This method is perhaps the easiest understood but by far the least efficient especially when inserting large amounts of data.

To improve this we can combine multiple INSERT statements using SELECT and UNION ALL:

INSERT INTO Table (Column1, Column2)
SELECT Value1, Value2
UNION ALL
SELECT Value1, Value2
UNION ALL
SELECT Value1, Value2
GO

On larger data sets this will increase performance and help move the code to a more ‘Set based’ approach.

The third method involves using a feature introduced in SQL Server 2008, row construction. The syntax is much more logical than previous versions:

INSERT INTO Table (Column1, Column2)
VALUES
(Value1, Value2),
(Value1, Value2),
(Value1, Value2)

The final method worth considering the bulk insert. Unlike the previous methods it takes data directly from a text file and will insert all records at once.

Beginning Query Optimisation

Wednesday, April 14th, 2010

The order in which SQL directives get executed in a good starting point for query optimisation:

FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause

The above order is a good start to query optimisation but also shows up an issue I had in the past. Namely referencing a calculated SELECT statement in a WHERE clause.

FROM Clause

Since FROM is executed first it makes much more sense to put joins here rather than in the WHERE clause.

SELECT *
FROM Tbl1
INNER JOIN Tbl2
ON Tbl1.id on Tbl2.id

is much better than and outperforms

SELECT *
FROM Tbl1, Tbl2
WHERE Tbl1.id = Tbl2.id

On small data sets it does not make much difference but on larger ones it can greatly improve performance.

WHERE Clause

Since the WHERE clause comes second it becomes obvious why SELECT columns cannot be referenced in the WHERE clause. This is because the SELECT clause has not even been executed at the time the WHERE clause is being run.

ORDER BY Clause

Since the order by clause is executed last you can reference calculated SELECT columns here.