Archive

Posts Tagged ‘performance’

Native Dynamic SQL in SQL Server (Late Binding)

May 2nd, 2009 Administrator 3 comments

I was trying to figure out how to use Native Dynamic SQL in SQL Server (using late binding) for performance reasons. With late binding, you are able to reuse cached execution plans and therefore can get better performance. This also keeps you from having from having to create the same query multiple times in a stored procedure just to allow for different filter criteria. You can build the SQL on the fly, bind the parameters and execute the SQL on the fly.

Here’s what I ended up with.


USE [database]

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE PROCEDURE [dbo].[proc]

@Parm1 INT,

@Parm2 INT,

@Parm3 DATETIME,

AS

SET NOCOUNT ON

DECLARE

@SELECT NVARCHAR(1000),

@FROM NVARCHAR(1000),

@WHERE NVARCHAR(1000),

@ORDERBY NVARCHAR(1000),

@SQL NVARCHAR(4000),MAX SIZE for dynamic SQL is 4000 characters

@DYNSQLPARMLIST NVARCHAR(250),

@vParm3 CHAR(10)

 

-1 implies ALL. Set -1s to NULL for COALESCE functions in SQL

IF @Parm1 = -1 SET @Parm1 = NULL

IF @Parm2 = -1 SET @Parm2 = NULL

 

SET @vParm3 = CONVERT(char, @Parm3, 101) /*Just a placeholder for easier dyn sql conversion*/

 

/*For some reason, dynamic sql can’t handle s or linefeeds in this section, so they’re all on one line with spaces separating them*/

/*Note, the DATETIME parameter is not in CHAR format. Again, this is just because I’m building a string for the SQL*/

SET @DYNSQLPARMLIST = N’@dynParm1 INT, @dynParm2 INT, @dynParm3 CHAR(10)

 

/*Build the SQL. Note the parameters match the parameters in the @DYNSQLPARMLIST variable. @dynParm1, not @Parm1*/

SET @SELECT =SELECT *

SET @FROM =FROM table1 with (nolock)

SET @WHERE =WHERE column1 = COALESCE(@dynParkID, column1)if null, set column = to itself (no filter)

AND column2 = COALESCE(@dynInformationTypeID, column2)

AND column3 > cast(@dynParm3 as datetime)

SET @ORDER =ORDER BY 1?

 

SET @SQL =

@SELECT + ‘ ‘ +

@FROM + ‘ ‘ +

@WHERE + ‘ ‘ +

@ORDERBY

 

EXECUTE sp_executesql

@SQL,

@DYNSQLPARMLIST,

@dynParm1 = @Parm1,

@dynParm2 = @Parm2,

@dynParm3 = @vParm3

 

SET NOCOUNT OFF

For more information, check out this page: http://ping.fm/mOmXI

VN:F [1.9.6_1107]

Rating: 4.0/5 (1 vote cast)
VN:F [1.9.6_1107]
Rating: 0 (from 0 votes)

Using Statistics to Improve Query Performance in SQL Server

May 1st, 2009 NothingMan No comments

Like all other query optimizers, SQL uses statistics to create query plans that improve query performance. The optimizer generally generates the necessary statistics for a high quality query plan, however, in a few cases, you need to create additional statistics or modify the query design for “optimal” results.

Query Optimization Statistics are objects that contain statistical information about the distribution of values in one or more columns of a table or indexed view. The query optimizer uses these statistics to estimate the cardinality, or number of rows, in the query result. These cardinality estimates enable the query optimizer to create a high-quality query plan. For example, the query optimizer could use cardinality estimates to choose the index seek operator instead of the more resource-intensive index scan operator, and in doing so improve query performance.

You can create statistics objects with filters to improve query performance for queries that select from well-defined subsets of data. They use a filter predicate to select the subset of data that is included in the statistics. Well-designed filtered statistics can improve the query execution plan compared with full-table statistics.

Syntax

CREATE STATISTICS statistics_name
ON { table_or_indexed_view_name } ( column [ ,...n ] )
  [ WHERE <filter_predicate> ]
  [ WITH
  [ [ FULLSCAN
   | SAMPLE number { PERCENT | ROWS }
   | STATS_STREAM = stats_stream ] [ , ] ]
  [ NORECOMPUTE ]
] ;

<filter_predicate> ::=
  <conjunct> [AND ]

<conjunct> ::=
  <disjunct> | <comparison>

<disjunct> ::=
  column_name IN (constant ,…)

<comparison> ::=
  column_name <comparison_op> constant

<comparison_op> ::=
  IS | IS NOT | = | <> | != | > | >= | !> | < | <= | !<

SAMPLES

USE AdventureWorks;
GO
CREATE STATISTICS ContactMail1
  ON Person.Contact (ContactID, EmailAddress)
  WITH SAMPLE 5 PERCENT;

OR

IF EXISTS (SELECT name FROM sys.stats
  WHERE name = N'ContactPromotion1'
  AND object_id = OBJECT_ID(N'Person.Contact'))
DROP STATISTICS Person.Contact.ContactPromotion1
GO
CREATE STATISTICS ContactPromotion1
  ON Person.Contact (ContactID, EmailAddress, EmailPromotion)
WHERE EmailPromotion = 2
WITH SAMPLE 50 PERCENT;
GO

For more information, see http://msdn.microsoft.com/en-us/library/ms188038.aspx

For information about when to CREATE STATISTICS, see http://msdn.microsoft.com/en-us/library/ms188038.aspx

VN:F [1.9.6_1107]

Rating: 4.0/5 (1 vote cast)
VN:F [1.9.6_1107]
Rating: 0 (from 0 votes)

Creating Index with the INCLUDE clause to Increase Performance

April 30th, 2009 NothingMan No comments

The INCLUDE statement specifies the non-key columns to be added to the leaf level of the nonclustered index. The nonclustered index can be unique or non-unique.
Column names cannot be repeated in the INCLUDE list and cannot be used simultaneously as both key and non-key columns. Nonclustered indexes always contain the clustered index columns if a clustered index is defined on the table.

All data types are allowed except text, ntext, and image. The index must be created or rebuilt offline (ONLINE = OFF) if any one of the specified non-key columns are varchar(max), nvarchar(max), or varbinary(max) data types.

Computed columns that are deterministic and either precise or imprecise can be included columns. Computed columns derived from image, ntext, text, varchar(max), nvarchar(max), varbinary(max), and xml data types can be included in non-key columns as long as the computed column data types is allowable as an included column.

SYNTAX:
CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name
  ON ( column [ ASC | DESC ] [ ,...n ] )
  [ INCLUDE ( column_name [ ,...n ] ) ]

SAMPLE:
Lets say you have an ETL Job for your fact table that has lookups to validate all of the dimensional keys and get the surrogate keys. You may have something like

SELECT productID, EDWproductID
FROM EDWProduct
WHERE productID = ?

OR

SELECT AddressLine1, AddressLine2, City, StateProvinceID, PostalCode
FROM Person.Address
WHERE PostalCode BETWEEN N’98000′ and N’99999′;

It’s obvious that you’ll need indexes on the key fields ProductID and PostalCode, but what might not have been obvious is that it makes sense to include the non-key values into the index to improve performance. Granted, you don’t want to make a habit of creating indexes to meet every single query’s needs, but if these queries are being used in ETL jobs multiple times every day or are being used in reports every day etc, it would make sense to create these indexes to avoid touching the table at all. If these columns are included in the index, you read the index only and the query engine has no need to read the table.


CREATE NONCLUSTERED INDEX IX_Address_PostalCode
  ON Person.Address (PostalCode)
  INCLUDE (AddressLine1, AddressLine2, City, StateProvinceID);

CREATE NONCLUSTERED INDEX IX_EDWProduct
  ON EDWProduct (ProductID)
  INCLUDE (EDWProductID);
GO

For more information, see http://msdn.microsoft.com/en-us/library/ms188783.aspx

VN:F [1.9.6_1107]

Rating: 3.0/5 (1 vote cast)
VN:F [1.9.6_1107]
Rating: 0 (from 0 votes)
Categories: BI, Code, Tips Tags: , , , ,

Index Seek Vs. Index Scan

April 29th, 2009 NothingMan No comments

I’m trying to understand what the difference is between an Index Seek and an Index Scan. It would seem that a Seek is better than a Scan, but I wanted to verify.

It would appear that an Index Scan has to quickly read every single row in the index to determine which rows qualify. In Oracle, I believe this is comparable to a Full-Index Scan. In most cases, I think this is still better than a full table scan, because index files are not nearly as “fat” as a table would be, typically. So it’s easier and faster to read the entire index that it would be to read the entire table. Once the qualifying rows are determined in the index scan, the SQL engine must then go find those rows in the actual table to get the remaining data for the SELECT portion of the query.

An Index Seek only reads rows in the index that qualify based on the filters in the query. It’s able to determine the qualifying rows in the index immediately, and only read those rows, and then if needed, filter those results further. On the surface, it would seem that an Index Seek is much more optimal than an Index Scan, simply because it’s got less of a footprint; it doesn’t have to read or touch as much of the data.
There are ways to trick the optimizer into using an Index Seek with hints and certain ways of formatting your where clause such as using something like “WHERE PKID” > 0, knowing that the ID is always greater than 0.

I’ve also come to find is that you can use an INCLUDE statement in your INDEX to avoid the trip back to the Table. So if you had an Index seek that found a recordset of 500 rows, it would then normally have to go back to the table and find those 500 rows to get the additional columns from the Table for the SELECT statement. For instance, if you were looking up firstname and lastname on the persontable by personID, the engine would find the personID in the index, then find the row(s) in the table so it would return the firstname and lastname. To eliminate that round trip, you could create an index in PersonID with an INCLUDE on firstname and lastname.

VN:F [1.9.6_1107]

Rating: 3.0/5 (1 vote cast)
VN:F [1.9.6_1107]
Rating: 0 (from 0 votes)
Categories: BI Tags: , , ,