Feel Tech Evolution: Troubleshooting and optimizing queries on SQL Azure.

I have done a lot of work improving I/O performance for on-premise SQL Server installations. Usually it involves tweaking the storage system, balancing databases across RAID arrays, or expanding the count of files that the tempdb is using; these are all common techniques of SQL Server DBA. However, how do you improve your I/O performance when you are not in charge of the storage subsystem, like in the case of SQL Azure? You focus on how your queries use the I/O and improve the queries. This blog post will talk about how to detect queries which use a high amount of I/O and how to increase the performance of your I/O on SQL Azure.

see video http://channel9.msdn.com/Events/TechEd/NorthAmerica/2011/DBI314

Detecting Excessive I/O Usage

With SQL Azure, just like SQL Server, I/O is a bottleneck in getting great query performance. Before you can make any changes, you first thing have to be able to do is detect which queries are having trouble.
This Transact-SQL returns the top 25 slowest queries:

SELECT TOP 25
    q.[text],
    (total_logical_reads/execution_count) AS avg_logical_reads,
    (total_logical_writes/execution_count) AS avg_logical_writes,
    (total_physical_reads/execution_count) AS avg_phys_reads,
     Execution_count
FROM sys.dm_exec_query_stats
    cross apply sys.dm_exec_sql_text(plan_handle) AS q
ORDER BY
(total_logical_reads + total_logical_writes) DESC

Identifying poorly performing queries

SQL Server generates an optimized query plan for all the queries that are executed. This allows the SQL Server optimizer to reuse the query plan when the same or similar query is executed to retrieve the data in the fastest time possible. As the data and the statistics on that data change, the query plans become out of date and can become inefficient. It is important to identify these queries and tune them for optimal performance of the application and consistent user experience. The DMVs just discussed directly help in identifying the problematic queries.

The following are some basic queries to troubleshoot poor performance of queries

--Excessive recompiles:

select top 25
    sql_text.text,
    sql_handle,
    plan_generation_num,
    execution_count,
    dbid,
    objectid
from
    sys.dm_exec_query_stats a
    cross apply sys.dm_exec_sql_text(sql_handle) as sql_text
where
    plan_generation_num >1
order by plan_generation_num desc

Of course these queries can't replace the powerfull features of SQL Profiler, but at least it will help you and it is better than nothing.

--1) Last executed queries with used query plan.

SELECT TOP 50 query_plan,q2.[text],
(total_logical_reads/execution_count) AS avg_logical_reads,
(total_logical_writes/execution_count) AS avg_logical_writes,
(total_physical_reads/execution_count) AS avg_phys_reads,
execution_count,
(total_elapsed_time/execution_count) AS avg_Duration,
last_execution_time
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle)
CROSS APPLY sys.dm_exec_sql_text(Sql_handle) AS q2
ORDER BY last_execution_time DESC

--Inefficient query plans:

select
    highest_cpu_queries.plan_handle,
    highest_cpu_queries.total_worker_time,
    q.dbid,
    q.objectid,
    q.number,
    q.encrypted,
    q.[text]
from
    (select top 50
        qs.plan_handle,
        qs.total_worker_time
    from
        sys.dm_exec_query_stats qs
    order by qs.total_worker_time desc) as highest_cpu_queries
    cross apply sys.dm_exec_sql_text(plan_handle) as q
order by highest_cpu_queries.total_worker_time desc

Reading Too Much Data

One way to overuse I/O is to read data that you are never going to use. A great example of this is:

SELECT * FROM [Table]

This query reads all the columns and all the rows from the [Table]. Here is how you can improve that:

Use a WHERE clause to reduce the number of rows to just the ones that you need for your scenario
Explicitly name the columns you need from the tables, which hopefully will be less than all of them

Create Covered Indexes

Once you have reduced the number of columns you are returning for each query, you can focus on creating non-clustered covered indexes for the queries that have the most read I/O. Covered indexes are indexes that contain all the columns in the query as part of the index, this includes the columns in the WHERE clause. Note that there might be several covered indexes involved in a single query, since the query might join many tables, each potentially with a covered index. You can determine what columns should go into the index by examining SQL Azure’s execution plan for the index. More information about be found in the MSDN article: SQL Server Optimization.

Just a note, non-clustered indexes (what you make when you do a covered index) reduce the performance of your writes. This is because on insertion or updates, the indexes need to be updated. So you need to balance your covered index creation with the ratio of reads and writes to your database. Databases with a disproportion amount of reads to writes gain more performance from covered indexes.

Thursday, 12 July 2012

Troubleshooting and optimizing queries on SQL Azure.

Detecting Excessive I/O Usage

Reading Too Much Data

Create Covered Indexes

No comments :