Feel Tech Evolution: Top 10 steps to optimize data access in SQL Server: Part IV (Diagnose database performance problems)

Introduction

Imagine you are a doctor, or a physician. What do you do when one of your patients arrive feeling out of sorts and fallen ill?
You will try to understand the cause of his/her illness, right? Yes. This is the most important thing to do first. Because, in order to cure your patient, you need to find out what causes your patient to fall ill. In most cases, you will study the symptoms, and based upon your knowledge and experience, you will suggest a treatment, which works in most of the cases.
But, you may not be lucky in all cases. Some patients have complex problems with multiple types of illnesses. Studying the symptoms alone is not sufficient in these cases. You suggest diagnosing the problems, and prescribe one or more tests to be done. Pathologists then collect samples from the patient and start finding out the causes of the illness. Once you get the test report, you are in a better position in understanding the problem that caused the patient's illness, and you are most likely to prescribe the correct treatment plan for cure.
Sounds like a familiar situation. Isn't this the same thing we have to do while trying to debug or troubleshoot a problem in our software systems?
Yes, it is. So, while we were actually trying to optimize our data access operations, it's time for us to learn how to diagnose different performance related problems in SQL Server databases. Take a look at the following articles to learn the step by step processes that we've already carried out so far:

Top 10 steps to optimize data access in SQL Server: Part I (Use indexing)
Top 10 steps to optimize data access in SQL Server: Part II (Re-factor TSQLs and apply best practices)
Top 10 steps to optimize data access in SQL Server: Part III (Apply advanced indexing and denormalization)

As you might have seen already, we have gone through seven optimization steps so far. Let us proceed to step 8 now:

Step 8: Diagnose performance problems, and use SQL Profiler and the Performance Monitoring Tool effectively

The SQL Profiler tool is perhaps the most well-known performance troubleshooting tool in the SQL Server arena. In most cases, when a performance problem is reported, this is the first tool that you are going to launch to investigate the problem.
As you perhaps already know, the SQL Profiler is a graphical tool for tracing and monitoring a SQL Server instance, mostly used for profiling and measuring the performance of the TSQLs that are executed on the database server. You can capture each event on the server instance and save event data to a file or table to analyze later. For example, if the production database performs slowly, you can use SQL Profiler to see which Stored Procedures are taking too much time to execute.

Basic use of the SQL Profiler tool

There is a 90% chance that you already know how to use it. But I assume a lot of newbies out there who are reading this article might feel good if there is a section on the basic usage of the SQL Profiler (if you know this tool already, just feel free to skip this section). So, here is a brief section:
Start working with the SQL Profiler in the following way:

Launch SQL Profiler (Tools -> SQL Server Profiler in Management Studio) and connect it to the desired SQL Server instance. Select a new trace to be created (File -> New Trace) and select a trace template (a trace template is a template where some pre-selected events and columns are selected to be traced).

Trace template

Optionally, select particular events (which should be captured in the trace output) and select/deselect columns (to specify the information you want to see in the trace output).

Select events to be captured for tracing

Optionally, organize the columns (click the "Organize Columns" button) to specify the order of their appearance in the trace. Also, specify the column filter values to filter the event data which you are interested in. For example, click on "Column Filters" and specify the database name value (in the "Like" text box) to trace events only for the specified database. Please note that filtering is important because SQL Profiler would otherwise capture unnecessary events and trace too many information that you might find difficult to deal with.

Filter column values

Run the profiler (by clicking the green Play button) and wait for the events to be captured on the trace.

Running the profiler

When enough information is traced, stop the profiler (b pressing the red Stop icon) and save the trace either into a trace file or into a SQL Server table (you have to specify a table name and the SQL Server Profiler would create the table with necessary fields and store all the tracing records inside it).

Storing profiler trace data into a table

If the trace is saved on a table, issue a query to retrieve the expensive TSQLs using a query like the following:

Collapse | Copy Code

SELECT TextData,Duration,..., FROM Table_Name ORDER BY
Duration DESC

Querying for the most expensive TSQL/Stored Procedure

Voila! You just identified the most expensive TSQLs in your application in quick time.

Effective use of SQL Profiler to troubleshot performance related problems

In most cases, SQL profiler is used to trace the most expensive TSQLs/Stored Procedures in the target database to find the culprit that is responsible for performance problems (described above). But, the tool is not limited to just providing TSQL duration information. You can use many of the powerful features of this tool to diagnose and troubleshoot different kinds of problems that could occur due to many possible reasons.
When you are running SQL Profiler, there are two possibilities. Either you have a reported performance related issue that you need to diagnose, or you need to diagnose any possible performance issue in advance so that you can make sure your system would perform blazing fast in production after deployment.
Following are some tips that you can follow while using the SQL Profiler tool:

Use existing templates, but create your own templates when in need.

Creating a new template

Specifying events and columns for the new template

Capture TableScan or DeadLock events.

Detecting table scan

Detecting deadlocks

Create Replay trace.

Replay trace

.trace

Creating Replay trace

http://msdn.microsoft.com/en-us/library/ms189604.aspx

Create Tuning trace.

Tuning

http://msdn.microsoft.com/en-us/library/ms166575.aspx

Create Tuning profiler trace

Capture ShowPlan to include SQL execution plans in the profiler.

Specifying execution plans to be included in the trace

Execution plan in the profiler trace

Use the Performance Monitoring Tool (Perfmon) to diagnose performance problems

When you encounter performance related problems in your database, SQL Profiler would enable you to diagnose and find out the reasons behind the performance issues in most cases. But, sometimes the profiler alone cannot help you in identifying the exact cause of the problems.
For example, when analyzing the query execution time using the profiler in the production server, you've seen that the corresponding TSQL is executing slowly (say, 10 seconds), though the same query takes a much lower time in the test server (say, 200 ms). You analyzed the query execution plans and data volume, and found those to be roughly the same. So there must have been some other issue that was creating a bottleneck situation in the production server. How would you diagnose this problem then?
The Performance Monitoring Tool (known as Perfmon) comes to your aid in these kinds of situations. Performance Monitor is a tool (that is built-in within the Windows OS) that gathers statistical data related to hardware and software metrics from time to time.
When you issue a TSQL to execute in the database server, there are many stakeholders participating in the actions to execute the query and return the result. These include the TSQL Execution engine, Server buffer cache, SQL Optimizer, Output queue, CPU, Disk I/O, and lots of other things. So, if one of these does not perform its corresponding task well and fast, the ultimate query execution time taken by the database server would be high. Using the Performance Monitoring Tool, you can take a microscopic look at the performance of these individual components and identify the root cause of the performance problem.
With the Performance Monitoring Tool (system monitor), you can create a counter log including different built-in counters (that measures the performance of each individual component while executing the queries) and analyze the counter log with a graphical view to understand what's going on in detail. Moreover, you can combine the performance counter log with the SQL Profiler trace for a certain period of time to better understand the complete situation while executing a query.

Basic use of Performance Monitor

Windows has lots of built-in objects with their corresponding performance counters. These are installed when you install Windows. While SQL Server gets installed, performance counters for SQL Server also get installed. Hence, these counters are available when you define a performance counter log.
Follow these steps to create a performance counter log:

Launch the Performance Monitor Tool from Tools->Performance Monitor in the SQL Profiler tool.

Figure : Launch Performance Monitor Tool

Create a new performance counter log by clicking on Counter Logs->New Log Settings.

Create a performance counter log

Specify the name for the performance counter log

Click on the "Add Counters" button to select the preferred counters in the newly created counter log.

Add counters for the performance counter log

Add the preferred counters by selecting the desired objects and their corresponding counters from the list. Click on "Close" when done.

Specify the objects and the corresponding counters

The selected counters will be displayed in the form.

Specify the counters

Click on the Log Files tab and click on the "Configure" tab to specify the log file location and modify the log file name if required. Click "OK" when done.

Specify the performance counter log location

Click on the "Schedule" tab to specify a schedule for reading the counter information and write in the log file. Optionally, you can also select "Manually" for the "Start log" and "Stop log" options, in which case, the counter data will be logged after you start the performance counter log.

Scheduling the performance counter log operation

Click on the "General" tab and specify the interval for gathering counter data.

Setting the counter sample interval

Press "OK" and start the performance counter log by selecting the counter log and clicking Start. When done, stop the counter log.

Starting the performance counter logging

For viewing log data, close and open the Performance Monitor Tool again. Click on the View Log icon (the icon in the red box) to view the counter log. Click on the "Source" tab and select the "Log files" radio button, and add the log file to view by clicking on the "Add" button.

Viewing the performance counter log

By default, only three default counters are selected to be shown in the counter log output. Specify other counters (that were included while creating the counter log) by clicking on the "Data" tab and selecting the desired counters by clicking on the "Add" button.

Specifying the counters to view the data in the log

Click the "OK" button to view the performance counter log output in a graphical view.

Viewing the performance counter log

Correlate the performance counter log and SQL Profiler trace for better investigation

The SQL Profiler can give you information about long running queries, but it cannot provide you with the context information to explain the reason for the long query execution time.
On the other hand, the Performance Monitor Tool gives you statistics regarding an individual component's performance (context information), but it does not give you information regarding the query execution time.
So, by combining the performance counter log with the SQL Profiler trace, you can get the complete picture while diagnosing performance problems in SQL Server.
Correlating these two things serve another important purpose also. If the same query takes longer time in the production server to execute than in the test server, that indicates the test server may not have the same amount of load, and the same environment and query execution context as the production server. So, to diagnose the performance problem, you need a way to simulate the production server's query execution context in the test server somehow. You can do this by correlating the SQL Profiler trace at the test server with the performance counter log that is taken at the production server (obviously, the SQL Profiler trace and performance counter log that are taken within the same time period alone can be correlated).
Correlating these two tool outputs can help you in identifying the exact root cause of the performance problem. For example, you might find that each time the query takes 10 seconds to execute in the production server, the CPU utilization reaches up to 100%. So, instead of trying to tune the SQL, you should investigate the reason why the CPU utilization rises up to 100%, to optimize the query performance.
Follow these steps to correlate the SQL Profiler trace with the performance counter log:

Create a performance counter log by incorporating the following common performance counters. Specify "Manual" option for starting and stopping the counter log.

Network Interface\Output Queue length
Processor\%Processor Time
SQL Server: Buffer Manager\Buffer Cache Hit Ratio
SQL Server: Buffer Manager\Page Life Expectancy
SQL Server: SQL Statistics\Batch Requests/sec
SQL Server: SQL Statistics\SQL Compilations
SQL Server: SQL Statistics\SQL Re-compilations/sec

Using SQL Profiler, create a trace using the TSQL Duration template (for simplicity). Add "Start Time" and "End Time" columns to the trace, and start the profiler trace and the performance counter log created in the previous step at the same time.
When enough tracing has been done, stop both the SQL Profiler trace and the performance counter log at the same time. Save the SQL Profiler trace as a .trc file in the file system.
Close the SQL Profiler trace window and open the trace file again with the profiler (.trc file) that was saved in the previous step (yes, you have to close the profiler trace and open the trace file again; otherwise, you won't get the "Import Performance Data" option enabled; this looks like a bug in the Management Studio). Click on "File->Import Performance Data" to correlate the performance counter log with the SQL Profiler trace. (If the Import Performance Data option is disabled, something is wrong, and review your steps from the beginning.) A file browser window will appear, and select the performance counter log file in the file system that is to be correlated.
A window will appear to select the counters to correlate. Select all counters, and press "OK". You will be presented with a screen like below that is the correlated output of the SQL Profiler trace and the performance counter log.

Correlated output of SQL Profiler and Performance Monitor Tool

Click on a particular TSQL in the profiler trace output (in the upper part of the window). You'll see that a red vertical bar will be set in the performance counter log output to indicate the particular counter statistics when that particular query was being executed. Similarly, click on the performance counter log output any where you see a certain performance counter's value is high (or, above the normal value). You'll see that the corresponding TSQL that was being executed on the database server will be highlighted in the SQL Profiler trace output.

I bet you'll surely find correlating these two tool outputs extremely interesting and handy.

Tuesday, 22 November 2011

Top 10 steps to optimize data access in SQL Server: Part IV (Diagnose database performance problems)