Database tuning is continuous and iterative process. It involves
identifying the bottlenecks, find their cause, apply resolution and
check if resolution works as expected. QL Azure is same database engine
as on premises SQL Server, most of the things apply equally on SQL
Azure as well.
In order to make use of tools like SQL profiler it is always good idea to create and perform initial optimization using on premises instance and later move to SQL Azure. It might save a lot of time and cost.
It is always good idea to create performance baseline before starting up performance optimization task to measure the effectiveness of optimization.
For on premises database we focus on various areas as listed below.
1) Database design
2) Query design
3) Configuration setting
4) Hardware and software administration.
5) Database backup
In case of windows azure apart from first two items, other have become less relevant for performance tuning. These first two points are the area where best results are achieved in performance tuning . Which is also shown in figure below.
1) Database design:- Following is checklist for database design for better performance
a) Database design start with applying normalization and create highly normalized databases. These databases are good for write performance. Once we are done with normalization and ready for performance optimization phase, de normalization is applied appropriately to improve the performance.
b) Separate OLAP (Online Analytical processing ) and OLTP( Online transaction processing workloads.
c) Apply entity integrity constraints – this enable optimizer to come up with better query plans.
d) Apply domain and referential integrity constraints – similar to above it helps query optimizer.
e) Choose appropriate data type.
f) Use indexed views in de-normalization.
g) Create indexes and follow index best practices : -
I. Use narrow column for indexes.
II. Use columns with high selectivity for indexes.
III. For composite index use leading column with high selectivity.
IV. Design indexes based on the where clause in queries.
V. Use index covering with include clause.
VI. Create clustered index very carefully since every non clustered indexes are based on the clustered index.
2) Query Design :- Following is a list of various items which should be kept in mind while designing queries.
a) Avoid arithmetic operator and functions in where clause.
b) Ensure no implicit type conversions.
c) Adopt best practices for reusing the execution plans.
d) Adopt best practices for database transactions.
e) Eliminate or reduce overhead of database cursors.
f) Evaluate alternate to temp tables.
Above guide lines should help to stay streamlined.
Inspecting execution plan and statistics
Figure 1: Performance gain vs Methodology employed to improve performance
Let’s explore the execution plan first. By default, a query’s execution plan is not displayed. To view a query’s execution plan, we need to explicitly include it before executing the query. Right-Click the query window and select Include Actual Execution Plan, as Figure 2 shows.
Figure 2: Including a query’s execution plan
Figure 3: Selecting the Taskbar icon to include a query’s execution plan
Figure 4: Execution plan of a query: select firstname,lastname from saleslt.customer where phone like '1%';
Figure 8: After index creation
STATISTICS
In order to make use of tools like SQL profiler it is always good idea to create and perform initial optimization using on premises instance and later move to SQL Azure. It might save a lot of time and cost.
It is always good idea to create performance baseline before starting up performance optimization task to measure the effectiveness of optimization.
For on premises database we focus on various areas as listed below.
1) Database design
2) Query design
3) Configuration setting
4) Hardware and software administration.
5) Database backup
In case of windows azure apart from first two items, other have become less relevant for performance tuning. These first two points are the area where best results are achieved in performance tuning . Which is also shown in figure below.
1) Database design:- Following is checklist for database design for better performance
a) Database design start with applying normalization and create highly normalized databases. These databases are good for write performance. Once we are done with normalization and ready for performance optimization phase, de normalization is applied appropriately to improve the performance.
b) Separate OLAP (Online Analytical processing ) and OLTP( Online transaction processing workloads.
c) Apply entity integrity constraints – this enable optimizer to come up with better query plans.
d) Apply domain and referential integrity constraints – similar to above it helps query optimizer.
e) Choose appropriate data type.
f) Use indexed views in de-normalization.
g) Create indexes and follow index best practices : -
I. Use narrow column for indexes.
II. Use columns with high selectivity for indexes.
III. For composite index use leading column with high selectivity.
IV. Design indexes based on the where clause in queries.
V. Use index covering with include clause.
VI. Create clustered index very carefully since every non clustered indexes are based on the clustered index.
2) Query Design :- Following is a list of various items which should be kept in mind while designing queries.
a) Avoid arithmetic operator and functions in where clause.
b) Ensure no implicit type conversions.
c) Adopt best practices for reusing the execution plans.
d) Adopt best practices for database transactions.
e) Eliminate or reduce overhead of database cursors.
f) Evaluate alternate to temp tables.
Above guide lines should help to stay streamlined.
Inspecting execution plan and statistics
This article is Part 1 of the
“Tuning SQL Azure database” series. The scope of this article is to
explain how to improve the performance of SQL Azure database by
inspecting the execution plan and statistics. The next article will
explain how to improve SQL Azure database performance by using
information from dynamic management views.
A unique thing about SQL Azure’s
multi-tenant architecture is the way it tackles the queries that tend to
throttle the SQL Azure’s (shared) server. Therefore, a longer running
query is more likely to get prematurely terminated. The rule of thumb is
to keep queries as short as possible. To do so, you need to tune the
query so that it runs for shorter duration of time. Inspecting the
execution plan and statistics of a particular query is one way to do so.
Also with SQL Azure database, you need to factor in performance delay
due to network latency and bandwidth for an application that is not near
to the database. If it’s an application hosted on the Azure platform
that connects to SQL Azure database, the best practice is to keep SQL
Azure database and the Azure application in the same data center. Apart
from performance gain, doing so will help you save on data transfer
costs. With SQL Azure database, a data transfer that takes place in the
same data center is not charged.
Traditionally, tuning a database involved some or all of the following steps:
- 1. Application tuning
2. Designing databases properly
3. Analyzing and optimizing queries
4. Changing hardware configuration or SQL Server configuration to minimize - bottlenecks
5. Altering system software parameters for efficient resource allocation
Because SQL Azure does not provide
control over underlying VM’s, system software or hardware; to tune a SQL
Azure database; we must focus on tuning the application, database
design, and query optimization. In fact, considerable performance gains
are obtained by employing correct development techniques and using
efficient data access code as shown in Figure 1: performance gain vs.
methodology employed to improve performance:
Figure 1: Performance gain vs Methodology employed to improve performance
Since
we know that tuning a database provides considerable performance gains,
in this article, we will discuss how to improve SQL Azure database
performance by inspecting the information available in the execution
plan and statistics.
Execution Plan
One of the best ways to analyze the
performance of a particular query is to inspect the query’s execution
plan. SQL Azure does not currently support SQL Profiler and Index Tuning
Wizard; however, we can view query’s execution plan and tune it based
on the information available through the execution plan. In addition to
execution plans, it is possible to inspect CPU time and I/O information
of a query execution using STATISTICS TIME and STATISTICS IO, which we
will discuss later in the article.
If you are already familiar with
using the execution plan and statistics for query optimization with SQL
Server, then you will see that it’s not different with SQL Azure.
I use SQL Server Management Studio 2008 R2 to connect to SQL Azure server and inspect the execution plan and statistics.
Let’s explore the execution plan first. By default, a query’s execution plan is not displayed. To view a query’s execution plan, we need to explicitly include it before executing the query. Right-Click the query window and select Include Actual Execution Plan, as Figure 2 shows.
Figure 2: Including a query’s execution plan
Alternatively,
you can select the option to include the actual execution plan by
clicking the icon in the SQL Editor Taskbar, as Figure 3 shows:
Figure 3: Selecting the Taskbar icon to include a query’s execution plan
For the purpose of this article, we
will view the execution plan in graphical format, but you can also view
it in text or XML format. In addition, you can view estimated execution
plans for a query. Viewing estimated execution plans can be beneficial
with SQL Azure because the data is not returned, so we save on the data
transfer cost. Note that the query is not executed for the estimated
plan, whereas the query is executed for the actual execution and the
plan displayed is the output from the engine showing the plan used.
Analyzing estimated execution plans instead of actual plans can be handy
in a lot of scenarios because you save the time to execute the query.
However, there are cases in which inspecting the actual execution plan
is preferred—for example, when cardinality estimates are not accurate or
when the code uses either dynamic T-SQL code or temporary tables. The
other approach is to tune the queries for the on-premises version of the
database and later migrate the database to SQL Azure. This approach can
also save on the data transfer cost. In addition, after the database is
migrated, the actual performance can be inspected using dynamic
management views, which we will explore in Part 2 of this article.
For the demonstrations in this
article, I used the AdventureWorksLT sample database, which is the
official sample database for SQL Azure. You can download this database
from the Microsoft SQL Azure download page, at http://msftdbprodsamples.codeplex.com/releases/view/37304.
Let’s run a query:
Figure 4 shows the execution plan for this query.
Figure 4: Execution plan of a query: select firstname,lastname from saleslt.customer where phone like '1%';
Note a couple of things in the execution plan:
- 1. Clustered Index Scan means that server scanned all records to answer the
- query
2. Suggestion to create an index, including specifications
A table scan is required when we
are displaying every record, but when we have filtered our query using
the where clause and only a fraction of rows are returned, we can create
an index to improve performance.
One thing that should be reviewed
in creating an index for SQL Azure database is the space occupied by the
index. You might not mind a single index occupying few kilobytes of
space. But what if your database has a lot of indexes? This will add to
the database size. And with SQL Azure, we are charged based on the size
of the database. So it becomes vital to keep a tab on the space that
indexes occupy. Also, creating an index will impact the performance of
DML queries (which I discuss later in this part of the article)
Following is the T-SQL code to review the size of indexes in a table for SQL Azure database:
Note that SQL Azure does not support tables without a clustered index. Every Table should have exactly one clustered index.
Now, Let’s create the index based on the index suggestion and inspect the performance gain, if any.
Note the “include” clause with the
index. By using an index with included columns, we can cover more
queries. Later in the article, I demonstrate this fact.
Now run the query again:
Figure 5 shows the execution plan:
Figure 5: Execution plan after rerunning the query
Note
that the execution plan now shows “Index Seek” which means that
particular rows are found using a seek operation on the index. This
improves performance because not all rows are supposed to be scanned to
find the relevant information. Now, let’s run three separate queries to
see if the index impacts other queries as well.
Figure 6 shows the execution plan for each of the three queries:
Figure 6: Execution plan for three separate queries.
Recall
that Index with included columns can cover more than one query. We can
see In Figure 6 that a non-clustered index can improve performance for
multiple queries. For more information about indexes with included
columns, see the Microsoft article “Index with Included Columns”.
You can drop the non-clustered index named nonclustered_index_cust_phone created earlier by using the following code:
Now,
Before creating indexes, we also
have to factor in the performance cost that will be incurred from
creating, updating, and deleting queries on the table. Figure 7 shows
the results before index creation; Figure 8 shows the results after
index creation.
Figure 7: Before index creation
Figure 7: Before index creation
As
you can see in Figure 7, I inserted a batch of 100 rows in the “before
index creation” scenario. The elapsed time is 53 seconds.
Figure 8: After index creation
In
the “after index creation” scenario, I create a non-clustered index
with included columns. I then insert the batch of 100 rows again, with
other parameters constant. The elapsed time is 55 seconds.
Thus, you need to keep the following considerations in mind with index creation:
- 1. Performance cost for create, update, and delete statements after index
- creation
2. Size of the index vs. performance gain; because we are charged for the size - of the database, it is important to analyze the performance gain vs. the
- additional cost due to the increase in size of the database with index
- creation
Indexing is an art as well as
science, and it is important to balance the performance gain vs. cost
for a particular scenario. Here, in this article, my purpose was to
point out the fact that Execution plan is available in SQL Azure and the
Information available through the execution plan is useful in tuning
SQL Azure database.
STATISTICS
Statistics
are a handy tool for tuning queries. Information such as CPU time and
I/O information for query execution is accessible through STATISTICS IO
and STATISTICS TIME. Based on this information available from it, we can
see whether or not query was tuned.
By default, the STATISTICS IO and STATISTICS TIME are OFF. The following T-SQL code turns them ON:
You can see the statistics in the
‘Message’ tab of the output window. Now, let’s try to tune a query and
see how the information available from STATISTICS IO and STATISTICS TIME
can be used during the performance tuning process.
If you inspect the message tab, you will see the following output:
The most interesting information available from STATISTICS TIME, is the following:
“CPU time” shows the amount of CPU
time it took to execute a query; and “elapsed time” shows the total time
it took to run the query. Because of varying loads on the server, the
elapsed time will show varying times, but the CPU time is invariably
consistent. Because CPU time is consistent, this parameter can help in
determining whether the changes made in the query, during performance
tuning, are acually helping (or hurting). Because it a simple query, we
encounter Zero’s.
When I ran “select * from saleslt.product,” I encountered the following output:
As discussed earlier, the ‘elapsed time’ will show inconsistent results.
The information available from
STATISTICS, that interest me is the ‘logical reads’. Logical reads are
the number of pages server had to read from the data cache in order to
produce the results of the specified query. In addtion, logical reads
remain constant from one execution of a query to an other execution of
the same query. If this number goes down when you are tunign a query, it
is an indication that the query is using less resources and that its
performace will improve.
Now, let’s create an index:
Then, lets run the query again and inspect the Messages tab:
Note that after index creation, the
logical reads decreased from 103 to 2. As discussed earlier, this is an
indication that the server is using less resources and therefore,
perofrmance will improve. Based on the information avaialable from
STATISTICS IO and STATISTICS TIME, we verified that our query is getting
tuned.
No comments :
Post a Comment