Regular Database Maintenance

The estimated reading time for this post is 11 minutes

Obtaining the best performance from the database engine requires a schedule of routine maintenance operations.

Database corruption is rare but one of the most important tasks in the maintenance schedule is to ensure that no corruption has occurred in the databases. SQL Server indexes will perform better if you periodically review their performance and remove any fragmentation that occurs within them.

Database Integrity

The database engine depends upon the hardware platform that it runs on, and that can cause corruption. In particular, issues in the memory and I/O subsystems can lead to corruption within databases.

There are two levels of database integrity:

  • Physical integrity. Ensures that SQL Server can read and write data extents to the physical media.
  • Logical integrity. Ensures that the data within the pages is logically correct. The pages can then refer to each other as required so that SQL Server can fetch related pages.

Without regular checking of the database files, any lack of integrity of the database might lead to bad information derived from it. Backup does not check the integrity, it only checks the page checksums and that is only when you use the WITH CHECKSUM option on the BACKUP command. Although the CHECKSUM database option is important, the checksum is only checked when SQL Server reads the data. The exception to this is when SQL Server is backing up the database and using the WITH CHECKSUM option. Archive data is, by its nature, not read frequently and this can lead to corrupt data within the database that, if it’s not checked as it’s backed up, may not be found for months.

DBCC CHECKDB

SQL Server provides the DBCC utility, which supports a range of management facilities. The DBCC CHECKDB command makes a thorough check of the structure of a database, to detect almost all forms of potential corruption. The functionality that DBCC CHECKDB contains is available as a range of options which are described as follows:

  • DBCC CHECKALLOC checks the consistency of the database files disk space allocation units.
  • DBCC CHECKTABLE checks the pages associated with a specified table and the pointers between pages that are associated with the table. DBCC CHECKDB executes DBCC CHECKTABLE for every user table in the database.
  • DBCC CHECKCATALOG checks the database catalog schema by performing logical consistency checks on the metadata tables in the database. These tables hold information that describes both system and user tables in addition to other database objects. DBCC CHECKCATALOG does not check user tables.

[INFO] Note: DBCC CHECKDB also performs checks on other types of objects, such as the links for FILESTREAM objects and consistency checks on the Service Broker objects. [/INFO]

DBCC CHECKDB Aspects

Repair Options

Usually, the best method for database recovery is to restore a backup of the database by synchronizing the execution of DBCC CHECKDB with backup retention policy. This ensures that you can always restore a database from an uncorrupted database.

Online Concurrent Operations

DBCC CHECKDB works using internal database snapshots to ensure that the utility works with a consistent view of the database to allow the database to be in use during this operation.

Disk Space

The use of an internal snapshot causes DBCC CHECKDB to need additional disk space. The amount of disk space required on the volumes depends on how much data is changed during the execution of DBCC CHECKDB.

DBCC CHECKDB also uses space in the tempdb database while executing. To provide an estimated amount of space required in tempdb, DBCC CHECKDB provides an ESTIMATEONLY option.

Backups and DBCC CHECKDB

It is good practice to run DBCC CHECKDB on a database before performing a backup. This check helps to ensure that the backup will contain a consistent version of the database.

MAXDOP override

You can reduce the impact of the DBCC utility on other services running on the server by setting the MAXDOP option to more than 0 and less than the maximum number of processors in your system.

DBCC CHECKFILEGROUP

The DBCC CHECKFILEGROUP command runs checks against the user objects in the specified filegroup. This has the potential of saving you considerable checking of non-user metadata objects.

DBCC CHECKDB Options

DBCC CHECKDB provides a number of options that alter its behavior while it is executing.

Option                                                  description

PHYSICAL_ONLY                               Only checks the integrity of the physical structure to reduce overhead

NOINDEX                                             Does not perform logical checks on nonclustered indexes

EXTENDED_LOGICAL_CHECKS    Performs additional logical checks of indexed views, spatial, and XML indexes.

TABLOCK                                             Uses locks instead of database snapshot

ALL_ERRORMSGS                             Returns all error messages instead of the default action that returns the first 200

NO_INFOMSGS                                 Returns only error messages with no informational messages

ESTIMATEONLY                                 Estimates the amount of tempdb space that is required to run

DBCC CHECKDB Repair Options

You should back up a database before performing any repair option, if you can, and also find and resolve the reason for the corruption. Otherwise, it could happen again soon after.

DBCC CHECKDB offers two repair options. For both options, the database needs to be in single user mode. The options are:

  • REPAIR_REBUILD rebuilds indexes and removes corrupt data pages. This option only works with certain mild forms of corruption and does not involve data loss.
  • REPAIR_ALLOW_DATA_LOSS will almost certainly lose some data. It deallocates any corrupt pages it finds and changes others that reference the corrupt pages to stop them trying to reach them. After the operation completes, the database will be consistent, but only from a physical database integrity point of view. Significant loss of data could have occurred. Repair operations do not consider any of the constraints that may exist on or between tables. If the specified table is involved in one or more constraints, you should execute DBCC CHECKCONSTRAINTS after running the repair operation.

Index Maintenance

Indexes can help to improve searching, sorting, and join performance, but they can also impede the performance of data modification. Over time, through the manipulation of the data, indexed data becomes fragmented. This fragmentation reduces the performance of storage access operations at the physical level. Another important aspect of SQL Server that requires ongoing maintenance for optimal performance is the management of indexes.

Index Fragmentation

Index fragmentation occurs over time, when data modification causes index pages to split. There are two type of index fragmentation:

  • Internal fragmentation occurs when pages are not completely full of data. This often occurs when a page is split during an insert—or update operation in a case where the data could not fit back into the space initially allocated to the row.
  • External fragmentation occurs when pages get out of sequence during Data Manipulation Language (DML) operations that split existing pages when modifying data. When the index requires and allocates a new page within an extent that is different to the one that contains the original page, extra links are required to point between the pages and extents involved.

You can use sys.dm_db_index_physical_stats dynamic management view to analyze the level of fragmentation in your indexes and to decide whether to rebuild it.

The following code shows how to detect objects with fragmentation more than 30 percent in database “adventureWorks2012”:

SELECT    DB_NAME(database_id)  AS [Database]  

                , OBJECT_NAME(object_id) AS [Object] 

                , avg_fragmentation_in_percent AS Fragmentation

 

FROM        sys.dm_db_index_physical_stats(DB_ID(‘AdventureWorks2012’),DEFAULT,DEFAULT,DEFAULT,DEFAULT)

WHERE      avg_fragmentation_in_percent > 30 

ORDER BY avg_fragmentation_in_percent DESC

FILLFACTOR and PAD_INDEX

You can reduce the performance impact of page splits by leaving an empty space on each page when creating an index, including a clustered index. You can use the FILLFACTOR option when you create the index to define how full the index pages should be. The default value of the FILLFACTOR is 0, which actually means fill the pages 100 percent. Any other value indicates the percentage of each page filled, on the index creation.

By default, the FILLFACTOR option only applies to leaf-level pages of an index. You can use it in conjunction with the PAD_INDEX = ON option to cause the same free space to be allocated in the nonleaf levels of the index as well.

Although the use of the fill factor to pre-allocate space in the index pages can help in some situations, it is not a best practice; it may even impede performance where all the changes are on the last page anyway. Consider the auto increment primary key index; It is unlikely that this fill factor will improve performance. The primary key ensures that no duplicate rows occur; therefore, it is unlikely that you will insert values for such rows in the middle of existing ones, because the primary key is sequential. Also, if you do not use variable sized columns, you will be able to update rows in place too, so there is no requirement for empty space for updates.

Make sure you consider the costs in addition to the benefits of setting the empty space in index pages. Both memory pages and disk space are wasted if the space adds no value. There will also be reduced network efficiency and an inability of the processing unit to get as much useful data when it needs to process it.

Index Defragmentation

SQL Server provides two options for removing fragmentation from clustered and non-clustered indexes.

  • REBUILD drops and recreates the index. This removes fragmentation, reclaims disk space by compacting the pages based on the specified or existing fill factor setting, and reorders the index entries in contiguous pages. When you use the ALL option, SQL Server drops all indexes on the table and rebuilds them in a single operation. If any part of that process fails, SQL Server rolls back the entire operation. A single rebuild operation can use a large amount of space in the transaction log. To avoid this, you can change the recovery model of the database to use the BULK_LOGGED or SIMPLE recovery models before performing the rebuild operation, so that it is a minimally logged operation.
  • REORGANIZE uses minimal system resources. It defragments the leaf level of clustered and nonclustered indexes by physically reordering the leaf-level pages to match the logical, left to right order of the leaf nodes. Reorganizing an index also compacts the index pages. The compaction uses the existing fill factor value for the index. You can interrupt a reorganize without losing the work performed so far. This means that, on a large index, you could configure partial reorganization to occur each day.

For heavily fragmented indexes (those with fragmentation greater than 30 percent) rebuilding is usually the most appropriate option to use. SQL Server maintenance plans include options to rebuild or reorganize indexes. If you do not use maintenance plans, it is important to create a job that regularly performs defragmentation of the indexes in your databases.

Many organizations have no available maintenance-time windows during which to perform database maintenance operations such as index rebuilds. The Enterprise edition of SQL Server can perform index operations online while users are accessing the database. During the online rebuild operation, the database engine applies schema locks to prevent metadata changes. This means that users cannot change the structure of the table using commands such as ALTER TABLE while the online index rebuild operation is occurring.

Because of the extra work that needs to be performed, online index rebuild operations are typically slower than offline ones.

Updating Statistics

One of the main tasks that SQL Server performs when it is optimizing queries is to determine which indexes to use and which ones not to use. To make these decisions, the query optimizer uses statistics about the distribution of the data in the index and data pages. SQL Server automatically creates statistics for indexed and non-indexed columns when you enable the AUTO_CREATE_STATISTICS database option.

SQL Server automatically updates statistics when AUTO_UPDATE_STATISTICS is set. This is the default setting and it is best that you do not disable this option unless you really have to, because it is necessary for a package solution you are implementing on top of SQL Server.

For large tables, the AUTO_UPDATE_STATISTICS_ASYNC option instructs SQL Server to update statistics asynchronously instead of delaying query execution, where it would otherwise update an outdated statistic before query compilation.

You can also update statistics on demand. Executing the Transact-SQL code UPDATE STATISTICS <tablename> causes SQL Server to refresh all statistics on the specified table. You can also run the sp_updatestats system stored procedure to update all statistics in a database. This stored procedure checks which table statistics are out of date and refreshes them.

SQL Server Maintenance Plans

You can use the SQL Server Maintenance Plan Wizard to create SQL Server Agent jobs that perform routine database maintenance tasks. The wizard creates SQL Server Integration Services (SSIS) packages for SQL Server Agent tasks to execute on the specified schedules. You can schedule many maintenance tasks to run automatically, including:

  • Checking database integrity.
  • Shrinking databases.
  • Rebuilding and reorganizing indexes in a database.
  • Updating database object statistics.
  • Cleanup of task history.
  • Activation of related SQL Agent jobs and alerts.

regular database maintenance - Maintenance Plan Wizard
image-3837
Monitoring Database Maintenance Plans

You can monitor them by using the standard Job Activity Monitor in SSMS. The results of the maintenance tasks are stored in the maintenance plan tables in the msdb database. You can query these tables, dbo.sysmaintplan_log and dbo.sysmaintplan_log_detail to view the entries from the Log File Viewer. In addition, tasks can generate text-based reports, write them to the file system, and send them automatically to operators defined in the SQL Server Agent service system.

regular database maintenance - Log File Viewer
image-3838

 

Hamid J. Fard

I am SQL Server Data Platform Expert with more than 9 years’ of professional experience, I am currently Microsoft Certified Master: SQL Server 2008, Microsoft Certified Solutions Master: Charter-Data Platform, Microsoft Data Platform MVP and CIW Database Design Specialist. I also do Pig, Hive and Sqoop development regarding Hadoop Big Data platform. After a few years of being a production database administrator I jumped into the role of Data Platform Expert. Being a consultant allows me to work directly with customers to help solve questions regarding database issues for SQL Server.

More Posts

Follow Me:
FacebookLinkedIn


Leave a Comment

avatar
1000
  Subscribe  
Notify of