Saturday, June 13, 2009

Understanding Fragmentation (Index) SQL Server – Simplistic View

“Googling” (www.google.com) or “Living” :)  (www.live.com) about fragmentation in SQL Server would result in innumerable link and most of them almost state same things in similar manner. Going thru couple of them left me wondering if Fragmentation is a simple and well understood (as represented by number of links) why the heck did we need so many articles, blogs etc etc.

In this blog, see Logical Fragmentation in a different and yet simplistic way and how to calculate Fragment Counts, Page Counts and Fragmentation %.

Put in simple way Fragmentation is out of order pages. Imagine taking notes in School for different subjects. If all of subject notes are taken in same book with out any order being followed then we would end up fragmenting the book. To elucidate this further,

Math notes in Pages 1, 3, 9, 5, 11, 21. English notes in Pages 2, 22, 25, 29.

Now to read complete Math notes I will have go to 1, 3, 9, 5, 11, 21 (Order of reading is important).

Every database has Data File and each data file is logically split into Pages (similar to notebook pages). For example Page 1, 2, 3 so on…… But if a table occupies multiple pages and not consecutively or contiguously then table is logically fragmented.

Fragmentation performance impact effects are only felt when all pages need to be read not when only 1 pages needs to be read. Understanding with our notes example even if Math notes is spread across multiple books (and pages) if it is only 1 page that to be read and if I know what page to read, even if pages are in random order it does not affect at all as I do not need to flip through pages but if entire notes needs to be read then fragmentation affects performances as pages are not in order.

Coming back to SQL Server, “Fragmentation does not effect Index Seek  but only effects Index Scan operations.”

 

In SQL Server 2005, (also in SQL Server 2000) there are couple of commands that we would be using.

DBCC IND, (I learnt usage of DBCC IND from www.SQLSkills.com)

DBCC PAGE (later but not in this blog)

Sys.dm_db_Index_Physical_Stats DMV (SQL Server 2005). More reference about this DMV from (http://msdn.microsoft.com/en-us/library/ms188917.aspx)

With this above three, I will try to explain how to calculate Fragment Counts, Page Counts and Logical Fragmentation %.

Before that formula that has been practically arrived for Logical Fragmentation % is

Logical Fragmentation % = ((Distinct Fragment Counts  - 1) / Page Counts)*100

From DBCC IND command we could identify Fragments, Page Count and then calculate Fragmentation % and compare with columns from the output of Sys.dm_db_index_Phsyical_Stats.

use Tempdb
go

/*

    Create Table in Temporary data.
    The Datatype chose is Char(900), due to
        Large data size fragemenation can be achieved easily and
        Index size would be 900 bytes
*/
Create table TestLogicalFragmentation(Col1 Char(900))
go

-- Insert Rows into table ( 8 Rows * 900 Bytes each = 7.2 KB < 8 KB (Page Size)
insert into TestLogicalFragmentation values('A')
go 8 --Inserts 8 Rows

--Created Index
Create index ind on TestLogicalFragmentation(Col1)
go

--Output of DBCC IND for Index ind and Index ID = 2 as it is a Non Clustered Index
declare @DBID int
declare @ObjectID int
select @DBID = db_id(), @ObjectID = Object_ID('TestLogicalFragmentation')
DBCC IND(@DBID,@ObjectID,2)

Output of DBCC IND

image

Index ID = 2 , Non Clustered Index (IND)

Page Type= 2 stands for Index Page and Page Type= 10 stands for IAM Page

Index Level: Number of Index Levels from Root to leaf level and in this there is only 1 Page.

Now let us insert couple of more rows in the table and dump IND again.

insert into TestLogicalFragmentation values('A')
go 8 --Inserts 8 Rows

declare @DBID int
declare @ObjectID int
select @DBID = db_id(), @ObjectID = Object_ID('TestLogicalFragmentation')
DBCC IND(@DBID,@ObjectID,2)
go

image

Now there are multiple pages, Pages 174, 110 and 115 are of Page Type 0, Page 90 of Page Type 1 and Page 41 of Page Type 10. We would consider only Page type 2 and Index Level 0 as they are index Pages (Type 2) and Leaf Level pages (Index Level 0) as fragmentation is prevalent more that Leaf Level.

Page 174: NextPagePID = 115 and PrevPagePID = 0 indicates first page in double linked list.

Page 115: NextPagePID = 110 and PrevPavePID=174 indicates second page.

Page 110 is Last Page and could be arranged as below:

image

But if the Pages were contiguous with no fragmentation it could have been in any ways as below.

image

Going back to current way pages are stored (first diagram) pages are stored in 3 different fragments, if they were contiguous like above it would have in 1 Fragment irrespective they span across Extents or in same extents. (i.e 8 page of Fragment x and 1 Page of Fragment x + 1 are not considered different fragments but are of same fragment).

Now coming back to % Fragmentation = ((Fragment Count – 1) / Total pages) * 100

In this scenario Logical fragmentation = ((3 different Fragments – 1)  / 3 Total Pages ) * 100 = 66.67% fragmentation.

Let us verify with DMV.

declare @DBID int
declare @ObjectID int
select @DBID = db_id(), @ObjectID = Object_ID('TestLogicalFragmentation')
Select * from sys.dm_db_Index_Physical_Stats(@DBID,@ObjectID,2,null,null)
go

image

See these columns these concur with our findings above.

In the next blog I will try to give different angle for fragmentation for more number of pages.

No comments: