Monday, June 16, 2014

CHANGE TRACKING (CT)

SQL Server - Change Tracking (CT)




Introduction

Have you ever tried to track the changes (DML changes like Insert, Delete and Update)

If you have not, let’s play with the data by using one of the great features of SQL Server 2008.

Background

Prior to SQL Server 2008, developers had to create some custom tracking solutions using DML Trigger and additional tables to track the data which we have modified.
  • DML Triggers: These are the part of our transaction which contains the DML by which it is triggered. As we all know that, triggers are very expensive and we are using them in our transaction, the execution time will increase so that the performance of our project will be affected.
  • Additional tables: By running the above DML triggers, we are able to track the data. But there is nothing to store these changes. To store this changed data, we need to create these additional tables. These tables will have similar columns as the tables which we need to track.

Drawbacks

  • Takes much time to develop/create DML triggers and additional tables
  • Performance hit
  • Very complex process

SQL Server 2008

To overcome the above drawbacks, SQL Server 2008 introduced powerful and efficient tracking mechanisms called‘Change Tracking (CT)’ and Change data Capture (CDC)
In this article Change Data Capture (CDC) is out of scope. We may discuss about this in the coming articles.
Note: Since I am a lazy fellow I am not going to say Change tracking every time in this article. But I will just say CT.
CT is a new feature in SQL Server 2008 which allows us to track information about changes that we have made to tables in which CT is enabled.
When this is turned on for a table, it will keep track of each DML operation and the keys of the row that was affected. This means that at any time, we can query to find out which rows in our table(s) were inserted/deleted/updated.

Advantages:

  • DML Triggers and additional tables are no more required
  • Efficient & fast
  • Easy to track the data
  • Minimal Disk Space Costs
  • Packaged Functions available to query the data
  • Auto Clean Up of side table data

Working with CT:

If we want to work with CT, we must enable CT on our database and tables on which we are interested to track the changes. We cannot enable CT for table without enabling CT for database. So should first enable CT for database and then for table.
Before enabling, let us create a database ‘Change Tracking’ and a table ‘[dbo].[EmployeeTracking]’ and insert 10 rows into it.
---------------------------------------------
-- Create Database ChangeTracking
---------------------------------------------
USE master
GO
CREATE DATABASE ChangeTracking
GO
---------------------------------------------
---------------------------------------------
-- Create table [dbo].[EmployeeTracking]
---------------------------------------------
USE ChangeTracking
GO
CREATE TABLE [dbo].[EmployeeTracking]
(
  EmployeeID INT PRIMARY KEY,
  EmployeeName Varchar(30),
  Designation Varchar(15)
)
---------------------------------------------
---------------------------------------------------------------
-- Insert 10 records into [dbo].[EmployeeTracking] table
---------------------------------------------------------------
BEGIN TRANSACTION [CT]
 BEGIN TRY
INSERT [dbo].[EmployeeTracking] VALUES(1,'Biswajit','SASE')
INSERT [dbo].[EmployeeTracking] VALUES(2,'Chandu','SASE')
INSERT [dbo].[EmployeeTracking] VALUES (3,'Rukmini','CSE')
INSERT [dbo].[EmployeeTracking] VALUES (4,'Bhawna','SASE')
INSERT [dbo].[EmployeeTracking] VALUES (5,'Mohan','CSE')
INSERT [dbo].[EmployeeTracking] VALUES (6,'Athira','ASE')
INSERT [dbo].[EmployeeTracking] VALUES (7,'Sravani','ASE')
INSERT [dbo].[EmployeeTracking] VALUES(8,'Kiranmai','ASE')
INSERT [dbo].[EmployeeTracking] VALUES (9,'Nagendra','LCSE')
INSERT [dbo].[EmployeeTracking] VALUES (10,'Laxman','LCSE')
COMMIT TRANSACTION [CT]
 END TRY
BEGIN CATCH
ROLLBACK TRANSACTION [CT]
END CATCH
GO
---------------------------------------------------------------
----------------------------------------------------
SELECT * FROM [dbo].[EmployeeTracking] -- 10 rows
----------------------------------------------------

Enabling CT for a database:

Before enabling the CT, let us check the properties of our database. By default CT is disabled.

Let us enable it by executing the following script.
---------------------------------------------
USE master
GO
ALTER DATABASE ChangeTracking
SET CHANGE_TRACKING = ON
---------------------------------------------


We can run this in ALTER statement in Change Tracking database also. But it is a best practice to run this in master database.
While enabling, we can use the below two optional parameters.
CHANGE_RETENTION: This allows us to set the period for which tracking data should be stored. We can also set this value as per our requirement. Suppose, if you make it 100 days, tracking data will be available for 100 days and older data will be removed automatically. By default it is 2 days.
AUTO_CLEANUP: It cleans the older data automatically if it is ON. By default it is ON.
Ex:
---------------------------------------------------------------------------
USE master
GO
ALTER DATABASE ChangeTracking 
SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 100 DAYS,AUTO_CLEANUP = ON)
GO
---------------------------------------------------------------------------
We can enable CT by using SQL Server Management Studio (SSMS) also.
Go to Object Explorer -> Databases -> Right click on your database -> Properties -> Change Tracking page -> set the values -> Ok.

Note: Enabling from SSMS is not a good practice.

Enabling CT for a table

Before enabling the CT for our table, let us check the properties of our table. By default CT is disabled.Let us enable it by executing the following script.

---------------------------------------------
USE ChangeTracking
GO
ALTER TABLE [dbo].[EmployeeTracking]
ENABLE CHANGE_TRACKING
---------------------------------------------


Since our table exists in Change Tracking database, we should run this under Change Tracking database not in master.
While enabling, we can use the below optional parameter.
TRACK_COLUMNS_UPDATED: This parameter is used to indicate the columns which are changed by UPDATE operation and also indicates that row has changed. By default, it is OFF.

Example

----------------------------------------------------------
USE ChangeTracking
GO
ALTER TABLE [dbo].[EmployeeTracking] 
ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = OFF)
GO
----------------------------------------------------------
We can enable CT by using SQL Server Management Studio (SSMS) also.
Go to Object Explorer -> Databases ->Open your database ->Tables -> Right click on your table ->Properties -> Change Tracking page -> set the values -> OK.
Now everything is ready. Let’s track the data.

Operation 1

Delete one row from [dbo]. [EmployeeTracking] table.
--------------------------------------------------
DELETE FROM [dbo].[EmployeeTracking]
WHERE EmployeeName ='Bhawna'
--------------------------------------------------

Now the total numbers of rows are reduced to 9.
--------------------------------------------------
SELECT * FROM [dbo].[EmployeeTracking] -- 9 rows
--------------------------------------------------

Operation 2

Insert one row into [dbo].[EmployeeTracking] table.
--------------------------------------------------
INSERT [dbo].[EmployeeTracking]
VALUES(11,'Arumugam','EM')
--------------------------------------------------

Now total number of rows is increased to 10.
--------------------------------------------------
SELECT * FROM [dbo].[EmployeeTracking] -- 10 rows
-------------------------------------------------- 

It is time to track the data. Before tracking I would like to teach you guys about some built in functions of CT.

List of functions used in CT:

  • CHANGETABLE()
  • CHANGE_TRACKING_CURRENT_VERSION()
  • CHANGE_TRACKING_MIN_VALID_VERSION()
  • CHANGE_TRACKING_IS_COLUMN_IN_MASK()
  • WITH CHANGE_TRACKING_CONTEXT()

1. CHANGETABLE():

It is the mostly used CT function and which is used to obtain CT information of a table since the Tracking started.
It accepts two parameters.
  • First parameter will be CHANGES+Tablename which need to be tracked.
  • Second parameter will be PreviousSyncVersion which represents the number of changes to be retrieved since the CT started. Confused??? Please go through the below example.
For example, you have done 5 modifications to the table, but you want to track only recent (last) change. Then it should be 4. It is working like 5-4=1. It retrieves only one record out of 5 records. I think now you got how it is working. Like this make it ‘0’, if you want to retrieve all the changes made since the CT started.
Column NameDescription
SYS_CHANGE_VERSIONIt represents the last version number when a particular row was modified
SYS_CHANGE_CREATION_VERSIONIt represents the version number when a record is inserted.
SYS_CHANGE_OPERATIONThe possible values of this column are I or D or U.
I - Insert
D - Delete
U - Update
SYS_CHANGE_COLUMNSIt represents all the columns impacted since last baseline version. This column will have values only when UPDATE operations are happened.
<primary key columns>It represents the primary key columns of our tracked table.
2. CHANGE_TRACKING_CURRENT_VERSION():
It gives us the most recent value (version of last modification) of SYS_CHANGE_VERSION column.
--------------------------------------------
SELECT CHANGE_TRACKING_CURRENT_VERSION()
--------------------------------------------

3. CHANGE_TRACKING_MIN_VALID_VERSION():
It gives the minimum change version number of a table after CT information is available or retained. It accepts one mandatory parameter which is the objectID of our table.
-------------------------------------------------------------------------------------------
SELECT CHANGE_TRACKING_MIN_VALID_VERSION(OBJECT_ID('EmployeeTracking')) AS MinValidVersion
-------------------------------------------------------------------------------------------

Note: You may get doubt that it is always 0. The answer is No. It cannot be zero always. Since you are only tracking the one table information, it is zero. But if you deal with more than one table, it will be more than one. Because the tracking data of all tables is stored in CHANGETABLE only.
4. CHANGE_TRACKING_IS_COLUMN_IN_MASK():
The value returned by SYS_CHANGE_COLUMNS column in CHANGETABLE function contains data in binary format. This function is will return 1 or 0 depending on whether the column was updated or not. It takes two parameters. First parameter is the table name and second parameter is the column name.
5. WITH CHANGE_TRACKING_CONTEXT():
This function is used to differentiate between the changes you have made by your application and other applications.
It accepts one parameter called ‘context’ which is provided by the calling application and stored with tracking information. It is varbinary(128) and it do not accept NULL.
Note: Since the above two functions are not much useful in CT, I am not going in deep.
If you remember, around 5minutes back we did two changes to data in our table. i.e. One Delete and one Insert. Now i am going to track those two records by using CHANGETABLE.
------------------------------------------------------------------------
SELECT * FROM CHANGETABLE(CHANGES EmployeeTracking,0) AS ET -- 2 rows
------------------------------------------------------------------------

In the above screenshot, we are having some columns which we do not want. So let’s do some modification to the above query to display only required columns.
-------------------------------------------------------------
SELECT SYS_CHANGE_VERSION,SYS_CHANGE_OPERATION,EmployeeID
FROM CHANGETABLE(CHANGES [dbo].[EmployeeTracking],0)AS ET
ORDER BY SYS_CHANGE_VERSION
-------------------------------------------------------------

Note:
Aliasing is mandatory while using CHANGETABLE. If you run the below statement, it will throw the error.
-------------------------------------------------------
SELECT * FROM CHANGETABLE(CHANGES EmployeeTracking,0)
-------------------------------------------------------

Operation 3:
Let’s update one of the records in our table.
-----------------------------------------------------------
UPDATE [dbo].[EmployeeTracking]
SET EmployeeName ='Shantha' WHERE EmployeeName ='Rukmini'
-----------------------------------------------------------

----------------------------------------------------
SELECT * FROM [dbo].[EmployeeTracking] -- 10 rows
----------------------------------------------------

-------------------------------------------------------------
SELECT * FROM CHANGETABLE(CHANGES EmployeeTracking,0)AS ET 
ORDER BY SYS_CHANGE_VERSION -- 3 rows
-------------------------------------------------------------

In the above example, we updated Non Primary Key column. But when we update non Primary key column, CT works differently compare to when we update Primary key column. Please see the below example.
--------------------------------------------
UPDATE [dbo].[EmployeeTracking]
SET EmployeeID = 12 WHERE EmployeeID = 7 
--------------------------------------------

------------------------------------------------------
SELECT * FROM [dbo].[EmployeeTracking] -- 10 rows
------------------------------------------------------

In this update statement, we are changing EmployeeID from 7 to 12. The EmployeeID 7 is deleted and 12 are inserted in place of 7.
----------------------------------------------------------------
SELECT * FROM CHANGETABLE(CHANGES EmployeeTracking,0) AS ET 
ORDER BY SYS_CHANGE_VERSION -- 5 rows
----------------------------------------------------------------

You may get one more doubt that he is showing only operation details whether it is Insert/Delete /Update.
But what if I want to see the data of all columns in our table? Can I get remaining columns data?
If this is your question, my answer will be YES.
Clue: If you see in the above screenshot, at last we have a column called EmployeeID which is the primary key of our EmployeeTracking table.
I think you got the solution to get those values. Yes you are correct. We should JOIN these two (CHANGETABLE and EmployeeTracking) based on EmployeeID.
------------------------------------------------------------------------
SELECT * FROM CHANGETABLE(CHANGES [dbo].[EmployeeTracking],0) AS ET 
JOIN [dbo].[EmployeeTracking] AS ETT ON ETT.EmployeeID = ET.EmployeeID
------------------------------------------------------------------------

If you observe the above data, it is displaying only Insert and Update data but not delete data. Because these records are already deleted from our EmployeeTracking table.
We are almost done with CT. But before closing I would like to discuss one hidden secret of CT.
To explain this, I am creating a table called ‘[dbo].[EmployeeTrackingNoPK]’.
--------------------------------------------
CREATE TABLE [dbo].[EmployeeTrackingNoPK]
(
  EmployeeID INT,
  EmployeeName Varchar(30),
  Designation Varchar(15)
)
--------------------------------------------

Now I am trying to enable CT on this table. But it is throwing the below error L
---------------------------------------------
ALTER TABLE [dbo].[EmployeeTrackingNoPK]
ENABLE CHANGE_TRACKING
--------------------------------------------- 
Msg 4997, Level 16, State 1, Line 1
Cannot enable change tracking on table 'EmployeeTrackingNoPK'. Change tracking requires a primary key on the table. Create a primary key on the table before enabling change tracking.

I tried to enable using SSMS also. But no luck.

I hope you guys got the hidden secret of CT.
The secret is we can enable CT on a table when table is having Primary Key on it. Otherwise we can’t enable CT.
We hope we are done with Tracking. After completion of Tracking we need to disable CT on our database and table. We cannot disable CT for a database when CT is enabled on at least one table. So first we need to disable in tables and then database.
-----------------------------------------
-- Disabling CT
-----------------------------------------
ALTER TABLE [dbo].[EmployeeTracking]
DISABLE CHANGE_TRACKING
GO
ALTER DATABASE ChangeTracking
SET CHANGE_TRACKING = OFF
GO
----------------------------------------

This is the first article in my life. I hope you guys enjoyed learning CT sorry Change Tracking. If you find any mistakes, please let me know and we are always welcome to have suggestions from you to improve our future articles content quality.

No comments:

Post a Comment