Process each row of sql result Set using Cursor

We generally have resultset of our select statements. Some times we want to perform any logic on the each row of the resultset. In these kind of scenario where we want to process each row and perform any logic on the row we make use of Cursor. A Cursor is always associated with SQL Select statement and will process each row from the resultant resultset.

Syntax

--Declare Variables
DECLARE @ID int
DECLARE @Description nvarchar(100)

--Declare Cursor 
DECLARE PRODUCTCURSOR CURSOR FOR 
SELECT ID,[Description] FROM Product WHERE ID < 2000

--With Open command cursor will point towards first row of result set
OPEN PRODUCTCURSOR 


--Cursor will retrieve first row and set to @Id and @description
FETCH NEXT FROM PRODUCTCURSOR INTO @ID,@Description

--@@FETCH_STATUS global variable to state whether cursor contains rows in the resultset or no
--When all records are looped it will will return -1

WHILE(@@FETCH_STATUS=0)
BEGIN
PRINT ('ID= ' +CAST(@ID AS NVARCHAR(40)) +' Description ='+@Description)

--Cursor will retrieve next row and set to @Id and @description
FETCH NEXT FROM PRODUCTCURSOR INTO @ID,@Description
END

--Close the Cursor
CLOSE PRODUCTCURSOR

--DeAllocate Cursor 
DEALLOCATE PRODUCTCURSOR 

Consider that we have a below table with 450002 number of records with below structure.

Capture

Now business team recognized that there are some customers which has wrong Account Numbers updated in the table corresponding to Pan number and they want this to be rectified, they share the xls file with the Pan Number and there respective Account details as shown below:

14

So now in order to update these AccountNumbers i first store them on database by uploading via excel import in sql server.

Step 1
3

Step 2

4

Step 3

5

Step 4

6

Step 5

7

Step 6

8

Step 7

Edit the table Name if you want to give new name or use the Sheet as given by SQL server. Click on edit mapping if you want to change the data type of column.

9

Step 8

Click on Next

10

Next

11

Status of Import

13

Now in order to update the same we will make use of Cursor to update these 890 records. We will use Cursor for the same.

In order to do the same i have created belo cursor which will update the records of the corresponding records present in Sheet table which we imported from excel.

--Declare Variables
DECLARE @PanNumber	nvarchar(20)
DECLARE @AccountNumber nvarchar(20)
begin transaction

declare @ToUpdateAccountNumber  nvarchar(20)
DECLARE @ToUpdatePanNumber nvarchar(20)
--Declare Cursor 
DECLARE UpdateCustomerCURSOR CURSOR FOR 
SELECT [Pan Number],[Account number] FROM  [dbo].[Sheet1$]

OPEN UpdateCustomerCURSOR 

FETCH NEXT FROM UpdateCustomerCURSOR  INTO @PanNumber,@AccountNumber

WHILE(@@FETCH_STATUS=0)
BEGIN

update mstCustomer set AccountNumber=@AccountNumber where PanNumber=@PanNumber

--Cursor will retrieve next row and set to @Id and @description
FETCH NEXT FROM UpdateCustomerCURSOR INTO @PanNumber,@AccountNumber
END

--Close the Cursor
CLOSE UpdateCustomerCURSOR

--DeAllocate Cursor 
DEALLOCATE UpdateCustomerCURSOR 

So now when we execute the below block to update the records as shown below

14

We can see it took 1 min 18 second to complete the update process, we can say it as time consuming query. Generally it’s said to avoid using cursor due to it’s performance issues as we just saw in above figure. So in order to get the performance benefits generally i solve this situation using Joins.

update mstCustomer set AccountNumber=[dbo].[Sheet1$].[Account number] 
from mstCustomer
inner join [dbo].[Sheet1$] on
mstCustomer.PanNumber= [dbo].[Sheet1$].[Pan Number]

15

We can clearly see the performance we get threw join. The same query where took minutes whereas over it took only 6 seconds to perform the same operation. There is a drastic improvement of performace when we used join over cursor.

Rotating a table using Pivot Operator in SQL

The Pivot operator turns the unique value from one column and transfer the same to multple columns in the output, effectively rotating the table.

Let’s take an example below table which contains batsman stats for the tournaments scores based on date of match as shown below:

1

In order to get the total runs scored by the Player in the tournament, we can do the group by on the table as shown below:

select firstname+' '+secondname PlayerName,tournamentName,sum(total_Score) 'Total Score'
from  [SampleDb].[dbo].[BatsmanStats]
group by firstname,secondname,tournamentName
order by firstname,secondname,tournamentName

2

What if our business wants to get player stats as playername and than tournament names which is actually like converting uniquer Tournament Name into multiple columns with the help of Pivot.

syntax of Pivot as per docs.microsoft.com


SELECT non-pivoted column,  
    [first pivoted column] AS column name,  
    [second pivoted column] AS column name,  
    ...  
    [last pivoted column] AS column name  
FROM  
    (SELECT query that produces the data)   
    AS alias for the source query
PIVOT  
(  
    aggregation function (column being aggregated)  
FOR   
[column that contains the values that will become column headers]   
    IN ( [first pivoted column], [second pivoted column],  
    ... [last pivoted column])  
) AS alias for the pivot table  
optional ORDER BY clause;  

So now in order to achieve the same, let write the query accordingly as per syntax.

SELECT PlayerName,UTU,Zonal  --selected non-pivoted column,pivotedColumns
FROM  
    (SELECT firstname+' '+secondname as PlayerName,tournamentName,total_Score --SELECT query that produces the data 
	from  [SampleDb].[dbo].[BatsmanStats])   
    as TblStats --alias for the source query
PIVOT  
(  
    sum(total_Score) -- aggregation function (column being aggregated)  
	for tournamentName  --[column that contains the values that will become column headers]   
	IN ( [UTU],[Zonal]) --IN ( [first pivoted column], [second pivoted column],  
   
)   
as PvtStatsPerTournament --AS alias for the pivot table 

Output

5

Database Normalization in depth

Database normalization is the rules to design of tables to minimize data redundancy i.e. data duplicate which ensure data consistency. It can also defined as dividing a single table in two or more and defining relationships between those tables.

Benefits of Database Normalization

  1. Avoid Repetitive entries
  2. Avoiding unnecessary disk space wastage
  3. Increase speed and flexible query.

 

 Table without Normalization

We have a Shipment table where whenever the customer books his/her shipment. The information are stored in database table as shown below:

1

So here in this table we store the application name from where the ShipmentRequest is coming, along with its hosted IP Address. These fields are duplicated in the table. We also have Commodity Type column also which remains consistent and duplicated in the table. We can see all these mentioned columns are repeated. Right now we are inserting all these details for each shipment in the table.

Drawbacks of this Design

  • Disk space wastage
  • Inconsistent Data
  • DML queries will become slow due

Disk space wastage

As we insert new records to the table. The disk space automatically increases rapidly for unnecessary data.

Inconsistent Data

If business team tells us that commodity Type Handicraft has been changed to Handicraft/Products and that should reflect in reports as well, we need to update the whole table wherever this Commodity Type is defined.

Slow DML queries

As discussed if we have to update the records we have to update all records which will lead to slow down of DML queries

Normalized structure of table

Below is the normalized tables design. We can normalize the main table to more extent which we will do after a while, But for understanding purpose lets check the same for now.

Main Shipment table Design
2

Commodity Type Table
3

Application table

4

Now we can clearly see all the duplicate records are not inserted into the table rather than we refer their respective id from their respective table. So now if we have to update the Commodity Type we need to update single row.

First Normal Form

The table to be in First Normal form the records should be single unit rather than duplicate. No multiple values or repetitions group of fields.

Sample ShipperDetails table where we have Shipper details and their respective GSTIN Number with state name as shown below:

5

According to 1NF we should not have repetitive group of fields. In our ShipperDetails table we have repetitive GSTIN and City Columns which are repetitive. We have some customer which has GSTIN number in only 1 state than respective other GSTIN columns are set to null hence resulting in wasted disk space. While in future our customer may increase his business and opt for other GST as well in that case we need to alter our tables and add the new GSTIN and city column. In order to implement 1NF we need to modify the table structure and make compliant to 1NF.

6

So we can see we have broken down the the single unit with Single GSTIN, State, CustomerFirstName and CustomerSecondName.

2NF

The table is state to be in Second normal form. If it fulfills the conditions for First Normal form. In Second normal form states that All the non-key columns are dependent on the table’s primary key.

In the above mentioned Table we have column name state as varchar type. If we have millions of records we would have repeative state name present in this table, We know there are 37 States as per GST regulations. So we will move our state column to a new table called statemaster with additional state details for our purpose as shown below:

7

Now we will change the State column from our [ShipperDetails]table to StateId as shown below with respect to the state present in [ShipperDetails] and create relationship between them using foreign key.

8

Create relationship between stateMaster and [ShipperDetails] table using Foreign key

9

Third Normal Form
A table is said to be in Third Normal form if meets all condition of First and Second normal form.

The table doesn’t contains column that are depends on other non-key field. i.e. columns that are not fully dependent upon Primary key for example:Total in our table which is dependent on Unit Price and Qty, Total column is not dependent on our PrimaryKey, We can compute Total by multiplying Unit Price and Qty.

So in order to make our [ShipperDetails] table in Third normal form we have to remove the Total column from the table as shown below:

10

 

Tempory Comman table Expression

A CTE is a temporary table that can be reference within Select, update, or Delete statement.

Syntax

To create Comman table expression we below command:

WITH CTE_NAME (column1, column2, ..)
as
(query)

We have an Order table which contains below mentioned records:

1

We have another table called Stores which have the store information as show below:

2

Now i want to retrieve the number of order placed for the particular store number with use of CTE expression and then retrieving all the details for the store from the Stores table as shown below.

Creating CTE for retrieving StoreOrders

With StoreOrders(STOREID,TotalOrder)
  as
  (select  STOREID,count(STOREID) as TotalOrder
  from [dbo].[Orders]
  group by STOREID
  )	

Now retrieving store information from the CTE Table and Stores table.

select * from StoreOrders
  inner join [dbo].[stores]
  on StoreOrders.STOREID=[stores].[id] 

3

The CTE Table should be referenced within the select command which immediately follow the CTE Table. If not than select expression will not work, as shown below:

4

5

Multiple CTE Tables

We can create multiple CTE Table one after the another and use them within select statement.

In order to demonstrate the same let’s take an example:
We have two order table which is Electronic device while other Order table for Kirana. So now we will retrieve the total no of orders placed based on the storeId, below are the sample table records.

Electronics Orders

3

Kirana Orders

4

Now we will create two cte to retrieve the total number of Order placed for Electronics and Kirana.

with ElectronicOrders(STOREID,TotalOrder)
as
(
select  STOREID,count(STOREID) as TotalOrder
  from [dbo].[Orders] inner join
  [dbo].[stores]
  on [dbo].[Orders].STOREID=[stores].[id]
  group by STOREID
),

 KiranaOrders(STOREID,TotalOrder)
as
(
select   STOREID,count(STOREID) as TotalOrder
  from [dbo].[Order] inner join [dbo].[stores]
  on [dbo].[Order].STOREID=cast ([dbo].[stores].[id] as nvarchar(20)) 
  group by STOREID
)

select * from ElectronicOrders
union 
select * from KiranaOrders

So as shown above we can create two table expression by using single With expression of CTE with (,) separated.

6

Now with the CTE the questions comes whether the CTE are updatable as Views?>

The answer is “Yes and No” depending upon the scenario where we can update the same. Lets take a scenario where we can update the same.

So here we have simple CTE expression


WITH Store_Name_Owner
as
(
select ID,STORENAME,STOREOWNER FROM stores
)
SELECT * FROM Store_Name_Owner

Let’t try and update the same.

WITH Store_Name_Owner
as
(
select ID,STORENAME,STOREOWNER FROM stores
)
update Store_Name_Owner set storeowner='Saillesh Pawar' where id=3267

7

So we have conclude that if have CTE table on one base table than we can update the CTE which in turn updates the base table. If we have multple base table in CTE but update allows updating only one base table than also update is allowed but not as expected.

Trying to update multiple base table CTE

WITH OrdersByStore
as
(
select [Order].ID,ORDERCREATEDDATE,storename,storeowner from [Order] inner join
 stores on [Order].STOREID=CAST (stores.id as nvarchar(10))
)
update OrdersByStore set ORDERCREATEDDATE=GETDATE(),storename='Virender' where id=2

8

So we conclude if CTE is based on multiple table and the update affects multiple table than update doesn’t happen as shown above.

Recursive CTE
CTE provide great advantage by being able to self reference like self join in order to create Recuresive CTE. A recursive CTE is a CTE which executes itself based on the base query and than returning the subset data.

We have following table named tblPlayers of a team which consist of Coaches and players. There is one senior coach, multiple junior coach and rest are players under junior coaches as shown below:

5

So we want to disply the PlayerName,Coach and their Level with help of CTE.

We have a below hierarchy for the Players with their respective Coach. Shailender Panwar is main Head Coach. So Level wise Shailender Panwar level is 1, than Ravindra Negi and Nitin Negi level are 2 while others are at level 3.
6

with 
PlayersCte(PlayerId,PlayerName,CoachId,[Level])
as
(
select PlayerId,PlayerName,CoachId,1 from tblPlayer where coachid is null
union All
select tblplayer.PlayerId,tblplayer.PlayerName,tblplayer.CoachId,PlayersCte.[Level]+1
from tblplayer join PlayersCte
on tblplayer.CoachId=PlayersCte.PlayerId
)
select PlayersCTE.PlayerName,ISNULL(CoachCTE.PlayerName,'Head Coach') as Coach,PlayersCTE.[Level] from 
PlayersCte PlayersCTE
left join PlayersCte CoachCTE on PlayersCTE.CoachId=CoachCTE.PlayerId

In CTE what we did is first separated the Head Coach which has CoachId as null than returned and increment Level for each respective players based on the CTE level value.

Once we receieve the records we need to do self join to display the “Head Coach” in case of Null with self CTE join on coachId eqauls PlayerId.

10