On the Journey to be a SQL Server Professional: Interview Questions on SQL Server Programming

41. Rollup, Cube, Grouping SETS operator and the Grouping function

a. Cannot use DISTNCT for any aggregate function any more

b. ROLLUP Syntax and Example

i. GROUP BY VendorState, Vendor City WITH ROLLUP

ii. GROUP BY ROLLUP (VendorState, Vendor City)

c. Cube Syntax and Example

i. GROUP BY VendorState, Vendor City WITH Cube

ii. GROUP BY CUBE (VendorState, Vendor City)

d. Grouping SETS

i. New to 2008

ii. Summary row only – create a summary for each specified group

iii. Examples

GROUP BY GROUPING SETS (VendorS tate, VendorCity)

GROUP BY GROUPING SETS ((VendorState, VendorCity), VendorZipCode, ())

--notice ROLLUP and CUBE can be used with GROUPING SET. The following achieves a similar result as that in the 2nd example

GROUP BY GROUPING SETS (ROLLUP(VendorState, VendorCity), VendorZipCode)

e. GROUPING(Column_name)

i. Return 1 or 0

ii. Often used with CASE

iii. Example

SELECT

CASE

WHEN GROUPING(VendorState ) = 1 THEN 'All'

ELSE VendorS tate

END AS VendorState,

CASE

WHEN GROUPING(VendorCity) = 1 THEN 'All'

ELSE VendorCity

END AS VendorCity,

COUNT( *) AS QtyVendor.

FROM Vendor.

WHERE VendorState IN ('IA', 'NJ')

GROUP BY VendorState, VendorCity WITH ROLLUP

ORDER BY VendorState DESC, VendorCity DESC

42. Simple subquery

a. Often used for WHERE, SELECT, FROM, HAVING
b. Rarely used in GROUP BY and ORDER BY
c. Simple subquery often does not use GROUP BY and HAVING. ORDER BY is OK, but it needs to be paired with TOP

43. Correlated subquery

a. Execute once for each outer row

b. Non-correlated executed only once

c. Slow

44. Subquery or JOIN? JOIN!

a. Subquery and JOIN can be exchangeable in many cases

b. JOIN is often faster than subquery

c. Columns in the subquery cannot be shown to the user.

45. The keywords: ALL, ANY(or SOME in ANSI) – rarely used, they can often be replace with the MIN or MAX function

a. x >ALL (1, 2 ) è x>2

b. x <ALL (1. 2 ) è x < 1,

c. x= ALL (1. 2 ) è( x = 1) AND ( x =2 )

d. d.x <> ALL (1, 2) è(x <> 1) AND ( x <> 2 )

e. e. x >ANY (1, 2 ) è x>1

f. x <ANY (1. 2 ) è x < 2 èOFTEN replaced with the MAX function,

g. x= ANY (1. 2 ) è( x = 1) OR ( x =2 )

h. x <> ANY (1, 2) è(x <> 1) OR ( x <> 2 )

46. EXISTS often is used with correlated subquery

a. WHERE (NOT) EXISTS (subquery)

SELECT vendors.VendorID, VendorName, VendorState

FROM Vendors

WHERE NOT EXISTS

(SELECT *

FROM Invoices

WHERE Invoices.VendorID = Vendors.VendorID)

47. How CTE is similar to a temp table?

WITH cte_name1 AS (query_definiton1) -- can use cte_name1 or cte_name1.col defined in query_definiton1 in cte_name2 or sql_statement

[. Cte_name2 as (query_definition2)]

[…]

Sql_statement

48. Recursive CTE

a. The structure

CTE_query_definition –- Anchor member is defined.

UNION ALL

CTE_query_definition –- Recursive member is defined referencing cte_name.

)

-- Statement using the CTE

SELECT *

FROM cte_name

b. Example

USE AdventureWorks2008R2;

WITH DirectReports (ManagerID, EmployeeID, Title, DeptID, Level)

(

-- Anchor member definition

SELECT e.ManagerID, e.EmployeeID, e.Title, edh.DepartmentID,

0 AS Level

FROM dbo.MyEmployees AS e

INNER JOIN HumanResources.EmployeeDepartmentHistory AS edh

ON e.EmployeeID = edh.BusinessEntityID AND edh.EndDate IS NULL

WHERE ManagerID IS NULL

UNION ALL

-- Recursive member definition

SELECT e.ManagerID, e.EmployeeID, e.Title, edh.DepartmentID,

Level + 1

FROM dbo.MyEmployees AS e

INNER JOIN HumanResources.EmployeeDepartmentHistory AS edh

ON e.EmployeeID = edh.BusinessEntityID AND edh.EndDate IS NULL

INNER JOIN DirectReports AS d

ON e.ManagerID = d.EmployeeID

)

-- Statement that executes the CTE

SELECT ManagerID, EmployeeID, Title, DeptID, Level

FROM DirectReports

INNER JOIN HumanResources.Department AS dp

ON DirectReports.DeptID = dp.DepartmentID

WHERE dp.GroupName = N'Sales and Marketing' OR Level = 0;

49. How to create a test table?

a. SELECT INTO

SELECT select_list

INTO table_name_on_the_fly.

FROM table_source

(WIHERE)

(GROUP BY)

(HAVING)

(ORDER BY)

b. Use a Subquery in the place for VALUES

INSERT [INTO] table_existing[col_list]

SELECT COL_LIST

FROM

WHERE

50. Update table syntax – be aware of FROM etc

a. Syntax structure

UPDATE table name

SET column_name1 = expression1 (,column_name2 = expression2 ...)

(FROM table_ source ((AS ) table_ alias)

WHERE search condition)

b. You can code a subquery in the SET, FROM, or WHERE clause of an UPDATE statement.

c. You can use a subquery in the SET clause to return the value that’s assigned to a column,

d. You can use a subquery in the FROM clause to identify the rows that are available for update, Then, you can refer to the derived table in the SET and WHERE clauses.

e. You can code a subquery in the WHERE clause to provide one or more values used in the search condition.

f. Example:

USE AP

--only the top 10 InvoiceID will be updated!

UPDATE InvoiceCopy

SET CreditTotal = CreditTotal + 100

FROM

(SELECT TOP 10 InvoiceID

FROM InvoiceCopy

WHERE InvoiceTotal - PaymentTotal - CreditTotal >= 100

ORDER BY InvoiceTotal - PaymentTotal - CreditTotal DESC) AS TopInvoices

WHERE InvoiceCopy.InvoiceID = TopInvoices.InvoiceID

51. Delete rows

DELETE [PROM) table_name

[FROM table_source) -- the SQL Server extension

[WHERE search condition)

52. Merge – the upsert statement

a. The syntax of the MERGE statement

MERGE (INTO) table_ terget

USING table source

ON join_ condition

(WHEN MATCHED (AND B<larch condition) '"

THEN dml_statement )

(WHEN NOT MATCHED (BY TARGET) (AND search condition)",

THEN dml statement)

(WHEN NOT MATCHED BY SOURCE (AND search condition) '"

THEN dml statement)

b. Example

USE AP

MERGE INTO InvoiceArchive AS ia

USING InvoiceCopy AS ic

ON ic.InvoiceID = ia.InvoiceID

WHEN MATCHED AND

ic.PaymentDate IS NOT NULL

AND ic.PaymentTotal > ia.PaymentTotal THEN

UPDATE --no table name here

SET ia.PaymentTotal = ic.PaymentTotal ,

ia.CreditTotal = ic.CreditTotal ,

ia.PaymentDate = ic.PaymentDate

WHEN NOT MATCHED THEN

INSERT --no table name here

( InvoiceID ,

VendorID ,

InvoiceNumber ,

InvoiceTotal ,

PaymentTotal ,

CreditTotal ,

TermsID ,

InvoiceDate ,

InvoiceDueDate

)

VALUES ( ic.InvoiceID ,

ic.VendorID ,

ic.InvoiceNumber ,

ic.InvoiceTotal ,

ic.PaymentTotal ,

ic.CreditTotal ,

ic.TermsID ,

ic.InvoiceDate ,

ic.InvoiceDueDate

)

WHEN NOT MATCHED BY SOURCE THEN

DELETE --no table name here

;

53. Total 26 data types in SQL Server

a. RowVersion is ANSI style, timestamp is SQL Server (to be replaced with RowVersion, the only ANSI data type is recommended)

b. Order of precedence (from lower to higher)

i. StringàInt typeàMoneyàfloatàDateTime (implicit convert)

ii. Have to be explicit from higher to lower

54. Some string functions – how to parse a string

a. CharIndex() and PatIndex both return int

55. How to parse date and time

a. Year(), MONTH(), Day()

b. DatePart() or DateName()

56. Two CASE FUNCTIONS

a. SIMPLE CASE

CASE color

When Black THEN ‘Black’

b. Searching CASE

CASE

WHEN ..THEN

WHEN..THEN

ELSE

END CASE AS XXX

57. Index

a. Good for query but bad for update , so no more, no less due to data/index update overhead

b. It’s more of an art – testing…..

c. Composite index, Including Index, Covering Index

i. Composite index - The composite index (where all of the columns are “key” columns) will carry the data in all levels of the index;

ii. Including index (using Include, one key column, and include others)

iii. The INCLUDE index will only carry the “non key” columns in the leaf node, not as a part of the B-tree. Key takeaway: The INCLUDE index will be smaller than the equivalent composite index.

iv. The INCLUDE columns (non-key) will not count towards the max index size (900 bytes).

v. Updates to non-key columns in INCLUDE indexes will not contribute to index fragmentation; updates to unused “key” columns in the composite index will result in fragmentation

vi. Covering index is a special type of INCLUDING index, it includes all of the columns used by a query (columns in SELECT, WHERE, ORDER BY)

58. THE SEVEN NORMAL FORM

a. 1^st – scalar value + no repeating columns

b. 2^nd – non-key depends on the entire key

c. 3^rd – non-key only depends on the PK

d. Boyce-Codd – no dependency between non-keys

e. 4^th – eliminate multivalue dependencies (e.g. cell phone and home need to be in separate table)

f. 5^th - smaller and smaller tables, no redundancieskey+one or two data elements

g. 6^th – domain-key normal form - database contains no constraints other than domain constraints and key constraints.(Wikipedia)

i. A domain constraint specifies the permissible values for a given attribute, while a key constraint specifies the attributes that uniquely identify a row in a given table.

ii. A violation of DKNF occurs in the following table:

Wealthy Person
Wealthy Person	Wealthy Person Type	Net Worth in Dollars
Steve	Eccentric Millionaire	124,543,621
Roderick	Evil Billionaire	6,553,228,893
Katrina	Eccentric Billionaire	8,829,462,998
Gary	Evil Millionaire	495,565,211

(Assume that the domain for Wealthy Person consists of the names of all wealthy people in a pre-defined sample of wealthy people; the domain for Wealthy Person Type consists of the values 'Eccentric Millionaire', 'Eccentric Billionaire', 'Evil Millionaire', and 'Evil Billionaire'; and the domain for Net Worth in Dollars consists of all integers greater than or equal to 1,000,000.)

There is a constraint linking Wealthy Person Type to Net Worth in Dollars, even though we cannot deduce one from the other. The constraint dictates that an Eccentric Millionaire or Evil Millionaire will have a net worth of 1,000,000 to 999,999,999 inclusive, while an Eccentric Billionaire or Evil Billionaire will have a net worth of 1,000,000,000 or higher. This constraint is neither a domain constraint nor a key constraint; therefore we cannot rely on domain constraints and key constraints to guarantee that an inconsistent Wealthy Person Type / Net Worth in Dollars combination does not make its way into the database.

The DKNF violation could be eliminated by altering the Wealthy Person Type domain to make it consist of just two values, 'Evil' and 'Eccentric' (the wealthy person's status as a millionaire or billionaire is implicit in their Net Worth in Dollars, so no useful information is lost).

Wealthy Person
Wealthy Person	Wealthy Person Type	Net Worth in Dollars
Steve	Eccentric	124,543,621
Roderick	Evil	6,553,228,893
Katrina	Eccentric	8,829,462,998
Gary	Evil	495,565,211
Wealthiness Status
Status	Minimum	Maximum
Millionaire	1,000,000	999,999,999
Billionaire	1,000,000,000	999,999,999,999

iii. DKNF is frequently difficult to achieve in practice.

59. Coding rules

a. @-local – its scope is the batch in which it is declared

b. #-temp, ##-global temp

c. The 5 statements need to be in its own batch (cannot combined with other T-SQL statements)

i. View

ii. Trigger

iii. Fn

iv. Sp

v. Schema

d. The big T-SQL coding keywords

i. IF …ELSE

ii. BEGIN …END

iii. WHILE (BREAK, CONTINE) for repetition

iv. TRY CATCH

· Handle severity 10 – 20

· The error functions can be used in the CATCH block

o ERROR_NUMBER()

o ERROR_MESSAGE()

o ERROR_STATE ()

o ERROR_SEVERITY()

v. GOTO

vi. RETURN

vii. USE

viii. PRINT

ix. DECLARE

x. SET

xi. EXEC

60. Column-level and table-level constraints

a. A statement that creates a table with two column-level check constraints

CREATE TABLE Invoices1

(

InvoiceID INT NOT NULL

IDENTITY

PRIMARY KEY ,

InvoiceTotal MONEY NOT NULL

CHECK ( InvoiceTotal >= 0 ) ,

PaymentTotal MONEY NOT NULL

DEFAULT 0

CHECK ( PaymentTotal >= 0 )

);

b. The same statement with the check constraints coded at the table level

CREATE TABLE Invoices2

(

InvoiceID INT NOT NULL

IDENTITY

PRIMARY KEY ,

InvoiceTotal MONEY NOT NULL ,

PaymentTotal MONEY NOT NULL

DEFAULT 0 ,

CHECK (( InvoiceTotal >= 0 )

AND (PaymentTotal >= 0 ) )

);

CREATE TABLE Vendors1

(

VendorCode CHAR(6) NOT NULL

PRIMARY KEY ,

VendorName VARCHAR(50) NOT NULL ,

CHECK ( ( VendorCode LIKE '[A-Z][A-Z][0-9][0-9][0-9][0-9]' )

AND ( LEFT(VendorCode, 2) = LEFT(VendorName, 2) ) )

);

61. ON DELETE CASCADE OR ON UPDATE CASCADE from PK table cascading to the FK table.

62. Updatable and read-only Views

a. Read-only – if using DISTINCT, TOP, GROUP BY, HAVING, UNION, AGGREGATE functions or calculated value

b. Updateable – possible but not recommended, use INSTEAD OF trigger

i. One base table only

ii. WITH CHECK

iii. For INSERT, need data for all of the non-null, non-default columns in the base table, not just those in the view.

63. Catalog view in 2005, replacing INFORMATION SCHEMA VIEW

64. Table variable (@TableName) vs. temp table (#TableName) vs. derived table

a. Table variable

i. Use similar syntax as CREATE TABLE, but they are declared, as opposed to being created.

ii. Used like a standard table, but not in SELECT INTO

iii. Scope – current batch only, and they are destroyed automatically at the end of the batch. They are not visible across batches in the same level, and are also not visible to inner levels in the call stack.

iv. Stored in tempdb as well. There’s a common misconception that only temporary tables have a physical representation in tempdb and that table variables reside only in memory. This isn’t true. Both temporary tables and table variables have a physical representation in tempdb. But, unlike temporary tables and table variables, table expressions such as CTE have no physical side to them.

v. With table variables, SQL Server doesn’t allow explicit naming of constraints—not even in a single session. However, recall that when you define a primary key constraint, SQL Server enforces its uniqueness by using a unique clustered index by default. When you define a unique constraint, SQL Server enforces its uniqueness by using a unique nonclustered index by default. So if you want to define indexes on a table variable, you can do so indirectly by defining constraints.

DECLARE @T1 AS TABLE

(

col1 INT NOT NULL

);

INSERT INTO @T1(col1) VALUES(10);

--Error! Must declare the table variable "@T1".

EXEC('SELECT col1 FROM @T1;');

--The following code demonstrates how to declare a table variable and then query the sys.objects view.

DECLARE @T1 AS TABLE

(

col1 INT NOT NULL

);

INSERT INTO @T1(col1) VALUES(10);

SELECT name FROM tempdb.sys.objects WHERE name LIKE '#%';

When this code ran on the test system, it produced the following output.

name

----------

#BD095663

As you can see, SQL Server created a table in tempdb to implement the table variable you declared.

b. Temp table

i. Scope - current connection session. They are visible throughout the level that created them, across batches, and in all inner levels of the call stack. So if you create a temporary table in a specific level in your code and then execute a dynamic batch or a stored procedure, the inner batch can access the temporary table.

ii. Stored in temp database

iii. Temporary tables are created in tempdb in the dbo schema. You can create two temporary tables with the same name in different sessions, because SQL Server internally adds a unique suffix to each. But if you create temporary tables in different sessions with the same constraint name, only one will be created and the other attempts will fail. SQL Server does not allow two occurrences of the same constraint name within the same schema.

iv. If you define a constraint without naming it, SQL Server internally creates a unique name for it. The recommendation therefore is not to name constraints in temporary tables.

v. You can create indexes on temporary tables after the table is created. You can also alter the table definition and apply definition changes, like adding a constraint or a column.

vi. SQL Server maintains distribution statistics histograms for temporary tables but not for table variables. This means that, generally speaking, you tend to get more optimal plans for temporary tables. This comes at the cost of maintaining histograms, and at the cost of recompilations that are associated with histogram refreshes. When the plan efficiency depends on existence of histograms, you should use temporary tables. Table variables are fine to use in two general cases. One is when the volume of data is so small, like a page or two, that the efficiency of the plan isn’t important. The other case is when the plan is trivial. A trivial plan means that there’s only one sensible plan and the optimizer doesn’t really need histograms to come up with this conclusion. An example for such a plan is a range scan in a clustered index or a covering index. Such a plan is not dependent on selectivity of the filter.

vii. In Summary, temporary tables and table variables differ in a number of ways, including scope, DDL and indexing, interaction with transactions, and distribution statistics.

CREATE TABLE #T1

(

col1 INT NOT NULL

);

INSERT INTO #T1(col1) VALUES(10);

--OK

EXEC('SELECT col1 FROM #T1;');

--OK

SELECT col1 FROM #T1;

DROP TABLE #T1;

c. Derived Table

i. Scope – within the current statement

ii. Storage – memory

iii. Recommended

d. Testing for existence of a database object

i. If OBJECT_ID(‘object’) is not NULL…

ii. DB_ID(‘aDB’)

iii. IF EXISTS (SELECT * FROM SYS.TABLES WHRER NAME=’XX’)

65. The three identity functions

a. @@IDENTITY – The @@IDENTITY function returns the last identity value generated in your session regardless of scope.

b. IDENT_CURRENT(‘aTable’) - The IDENT_CURRENT function accepts a table as input and returns the last identity value generated in the input table regardless of session.

c. The SCOPE_IDENTITY function returns the last identity value generated in your session in the current scope.

d. Demo the differences:

USE TSQL2012;

IF OBJECT_ID('Sales.MyOrders') IS NOT NULL

DROP TABLE Sales.MyOrders;

CREATE TABLE Sales.MyOrders

(

orderid INT NOT NULL

IDENTITY(1, 1)

CONSTRAINT PK_MyOrders_orderid PRIMARY KEY ,

custid INT NOT NULL

CONSTRAINT CHK_MyOrders_custid CHECK ( custid > 0 ) ,

empid INT NOT NULL

CONSTRAINT CHK_MyOrders_empid CHECK ( empid > 0 ) ,

orderdate DATE NOT NULL

);

INSERT INTO Sales.MyOrders

( custid, empid, orderdate )

VALUES ( 1, 2, '20120620' ),

( 1, 3, '20120620' ),

( 2, 2, '20120620' );

SELECT *

FROM Sales.MyOrders;

SELECT SCOPE_IDENTITY() AS SCOPE_IDENTITY ,

@@IDENTITY AS [@@IDENTITY] ,

IDENT_CURRENT('Sales.MyOrders') AS IDENT_CURRENT;

All three functions return the same values.

SCOPE_IDENTITY @@IDENTITY IDENT_CURRENT

--------------- ----------- --------------

3 3 3

Next, open a new query window and run the query again. This time, you get the following result.

SCOPE_IDENTITY @@IDENTITY IDENT_CURRENT

--------------- ----------- --------------

NULL NULL 3

Because you’re issuing the query in a different session than the one that generated

the identity value, both SCOPE_IDENTITY and @@IDENTITY return NULLs. As for IDENT_CURRENT, it returns the last value generated in the input table irrespective of session.

As for the difference between SCOPE_IDENTITY and @@IDENTITY, suppose that you have a stored procedure P1 with three statements:

■ An INSERT that generates a new identity value

■ A call to a stored procedure P2 that also has an INSERT statement that generates a new identity value

■ A statement that queries the functions SCOPE_IDENTITY and @@IDENTITY

The SCOPE_IDENTITY function will return the value generated by P1 (same session and scope). The @@IDENTITY function will return the value generated by P2 (same session irrespective of scope).

66. Some common system functions (used to be called global variables)

a. @@ROWCOUNT

b. @@ERROR

c. @@SERVERNAME

d. HOST_NAME()

e. SYSTEM_USER --return current login

67. Change database session settings (e.g., SET NOCOUNT ON|OFF)

68. The basic syntax of a dynamic SQL

a. EXEC |EXECUTE (‘SQL_String’)

b. Exec sp_execuresql

sp_executesql [ @stmt = ] statement

[

{ , [ @params = ] N'@parameter_name data_type [ OUT | OUTPUT ][ ,...n ]' }

{ , [ @param1 = ] 'value1' [ ,...n ] }

]

EXEC sp_executesql @InsertString,

N'@InsOrderID INT, @InsCustID INT, @InsOrdDate DATETIME,

@InsOrdMonth INT, @InsDelDate DATETIME',

@PrmOrderID, @PrmCustomerID, @PrmOrderDate,

@OrderMonth, @PrmDeliveryDate

69. SQLCMD

a. Can be used in both in DOS and SQL Server

b. New in 2005, replacing OSQL

70. The four major types of SQL programming Objects

a. Script

b. Sp

c. Fn - Often used in sp and triggers

d. Trigger

71. SPs

a. Parameters

i. Optional parameter (having a default)

ii. OUTPUT parameter

CREATE PROC apInvTotall @fDateVar smalldatetime,

@fInvTotal money OUTPUT

SELECT (fInvTotal= SUM (InvoiceTotal )

FROM Invoices

WHERE InvoiceDate >= fDateVar

iii. RETURN (used for a single integer value) vs. OUTPUT (for other data types or multiple values)

b. Calling a sp with parameters (notice OUTPUT)

A CREATE PROC statement that includes three parameters

CREATE PROC spInvTotal3

@InvTotal MONEY OUTPUT ,

@DateVar SMALLDATETIME = NULL ,

@VendorVar VARCHAR(40) = '%'

IF @DateVar IS NULL

SELECT @DateVar = MIN(InvoiceDate)

FROM Invoices;

SELECT @InvTotal = SUM(InvoiceTotal)

FROM Invoices

JOIN Vendors ON Invoices.VendorID = Vendors.VendorID

WHERE ( InvoiceDate >= @DateVar )

AND ( VendorName LIKE @VendorVar );

Code that passes the parameters by position:

DECLARE @MyInvTotal money

EXEC spInvTota13 @MyInvTotal OUTPUT, ‘2008-06-01’, ‘P%’

Code that passes the parameters by name:

DECLARE .MyInvTotal money

EXEC .pInvTota13 .DateVar • ‘2008-06-01’, WvendorVar • ‘P\’,

@InvTotal = @MyInvTotal OUTPUT

Code that omits one optional parameter:

DECLARE .MyInvTotal money

EXEC .pInvTota13 WvendorVar = ‘M%’, @InvTotal =@MyInvTotal OUTPUT

Code that omits both optional parameters:

DECLARE @MyInvTotal money

EXEC .pInvTota13 @MyInvTotal OUTPUT

To use an output parameter in the calling program, you must declare a variable to store its value. Then, you use the name of that variable in the EXEC statement, and you code the OUTPUT keyword after it to identify it as an output parameter.

c. Passing a table as a parameter

i. New in 2008

ii. Create Type xxx AS Table

iii. READONLY for the table parameter – input only, no OUTPUT

iv. No FKs can be defined for the table

72. Functions

a. Return a scalar or a table, not for modification.

i. Use sp if u need to change the data

ii. But within the function, you can create a table, change it, but the final result must return a table or a scalar value.

b. Simple table-valued vs. multi-valued function

i. Both return a table

ii. Simple – single SELECT

iii. Multi-valued – Multiple SELECT statements.

iv. Multi-statement table-valued function is rarely needed as a single SELECT with joins and subqueries can fulfill almost every query need.

c. Use it with schema – two parts needed

73. Triggers

a. No AFTER trigger for a view.

b. Multiple AFTER triggers for one action is fine for a table, but only one INSTEAD OF TRIGGER FOR each action

c. Update and Insert use the ‘inserted’ virtual table

74. Cursors

a. Two ways

i. T-SQL Cursor on the server

ii. API (ADO, ODBC) Cursor on the client (both are similar)

b. Three common uses of TSQL Cursors

i. For administering you own databases with sp or system tables to generate dynamic SQL

b. Do something different to each row in a result set

c. If the standard database APIs cannot be used.

75. Nested tran – Surprising behavior in nested tran

a. Commit tran - only decrements @@TRANCOUNT, not actually committing

b. Example

USE AP

BEGIN TRAN

PRINT 'First Tran @@TRANCOUNT: ' + CONVERT(varchar,@@TRANCOUNT)--1

DELETE Invoices

BEGIN TRAN

PRINT 'Second Tran @@TRANCOUNT: ' + CONVERT(varchar,@@TRANCOUNT)--2

DELETE Vendors

COMMIT TRAN -- This COMMIT decrements @@TRANCOUNT.

-- It doesn't commit 'DELETE Vendors'.

PRINT 'COMMIT @@TRANCOUNT: ' + CONVERT(varchar,@@TRANCOUNT)--1

ROLLBACK TRAN

PRINT 'ROLLBACK @@TRANCOUNT: ' + CONVERT(varchar,@@TRANCOUNT)--1

PRINT ' '

DECLARE @VendorsCount int, @InvoicesCount int

SELECT @VendorsCount = COUNT (*) FROM Vendors

SELECT @InvoicesCount = COUNT (*) FROM Invoices

PRINT 'Vendors Count: ' + CONVERT (varchar , @VendorsCount)--123 (not deleted)

PRINT 'Invoices Count: ' + CONVERT (varchar , @InvoicesCount)--116

76. Concurrency and Locking

a. Not a problem for querying data, only a problem for modifying data

b. Four concurrency problems

i. Lost updates – last wins on the same-row update

ii. Dirty reads – one tran read data not committed by another tran

iii. Non-repeatable reads – two SELECT have different reads as 3^rd tran changes the data between the SELECT

iv. Phantom reads – one tran update/delete a set of rows, another tran performs insert/delete affecting data in the same set of rows.

c. SQL Server automatically enforces locking, but the default locking behavior does not handle all of the problems.

d. You can write more efficient code in a large system with many users, especially when the transaction could adversely affect data integrity. In this case, you often change the default locking behavior by setting the transaction isolation level. That’s the most popular thing you can do.

Isolation Level	Dirty reads	Lost Updates	Non-repeatable reads	Phantom reads
Read Uncommitted	Yes	Yes	Yes	Yes
Read Committed (99% of time OK!)	NO	Yes	Yes	Yes
· Read_Committed_Snapshot OFF	(default)
· Read_Committed_Snapshot ON	(still prevent Dirty reads, but use row version to minimize contention)
Repeatable Read	NO	NO	NO	Yes
Serializable	NO	NO	NO	NO
Snapshot (transaction-level)(use a snapshot method to record what the data was at the beginning of the tran)	NO	NO	NO	NO

e. How SQL Server Lock Manager works?

i. Ten levels of lockable resources from a row to a database

ii. Try the finest first, but also do lock escalation to a more coarse level if necessary

iii. Lock mode

· Shared and exclusive – the two most common ones

· Update lock

o For Update

o Lock manager assigns a U lock which prevent another transaction from gaining a shared lock

o Then lock manager promotes the U lock to X lock when data is changed. In this way, deadlock is avoided.

· Intent Lock - An intent lock indicates that SQL Server intends to acquire a shared lock or an exclusive lock on a liner-grain resource. For example, an Intent Shared (IS) lock acquired at the table level means that the transaction intends to acquire shared locks on pages or rows within that table. This prevents another transaction from acquiring an exclusive lock on the table containing that page or row.

· Schema locks – placed on table’s design

· Bulk Update – for BULK INSERT and bcp.

77. XML

a. Parent-child elements, hierarchical, the top one is the root element

b. XSD (Schema Collection) can be generated from an XML document in SQL Server

i. Open the XML document in the editor

ii. From the XML Menu|Create Schema

c. XML data for DB/table auditing

i. Create a table with an XML column

ii. Create a DDL trigger and perform the following actions (see Example 2 below):

· Declare an xml type variable

· Populate the variable with EVENDATA() function

· Insert the variable into the audit table

d. XML Schema Collection

i. For data validation

ii. In Programmability|Types in Object Explorer

iii. Many are on Internet

e. XML and Relational data

i. Relational to XML

FOR XML {RAW | AUTO} [, ROOT ('RootName')] [ , ELEMENTS]

ii. XML to Relational

· Value() (with nodes() in some cases)

· OpenXML()

· Demo

--Method 1 - using the Value(XQuery, SQLType) method on the XML column in the table

USE AP

SELECT

EventID,

EventData.value('(/EVENT_INSTANCE/EventType)[1]', 'varchar(40)')

AS EventType,

EventData.value('(/EVENT_INSTANCE/PostTime)[1]', 'smalldatetime')

AS PostTime,

EventData.value('(/EVENT_INSTANCE/LoginName)[1]', 'varchar(40)')

AS LoginName,

EventData.value('(/EVENT_INSTANCE/ObjectName)[1]', 'varchar(40)')

AS ObjectName

FROM DDLActivityLog

WHERE

EventData.value('(/EVENT_INSTANCE/EventType)[1]', 'varchar(40)')

= 'DROP_TABLE'

--Method 2 - - using the Value(XQuery, SQLType) method in the trigger to

-- parse the XML data if there are no XML columns in the table

CREATE TABLE DDLActivityLog2

(

EventID int NOT NULL IDENTITY PRIMARY KEY,

EventType varchar(40) NOT NULL,

PostTime smalldatetime NOT NULL,

LoginName varchar(40) NOT NULL,

ObjectName varchar(40) NOT NULL

)

GO;

CREATE TRIGGER Database_CreateTable_DropTable2

ON DATABASE

AFTER CREATE_TABLE, DROP_TABLE

DECLARE @EventData XML

SELECT @EventData = EVENTDATA()

INSERT INTO DDLActivityLog2 VALUES

(

@EventData.value('(/EVENT_INSTANCE/EventType)[1]', 'varchar(40)'),

@EventData.value('(/EVENT_INSTANCE/PostTime)[1]', 'varchar(40)'),

@EventData.value('(/EVENT_INSTANCE/LoginName)[1]', 'varchar(40)'),

@EventData.value('(/EVENT_INSTANCE/ObjectName)[1]', 'varchar(40)')

)

--Method 3 - OpenXML

/*The simplified syntax for the OPENXML statement

OPENXML (xml document-handle_ int, x_ path)

WITH ( table definition)*/

USE AP

-- Declare an int variable that's a handle for the internal XML document

DECLARE @VendorInvoicesHandle int

-- Create an xml variable that stores the XML document

DECLARE @VendorInvoices xml

SET @VendorInvoices = '

<VendorName>Abbey Office Furnishings</VendorName>

</Invoice>

</Vendor>

<VendorName>Bertelsmann Industry Svcs. Inc</VendorName>

</Invoice>

</Vendor>

<VendorName>Blue Cross</VendorName>

</Invoice>

</Invoice>

</Invoice>

</Vendor>

</Vendors>

-- Prepare the internal XML document

EXEC SP_XML_PREPAREDOCUMENT @VendorInvoicesHandle OUTPUT, @VendorInvoices

-- SELECT the data from the table returned by the OPENXML function

SELECT *

FROM OPENXML (@VendorInvoicesHandle, '/Vendors/Vendor/Invoice')

WITH

(

VendorName varchar(50) '../VendorName',

InvoiceNumber varchar(50) 'InvoiceNumber',

InvoiceTotal money 'InvoiceTotal'

)

-- Remove the internal XML document

EXEC SP_XML_REMOVEDOCUMENT @VendorInvoicesHandle

/*the output

VendorName InvoiceNumber InvoiceTotal

Abby Office Furniture 203339-13 17.50

.....

Blue Cross 547481328 224.00

78. Three ways to store binary data

a. A varchar column as a pointer in the DB, pointing to the binary file in the file system

b. Varbinary(max) inside DB, <2GB

c. Varbinary(max) with FileStream attribute

i. MOST OF BLOB DATA >1MB

ii. Fast Read Access is critical

iii. NTFS drives

iv. More work to setup

v. Key steps

· Enable it at the server

· Create a database with FILESTREAM storage – including a file group that provides for FILESTREAM storage

USE master

-- if necessary, drop the MusicStore database

IF EXISTS(SELECT * FROM sys.databases WHERE name = 'MusicStore')

DROP DATABASE MusicStore

CREATE DATABASE MusicStore

ON PRIMARY

(

NAME = MusicStore,

FILENAME = 'C:\Murach\SQL Server 2008\Databases\MusicStore.mdf'

FILEGROUP FileStreamImages CONTAINS FILESTREAM DEFAULT

(

NAME = MusicStoreImages,

FILENAME = 'C:\Murach\SQL Server 2008\Databases\MusicStore_images'

)

· Create a table with FILESTREAM

USE MusicStore

IF OBJECT_ID('ProductImages') IS NOT NULL

DROP TABLE ProductImages

CREATE TABLE ProductImages

(

ImageID int PRIMARY KEY IDENTITY,

ProductID int NOT NULL,

RowID uniqueidentifier ROWGUIDCOL NOT NULL UNIQUE DEFAULT NEWID(),

ProductImage varbinary(max) FILESTREAM NOT NULL

)

INSERT INTO ProductImages (ProductID, ProductImage)

VALUES (1, 0)

INSERT INTO ProductImages

VALUES (2, NEWID(), 0)

INSERT INTO ProductImages

VALUES (3, NEWID(), CAST('0123456789ABC' AS varbinary(max)))

SELECT * FROM ProductImages

--RETURN THE PATH TO THE BINARY FILE STREAM

USE MusicStore

SELECT ImageID, ProductImage.PathName() AS FileStreamPath

FROM ProductImages

79. CLR

a. SQL Server Data TypesSQLTypes.NET Types

i. The System.Data.SqlTypes is a namespace

ii. It maps the SQL data types to .NET types (maps to SQL Server data as well)

iii. It’s faster and safer than .NET types.

iv. So use SQLTypes (not .NET types) in CLR coding

80. Optimizer Hints and Plan Guide

a. What are hints? Hints are orders about how to execute a query

b. Table hints, which are hints for which you specify how to use a specific table in a query.

i. Table hints influence locking and the access method for a single table or view only. You can use the table hints in the FROM clause, and introduce them by using the WITH keyword.

ii. SQL Server supports the following table hints:

· NOEXPAND

· INDEX ( index_value [ ,...n ] ) | INDEX = ( index_value )

· FORCESEEK [ ( index_value ( index_column_name [ ,... ] ) ) ]

· FORCESCAN

· FORCESEEK

· KEEPIDENTITY

· KEEPDEFAULTS

· IGNORE_CONSTRAINTS

· IGNORE_TRIGGERS

· HOLDLOCK

· NOLOCK

· NOWAIT

· PAGLOCK

· READCOMMITTED

· READCOMMITTEDLOCK

· READPAST

· READUNCOMMITTED

· REPEATABLEREAD

· ROWLOCK

· SERIALIZABLE

· SPATIAL_WINDOW_MAX_CELLS = integer

· TABLOCK

· TABLOCKX

· UPDLOCK

· XLOCK

iii. Example: Maybe the most popular optimizer hint is the table hint that forces a specific index usage. The following two queries show an example of leaving it to SQL Server to choose the access method and of forcing usage of a nonclustered index.

-- Clustered index scan: 21%

SELECT orderid ,

productid ,

qty

FROM Sales.OrderDetails

WHERE productid BETWEEN 10 AND 30

ORDER BY productid;

-- Forcing a nonclustered index usage: 79%

SELECT orderid ,

productid ,

qty

FROM Sales.OrderDetails WITH (INDEX ( idx_nc_productid ))

WHERE productid BETWEEN 10 AND 30

ORDER BY productid;

c. Query hints, which are hints on a query level, for which you specify, for example, which join algorithms should be used for a specific query.

i. You specify query hints as part of the OPTION clause of the SELECT, INSERT, UPDATE, DELETE, and MERGE statements. You cannot use query hints in subqueries, only in the outermost query.

ii. The following query hints are supported by SQL Server 2012:

· { HASH | ORDER } GROUP

· { CONCAT | HASH | MERGE } UNION

· { LOOP | MERGE | HASH } JOIN

· EXPAND VIEWS

· FAST number_rows

· FORCE ORDER

· IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX

· KEEP PLAN

· KEEPFIXED PLAN

· MAXDOP number_of_processors

· MAXRECURSION number

· OPTIMIZE FOR ( @variable_name { UNKNOWN | = literal_constant } [ ,...n ] )

· OPTIMIZE FOR UNKNOWN

· PARAMETERIZATION { SIMPLE | FORCED }

· RECOMPILE

· ROBUST PLAN

· USE PLAN N'xml_plan'

· TABLE HINT ( exposed_object_name [ , <table_hint> [ [, ]...n ] ]

iii. The following two queries return the same aggregated rowset; the first one allows SQL Server to decide which aggregation technique to use—SQL Server decides to use the hash aggregation (39%)—whereas the second one forces the stream aggregation (61%).

-- Hash match aggregate

SELECT qty ,

COUNT(*) AS num

FROM Sales.OrderDetails

GROUP BY qty;

-- Forcing stream aggregate

SELECT qty ,

COUNT(*) AS num

FROM Sales.OrderDetails

GROUP BY qty

OPTION (ORDER GROUP);

In the second query, SQL Server used the Stream Aggregate operator. However, because this operator expects ordered input, SQL Server also added the Sort operator to the plan. Although the stream aggregation might be faster than the hash aggregation, the second query might be slower because of the additional sort operation.

d. Join hints for a single join only.

i. SQL Server 2012 also supports the following join hints in the FROM clause:

· LOOP

· HASH

· MERGE

· REMOTE

ii. Example: The following two queries return the same result set again. For the first query, the selection of the join algorithm is left to SQL Server—SQL Server decides to use a nested loops join—and the second query forces a merge join.

-- Nested loops join: 36%

SELECT O.custid ,

O.orderdate ,

OD.orderid ,

OD.productid ,

OD.qty

FROM Sales.Orders AS O

INNER JOIN Sales.OrderDetails AS OD ON O.orderid = OD.orderid

WHERE O.orderid < 10250;

-- Forced merge join: 64%

SELECT O.custid ,

O.orderdate ,

OD.orderid ,

OD.productid ,

OD.qty

FROM Sales.Orders AS O

INNER MERGE JOIN Sales.OrderDetails AS OD ON O.orderid = OD.orderid

WHERE O.orderid < 10250;

e. Finally, you can prescribe the complete query execution by using plan guides.

i. In the plan guide, you can specify either the OPTION clause or a specific query plan for the statement you want to optimize. You also specify the T-SQL statement for which the plan guide is intended. The SQL Server Query Optimizer matches the executing T-SQL statement with the statement specified in the plan guide and then uses the guide to create the execution plan.

ii. You can create the following types of plan guides:

· OBJECT plan guides are used by the Query Optimizer to match queries inside stored procedures, scalar user-defined functions, multistatement table-valued user-defined functions, and DML triggers.

· SQL plan guides are used by the Query Optimizer to match stand-alone queries or queries in ad hoc batches.

· TEMPLATE plan guides are used by the Query Optimizer to match stand-alone queries that can be parameterized to a specified form. You can force parameterization with template guides.

iii. Some important system stored procedures related to plan guides

· sys.sp_create_plan_guide – create a plan guide

· sys.sp_control_plan_guide - disable, enable, or drop a plan guide

· sys.sp_create_plan_guide_from_handle - create a plan guide from a cached query plan

· sys.fn_validate_plan_guide - validate a plan. A plan guide might become invalid because of a database schema change.

· sys.sp_get_query_template - get the parameterized form of a query. This procedure is especially useful to get the parameterized query for the TEMPLATE plan guide.

iv. Example: Consider the following stored procedure.

CREATE PROCEDURE Sales.GetCustomerOrders ( @custid INT )

SELECT orderid ,

custid ,

empid ,

orderdate

FROM Sales.Orders

WHERE custid = @custid;

Suppose for the vast majority of customers—for example, a customer that has a custid equal to 71—the query in the procedure is not very selective; therefore, a table or clustered index scan would be the most appropriate to use. However, for some rare customers with only a few orders—for example, a customer that has a custid equal to 13—an index seek with a lookup would be better. If a user executes the procedure for customer 13 first, then the procedure plan in the cache would not be appropriate for most of the further executions. By creating a plan guide that uses a query hint that forces optimization of the query in the procedure for the customer that has a custid equal to 71, you are optimizing the stored procedure execution for most of the customers.

The following code creates the plan guide.

EXEC sys.sp_create_plan_guide @name = N'Cust71', @stmt = N'

SELECT orderid, custid, empid, orderdate

FROM Sales.Orders

WHERE custid = @custid;', @type = N'OBJECT',

@module_or_batch = N'Sales.GetCustomerOrders', @params = NULL,

@hints = N'OPTION (OPTIMIZE FOR (@custid = 71))';

If you execute the procedure by using different parameters after you cleaned the cache to make sure that an older plan for this procedure is not present in the cache, SQL Server always optimizes the query for the custid value 71, and thus uses a clustered index scan. This is true even if you execute the query with value 13 for the custid first, like the following code shows.

-- Clearing the cache

DBCC FREEPROCCACHE;

-- Executing the procedure with different parameters

EXEC Sales.GetCustomerOrders @custid = 13;

EXEC Sales.GetCustomerOrders @custid = 71;

You can always get a list of all plan guides in a database by querying the sys.plan_guides catalog view. You can also list all of the hints used in each plan guide, like the following query shows.

SELECT plan_guide_id ,

name ,

scope_type_desc ,

is_disabled ,

query_text ,

hints

FROM sys.plan_guides;

f. General Recommendations: Use all other means, such as creating appropriate indexes, creating and updating statistics, and even using plan guides before moving to the hints. Use hints as the last resort, and after you use them, validate whether they are still useful from time to time.

Interview Questions on SQL Server Programming - Part 2