Microsoft Azure SQL Data Warehouse - Architecture and SQL [1 ed.] 9781940540320

One of the most popular databases worldwide is Microsoft’s SQL Server. Now Microsoft has introduced their MPP data wareh

177 107 10MB

English Pages 767 Year 2015

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Microsoft Azure SQL Data Warehouse - Architecture and SQL [1 ed.]
 9781940540320

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

The Tera-Tom Video Series

Lessons with Tera-Tom Teradata Architecture and SQL Video Series These exciting videos make learning and certification much easier

Three ways to view them: 1. Safari (look up Coffing Studios) 2. CoffingDW.com (sign-up on our website) 3. Your company can buy them all for everyone to see (contact [email protected])

Current Books in the Tera-Tom Genius Series

Current Books in the Tera-Tom Genius Series

Our Recommended Book In The Tera-Tom Genius Series

Tera-Tom- Author of over 50 Books

Tera-Tom books have been the primary source of Teradata learning for over 20 years. They have helped to teach millions of people all aspects of Teradata. What people love the most about the Tera-Tom books is how easy they are to understand. They are so easy that a seven year old boy (raised by wolves) can understand them!

The Best Query Tool Works on all Systems

When you possess a tool like Nexus, you have access to every system in your enterprise! The Nexus Query Chameleon is the only tool that works on all systems. Its Super Join Builder allows for the ERwin Logical Model to be loaded, and then Nexus shows tables and views visually. It then guides users to show what joins to what. As users choose the tables and columns they want in their report, Nexus builds the SQL for them with each click of the mouse. Nexus was designed for Teradata and Hadoop, but works on all platforms. Nexus even converts table structures between vendors, so querying and managing multi-vendor platforms is transparent. Even if you only work with one system, you will find that the Nexus is the best query tool you have ever used. If you work with multiple systems, you will be even more amazed. Download a free trial at www.CoffingDW.com.

Trademarks and Copyrights Microsoft Windows, Windows 2003 Server, SQL Server 2012, SQL Server Compact Edition, .NET, PDW, SQL Server, T-SQL, Azure SQL Data Warehouse and Azure Cloud are trademarks of Microsoft. Teradata, NCR, BYNET and SQL Assistant are registered trademarks of Teradata Corporation, Dayton, Ohio, U.S.A., IBM, DB2 and Netezza are registered trademarks of IBM Corporation, ANSI is a registered trademark of the American National Standards Institute. Ethernet is a trademark of Xerox. UNIX is a trademark of The Open Group. Linux is a trademark of Linus Torvalds. Java and Oracle is a trademark of Oracle. ParAccel is a trademark of ParAccel. Kognitio is a trademark of Kognitio. Greenplum is a trademark of EMC Corporation. Nexus Query Chameleon is a trademark of Coffing Data Warehousing. Coffing Data Warehousing shall have neither liability nor responsibility to any person or entity with respect to any loss or damages arising from the information contained in this book or from the use of programs or program segments that are included. The manual is not a publication of Microsoft Corporation, nor was it produced in conjunction with Microsoft Corporation. Copyright © May 2015 by Coffing Publishing ISBN 978-1-940540-32-0 All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmitted by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission from the publisher. No patent liability is assumed with respect to the use of information contained herein. Although every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions, neither is any liability assumed for damages resulting from the use of information contained herein.

About Tom Coffing

Tom Coffing, better known as Tera-Tom, is the founder of Coffing Data Warehousing where he has been CEO for the past 20 years. Tom has written over 50 books on all aspects of Teradata, Netezza, Kognitio, Redshift, ParAccel, Vertica, SQL Server, and Greenplum. Tom has taught over 1,000 Teradata classes in places such as India, Africa, Europe, China, Malaysia, and throughout North America. Tom is also the owner and designer of the Nexus Query Chameleon, the most sophisticated enterprise query tool in the industry. The Nexus works on all platforms, including Hadoop, converts table structures between all systems, and allows companies to load their ERwin logical model inside Nexus. The Nexus guides users like a GPS system. Users point and click on any table or view from any system, and they are guided to what joins to what. As users choose the columns they want on their report, the SQL is built automatically. In High School, Tom was the first athlete from his school to ever place at state. He was selected by his school to represent them at Buckeye Boys State, and Tom was inducted into the first class of the Lakota High School Hall of Fame. At the University of Arizona and University of Nevada Las Vegas, Tom was a two-time All-American wrestler, Sophomore Athlete of the year, and a two-time winner of the 1980 Olympic wrestling trials. Tom graduated with a Bachelor’s degree in Speech Communications. After college, Tom became a state and national champion speech winner for Toastmasters and won two orchid awards as an actor. Tom is the proud father of three wonderful children and has been married for the past 32 years. You can contact Tom at 513 300-0341 or at [email protected].

About Todd Wilson

Todd Wilson is the Chief Technology Officer of Coffing Data Warehousing. As CTO, Todd has overseen the development of CoffingDW's premiere data analytics tool, The Nexus Query Chameleon. Under Todd’s leadership, the Nexus has expanded to support data sources from all spectrums of data warehousing such as Kognitio (in-memory analytics), HP Vertica (columnar data), Hadoop (including such “Big Data” companies as Cloudera and Hortonworks), cloud service data sources (Amazon’s Redshift), as well as traditional data sources such as Oracle, Teradata, SQL Server, Greenplum, and DB2. An experienced .NET developer, Todd has developed data replication tools, data movement tools, data visualization tools, database management tools, DDL conversion tools and is currently overseeing the development of the Nexus Logical Model Loader. As a technical consultant, he has worked with multiple Fortune 500 companies in fields such as telecommunication, PC manufacturing, and health care. Todd is a Teradata Certified Master and a graduate of Pepperdine University.

Table of Contents

Contents Chapter 1 – Introduction to the Azure SQL Data Warehouse ...................................................................................... 1 Introduction to the Family of SQL Server Products .................................................................................................. 2 Introduction to the Family Continued ........................................................................................................................ 3 Microsoft Azure SQL Data Warehouse ..................................................................................................................... 4 Symmetric Multi-Processing (SMP) .......................................................................................................................... 5 What is Parallel Processing? ...................................................................................................................................... 6 The Basics of a Single Computer ............................................................................................................................... 7 Data in Memory is Fast as Lightning ......................................................................................................................... 8 Parallel Processing of Data ....................................................................................................................................... 9 A Table has Columns and Rows............................................................................................................................... 10 The Azure SQL Data Warehouse has Linear Scalability ......................................................................................... 11 The Architecture of the Azure SQL Data Warehouse ............................................................................................. 12 Nexus is now Available on the Microsoft Azure Cloud ........................................................................................... 13 The MPP Engine is the Optimizer ........................................................................................................................... 14 The Azure SQL Data Warehouse System ................................................................................................................ 15 The Azure SQL Data Warehouse System is Scalable ............................................................................................. 16 The Control Node ..................................................................................................................................................... 17 The Data Rack .......................................................................................................................................................... 18 The Landing Zone .................................................................................................................................................... 19 The Backup Node ..................................................................................................................................................... 20 Software as a Service (SaaS) and the Elastic Database ........................................................................................... 21 Azure Data Lake ...................................................................................................................................................... 22 Azure Disaster Recovery.......................................................................................................................................... 23 Security and Compliance ......................................................................................................................................... 24 How to Get an EXPLAIN Plan ................................................................................................................................ 25

Table of Contents Chapter 2 – The Azure SQL Data Warehouse Table Structures ................................................................................ 27 The 5 Concepts of Azure SQL Data Warehouse Tables.......................................................................................... 28 Tables are Either Distributed by Hash or Replicated (1 of 5) ................................................................................. 29 Table Rows are Either Sorted or Unsorted (2 of 5) ................................................................................................. 30 Tables are Stored in Either Row or Columnar Format (3 of 5) ............................................................................... 31 Tables can be Partitioned (4 of 5) ............................................................................................................................ 32 There are Permanent, Temporary and External Tables (5 of 5) .............................................................................. 33 Creating a Table With a Distribution Key ............................................................................................................... 34 Creating a Table that is Replicated .......................................................................................................................... 35 Distributed by Hash vs. Replication ........................................................................................................................ 36 The Concept is All About the Joins ......................................................................................................................... 37 Creation of a Hash Distributed Table with a Clustered Index ................................................................................. 38 A Clustered Index Sorts the Data Stored on Disk.................................................................................................... 39 Each Node Has 8 Distributions ................................................................................................................................ 40 How Hashed Tables are Stored Among a Single Node ........................................................................................... 41 Hashed Tables Will Be Distributed Among All Distributions ................................................................................ 42 Creation of a Replicated Table................................................................................................................................. 43 How Replicated Tables are Stored Among a Single Node ...................................................................................... 44 Replicated Table will be Duplicated among Each Node ........................................................................................ 45 Distributed by Replication ....................................................................................................................................... 46 How Hashed and Replicated Tables Work Together ............................................................................................... 47 Tables are Stored as Row-based or Column-based.................................................................................................. 48 Creation of a Columnar Table that is Hashed .......................................................................................................... 49 How Hashed Columnar Tables are Stored on a Single Node .................................................................................. 50 How Hashed Columnar Tables are Stored on All Distributions .............................................................................. 51 Comparing Normal Table Vs. Columnar Tables ..................................................................................................... 52 Columnar can move just One Segment to Memory ................................................................................................. 53 Segments on Distributions are Aligned to Rebuild a Row ...................................................................................... 54 Why Columnar? ....................................................................................................................................................... 55

Table of Contents Columnar Tables Store Each Column in Separate Pages ........................................................................................ 56 Visualize the Data – Rows vs. Columns .................................................................................................................. 57 Creation of a Columnar Table that is Replicated ..................................................................................................... 58 Creating a Partitioned Table Per Month .................................................................................................................. 59 A Visual of One Year of Data with Range Per Month ............................................................................................ 60 Another Create Example of a Partitioned Table ...................................................................................................... 61 Creating a Partitioned Table Per Month That is a Columnstore .............................................................................. 62 Visual of Row Partitioning and Columnar Storage ................................................................................................. 63 CREATE TABLE AS (CTAS) Example ................................................................................................................. 64 Creating a Temporary Table .................................................................................................................................... 65 Facts About Tables ................................................................................................................................................... 66 Chapter 3 – Hashing and Data Distribution ................................................................................................................ 68 Distribution Keys Hashed on Unique Values Spread Evenly.................................................................................. 69 Distribution Keys With Non-Unique Values Spread Unevenly .............................................................................. 70 Best Practices for Choosing a Distribution Key ...................................................................................................... 71 The Hash Map Determines which Distribution owns the Row ............................................................................... 72 The Hash Map Determines which Node will Own the Row ................................................................................... 73 The Hash Map Determines which Node will Own the Row ................................................................................... 74 The Hash Map Determines which Node will Own the Row ................................................................................... 75 The Hash Map Determines which Node will Own the Row ................................................................................... 76 A Review of the Hashing Process ............................................................................................................................ 77 Non-Unique Distribution Keys have Skewed Data ................................................................................................. 78 Chapter 4 – The Technical Details.............................................................................................................................. 80 Every Node has the Exact Same Tables................................................................................................................... 81 Hashed Tables are spread across All Distributions ................................................................................................. 82 The Table Header and the Data Rows are Stored Separately .................................................................................. 83 A Distribution Stores the Rows of a Table inside a Data Block.............................................................................. 84

Table of Contents To Read a Data Block a Node Moves the Block into Memory ............................................................................... 85 A Full Table Scan Means All Nodes Must Read All Rows..................................................................................... 86 Rows are Organized inside a Page ........................................................................................................................... 87 Moving Data Blocks is Like Checking In Luggage................................................................................................. 88 As Row-Based Tables Get Bigger, the Page Splits ................................................................................................. 89 Data Pages are Processed One at a Time Per Unit................................................................................................... 90 Creating a Table that is a Heap ................................................................................................................................ 91 Heap Page ................................................................................................................................................................. 92 Extents ...................................................................................................................................................................... 93 Creating a Table that has a Clustered Index ............................................................................................................ 94 Clustered Index Page................................................................................................................................................ 95 The Row Offset Array is the Guidance System for Every Row .............................................................................. 96 The Row Offset Array Provides Two Search Options (1 of 2) ............................................................................... 97 The Row Offset Array Provides Two Search Options (2 of 2) ............................................................................... 98 The Row Offset Array Helps With Inserts .............................................................................................................. 99 B-Trees ................................................................................................................................................................... 100 The Building of a B-Tree for a Clustered Index (1 of 3) ....................................................................................... 101 The Building of a B-Tree for a Clustered Index (2 of 3) ....................................................................................... 102 The Building of a B-Tree for a Clustered Index (3 of 3) ....................................................................................... 103 When Do I Create a Clustered Index? ................................................................................................................... 104 When Do I Create a Non Clustered Index? ........................................................................................................... 105 B-Tree for Non Clustered Index on a Clustered Table (1 of 2) ............................................................................. 106 B-Tree for Non Clustered Index on a Clustered Table (2 of 2) ............................................................................. 107 Adding a Non Clustered Index To A Heap ............................................................................................................ 108 B-Tree for Non Clustered Index on a Heap Table (1 of 2) .................................................................................... 109 B-Tree for Non Clustered Index on a Heap Table (2 of 2) .................................................................................... 110 Max Levels on the Azure SQL Data Warehouse ................................................................................................... 111 Azure SQL Data Warehouse Data Types .............................................................................................................. 112 Character Data Types for SQL Server ................................................................................................................... 113

Table of Contents Numeric Data Types for SQL Server ..................................................................................................................... 114 Date and Time Data Types for SQL Server ........................................................................................................... 115 Additional Data Types for SQL Server.................................................................................................................. 116 Chapter 5 – CREATE Statistics ............................................................................................................................... 118 CREATE Statistics Syntax..................................................................................................................................... 119 CREATE Statistics on a Percentage of a Table ..................................................................................................... 120 CREATE Statistics on a Sample by Using the System Default ............................................................................ 121 CREATE Statistics on a Multi-Column Join Key ................................................................................................. 122 What to Column(s) to CREATE Statistics On ....................................................................................................... 123 CREATE Statistics Using a WHERE Clause ........................................................................................................ 124 Updating All Statistics on a Table ......................................................................................................................... 125 Updating Only Certain Statistics on a Table.......................................................................................................... 126 Dropping Statistics on Certain Statistics on a Table .............................................................................................. 127 Showing the Statistics ............................................................................................................................................ 128 DBCC SHOW_STATISTICS ................................................................................................................................ 129 DBCC SHOW_STATISTICS WITH HISTOGRAM ........................................................................................... 130 Chapter 6 - The Basics of SQL ................................................................................................................................. 132 Introduction ............................................................................................................................................................ 133 Naming of Objects ................................................................................................................................................. 134 Setting Your Default Database ............................................................................................................................... 135 SELECT * (All Columns) in a Table ..................................................................................................................... 136 Fully Qualifying a Database, Schema and Table ................................................................................................... 137 SELECT Specific Columns in a Table .................................................................................................................. 138 Commas in the Front or Back? .............................................................................................................................. 139 Place your Commas in front for better Debugging Capabilities ............................................................................ 140 Sort the Data with the ORDER BY Keyword ....................................................................................................... 141 ORDER BY Defaults to Ascending ....................................................................................................................... 142

Table of Contents Use the Name or the Number in your ORDER BY Statement .............................................................................. 143 Two Examples of ORDER BY using Different Techniques ................................................................................. 144 Changing the ORDER BY to Descending Order................................................................................................... 145 NULL Values sort First in Ascending Mode (Default) ......................................................................................... 146 NULL Values sort Last in Descending Mode (DESC).......................................................................................... 147 Major Sort vs. Minor Sorts .................................................................................................................................... 148 Multiple Sort Keys using Names vs. Numbers ...................................................................................................... 149 Sorts are Alphabetical, NOT Logical ..................................................................................................................... 150 Using A CASE Statement to Sort Logically .......................................................................................................... 151 An Order By That Uses an Expression .................................................................................................................. 152 How to ALIAS a Column Name ............................................................................................................................ 153 Aliasing a Column Name with Spaces or Reserved Words................................................................................... 154 A Missing Comma can by Mistake become an Alias ............................................................................................ 155 Comments using Double Dashes are Single Line Comments ............................................................................... 156 Comments for Multi-Lines..................................................................................................................................... 157 Comments for Multi-Lines as Double Dashes Per Line ........................................................................................ 158 A Great Technique for Comments to Look for SQL Errors .................................................................................. 159 sp_help at the Database Level ................................................................................................................................ 160 sp_help at the Object Level .................................................................................................................................... 161 Getting System Information ................................................................................................................................... 162 Getting Additional System Information ................................................................................................................. 163 Chapter 7 – The Where Clause ................................................................................................................................. 165 The WHERE Clause limits Returning Rows ......................................................................................................... 166 Double Quoted Aliases are for Reserved Words and Spaces ................................................................................ 167 Using A Column ALIAS in a WHERE Clause ..................................................................................................... 168 Using A Column ALIAS in a ORDER BY Clause ............................................................................................... 169 In What Order Does SQL Server Process A Query? ............................................................................................. 170 Character Data needs Single Quotes in the WHERE Clause................................................................................. 171

Table of Contents Character Data needs Single Quotes, but Numbers Don’t..................................................................................... 172 Declaring a Variable .............................................................................................................................................. 173 Comparisons against a Null Value ......................................................................................................................... 174 NULL means UNKNOWN DATA so Equal (=) won’t Work .............................................................................. 175 Use IS NULL or IS NOT NULL when dealing with NULLs ............................................................................... 176 NULL is UNKNOWN DATA so NOT Equal won’t Work .................................................................................. 177 Use IS NULL or IS NOT NULL when dealing with NULLs ............................................................................... 178 Using Greater Than or Equal To (>=).................................................................................................................... 179 AND in the WHERE Clause .................................................................................................................................. 180 Troubleshooting AND ............................................................................................................................................ 181 OR in the WHERE Clause ..................................................................................................................................... 182 Troubleshooting Or ................................................................................................................................................ 183 Troubleshooting Character Data ............................................................................................................................ 184 Using Different Columns in an AND Statement ................................................................................................... 185 Quiz – How many rows will return? ...................................................................................................................... 186 Answer to Quiz – How many rows will return? .................................................................................................... 187 LIKE command Underscore is Wildcard for one Character.................................................................................. 188 LIKE command using a Range of Values.............................................................................................................. 189 LIKE command Using a NOT Range of Values ................................................................................................... 190 LIKE Command Works Differently on Char Vs Varchar ..................................................................................... 191 Troubleshooting LIKE Command on Character Data ........................................................................................... 192 Introducing the RTRIM Command ........................................................................................................................ 193 Quiz – What Data is Left Justified and What is Right? ......................................................................................... 194 Numbers are Right Justified and Character Data is Left ....................................................................................... 195 Answer – What Data is Left Justified and What is Right? .................................................................................... 196 An Example of Data with Left and Right Justification ......................................................................................... 197 A Visual of CHARACTER Data vs. VARCHAR Data ........................................................................................ 198 RTRIM command Removes Trailing spaces on CHAR Data ............................................................................... 199 Using Like with an AND Clause to Find Multiple Letters .................................................................................... 200

Table of Contents Using Like with an OR Clause to Find Either Letters ........................................................................................... 201 Declaring a Variable and Using it with the LIKE Command ................................................................................ 202 Escape Character in the LIKE Command changes Wildcards .............................................................................. 203 Escape Characters Turn off Wildcards in the LIKE Command ............................................................................ 204 Quiz – Turn off that Wildcard................................................................................................................................ 205 ANSWER – To Find that Wildcard ....................................................................................................................... 206 Chapter 8 – Distinct, Group By and TOP ................................................................................................................. 208 The Distinct Command .......................................................................................................................................... 209 Distinct vs. GROUP BY ........................................................................................................................................ 210 Quiz – How many rows come back from the Distinct? ......................................................................................... 211 Answer – How many rows come back from the Distinct? .................................................................................... 212 TOP Command....................................................................................................................................................... 213 TOP Command is brilliant when ORDER BY is used! ......................................................................................... 214 TOP Command with Ties....................................................................................................................................... 215 TOP Command Using a Variable .......................................................................................................................... 216 Chapter 9 – Aggregation ........................................................................................................................................... 218 Quiz – You calculate the Answer Set in your own Mind ...................................................................................... 219 Answer – You calculate the Answer Set in your own Mind ................................................................................. 220 The 3 Rules of Aggregation ................................................................................................................................... 221 There are Five Aggregates ..................................................................................................................................... 222 Quiz – How many rows come back? ..................................................................................................................... 223 Answer – How many rows come back? ................................................................................................................. 224 Troubleshooting Aggregates .................................................................................................................................. 225 GROUP BY when Aggregates and Normal Columns Mix ................................................................................... 226 GROUP BY delivers one row per Group .............................................................................................................. 227 Count_Big .............................................................................................................................................................. 228 Limiting Rows and Improving Performance with WHERE .................................................................................. 229

Table of Contents WHERE Clause in Aggregation limits unneeded Calculations ............................................................................. 230 Keyword HAVING tests Aggregates after they are Totaled ................................................................................. 231 Group By Grouping Sets ........................................................................................................................................ 232 Group By Rollup .................................................................................................................................................... 233 Answer Set for Group By Rollup Query................................................................................................................ 234 Creating a Cube ...................................................................................................................................................... 235 Answer Set for Cube Query ................................................................................................................................... 236 An Easy Example of Creating a Cube ................................................................................................................... 237 Quiz - GROUP BY GROUPING SETS Challenge ............................................................................................... 238 Answer To Quiz - GROUP BY GROUPING SETS Challenge ............................................................................ 239 Getting the Average Values Per Column ............................................................................................................... 240 Average Values per Column for all Columns in a Table ....................................................................................... 241 Chapter 10 - Join Functions ...................................................................................................................................... 243 The Azure SQL Data Warehouse Join Quiz .......................................................................................................... 244 The Azure SQL Data Warehouse Join Quiz Answer ............................................................................................ 245 Redistribution ......................................................................................................................................................... 246 Big Table Small Table Join Strategy ..................................................................................................................... 247 Duplication of the Smaller Table across All-Distributions ................................................................................... 248 If the Join Condition is the Distribution Key no Movement ................................................................................. 249 Matching Rows That Are On The Same Node Naturally ...................................................................................... 250 What if the Join Condition Columns are Not Primary Indexes ............................................................................. 251 Strategy 1 of 4 – The Merge Join ........................................................................................................................... 252 Quiz – Redistribute the Employees by their Dept_No .......................................................................................... 253 Quiz –Dept_No landed on Distribution with Matches .......................................................................................... 254 Quiz – Redistribute the Orders to the Proper Distribution .................................................................................... 255 Answer to Redistribute the Employees by their Dept_No Quiz ............................................................................ 256 Strategy 2 of 4 – The Hash Join ............................................................................................................................. 257 Strategy 4 of 4 – The Product Join ......................................................................................................................... 258

Table of Contents A Two-Table Join Using Traditional Syntax ......................................................................................................... 259 A two-table join using Non-ANSI Syntax with Table Alias ................................................................................. 260 You Can Fully Qualify All Columns ..................................................................................................................... 261 A two-table join using ANSI Syntax ..................................................................................................................... 262 Both Queries have the same Results and Performance.......................................................................................... 263 Quiz – Can You Finish the Join Syntax? ............................................................................................................... 264 Answer to Quiz – Can You Finish the Join Syntax? ............................................................................................. 265 Quiz – Can You Find the Error? ............................................................................................................................ 266 Answer to Quiz – Can You Find the Error? .......................................................................................................... 267 Super Quiz – Can You Find the Difficult Error? ................................................................................................... 268 Answer to Super Quiz – Can You Find the Difficult Error? ................................................................................. 269 Quiz – Which rows from both tables won’t Return? ............................................................................................. 270 Answer to Quiz – Which rows from both tables Won’t Return?........................................................................... 271 LEFT OUTER JOIN .............................................................................................................................................. 272 LEFT OUTER JOIN Results ................................................................................................................................. 273 RIGHT OUTER JOIN............................................................................................................................................ 274 RIGHT OUTER JOIN Example and Results......................................................................................................... 275 FULL OUTER JOIN .............................................................................................................................................. 276 FULL OUTER JOIN Results ................................................................................................................................. 277 Which Tables are the Left and which Tables are Right? ....................................................................................... 278 Answer - Which Tables are the Left and Which are the Right? ............................................................................ 279 INNER JOIN with Additional AND Clause .......................................................................................................... 280 ANSI INNER JOIN with Additional AND Clause ............................................................................................... 281 ANSI INNER JOIN with Additional WHERE Clause .......................................................................................... 282 OUTER JOIN with Additional WHERE Clause ................................................................................................... 283 OUTER JOIN with Additional AND Clause ......................................................................................................... 284 OUTER JOIN with Additional AND Clause Results ............................................................................................ 285 Quiz – Why is this considered an INNER JOIN? .................................................................................................. 286 Evaluation Order for Outer Queries ....................................................................................................................... 287

Table of Contents The DREADED Product Join ................................................................................................................................ 288 The DREADED Product Join Results ................................................................................................................... 289 The Horrifying Cartesian Product Join .................................................................................................................. 290 The ANSI Cartesian Join will ERROR .................................................................................................................. 291 Quiz – Do these Joins Return the Same Answer Set? ........................................................................................... 292 Answer – Do these Joins Return the Same Answer Set? ....................................................................................... 293 The CROSS JOIN .................................................................................................................................................. 294 The CROSS JOIN Answer Set............................................................................................................................... 295 The Self Join.......................................................................................................................................................... 296 The Self Join with ANSI Syntax ........................................................................................................................... 297 Quiz – Will both queries bring back the same Answer Set? ................................................................................. 298 Answer – Will both queries bring back the same Answer Set? ............................................................................. 299 Quiz – Will both queries bring back the same Answer Set? ................................................................................. 300 Answer – Will both queries bring back the same Answer Set? ............................................................................. 301 How would you Join these two tables? .................................................................................................................. 302 An Associative Table is a Bridge that Joins Two Tables ...................................................................................... 303 Quiz – Can you write the 3-Table Join? ................................................................................................................ 304 Answer to Quiz – Can you Write the 3-Table Join? .............................................................................................. 305 Quiz – Can you write the 3-Table Join to ANSI Syntax? ...................................................................................... 306 Answer – Can you Write the 3-Table Join to ANSI Syntax? ................................................................................ 307 Quiz – Can you Place the ON Clauses at the End?................................................................................................ 308 Answer – Can you Place the ON Clauses at the End? ........................................................................................... 309 The 5-Table Join – Logical Insurance Model ........................................................................................................ 310 Quiz - Write a Five Table Join Using ANSI Syntax .............................................................................................. 311 Answer - Write a Five Table Join Using ANSI Syntax ......................................................................................... 312 Quiz - Write a Five Table Join Using Non-ANSI Syntax ..................................................................................... 313 Answer - Write a Five Table Join Using Non-ANSI Syntax ................................................................................. 314 Quiz –Re-Write this putting the ON clauses at the END ...................................................................................... 315 Answer –Re-Write this putting the ON clauses at the END .................................................................................. 316

Table of Contents Chapter 11 – Date Function ..................................................................................................................................... 318 Current_Timestamp................................................................................................................................................ 319 Getdate ................................................................................................................................................................... 320 Date and Time Keywords....................................................................................................................................... 321 SYSDATETIMEOFFSET Provides the Timezone Offset .................................................................................... 322 SYSDATETIMEOFFSET Provides the Timezone Offset .................................................................................... 323 Using both CAST and CONVERT in Literal Values ............................................................................................ 324 Using Both CAST and CONVERT in Literal Values ........................................................................................... 325 Using both CAST and CONVERT in Literal Values ............................................................................................ 326 The DATEADD Function ...................................................................................................................................... 327 The DATEDIFF Function ...................................................................................................................................... 328 DATEADD Function ............................................................................................................................................. 329 A Real World Example for DateAdd Using the Order Table ................................................................................ 330 DATEPART Function ............................................................................................................................................ 331 DATEPART Function Examples ........................................................................................................................... 332 YEAR, MONTH, and DAY Functions .................................................................................................................. 333 A Better Technique for YEAR, MONTH, and DAY Functions ........................................................................... 334 DATENAME Function .......................................................................................................................................... 335 ISDATE Function .................................................................................................................................................. 336 Chapter 12 - Temporary Tables ................................................................................................................................ 338 Temporary Tables .................................................................................................................................................. 339 CREATING A Derived Table................................................................................................................................ 340 Naming the Derived Table ..................................................................................................................................... 341 Aliasing the Column Names in the Derived Table ................................................................................................ 342 Multiple Ways to Alias the Columns in a Derived Table ...................................................................................... 343 CREATING a Derived Table using the WITH Command .................................................................................... 344 The Same Derived Query shown Three Different Ways ....................................................................................... 345 MULTIPLE Derived Tables using the WITH Command ..................................................................................... 346

Table of Contents Column Alias Can Default For Normal Columns.................................................................................................. 347 Most Derived Tables Are Used To Join To Other Tables ..................................................................................... 348 A Join Example Showing Different Column Alias Styles ..................................................................................... 349 The Three Components of a Derived Table ........................................................................................................... 350 Visualize This Derived Table ................................................................................................................................ 351 Our Join Example With The WITH Syntax ........................................................................................................... 352 Quiz - Answer the Questions ................................................................................................................................. 353 Answer to Quiz - Answer the Questions................................................................................................................ 354 Clever Tricks on Aliasing Columns in a Derived Table ........................................................................................ 355 A Derived Table lives only for the lifetime of a single query ............................................................................... 356 An Example of Two Derived Tables in a Single Query ........................................................................................ 357 RECURSIVE Derived Table Hierarchy ................................................................................................................ 358 RECURSIVE Derived Table Query ...................................................................................................................... 359 RECURSIVE Derived Table Definition ................................................................................................................ 360 WITH RECURSIVE Derived Table Seeding ........................................................................................................ 361 WITH RECURSIVE Derived Table Looping ....................................................................................................... 362 RECURSIVE Derived Table Looping in Slow Motion......................................................................................... 363 RECURSIVE Derived Table Looping Continued ................................................................................................. 364 RECURSIVE Derived Table Looping Continued ................................................................................................. 365 Six rows are added in the third loop. RECURSIVE Derived Table Ends the Looping ........................................ 365 RECURSIVE Derived Table Ends the Looping .................................................................................................... 366 RECURSIVE Derived Table Definition ................................................................................................................ 367 RECURSIVE Derived Table Answer Set .............................................................................................................. 368 What is TEMPDB? ................................................................................................................................................ 369 Creating a Temporary Table .................................................................................................................................. 370 The Three Steps to Use a Private Temporary Table .............................................................................................. 371 Creating a Temporary Table With a Clustered Index ............................................................................................ 372 Creating a Columnstore Temporary Table From a CTAS ..................................................................................... 373

Table of Contents Chapter 13 – Sub-query Functions ........................................................................................................................... 375 An IN List is much like a Subquery ....................................................................................................................... 376 An IN List Never has Duplicates – Just like a Subquery....................................................................................... 377 An IN List Ignores Duplicates ............................................................................................................................... 378 The Subquery ......................................................................................................................................................... 379 The Three Steps of How a Basic Subquery Works................................................................................................ 380 These are Equivalent Queries ................................................................................................................................ 381 The Final Answer Set from the Subquery.............................................................................................................. 382 Quiz- Answer the Difficult Question ..................................................................................................................... 383 Answer to Quiz- Answer the Difficult Question ................................................................................................... 384 Should you use a Subquery of a Join? ................................................................................................................... 385 Quiz- Write the Subquery ...................................................................................................................................... 386 Answer to Quiz- Write the Subquery..................................................................................................................... 387 Quiz- Write the More Difficult Subquery .............................................................................................................. 388 Answer to Quiz- Write the More Difficult Subquery ............................................................................................ 389 Quiz – Write the Extreme Subquery ...................................................................................................................... 390 Answer to Quiz – Write the Extreme Subquery .................................................................................................... 391 Quiz- Write the Subquery with an Aggregate........................................................................................................ 392 Answer to Quiz- Write the Subquery with an Aggregate ...................................................................................... 393 Quiz- Write the Correlated Subquery .................................................................................................................... 394 Answer to Quiz- Write the Correlated Subquery ................................................................................................... 395 The Basics of a Correlated Subquery ..................................................................................................................... 396 The Top Query always runs first in a Correlated Subquery .................................................................................. 397 Correlated Subquery Example vs. a Join with a Derived Table ............................................................................ 398 Quiz- A Second Chance to Write a Correlated Subquery ..................................................................................... 399 Answer - A Second Chance to Write a Correlated Subquery ................................................................................ 400 Quiz- A Third Chance to Write a Correlated Subquery ........................................................................................ 401 Answer - A Third Chance to Write a Correlated Subquery ................................................................................... 402 Quiz- Last Chance To Write a Correlated Subquery ............................................................................................. 403

Table of Contents Answer – Last Chance to Write a Correlated Subquery ........................................................................................ 404 Quiz – Write the Extreme Correlated Subquery .................................................................................................... 405 Answer To Quiz – Write the Extreme Correlated Subquery ................................................................................. 406 Quiz- Write the NOT Subquery ............................................................................................................................. 407 Answer to Quiz- Write the NOT Subquery ........................................................................................................... 408 Quiz- Write the Subquery using a WHERE Clause............................................................................................... 409 Answer - Write the Subquery using a WHERE Clause ......................................................................................... 410 Quiz – Write the Triple Subquery .......................................................................................................................... 411 Answer to Quiz – Write the Triple Subquery ........................................................................................................ 412 Quiz – How many rows return on a NOT IN with a NULL? ................................................................................ 413 Answer – How many rows return on a NOT IN with a NULL? ........................................................................... 414 How to handle a NOT IN with Potential NULL Values........................................................................................ 415 Using a Correlated Exists ....................................................................................................................................... 416 How a Correlated Exists matches up ..................................................................................................................... 417 The Correlated NOT Exists.................................................................................................................................... 418 The Correlated NOT Exists Answer Set ................................................................................................................ 419 Quiz – How many rows come back from this NOT Exists? .................................................................................. 420 Answer – How many rows come back from this NOT Exists? ............................................................................. 421 Chapter 14 – Window Functions OLAP ................................................................................................................... 423 The Row_Number Command ................................................................................................................................ 424 Quiz – How did the Row_Number Reset? ............................................................................................................. 425 Quiz – How did the Row_Number Reset? ............................................................................................................. 426 Using a Derived Table and Row_Number ............................................................................................................. 427 Ordered Analytics OVER ...................................................................................................................................... 428 RANK and DENSE RANK ................................................................................................................................... 429 RANK Defaults to Ascending Order ..................................................................................................................... 430 Getting RANK to Sort in DESC Order .................................................................................................................. 431 RANK() OVER and PARTITION BY .................................................................................................................. 432

Table of Contents Cumulative Sum ..................................................................................................................................................... 433 The ANSI CSUM – Getting a Sequential Number ................................................................................................ 434 Troubleshooting The ANSI OLAP on a GROUP BY ........................................................................................... 435 Reset with a PARTITION BY Statement .............................................................................................................. 436 PARTITION BY only Resets a Single OLAP not ALL of them........................................................................... 437 Sorting in DESC Order .......................................................................................................................................... 438 Moving Average..................................................................................................................................................... 439 Casting a Moving Average .................................................................................................................................... 440 Partition By Resets an ANSI OLAP ...................................................................................................................... 441 COUNT OVER for a Sequential Number ............................................................................................................. 442 Quiz – What caused the COUNT OVER to Reset? ............................................................................................... 443 Answer to Quiz – What caused the COUNT OVER to Reset? ............................................................................. 444 The MAX OVER Command.................................................................................................................................. 445 MAX OVER with PARTITION BY Reset ........................................................................................................... 446 MAX OVER Without Rows Unbounded Preceding ............................................................................................. 447 The MIN OVER Command ................................................................................................................................... 448 Quiz – Fill in the Blank .......................................................................................................................................... 449 Answer – Fill in the Blank ..................................................................................................................................... 450 How Ntile Works ................................................................................................................................................... 451 Ntile ........................................................................................................................................................................ 452 Ntile Continued ...................................................................................................................................................... 453 Ntile Percentile ....................................................................................................................................................... 454 Another Ntile Example .......................................................................................................................................... 455 Using Quartiles (Partitions of Four)....................................................................................................................... 456 NTILE Buckets ...................................................................................................................................................... 457 NTILE Using a Value of 10 ................................................................................................................................... 458 NTILE With a Partition.......................................................................................................................................... 459 Using LAG and LEAD........................................................................................................................................... 460 Using LEAD........................................................................................................................................................... 461

Table of Contents Using LEAD With and Offset of 2 ........................................................................................................................ 462 LEAD ..................................................................................................................................................................... 463 LEAD With Partitioning ........................................................................................................................................ 464 Using LAG ............................................................................................................................................................. 465 Using LAG With an Offset of 2 ............................................................................................................................. 466 LAG ........................................................................................................................................................................ 467 LAG with Partitioning............................................................................................................................................ 468 SUM(SUM(n)) ....................................................................................................................................................... 469 Chapter 15 - Working with Strings ........................................................................................................................... 471 The ASCII Function ............................................................................................................................................... 472 The CHAR Function .............................................................................................................................................. 473 The UNICODE Function ....................................................................................................................................... 474 The NCHAR Function ........................................................................................................................................... 475 The LEN Function.................................................................................................................................................. 476 The DATALENGTH Function ............................................................................................................................... 477 Concatenation ......................................................................................................................................................... 478 The RTRIM and LTRIM Command trims Spaces ................................................................................................ 479 The SUBSTRING Command................................................................................................................................. 480 Using SUBSTRING to move Backwards .............................................................................................................. 481 How SUBSTRING Works with a Starting Position of -1 ..................................................................................... 482 How SUBSTRING Works with an Ending Position of 0 ...................................................................................... 483 Concatenation and SUBSTRING........................................................................................................................... 484 SUBSTRING and Different Aliasing .................................................................................................................... 485 The LEFT and RIGHT Functions .......................................................................................................................... 486 Four Concatenations Together ............................................................................................................................... 487 The DATALENGTH Function and RTRIM.......................................................................................................... 488 A Visual of the TRIM Command Using Concatenation ........................................................................................ 489 CHARINDEX Function Finds a Letter(s) Position in a String ............................................................................. 490

Table of Contents The CHARINDEX Command is brilliant with SUBSTRING .............................................................................. 491 The CHARINDEX Command Using a Literal ...................................................................................................... 492 PATINDEX Function............................................................................................................................................. 493 PATINDEX Function to Find a Character Pattern ................................................................................................ 494 SOUNDEX Function to Find a Sound ................................................................................................................... 495 DIFFERENCE Function to Quantile a Sound ....................................................................................................... 496 The REPLACE Function ....................................................................................................................................... 497 LEN and REPLACE Functions for Number of Occurrences ................................................................................ 498 REPLICATE Function ........................................................................................................................................... 499 STUFF Function..................................................................................................................................................... 500 STUFF without Deleting Function ........................................................................................................................ 501 UPPER and lower Functions.................................................................................................................................. 502 Chapter 16 - Interrogating the Data ......................................................................................................................... 504 Quiz – What would the Answer be? ...................................................................................................................... 505 Answer to Quiz – What would the Answer be? ..................................................................................................... 506 The NULLIF Command ......................................................................................................................................... 507 Quiz – Fill in the Answers for the NULLIF Command ......................................................................................... 508 Answer– Fill in the Answers for the NULLIF Command ..................................................................................... 509 The COALESCE Command – Fill In the Answers ............................................................................................... 510 The COALESCE Answer Set ................................................................................................................................ 511 COALESCE is Equivalent to This CASE Statement ............................................................................................ 512 The Basics of CAST (Convert and Store).............................................................................................................. 513 Some Great CAST (Convert and Store) Examples ................................................................................................ 514 Some Great CAST (Convert and Store) Examples ................................................................................................ 515 A Rounding Example ............................................................................................................................................. 516 Quiz - CAST Examples .......................................................................................................................................... 517 Answer to Quiz - CAST Examples ........................................................................................................................ 518 Quiz - The Basics of the CASE Statements ........................................................................................................... 519

Table of Contents Answer to Quiz - The Basics of the CASE Statements ......................................................................................... 520 Using an ELSE in the Case Statement ................................................................................................................... 521 Using an ELSE as a Safety Net .............................................................................................................................. 522 Rules For a Valued Case Statement ....................................................................................................................... 523 Rules for a Searched Case Statement ..................................................................................................................... 524 Valued Case Vs. A Searched Case.......................................................................................................................... 525 Quiz - Valued Case Statement ............................................................................................................................... 526 Answer - Valued Case Statement........................................................................................................................... 527 Quiz - Searched Case Statement ............................................................................................................................ 528 Answer - Searched Case Statement ....................................................................................................................... 529 Quiz - When NO ELSE is present in CASE Statement ......................................................................................... 530 Answer - When NO ELSE is present in CASE Statement .................................................................................... 531 Quiz -When an Alias is NOT used in a CASE Statement ..................................................................................... 532 Answer -When an Alias is NOT used in a CASE Statement................................................................................. 533 Combining Searched Case and Valued Case ......................................................................................................... 534 A Trick for getting a Horizontal Case .................................................................................................................... 535 Nested Case ............................................................................................................................................................ 536 Put a CASE in the ORDER BY ............................................................................................................................. 537 Chapter 17 – Table Create and Data Types ............................................................................................................. 539 Creating a Database................................................................................................................................................ 540 Creating a Table that is a Heap .............................................................................................................................. 541 Heap Page ............................................................................................................................................................... 542 Extents .................................................................................................................................................................... 543 Creating a Table That Has a Clustered Index ........................................................................................................ 544 Clustered Index Page.............................................................................................................................................. 545 When Do I Create a Clustered Index? ................................................................................................................... 546 B-Trees ................................................................................................................................................................... 547 The Building of a B-Tree for a Clustered Index (1 of 3) ....................................................................................... 548

Table of Contents The Building of a B-Tree for a Clustered Index (2 of 3) ....................................................................................... 549 The Building of a B-Tree for a Clustered Index (3 of 3) ....................................................................................... 550 The Row Offset Array is the Guidance System For Every Row ........................................................................... 551 The Row Offset Array Provides Two Search Options (1 of 2) ............................................................................. 552 The Row Offset Array Provides Two Search Options (2 of 2) ............................................................................. 553 The Row Offset Array Helps With Inserts ............................................................................................................ 554 What is a Uniquefier?............................................................................................................................................. 555 Adding an Index ..................................................................................................................................................... 556 When Do I Create a Non Clustered Index? ........................................................................................................... 557 B-Tree for Non Clustered Index on a Clustered Table (1 of 2) ............................................................................. 558 B-Tree for Non Clustered Index on a Clustered Table (2 of 2) ............................................................................. 559 Adding a Non Clustered Index To A Heap ............................................................................................................ 560 B-Tree for Non Clustered Index on a Heap Table (1 of 2) .................................................................................... 561 B-Tree for a Non Clustered Index on a Heap Table (2 of 2) ................................................................................. 562 Default Values ........................................................................................................................................................ 563 Chapter 18 – View Functions ................................................................................................................................... 565 The Fundamentals of Views .................................................................................................................................. 566 Creating a Simple View to Restrict Sensitive Columns ........................................................................................ 567 Creating a Simple View to Restrict Rows ............................................................................................................. 568 Basic Rules for Views ............................................................................................................................................ 569 Two Exceptions to the ORDER BY Rule inside a View ....................................................................................... 570 Views sometimes CREATED for Row Security ................................................................................................... 571 Creating a View to Join Tables Together............................................................................................................... 572 You Select From a View ........................................................................................................................................ 573 Another Way to Alias Columns in a View CREATE ............................................................................................ 574 The Standard Way Most Aliasing is done ............................................................................................................. 575 What Happens When Both Aliasing Options Are Present .................................................................................... 576 Resolving Aliasing Problems in a View CREATE ............................................................................................... 577

Table of Contents Answer to Resolving Aliasing Problems in a View CREATE .............................................................................. 578 Aggregates on View Aggregates............................................................................................................................ 579 Altering a Table...................................................................................................................................................... 580 Altering a Table after a View has been Created .................................................................................................... 581 A View that Errors After an ALTER ..................................................................................................................... 582 Troubleshooting a View ......................................................................................................................................... 583 Loading Data through a View ................................................................................................................................ 584 Chapter 19 – Data Manipulation Language (DML) ................................................................................................. 586 INSERT Syntax # 1 ................................................................................................................................................ 587 INSERT Example with Syntax 1 ........................................................................................................................... 588 INSERT Syntax #2 ................................................................................................................................................. 589 INSERT Example with Syntax 2 ........................................................................................................................... 590 INSERT/SELECT Command ................................................................................................................................ 591 INSERT/SELECT Example using All Columns (*) .............................................................................................. 592 INSERT/SELECT Example with Less Columns ................................................................................................... 593 The UPDATE Command Basic Syntax ................................................................................................................. 594 Two UPDATE Examples ....................................................................................................................................... 595 Subquery UPDATE Command Syntax .................................................................................................................. 596 Example of Subquery UPDATE Command .......................................................................................................... 597 Join UPDATE Command Syntax .......................................................................................................................... 598 Example of an UPDATE Join Command .............................................................................................................. 599 The DELETE Command Basic Syntax .................................................................................................................. 600 Two DELETE Examples to DELETE ALL Rows in a Table ............................................................................... 601 To DELETE or to TRUNCATE ............................................................................................................................ 602 A DELETE Example Deleting only Some of the Rows ........................................................................................ 603 Subquery and Join DELETE Command Syntax .................................................................................................... 604 Example of Subquery DELETE Command ........................................................................................................... 605 MERGE INTO ....................................................................................................................................................... 606

Table of Contents MERGE INTO ....................................................................................................................................................... 607 Chapter 20 – Set Operators Functions ...................................................................................................................... 609 Rules of Set Operators ........................................................................................................................................... 610 INTERSECT Explained Logically......................................................................................................................... 611 INTERSECT Explained Logically......................................................................................................................... 612 UNION Explained Logically ................................................................................................................................. 613 UNION Explained Logically ................................................................................................................................. 614 UNION ALL Explained Logically ........................................................................................................................ 615 UNION ALL Explained Logically ........................................................................................................................ 616 EXCEPT Explained Logically ............................................................................................................................... 617 EXCEPT Explained Logically ............................................................................................................................... 618 Another EXCEPT Example ................................................................................................................................... 619 EXCEPT Explained Logically in Reverse Order................................................................................................... 620 An Equal Amount of Columns in both SELECT List ........................................................................................... 621 Columns in the SELECT list should be from the same Domain ........................................................................... 622 The Top Query handles all Aliases ........................................................................................................................ 623 The Bottom Query does the ORDER BY .............................................................................................................. 624 Great Trick: Place your Set Operator in a Derived Table..................................................................................... 625 UNION Vs UNION ALL ....................................................................................................................................... 626 Using UNION ALL and Literals ........................................................................................................................... 627 A Great Example of how EXCEPT works ............................................................................................................ 628 USING Multiple SET Operators in a Single Request............................................................................................ 629 Changing the Order of Precedence with Parentheses ............................................................................................ 630 Building Grouping Sets Using UNION ................................................................................................................. 631 Three Grouping Sets Using a UNION ................................................................................................................... 632 Chapter 21 – Stored Procedure Functions ................................................................................................................ 634 Creating a Stored Procedure .................................................................................................................................. 635

Table of Contents Executing a Stored Procedure ................................................................................................................................ 636 There are Three Ways to Execute a Stored Procedure .......................................................................................... 637 Creating a Stored Procedure with a CASE Statement ........................................................................................... 638 Our Answer Set ...................................................................................................................................................... 639 Dropping a Stored Procedure ................................................................................................................................. 640 Passing an Input Parameter to a Stored Procedure ................................................................................................ 641 Executing With Positional Parameter vs. Named Parameters ............................................................................... 642 Passing an Output Parameter to a Stored Procedure .............................................................................................. 643 Changing a Stored Procedure with an ALTER ...................................................................................................... 644 Answer Set for the Altered Stored Procedure ........................................................................................................ 645 Using a Stored Procedure to Delete a Row ............................................................................................................ 646 A Different Method to Delete a Row ..................................................................................................................... 647 Deleting a Row Using an Input Parameter ............................................................................................................ 648 Using Loops in Stored Procedures ......................................................................................................................... 649 Stored Procedure Workshop .................................................................................................................................. 650 Looping with a WHILE Statement ........................................................................................................................ 651 Chapter 22 – Statistical Aggregate Functions........................................................................................................... 653 The Stats Table ....................................................................................................................................................... 654 Above, is the Stats_Table data in which we will use in our statistical examples. ................................................. 654 The VAR and VARP Functions ............................................................................................................................. 655 A VAR Example .................................................................................................................................................... 656 A VARP Example .................................................................................................................................................. 657 The STDEV and STDEVP Functions .................................................................................................................... 658 A STDEV Example ................................................................................................................................................ 659 A STDEVP Example.............................................................................................................................................. 660 Chapter 23 – Systems Views .................................................................................................................................... 662 System Views ......................................................................................................................................................... 663

Table of Contents sys.all_columns ...................................................................................................................................................... 664 sys.all_objects ........................................................................................................................................................ 665 sys.all_sql_modules ............................................................................................................................................... 666 sys.all_views .......................................................................................................................................................... 667 sys.columns ............................................................................................................................................................ 668 sys.data_spaces....................................................................................................................................................... 669 sys.database_files ................................................................................................................................................... 670 sys.database_principals .......................................................................................................................................... 671 sys.database_role_members ................................................................................................................................... 672 sys.databases .......................................................................................................................................................... 673 sys.filegroups.......................................................................................................................................................... 674 sys.identity_columns .............................................................................................................................................. 675 sys.objects .............................................................................................................................................................. 676 sys.partition_range_values ..................................................................................................................................... 677 sys.schemas ............................................................................................................................................................ 678 sys.server_role_members ....................................................................................................................................... 679 sys.sql_logins ......................................................................................................................................................... 680 Chapter 24 – Nexus ................................................................................................................................................... 682 Nexus is Now Available on the Microsoft Azure Cloud ....................................................................................... 683 Nexus Queries Every Major System ...................................................................................................................... 684 Setup of Nexus is as easy as pie ............................................................................................................................. 685 Setup of Nexus is a Easy as 1, 2, 3 ........................................................................................................................ 686 Nexus Data Visualization ....................................................................................................................................... 687 Nexus Data Visualization ....................................................................................................................................... 688 Nexus Data Visualization Shows What Tables Can Be Joined ............................................................................. 689 Nexus is doing a Five-Table Join ........................................................................................................................... 690 Nexus Generates the SQL Automatically .............................................................................................................. 691 Nexus Delivers the Report ..................................................................................................................................... 692

Table of Contents Cross-System Joins from Teradata, Oracle and SQL Server ................................................................................. 693 The Tab of the Super Join Builder ......................................................................................................................... 694 The 9 Tabs of the Super Join Builder – Objects Tab 1 .......................................................................................... 695 Selecting Columns in the Objects Tab ................................................................................................................... 696 The 9 Tabs of the Super Join Builder – Columns Tab 2........................................................................................ 697 Removing Columns from the Report in the Columns Tab .................................................................................... 698 The 9 Tabs of the Super Join Builder – Sorting Tab 3 .......................................................................................... 699 The 9 Tabs of the Super Join Builder – Joins Tab 4 .............................................................................................. 700 The 9 Tabs of the Super Join Builder – Where Tab 5 ........................................................................................... 701 Using the WHERE Tab For Additional WHERE or AND .................................................................................... 702 The 9 Tabs of the Super Join Builder – SQL Tab 6............................................................................................... 703 The 9 Tabs of the Super Join Builder – Answer Set Tab 7 ................................................................................... 704 The 9 Tabs of the Super Join Builder – Analytics Tab 9 ....................................................................................... 705 Analytics Tab ......................................................................................................................................................... 706 Analytics Tab – OLAP Example ........................................................................................................................... 707 Analytics Tab – OLAP Example of SQL Generated ............................................................................................. 708 Analytics Tab – Grouping Sets Example ............................................................................................................... 709 Analytics Tab – Grouping Sets Answer Set .......................................................................................................... 710 Nexus Data Movement ........................................................................................................................................... 711 Moving a Single Table To a Different System ...................................................................................................... 712 The Single Table Data Movement Screen ............................................................................................................. 713 Moving an Entire Database To a Different System ............................................................................................... 714 The Database Mover Screen .................................................................................................................................. 715 The Database Mover Options Tab ......................................................................................................................... 716 Converting DDL Table Structures ......................................................................................................................... 717 Converting DDL Table Structures ......................................................................................................................... 718 Converting DDL Table Structures ......................................................................................................................... 719 Compare and Synchronize ..................................................................................................................................... 720 Compare Two Different Databases From Different Systems ................................................................................ 721

Table of Contents Comparisons Down to the Column Level .............................................................................................................. 722 The Results Tab...................................................................................................................................................... 723 View Differences ................................................................................................................................................... 724 Synchronizing Differences In the Results Tab ...................................................................................................... 725 Synchronizing Differences In the Results Tab ...................................................................................................... 726 Hound Dog Compression ....................................................................................................................................... 727 Hound Dog Compression on Teradata ................................................................................................................... 728 Hound Dog Compression on Teradata ................................................................................................................... 729

Chapter 1

Introduction to the Azure SQL Data Warehouse

Chapter 1

Introduction

Chapter 1 – Introduction to the Azure SQL Data Warehouse

“The man who has no imagination has no wings.” – Muhammad Ali

Page 1

Chapter 1

Introduction

Introduction to the Family of SQL Server Products Microsoft SQL Server Compact 4.0 Microsoft SQL Server Compact 4.0 is a compact database that is embedded inside Nexus and other desktops around the world. It is ideal for also embedding in web applications. SQL Server Compact 4.0 provides developers a common programming model with other SQL Server editions. This is important for developing both native and managed applications. SQL Server Compact provides outstanding flexibility, but in a small footprint.

SQL Server 2014 Express Edition Microsoft provides this for free! This powerful database engine is perfect for embedded applications or for redistribution with other solutions. Independent software vendors (ISV's) use it to build desktop applications. If you need support for greater than 10 GB databases, SQL Server Express is compatible with other editions of SQL Server.

SQL Server Standard Edition Microsoft's robust data management and business intelligence database is ideal for departments and small workgroups. It supports common development tools for both on premise and cloud applications. This edition enables effective database management with minimal IT resources and it is compatible with other editions.

Above, are the first three offerings from Microsoft on SQL Server. Page 2

Chapter 1

Introduction

Introduction to the Family Continued Microsoft SQL Server Web Edition Microsoft's Web edition is a low total-cost-of-ownership option to host Web applications that provides scalability, affordability, and manageability capabilities for small to large scale Web initiatives.

SQL Server 2014 Business Intelligence Edition Microsoft's Business Intelligence edition is for the BI intelligence community and delivers a comprehensive platform. This empowers organizations to build and deploy secure, scalable and manageable BI solutions. It has browser based data exploration and visualization, plus includes powerful integration capabilities.

SQL Server 2014 Enterprise Edition Microsoft's SQL Server 2014 Enterprise edition delivers high-end datacenter capabilities with performance that has been enhanced for virtualization, business intelligence and integration capabilities. This enables high service levels for mission-critical workloads and end user access to data insights.

Above, are the next three offerings from Microsoft on SQL Server. Page 3

Chapter 1

Introduction

Microsoft Azure SQL Data Warehouse Azure SQL Data Warehouse Microsoft's Azure SQL Data Warehouse is a massively parallel processing (MPP) data warehousing appliance built for any volume of relational data and provides integration to Hadoop. Azure SQL Data Warehouse can provide up to 100x performance gains over other SQL Server platforms. This is the MPP platform that provides linear scalability for when data volumes grow and the number of users increases.

Azure SQL Data Warehouse is designed to parallelize and distribute the processing across multiple Symmetric Multi-Processing (SMP ) compute nodes. Azure SQL Data Warehouse is only available as part of Microsoft’s Analytics Platform System (APS) appliance. Azure SQL Data Warehouse is a shared-nothing architecture, which means each processor has its own operating system, memory and set of disks. Nothing is shared! Data is “horizontally partitioned” across nodes. This means that each node has a subset of the rows from each table in the database. Each node is then responsible for processing only the rows on its own disks. Above is the information about Microsoft's Azure SQL Data Warehouse, which is Microsoft's MPP system

Page 4

Chapter 1

Introduction

Symmetric Multi-Processing (SMP) CPU

CPU

CPU

CPU

Cache

Cache

Cache

Cache

Bus

Shared Memory Disk I/O

A Symmetric Multi-Processing system has multiple processors for extra power, but these processors share a single operating system, memory pool and they share access to the disks. This is a great architecture for speed, similar to a restaurant that is quick and organized, but it lack the ability for unlimited expansion. When there are too many cooks in the kitchen you need an MPP system that scales many SMP systems together as one parallel processing data warehouse.

A Symmetric Multi-Processing (SMP) system is what Microsoft is known for in their SQL Server suite of products. The only product that does not use SMP design is the new Azure SQL Data Warehouse. It uses a Massively Parallel Design (MPP).

Page 5

Chapter 1

Introduction

What is Parallel Processing? “After enlightenment, the laundry” - Zen Proverb

Tera-Tom’s Parallel Processing Wash and Dry

Tera-Tom’s Parallel Processing Wash and Dry

Tera-Tom’s Parallel Processing Wash and Dry

Tera-Tom’s Parallel Processing Wash and Dry

Tera-Tom’s Parallel Processing Wash and Dry

“After parallel processing the laundry, enlightenment!” - Azure SQL Data Warehouse Zen Proverb

Two guys were having fun on a Saturday night when one said, “ I’ve got to go and do my laundry.” The other said, “What?!” The man explained that if he went to the laundromat the next morning, he would be lucky to get one machine and then would be there all day. But, if he went on Saturday night he could get all the machines and he could do all his wash and dry in two hours. Now that’s parallel processing mixed in with a little dry humor! Page 6

Chapter 1

Introduction

The Basics of a Single Computer CPU

Memory How are we doing on orders today?

Orders Order_No 100 200 300 400

Customer_No

Order_Date

21345679 32456733 31323134 87323456

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Order_Total 12347.53 8005.91 5111.47 15231.62

How would I know? I'm just a disk. I need to transfer the block of data to the memory, and that is a slow process.

“When you are courting a nice girl, an hour seems like a second. When you sit on a red-hot cinder, a second seems like an hour. That’s relativity.”

– Albert Einstein

Data on disk does absolutely nothing. When data is requested, the computer moves the data one block at a time from disk into memory. Once the data is in memory, it is processed by the CPU at lightning speed. All computers work this way. The "Achilles Heel" of every computer is the slow process of moving data from disk to memory. The real theory of relativity is to find out how to get blocks of data from the disk into memory faster! Page 7

Chapter 1

Introduction

Data in Memory is Fast as Lightning CPU

Memory Order_No 100 200 300 400

Customer_No

Order_Date

21345679 32456733 31323134 87323456

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Order_Total 12347.53 8005.91 5111.47 15231.62

Orders Order_No 100 200 300 400

Customer_No

Order_Date

21345679 32456733 31323134 87323456

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Order_Total 12347.53 8005.91 5111.47 15231.62

“You can observe a lot by watching.” – Yogi Berra

Once the data block is moved off of the disk and into memory, the processing of that block happens as fast as lightning. It is the movement of the block from disk into memory that slows down every computer. Data being processed in memory is so fast that even Yogi Berra couldn't catch it! Page 8

Chapter 1

Introduction

Parallel Processing of Data Distribution

Distribution

Memory

Memory

Order_Date

Order_Total

Cust_No

Order_Date

Order_Total

21345679 32456733 31323134 87323456

01/01/2013 01/01/2013 01/01/2013 01/01/2013

12347.53 8005.91 5111.47 15231.62

34345699 41456543 51323154 67823486

01/01/2013 01/01/2013 01/01/2013 01/01/2013

13347.51 13005.91 7611.57 11671.92

Orders 21345679 32456733 31323134 87323456

Cust_No

Order_Date

87945679 98756733 35623134 97873456

Orders

Order_Date

Order_Total

Cust_No

01/01/2013 01/01/2013 01/01/2013 01/01/2013

12347.53 8005.91 5111.47 15231.62

34345699 41456543 51323154 67823486

Order_Date

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Distribution

Memory

Cust_No

Cust_No

Distribution

Memory

Order_Total

Cust_No

Order_Date

Order_Total

8347.53 17005.91 3451.47 19871.62

44445679 32547733 57497134 87768956

01/01/2013 01/01/2013 01/01/2013 01/01/2013

12447.53 8055.66 5651.47 231.62

Order_Total

Cust_No

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Orders

Order_Total 13347.51 13005.91 7611.57 11671.92

Cust_No 87945679 98756733 35623134 97873456

Order_Date 01/01/2013 01/01/2013 01/01/2013 01/01/2013

Orders 8347.53 17005.91 3451.47 19871.62

44445679 32547733 57497134 87768956

Order_Date 01/01/2013 01/01/2013 01/01/2013 01/01/2013

Order_Total 12447.53 8055.66 5651.47 231.62

"If the facts don't fit the theory, change the facts." -Albert Einstein

Big Data is all about parallel processing. Parallel processing is all about taking the rows of a table and spreading them among many parallel processing units. Above, we can see a table called Orders. There are 16 rows in the table. Each parallel processor holds four rows. Now they can process the data in parallel and be four times as fast. What Albert Einstein meant to say was, “If the theory doesn't fit the dimension table, change it to a fact."

Page 9

Chapter 1

Introduction

A Table has Columns and Rows Emp_No Dept_No First_Name 100 1001 Rafael 200 1002 Maria 300 1003 Charl 400 1004 Kyle 400 1005 Rob 300 1006 Inna 200 1007 Sushma 100 1008 Mo 300 1009 Mo Distribution

Distribution

Last_Name Salary Minal 90000 Gomez 80000 Kertzel 70000 Stover 60000 Rivers 50000 Kinski 50000 Davis 50000 Khan 60000 Swartz 70000 Distribution

Employee_Table 1001 100 Rafael

Employee_Table Employee_Table 80000 Maria Gomez Minal 90000 1002 200 1003 300 Charl Kertzel 70000

1004 400 Kyle

Stover 60000 1005 400 Rob

1007 200 Sushma Davis 50000 1008 100 Mo

Rivers 50000 1006 300 Inna Kinski 50000 Khan

60000 1009 300 Mo Swartz 70000

The table above has 9 rows. Our small system above has three parallel processing units called distributions. Each distribution holds three rows. There are eight distributions per node. A four node system will have 32 distributions. Double your nodes and double your speed and power. The idea of parallel processing is to take the rows of a table and spread them across the distributions so each distribution can process their portion of the data in parallel. Page 10

Chapter 1

Introduction

The Azure SQL Data Warehouse has Linear Scalability MPP Engine

Infiniband Network

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

Infiniband Network

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

"A Journey of a thousand miles begins with a single step ."

- Lao Tzu

The Azure SQL Data Warehouse was born to be parallel. With each query, a single step is performed in parallel by each distribution. An Azure SQL Data Warehouse system consists of a series of distributions that will work in parallel to store and process your data. This design allows you to start small and grow infinitely. If your Azure SQL Data Warehouse system provides you with an excellent Return On Investment (ROI), then continue to invest by purchasing more nodes (adds additional Distributions). Most companies start small, but after seeing what an Azure SQL Data Warehouse can do, they continue to grow their ROI from the single step of implementing an Azure SQL Data Warehouse system to millions of dollars in profits. Double your compute nodes and double your speeds….Forever. The Azure SQL Data Warehouse actually provides a journey of a thousand smiles! Page 11

Chapter 1

Introduction

The Architecture of the Azure SQL Data Warehouse The MPP Engine manages the distribution of data and builds the plan for the nodes to follow.

MPP Engine

Azure SQL Data Warehouse Node 1 D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

Azure SQL Data Warehouse Node n D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

“Be the change that you want to see in the world.” - Mahatma Gandhi

The MPP Engine is the brains behind the entire operation. The user logs into the MPP Engine, and for each SQL query, the MPP Engine will come up with a plan to retrieve the data. It passes that compiled plan to each compute node, and each of 8 Distributions process their portion of the data. If the data is spread evenly, parallel processing works perfectly. This technology is relatively inexpensive. It might not "be the change", but it will help your company "keep the change" because costs are low. Microsoft's Azure SQL Data Warehouse uses both SMP and MPP technology. Each node is an SMP, but then many nodes are lined together to become one big MPP system. Page 12

Chapter 1

Introduction

Nexus is now Available on the Microsoft Azure Cloud

Why the Nexus Chameleon should be your query tool of choice: 1) Queries every major system 2) Provides visualization and automatically writes the SQL 3) Can perform cross-system joins with a few clicks of the mouse 4) Converts table structures and moves the table and data between systems 5) Compares and synchronizes databases 6) Can move an entire database of tables or views between systems 7) Has the "Garden of Analysis" to re-query answer sets inside your PC 8) Provides a dashboard of graphs and charts for answer sets

Download the Nexus for a free trial at www.CoffingDW.com and use Nexus in-house or on the Microsoft Azure cloud. Page 13

Chapter 1

Introduction

The MPP Engine is the Optimizer Control Rack Management Node Active/Passive

The control node receives all queries and then the MPP Engine (Optimizer):

Control Node Active/Passive User Queries

1.

Parses the SQL text.

2.

Validates and authorizes that all objects exist and that the user has the right access rights.

3.

Builds a plan for the nodes to follow.

4.

Runs the MPP execution plan by executing SQL SELECT commands in parallel on each compute node.

5.

Gathers and merges all the parallel result sets from the compute nodes.

6.

Returns a single result set to the client.

SQL SQL

Landing Zone

Backup Node

The brains behind all user queries lie in the MPP Engine. The MPP Engine receives the query, checks the syntax and the security and then comes up with a plan for the nodes to follow. Page 14

Chapter 1

Introduction

The Azure SQL Data Warehouse System Control Rack

Data Rack

Management Node Active/Passive

Active Server

Dedicated Storage SQL

SQL

Control Node Active/Passive User Queries

SQL

SQL

SQL SQL

SQL

Landing Zone SQL

Data Loading

SQL

Backup Node Passive Server SQL

Data Backup

Dual Infiniband

Dual Fibre Channel

Above, is a pictorial of a Azure SQL Data Warehouse system. There is one Control Rack and many Data Racks.

Page 15

Chapter 1

Introduction

The Azure SQL Data Warehouse System is Scalable Control Rack

Data Rack

Management

Active

Storage

Data Rack

Active

Storage

Control Node

Landing Zone

Backup Node

Passive

Passive

The Azure SQL Data Warehouse will take up at least two full racks of space, and you can add storage and compute capacity one data rack at a time. A data rack will contain between 8 to 10 compute servers. A great asset about the Azure SQL Data Warehouse is that it works on a wide variety of hardware. Vendors such as Bull, Dell, HP, and IBM provide the hardware, and Fibre Channel storage arrays come from vendors like EMC, HP, and IBM. The control node controls the physical servers and guides them to work together, in parallel. It is the control node that acts as the optimizer and it accepts client query requests, and then creates the plan. It will then call upon one or more compute nodes to execute different parts of the query, often in parallel. The result set is then sent back to the user. Page 16

Chapter 1

Introduction

The Control Node Control Rack

The Azure SQL Data Warehouse control node allows the users to connect and query the Azure SQL Data Warehouse database.

Management Node Active/Passive

It is the control node that comes up with a parallel plan for the nodes to follow to retrieve query results. The control node has an instance of the SQL Server 2014 database for storing metadata.

Control Node Active/Passive SQL SQL

Landing Zone

Backup Node

The control node is also responsible for all intermediate query results in TempDB. The control node receives the results of intermediate query results from multiple compute nodes and store those results in SQL Server temporary tables, then merges those results into a single result set for final delivery to the client. The control node is an active/passive cluster server. Plus, there's a spare compute node for redundancy and failover capability.

Think of the control node as the optimizer, or a conductor in the Azure SQL Data Warehouse orchestra of servers. Page 17

Chapter 1

Introduction

The Data Rack Data Rack Active Server

Dedicated Storage SQL

The data rack of the Azure SQL Data Warehouse contains 8 to 10 compute nodes along with their related storage nodes, depending on the hardware vendor.

SQL

SQL

SQL

Each compute node is a physical server that runs a standalone SQL Server 2014 relational engine instance.

SQL

SQL

SQL

Passive Server SQL

Dual Infiniband

Dual Fibre Channel

The storage nodes are Fibre Channel-connected storage arrays containing 10 to 12 disk drives.

Above, is a pictorial of an Azure SQL Data Warehouse Data Rack. This is where the data is stored and the parallel processing magic occurs. Page 18

Chapter 1

Introduction

The Landing Zone Control Rack Management Node Active/Passive

This Landing Zone node is used to load data directly to the Azure SQL Data Warehouse. The load utility named dwloader is used for highspeed parallel loading of large data files into databases.

Control Node Active/Passive SQL SQL

Landing Zone

The brilliance of this design is that there is minimal impact to concurrent queries executing on the Azure SQL Data Warehouse. With this utility, data from a disk or SQL Server Integration Services (SSIS) pipeline can be loaded, in parallel, to all compute nodes.

Backup Node

This Landing Zone node is used to load data directly to the Azure SQL Data Warehouse. Page 19

Chapter 1

Introduction

The Backup Node Control Rack Management Node Active/Passive

Control Node Active/Passive SQL SQL

Landing Zone

The Azure SQL Data Warehouse backup node is used for backing up user databases. These databases are physically spread across all compute nodes and their related storage nodes. When backing up a user database, each compute node backs up, in parallel, its portion of the database.

Backup Node

The Azure SQL Data Warehouse Backup node is used for backing up user databases.

Page 20

Chapter 1

Introduction

Software as a Service (SaaS) and the Elastic Database MPP Engine

Infiniband Network

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

Infiniband Network

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

D I S T

Software-as-a-service (SaaS) applications need versatility and flexibility and need the ability to focus on end-user solutions and not have to worry about managing databases, schemas and sizing requirements. The Microsoft Azure SQL Data Warehouse is designed to provide flexibility for the growth needed at the time. Different workloads that continually change over time have unpredictable database resource consumption. That is why the elastic database model provides users with the ability to pool resources to be leveraged among a single or groups of databases. The idea of using the resources you need, when you need them, and not having to worry about provisioning and predicting the unpredictable is priceless. The most important thing a system can do for its users is to give them the space they need. Page 21

Chapter 1

Introduction

Azure Data Lake Azure SQL Data Warehouse

Traditional on-premise systems

Structured Data Azure Data Lake

HDFS File System SQL Server Cloud systems

Structured Data

Hadoop

Semi-Structured Data

Unstructured Data Hortonworks

Cloudera

The future of computing is data! Not data on any particular system or structure, but simply data that resides anywhere in the enterprise. That is why Microsoft has created the Azure SQL Data Warehouse to store relational data in the cloud and the Azure SQL Data Lake unstructured data. A data lake is comprised of raw data that can be all types of data in its native format. This includes all data types and can consist of traditional structured data, unstructured data and semi-structured data. The Azure Data Lake is a data store for big data analytics. This great idea gives users the ability to have the best of all data worlds, thus mixing on-premise traditional systems with Hadoop HDFS file systems. In the Azure data lake, there are plenty of fish in the sea because this lake can store every type of data with no fixed limits on account size or file size. The Azure Data Lake is a Hadoop File System compatible with HDFS. It is integrated with Azure HDInsight , Revolution-R Enterprise, Hortonworks and Cloudera. Page 22

Chapter 1

Introduction

Azure Disaster Recovery Node 1

Node n

Order_Table Sales_Table Student_Table

Order_Table Sales_Table Student_Table

Course_Table Cust_Table Claims_Table



Course_Table Cust_Table Claims_Table

The Microsoft Azure cloud provides data availability with built-in replicas and a competitive 99.99% Service Level Agreement at the database level. Instead of worrying about a disaster, you can count on an estimated 360x lower disaster recovery objectives. Microsoft uses something called active geo-replication, which gives users the ability to create up to 4 readable secondaries in any Microsoft Azure region, and additionally they give users control when and where to failover. Currently, users have up to 35 days of backups available for recovery. Page 23

Chapter 1

Introduction

Security and Compliance Data Rack

Active

Storage

Stop! Who goes there?

Passive

The Microsoft Azure cloud provides security and compliance-related tasks through a wide variety of features. Users can implement database level security such as database auditing, views, row-level security, data masking and encryption. And of course Microsoft Azure has independently verified cloud security and compliance through key cloud auditors as part of the scope of key Azure compliance certifications and approvals such as HIPAA BAA, E.U. Model Clauses, ISO/IEC 27001:2005 and FedRAMP. Some companies actually feel safer on the cloud than in their own on-premises data centers. Page 24

Chapter 1

Introduction

How to Get an EXPLAIN Plan To EXPAIN a Query just: 1. Type EXPLAIN 2. Or Press F6 3. Or click the Magnifying Glass on Nexus

EXPLAIN SELECT * FROM Employee_Table;

This EXPLAIN plan shows we are utilizing a system with 2 nodes on a table that has spread the rows across 16 distributions

SELECT * FROM Employee_Table

SELECT [T1_1].[Employee_No] AS [Employee_No], [T1_1].[Dept_No] AS [Dept_No], [T1_1].[Last_name] AS [Last_name], [T1_1].[First_name] AS [First_name], [T1_1].[Salary] AS [Salary] FROM [SQL_CLASS].[dbo].[Employee_table] AS T1_1

You can get an explain by placing the keyword EXPLAIN in front of any SQL. You can also hit the function key 6 (F6). If you are using the Nexus Chameleon you can click on the magnifying glass near the EXECUTE button.

Page 25

Chapter 2

Page 26

The Azure SQL Data Warehouse Table Structures

Chapter 2

The Azure SQL Data Warehouse Table Structures

Chapter 2 – The Azure SQL Data Warehouse Table Structures

“Let me once again explain the rules. The Azure SQL Data Warehouse Rules!” - Tera-Tom Coffing

Page 27

Chapter 2

The Azure SQL Data Warehouse Table Structures

The 5 Concepts of Azure SQL Data Warehouse Tables 1. Tables are either Distributed by Hash or Replicated

2. The rows of a table are either sorted or unsorted 3. Tables are stored physically on disk in either a row or columnar Format 4. Tables can be partitioned 5. Tables are either permanent, temporary or external Tables

Above, are some basics about concepts for Azure SQL Data Warehouse tables. The next five pages will cover each point one at a time. This will allow you to see exactly what is going on immediately. Page 28

Chapter 2

The Azure SQL Data Warehouse Table Structures

Tables are Either Distributed by Hash or Replicated (1 of 5) Distribution

Distribution

Distribution

Memory

Memory

Memory

Hashed Each Distribution holds different rows. Each row is hashed by the values in a certain column, such as Employee_No

1 4 7 11

Joel Davis Rick Jahns Lynn Meyer Seth Rogers

2 5 8 12

Mary Lewis John Miller Rich Jones Kyle Watson

3 Tony Brady 6 Lana Payne 9 Lorie Stewart 13 Dawn Daily

Replicated Each Node holds all rows of a table. The table is literally duplicated on each and every node.

100 200 300 400

Sales Marketing Finance HR

100 200 300 400

Sales Marketing Finance HR

100 200 300 400

Sales Marketing Finance HR

The Azure SQL Data Warehouse gives you two choices for table distribution. These choices are either Hash or Replicated. Large fact tables are usually hashed and smaller tables are usually replicated. When a table is hashed, one of the columns is chosen as the distribution key. In our example above, the Employee_Table (top) is hashed by the Employee_No. The Replicated table (bottom) only has four rows in it and all four rows are on each Node. Page 29

Chapter 2

The Azure SQL Data Warehouse Table Structures

Table Rows are Either Sorted or Unsorted (2 of 5) This table is sorted because it was created with a Clustered Index on Employee_No

Sorted

Employee_No Dept_No Last_Name

First_Name

Salary

1001

100

Rafael

Minal

90000

1004

400

Kyle

Stover

60000

1007

200

Sushma

Davis

50000

1020

200

May

Jones

60000

This table is unsorted (heap) because it was NOT created with a Clustered Index

Not Sorted

Employee_No Dept_No Last_Name

First_Name

Salary

1001 1007

100 200

Rafael Sushma

Minal Davis

90000 50000

1020

200

May

Jones

60000

1004

400

Kyle

Stover

60000

The rows of a table are either sorted or unsorted. If the table has a clustered index it is sorted, but if it does not have a clustered index then it is unsorted, which is referred to as a heap. You can only have one clustered index per table because you can only sort a table one way. Sorting has nothing to do with a distribution key or a replicated table, but once the rows are placed on a distribution they are then either sorted (clustered index) or unsorted (heap). Page 30

Chapter 2

The Azure SQL Data Warehouse Table Structures

Tables are Stored in Either Row or Columnar Format (3 of 5) Distribution

Distribution

Distribution

Memory

Memory

Memory

Employee_Row_Based

Employee_Row_Based

Employee_Row_Based

Employee_Columnar

Employee_Columnar

Employee_Columnar

A table is stored in either a row format or a columnar format. Traditionally, most systems have always stored the rows of a table in a row format (row store). When a query is run on the table the entire block of rows must be moved from disk into memory, where they are processed. This works well when all columns (or most columns) are needed to satisfy the query. Modern designs of computer systems will often now include a column format (column store). This works extremely well on queries that don't need all columns (or most columns) to satisfy the query, such as analytics, aggregations, etc. Only the columns needed will then be transferred from disk into memory. The Azure SQL Data Warehouse gives you a choice. Page 31

Chapter 2

The Azure SQL Data Warehouse Table Structures

Tables can be Partitioned (4 of 5) CREATE TABLE Ord_Tbl_Part ( Order_Number integer ,Customer_Number integer ,Order_Date date ,Order_Total decimal(10,2)) WITH ( DISTRIBUTION = HASH (Order_Number), PARTITION ( Order_Date RANGE RIGHT FOR VALUES ( '2015-01-01','2015-02-01','2015-03-01','2015-04-01', '2015-05-01','2015-06-01','2015-07-01','2015-08-01' ,'2015-09-01','2015-10-01','2015-11-01','2015-12-01' )));

Distribution 1

Distribution 2

Distribution 3

Distribution 4

Ord_Tbl_Part

Ord_Tbl_Part

Ord_Tbl_Part

Ord_Tbl_Part

01

JAN

JAN

JAN

JAN

02 03

FEB

FEB

FEB

FEB

MAR

MAR

MAR

MAR

12

DEC

DEC

DEC

DEC

Above, is the CREATE statement for the Ord_Tbl_Part table. This table is a rowstore table that is partitioned by Order_Date. By using RANGE RIGHT and dates for the boundary values, it puts a month of data in each partition. The distributions each hold different rows, but store each month in their own block(s). This physical partitioning allows for faster loads and faster maintenance (Insert, Update, Deletes). This is the design you want when users are performing range queries on dates. Page 32

Chapter 2

The Azure SQL Data Warehouse Table Structures

There are Permanent, Temporary and External Tables (5 of 5) Permanent Tables – These tables reside permanently and only a DROP or TRUNCATE statement removes them. Temporary Tables – These tables reside temporarily on the system. Here is more information:

• • •

Global Temp tables are not supported on the Azure SQL Data Warehouse When creating TEMP Table you must specify LOCATION=USER_DB Creating NON CLUSTERED indexes are not supported on temp tables

External Tables – These tables point to data in a Hadoop cluster or Azure blob storage. External tables are used most often to: • •

Query Hadoop data from within the Azure SQL Data Warehouse. Import and store Hadoop data into the Azure SQL Data Warehouse by using the CREATE TABLE AS SELECT statement.

The Azure SQL Data Warehouse utilizes permanent tables for permanent data, temporary tables for temporary information and external tables in order to query Hadoop and blobs.

Page 33

Chapter 2

The Azure SQL Data Warehouse Table Structures

Creating a Table With a Distribution Key CREATE TABLE Emp_Intl ( Employee_No INTEGER ,Dept_No SMALLINT ,First_Name VARCHAR(12) ,Last_Name CHAR(20) ,Salary DECIMAL(8,2) ) WITH (DISTRIBUTION = HASH (Employee_No)) ; Distribution Memory

Distribution Memory

Distribution Memory

Hashed Each Distribution holds different rows. Each row is hashed by the values in a certain column, such as Employee_No

1 4 7 11

Joel Davis Rick Jahns Lynn Meyer Seth Rogers

2 5 8 12

Mary Lewis John Miller Rich Jones Kyle Watson

3 Tony Brady 6 Lana Payne 9 Lorie Stewart 13 Dawn Daily

Above, is a basic TABLE CREATE STATEMENT for a table with a Distribution Key. You can only use one column as the Distribution Key in the Azure SQL Data Warehouse. The values in this column will be hashed with a hashing formula and used to distribute the rows of the table across the Distributions. Picking a good key is essential. An excellent Distribution Key will allow for even distribution among the many distributions. Page 34

Chapter 2

The Azure SQL Data Warehouse Table Structures

Creating a Table that is Replicated CREATE TABLE Dept_Intl ( Dept_No INTEGER ,Department_Name VARCHAR(30) ) WITH (DISTRIBUTION = REPLICATE) ; Node 1

Node 2

Node 3

Memory

Memory

Memory

Replicated Each Node holds all rows of a table. The table is literally Duplicated on each and every node.

100 200 300 400

Sales Marketing Finance HR

100 200 300 400

Sales Marketing Finance HR

100 200 300 400

Sales Marketing Finance HR

Above, is a basic TABLE CREATE STATEMENT for a table that is replicated across all nodes. That means that the entire table with every row is copied to each and every node. This should be done for relatively small tables because you are in essence duplicating the table on each node. This is done so when a join is performed between this Dept_Intl table and a large Emp_Intl table, the matching rows will be Distribution Local. This means the matching rows are already on the same node and therefore will not have to be shuffled across nodes to make the join happen. Page 35

Chapter 2

The Azure SQL Data Warehouse Table Structures

Distributed by Hash vs. Replication Node Memory

Node Memory 100 200 300 400

Sales Marketing Finance HR

Replicated tables are stored once on each node, in a convenient place for querying and for joining to other tables

Tables distributed by hash are generally stored across all 8 distributions on each node, thus taking advantage of I/O parallelism

Each node has eight distributions and each distribution has its own set of disks. So, think of this as each node having at least eight disks to place the table rows that it owns. If a table was small, then a node might have all of the rows it owns in a single distribution. This is often the case with a table that is replicated. If a table is huge, then a node might have rows stored in all eight distributions, which is often the case for tables distributed by hash. Page 36

Chapter 2

The Azure SQL Data Warehouse Table Structures

The Concept is All About the Joins HASH Providers Dimension Table Provider_Code Provider_Name P_Address P_City P_State P_Zip P_Error_Rate

Services Dimension Table Service_Code Service_Desc Service_Pay

REPLICATED

Claims Fact Table Claim_Id Claim_Date Claim_Service Subscriber_No Member_No Claim_Amt Provider_No

Hash by Claim_Id

Subscribers Dimension Table Subscriber_No Member_No Last_Name First_Name Gender SSN

Addresses Dimension Table Subscriber_No Street City State Zip AreaCode Phone

The Azure SQL Data Warehouse gives you two choices for table distribution. These choices are either hash or replicated. Large fact tables are usually hashed and smaller tables are usually replicated. The bottom line is that an Azure SQL Data Warehouse needs for two joining rows to be on the same Node. That is why in a 5-table join, an Azure SQL Data Warehouse will join two tables at a time. If tables are replicated, then they are always on the same node as the rows they join. That is why a large Fact table will often be distributed by hash and the smaller tables it joins to will be replicated. The setup of tables on MPP systems are all about the joins. Page 37

Chapter 2

The Azure SQL Data Warehouse Table Structures

Creation of a Hash Distributed Table with a Clustered Index Claims Fact Table Claim_Id Claim_Date Claim_Service Subscriber_No Member_No Claim_Amt Provider_No

Hash by Claim_Id

CREATE TABLE Claims ( Claim_ID int NOT NULL ,Claim_Date int NOT NULL ,Claim_Service int NOT NULL ,Subscriber_No int NOT NULL ,Member_No int NOT NULL ,Claim_Amt decimal(18,2) NOT NULL ,Provider_No int NOT NULL ) WITH ( CLUSTERED INDEX(Claim_Date), DISTRIBUTION = HASH(Claim_ID));

Above, is the CREATE statement for the Claims table. This has a DISTRIBUTION=Hash on Claim_ID. It also has a clustered index on Claim_Date. That means that each node will sort the rows by Claim_Date. This is excellent for range queries. Users will often look up claims based on a time frame, such as per day, week, month, quarter or year. Page 38

Chapter 2

The Azure SQL Data Warehouse Table Structures

A Clustered Index Sorts the Data Stored on Disk Node 1 Data is Sorted on disk with a Clustered Index on Claim_Date

Node n Data is Sorted on disk with a Clustered Index on Claim_Date

1/1/2014 1/2/2014 1/3/2014

1/4/2014 1/5/2014 1/6/2014

1/7/2014 1/8/2014 1/9/2014

1/10/2014 1/11/2014 1/12/2014

1/1/2014 1/2/2014 1/3/2014

1/4/2014 1/5/2014 1/6/2014

1/7/2014 1/8/2014 1/9/2014

1/10/2014 1/11/2014 1/12/2014

1/13/2014 1/14/2014 1/15/2015

1/16/2014 1/17/2014 1/18/2014

1/19/2014 1/20/2014 1/21/2014

1/22/2014 1/23/2014 1/24/2014

1/13/2014 1/14/2014 1/15/2015

1/16/2014 1/17/2014 1/18/2014

1/19/2014 1/20/2014 1/21/2014

1/22/2014 1/23/2014 1/24/2014

1/25/2014 1/26/2014 1/27/2014

1/28/2014 1/29/2014 1/30/2014

1/31/2014 2/1/2014 2/2/2014

2/3/2014 2/4/2014 2/5/2014

1/25/2014 1/26/2014 1/27/2014

1/28/2014 1/29/2014 1/30/2014

1/31/2014 2/1/2014 2/2/2014

2/3/2014 2/4/2014 2/5/2014

2/6/2014 2/7/2014 2/8/2014

2/9/2014 2/10/2014 2/11/2014

2/12/2014 2/13/2014 2/14/2014

2/15/2014 2/16/2014 2/17/2014

2/6/2014 2/7/2014 2/8/2014

2/9/2014 2/10/2014 2/11/2014

2/12/2014 2/13/2014 2/14/2014

2/15/2014 2/16/2014 2/17/2014

2/18/2014 2/19/2014 2/20/2015

2/21/2014 2/22/2014 2/23/2014

2/24/2014 2/25/2014 2/26/2014

2/27/2014 2/28/2014 3/1/2014

2/18/2014 2/19/2014 2/20/2015

2/21/2014 2/22/2014 2/23/2014

2/24/2014 2/25/2014 2/26/2014

2/27/2014 2/28/2014 3/1/2014

A Clustered Index is created to command the Azure SQL Data Warehouse to sort the actual data on disk according to the sorted order of the column values. Each table can have only one clustered index at the same time. For distributed tables, a clustered index affects the way data is stored within each distribution across the nodes, however, it does not affect which rows are assigned to each distribution. For replicated tables, the clustered index affects the way the data is stored within each replicated table, however, it does not affect where the replicated tables are stored. A clustered index sorts the data on disk which is very important for range queries. Above, we created a Clustered Index on Order_Date, so now a full table scan won't be needed for all queries. Page 39

Chapter 2

The Azure SQL Data Warehouse Table Structures

Each Node Has 8 Distributions Node Memory

Each node has eight distributions and each distribution has its own set of disks. So, think of this as each node having at least eight disks to place the table rows that it owns. Better yet, think of this as each compute node having eight parallel processes (called distributions) with each parallel process having its own dedicated disk. Page 40

Chapter 2

The Azure SQL Data Warehouse Table Structures

How Hashed Tables are Stored Among a Single Node Node Memory Addresses Table rows Subscribers Table rows

In a perfect distribution each hashed table is distributed evenly across all eight distributions

Providers Table rows Services Table rows

Claims Table rows

Think of this node as eight parallel processes simultaneously processing the rows of a table that they own

Each node has eight distributions and each distribution has its own set of disks. So, think of this as each node having at least eight disks to place the table rows that it owns. If a table was small, then a node might have all of the rows it owns in a single distribution. If a table is huge, then a node might have rows stored in all eight distributions. Page 41

Chapter 2

The Azure SQL Data Warehouse Table Structures

Hashed Tables Will Be Distributed Among All Distributions Node

Node

Node

Node

Memory

Memory

Memory

Memory

Above, we see four nodes and each node has eight distributions for a total of 32 distributions. We also see our five tables. Each table is hashed (in this example) and each table has spread different rows across all 32 distributions. All five tables above are row based tables. Page 42

Chapter 2

The Azure SQL Data Warehouse Table Structures

Creation of a Replicated Table

Addresses Dimension Table Subscriber_No Street City State Zip AreaCode Phone

CREATE TABLE Addresses ( Subscriber_No INTEGER ,Street VARCHAR(30) ,City VARCHAR(20) ,State CHAR(2) ,Zip INTEGER ,AreaCode SMALLINT ,Phone INTEGER ) WITH (DISTRIBUTION = REPLICATE);

Above, is the CREATE statement for the Addresses table. This has a DISTRIBUTION=Hash on REPLICATE. This table's data will be duplicated on each node in its entirety.

Page 43

Chapter 2

The Azure SQL Data Warehouse Table Structures

How Replicated Tables are Stored Among a Single Node Node Memory Addressees Table rows

Subscribers Table rows Providers Table rows

Replicated tables are stored only once per node

Services Table rows

Claims Table rows

Each node has eight distributions and each distribution has its own set of disks. So, think of this as each node having at least eight disks to place the table rows that it owns. Replicated tables are duplicated in their entirety across each node. If a table has 20 rows and there are 4 nodes in the system then each node has the same 20 rows. The rows for a replicated table store all the rows only once per node, but the Azure SQL Data Warehouse actually spreads those 20 rows across all eight distributions. This is done using file groups. Above, it appears that the entire table is only on one of the nodes disk, but that is just to illustrate that the entire table is copied only once per node. Page 44

Chapter 2

The Azure SQL Data Warehouse Table Structures

Replicated Table will be Duplicated among Each Node Node

Node

Node

Node

Memory

Memory

Memory

Memory

Above, we see four nodes and each node has eight distributions for a total of 32 distributions. We also see four tables. Each table is replicated so each table is thus duplicated across a node one time. All four tables above are row based tables. Page 45

Chapter 2

The Azure SQL Data Warehouse Table Structures

Distributed by Replication Node 1 Memory

Node 2 Memory

Node 3 Memory

Node 4 Memory

Addresses

Addresses

Addresses

Addresses

With Replication, a table is copied in its entirety to every Azure SQL Data Warehouse compute node. Is this duplicating the table and data across each compute node? Yes! Why in the world would anyone do this? For one reason, The joins! For two rows to be joined they need to be on the same compute node. When the Addresses table joins to the Subscriber table, the replication of the Addresses table will guarantee that the matching rows to the Subscribers will be on the same compute node. Take good advice here and replicate all small table that join to larger tables. Page 46

Chapter 2

The Azure SQL Data Warehouse Table Structures

How Hashed and Replicated Tables Work Together Node Memory Addresses Table rows Subscribers Table rows

Replicated tables are stored only once per node

Providers Table rows Services Table rows

The Hashed table is distributed evenly across all eight distributions

Claims Table rows

The Fact table (Claims), which is large, will be spread across all eight distributions. The dimension tables (Addresses, Subscribers, Providers and Services) are replicated once on the node. This will allow for easy joining among the five tables. Page 47

Chapter 2

The Azure SQL Data Warehouse Table Structures

Tables are Stored as Row-based or Column-based Node

Node

Node

Employee_Row_Based

Employee_Row_Based

Employee_Row_Based

Employee_Columnar

Employee_Columnar

Employee_Columnar

Column Segments

Column Segments

Column Segments

Above, is a picture of the same table stored as a row-based (top) and column-based design. Notice that either way the node gets the entire row, but the Azure SQL Data Warehouse gives you the option of storing it in either a rowbased or column-based design. When a query select all columns in a table the row-based storage if faster, however for queries that only select a few columns the column-based storage if faster. The column-based storage has advanced compression opportunities that save a great deal of space. Page 48

Chapter 2

The Azure SQL Data Warehouse Table Structures

Creation of a Columnar Table that is Hashed CREATE TABLE Sales_Columnar_Hashed ( Product_ID int NOT NULL, Sale_Date date, Daily_Sales decimal(9,2) ) WITH ( DISTRIBUTION = HASH(Product_ID), CLUSTERED COLUMNSTORE INDEX ); Distribution 1

Distribution 2

Distribution n

Above, is the CREATE statement for the Sales_Columnar_Hashed table. This table is a columnstore table that is hashed by the Product_ID column. The table has nine rows and three columns. The rows are hashed and the entire row is placed on a distribution, but then it is stored in separate columns. The idea is that when a query is run that can be satisfied by using only one or two of the columns, then the system only has to move that one or two columns from disk to memory. Page 49

Chapter 2

The Azure SQL Data Warehouse Table Structures

How Hashed Columnar Tables are Stored on a Single Node Node Memory In a perfect distribution each hashed table is distributed evenly across all eight distributions Addresses Table rows Subscribers Table rows Claims Table rows

A Columnar store will store each column in its own page. This is sometimes 10X faster for certain queries with 3X compression.

The Addresses table has four columns in it. The Subscribers table has five columns and the Claims table has nine columns. Each column is stored in its own page.

Page 50

Chapter 2

The Azure SQL Data Warehouse Table Structures

How Hashed Columnar Tables are Stored on All Distributions Node

Node

Node

Node

Memory

Memory

Memory

Memory

Above, we see four nodes and each node has eight distributions for a total of 32 distributions. The Addresses table has four columns in it. The Subscribers table has five columns, and the Claims table has nine columns. All 32 distributions hold a portion of each table, and each table stores each column in a separate page. Page 51

Chapter 2

The Azure SQL Data Warehouse Table Structures

Comparing Normal Table Vs. Columnar Tables Distribution

Employee_Normal Emp_No

Dept_No First_Name

1001 1004 1007

100 Rafael 400 Kyle 200 Sushma

Last_Name

Salary

Minal Stover Davis

90000.00 60000.00 50000.00

Employee_Columnar

Emp_No 1001 1004 1007

Dept_No 100 400 200

First_Name

Last_Name

Salary

Rafael Kyle Sushma

Minal Stover Davis

90000.00 60000.00 50000.00

Above, is a picture of the same table stored as a row-based (top) and column-based design. Notice that either way the node gets the entire row, but the Azure SQL Data Warehouse has the option of storing it in either a row-based or column-based design. Page 52

Chapter 2

The Azure SQL Data Warehouse Table Structures

Columnar can move just One Segment to Memory Distribution

Memory

Emp_No 1001 1004 1007

SELECT Emp_No FROM Employee_Columnar ;

Query

Employee_Columnar Emp_No 1001 1004 1007

Page 53

Dept_No 100 400 200

First_Name

Last_Name

Salary

Rafael Kyle Sushma

Minal Stover Davis

90000.00 60000.00 50000.00

Chapter 2

The Azure SQL Data Warehouse Table Structures

Segments on Distributions are Aligned to Rebuild a Row Distribution

Memory

Emp_No 1001 1004 1007

What if the query needed two columns?

Salary SELECT Emp_No, Salary FROM Employee_Columnar ;

90000.00 60000.00 50000.00

Employee_Columnar Emp_No 1001 1004 1007

Page 54

Dept_No 100 400 200

First_Name

Last_Name

Salary

Rafael Kyle Sushma

Minal Stover Davis

90000.00 60000.00 50000.00

Chapter 2

The Azure SQL Data Warehouse Table Structures

Why Columnar?

“Everyone is kneaded out of the same dough but not baked in the same oven.” – Yiddish Proverb

Emp_No

Dept_No

1001 1002 1003 1004 1005 1006 1007 1008 1009

100 200 300 400 400 300 200 100 300

First_Name

Rafael Maria Charl Kyle Rob Inna Sushma Mo Mo

Last_Name

Minal Gomez Kertzel Stover Rivers Kinski Davis Khan Swartz

Salary

90000 80000 70000 60000 50000 50000 50000 60000 70000

Each data block holds a single column. The row can be rebuilt because everything is aligned perfectly. If someone runs a query that would return the average salary, then only one small data block is moved into memory. The salary block moves into memory where it is processed as fast as lightning. We just cut down on moving large blocks by 80%! Why columnar? Because, like our Yiddish Proverb says, "All data is not kneaded on every query, so that is why it costs so much dough." Page 55

Chapter 2

The Azure SQL Data Warehouse Table Structures

Columnar Tables Store Each Column in Separate Pages Node

Node

Node

Memory

Memory

Memory

AVG Salary

AVG Salary

AVG Salary

This is the same data you saw on the previous page! The difference is that the above is a columnar design. I have color coded this for you. There are 8 rows in the table and five columns. Notice that the entire row stays on the same disk, but each column is a separate block. This is a brilliant design for Ad Hoc queries and analytics because when only a few columns are needed, columnar can move just the columns it needs to. Columnar can't be beat for queries because the pages are so much smaller, and what isn't needed isn't moved. Page 56

Chapter 2

The Azure SQL Data Warehouse Table Structures

Visualize the Data – Rows vs. Columns 24 rows (five columns) stored in 6 blocks in this row-based system

24 rows (five columns) stored in 15 blocks (each column is its own block)

Both examples above have the same data and the same amount of data. If your applications tend to need to analyze the majority of columns or read the entire table, then a row-based system (top example) can move more data into memory. Columnar tables are advantageous when only a few columns need to be read. This is just one of the reasons that analytics goes with columnar like bread goes with butter. A row-based system must move the entire page into memory even if it only needs to read one row or even a single column. If a user above needed to analyze the Salary, the columnar system would move 80% less block mass. Page 57

Chapter 2

The Azure SQL Data Warehouse Table Structures

Creation of a Columnar Table that is Replicated CREATE TABLE Sales_Columnar_Replicated ( Product_ID int NOT NULL, Sale_Date date, Daily_Sales decimal(9,2)) WITH ( DISTRIBUTION = REPLICATE, CLUSTERED COLUMNSTORE INDEX ); Node 1

Node 2

Node n

Above, is the CREATE statement for the Sales_Columnar_Replicated table. This table is a columnstore table that is replicated on each node. The table only has nine rows and three columns. Each table holds the exact same data. It is like looking in a mirror. That is what replicated means. The table is Replicated, but the storage is a columnar (columnstore) design. This allows single columns to be placed into memory for processing. Page 58

Chapter 2

The Azure SQL Data Warehouse Table Structures

Creating a Partitioned Table Per Month CREATE TABLE Ord_Tbl_Part ( Order_Number integer ,Customer_Number integer ,Order_Date date ,Order_Total decimal(10,2)) WITH ( DISTRIBUTION = HASH (Order_Number), PARTITION ( Order_Date RANGE RIGHT FOR VALUES ( '2015-01-01','2015-02-01','2015-03-01','2015-04-01', '2015-05-01','2015-06-01','2015-07-01','2015-08-01' ,'2015-09-01','2015-10-01','2015-11-01','2015-12-01' )));

Above, is the CREATE statement for the Ord_Tbl_Part table. This table is a rowstore table that is partitioned by Order_Date. By using RANGE RIGHT and dates for the boundary values, it puts a month of data in each partition.

Page 59

Chapter 2

The Azure SQL Data Warehouse Table Structures

A Visual of One Year of Data with Range Per Month Distribution 1

Distribution 2

Distribution 3

Distribution 4

Ord_Tbl_Part

Ord_Tbl_Part

Ord_Tbl_Part

Ord_Tbl_Part

01

JAN

JAN

JAN

JAN

02 03 04 05 06 07 08 09 10 11

FEB

FEB

FEB

FEB

MAR APR

MAR APR

MAR APR

MAR APR

MAY JUN JUL AUG

MAY JUN JUL AUG

MAY JUN JUL AUG

MAY JUN JUL AUG

SEP

SEP

SEP

SEP

OCT NOV

OCT NOV

OCT NOV

OCT NOV

DEC

DEC

DEC

DEC

12

Above, is a visual of the Ord_Tbl_Part table that was created on the previous page. This table is a rowstore table that is partitioned by Order_Date. By using RANGE RIGHT and dates for the boundary values, it puts a month of data in each partition. This table is NOT replicated, but hashed. The nodes each hold different rows, but store each month in their own block(s). This physical partitioning allows for faster loads and faster maintenance (Insert, Update, Deletes). This is the design you want when users are performing range queries on dates. Page 60

Chapter 2

The Azure SQL Data Warehouse Table Structures

Another Create Example of a Partitioned Table CREATE TABLE Sales_Partitioned ( Product_ID int NOT NULL, Sale_Date date, Daily_Sales decimal(9,2) ) WITH ( PARTITION ( Product_ID RANGE LEFT FOR VALUES (100, 200, 300, 400 )), CLUSTERED COLUMNSTORE INDEX ); In this example of RANGE LEFT, data will be sorted into the following partitions:

This would be the partitioning if this same table was partitioned RANGE RIGHT instead of RANGE LEFT:

Partition 1: col =) SELECT * FROM Student_Table WHERE Grade_Pt >= 3.0 ; Greater than or Equal to

Student_ID _________

231222 234121 324652 123250 322133

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________

Wilson Thomas Delaney Phillips Bond

Susie Wendy Danny Martin Jimmy

SO FR SR SR JR

3.80 4.00 3.35 3.00 3.95

All rows returned have a Grade_Pt >= 3.0

The WHERE Clause doesn’t just deal with ‘Equals’. You can look for things that are GREATER or LESSER THAN along with asking for things that are GREATER/LESSER THAN or EQUAL to. Page 179

Chapter 7

The WHERE Clause

AND in the WHERE Clause Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT FROM WHERE AND

* Student_Table Class_Code = 'FR' First_Name = 'Henry' ;

Notice the WHERE statement and the word AND. In this example, qualifying rows must have a Class_Code = ‘FR’ and also must have a First_Name of ‘Henry’. Notice how the WHERE and the AND clause are on their own line. Good practice! Page 180

Chapter 7

The WHERE Clause

Troubleshooting AND Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 AND Grade_Pt = 4.0; No rows qualify. How can a student have two grade points?

What is going wrong here? You are using an AND to check the same column. What you are basically asking with this syntax is to see the rows that have BOTH a Grade_Pt of 3.0 and a 4.0. That is impossible, so no rows will be returned. Page 181

Chapter 7

The WHERE Clause

OR in the WHERE Clause SELECT FROM WHERE OR

Student_ID _________ 234121 123250

* Student_Table Grade_Pt = 3.0 Grade_Pt = 4.0;

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Thomas Phillips

Wendy Martin

FR SR

4.00 3.00

Notice above in the WHERE Clause we use OR. Or allows for either of the parameters to be TRUE in order for the data to qualify and return.

Page 182

Chapter 7

The WHERE Clause

Troubleshooting Or Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 OR 4.0; error

SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 OR Grade_Pt = 4.0; perfect

Notice above in the WHERE Clause we use OR. Or allows for either of the parameters to be TRUE in order for the data to qualify and return. The first example errors and is a common mistake. The second example is perfect.

Page 183

Chapter 7

The WHERE Clause

Troubleshooting Character Data Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 AND Class_Code = SR ;

Error!!! Why?

This query errors! What is WRONG with this syntax? No Single quotes around SR.

Page 184

Chapter 7

The WHERE Clause

Using Different Columns in an AND Statement Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 AND Class_Code = 'SR' ; Student_ID _________ 123250

Last_Name _________ Phillips

First_Name __________ Class_Code __________ Grade_Pt ________ Martin SR 3.00

Notice that AND separates two different columns, and the data will come back if both are TRUE.

Page 185

Chapter 7

The WHERE Clause

Quiz – How many rows will return? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 4.0 OR Grade_Pt = 3.0 AND Class_Code = 'SR' ; Which Seniors have a 3.0 or a 4.0 Grade_Pt average. How many rows will return?

Page 186

A) 2

C) Error

B) 1

D) 3

Chapter 7

The WHERE Clause

Answer to Quiz – How many rows will return? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 4.0 OR Grade_Pt = 3.0 AND Class_Code = 'SR' ;

Student_ID _________ Last_Name __________ First_Name Class_Code Grade_Pt _________ __________ ________ 234121 Thomas Wendy FR 4.00 123250 Phillips Martin SR 3.00

We had two rows return! Isn’t that a mystery? Why?

Page 187

Chapter 7

The WHERE Clause

LIKE command Underscore is Wildcard for one Character Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Last_Name LIKE '_a%' ;

Student_ID _________ 423400 125634

Last_Name _________ Larkins Hanson

Show me anyone with an 'a' as the 2nd letter in their Last_Name

First_Name __________ Class_Code __________ Grade_Pt ________ Michael FR 0.00 Henry FR 2.88

The _ underscore sign is a wildcard for any single character. We are looking for anyone who has an 'a' as the second letter of their last name. Page 188

Chapter 7

The WHERE Clause

LIKE command using a Range of Values

The above syntax allows us to use a range of values (a-f in this example). Any First_Name that starts with an a, b, c, d, e or f will return. How about that for clever SQL?

Page 189

Chapter 7

The WHERE Clause

LIKE command Using a NOT Range of Values

The ^ sign (Shift 6) acts like a NOT.

The above syntax allows us to use a NOT range of values (a-f in this example). Any First_Name that starts with the letter a, b, c, d, e or f will not return.

Page 190

Chapter 7

The WHERE Clause

LIKE Command Works Differently on Char Vs Varchar Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

First_Name's Data Type is VARCHAR (20)

Student_ID _________ 125634 322133 324652 333450 260000 234121

SELECT * FROM Student_Table WHERE First_Name LIKE '%y' ;

Last_Name __________ First_Name __________ Class_Code ________ Grade_Pt _________ Hanson Henry FR 2.88 Bond Jimmy JR 3.95 Delaney Danny SR 3.35 Smith Andy SO 2.00 Johnson Stanley ? ? Thomas Wendy FR 4.00

It is important that you know the data type of the column you are using with your LIKE command. VARCHAR and CHAR data differ slightly. Page 191

Chapter 7

The WHERE Clause

Troubleshooting LIKE Command on Character Data Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

Last_Name has a Data Type of CHAR (20)

Student_ID _________

SELECT * FROM Student_Table WHERE Last_Name LIKE '%n' ;

Last_Name _________

First_Name __________ Class_Code __________ Grade_Pt ________

No Rows are returned! Why?

This is a CHAR (20) data type. That means that any words under 20 characters will pad spaces behind them until they reach 20 characters. You will not get any rows back from this example because technically, no row ends in an ‘N’, but instead ends in a space. Page 192

Chapter 7

The WHERE Clause

Introducing the RTRIM Command Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250 Last_Name has a Data Type of CHAR (20)

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT Last_Name FROM Student_Table WHERE RTRIM (Last_Name) LIKE '%n' ; Last_Name __________ Hanson Wilson Johnson

This is a CHAR(20) data type. That means that every Last_Name is going to be 20 characters long. Most names are not really 20 characters long, so spaces are padded at the end to ensure filling up all 20 characters. We need to do the RTRIM command to remove the trailing spaces. Once the spaces are trimmed, we can find out whose name ends in 'n'. Page 193

Chapter 7

The Where Clause

Quiz – What Data is Left Justified and What is Right? SELECT FROM WHERE AND

* Sample_Table Column1 IS NULL Column2 IS NULL ;

Answer Set Column1 Integers are Right Justified!

? Right Justified

Column2

?

Character Data is Left Justified!

Left Justified

Which Column from the Answer Set could have a DATA TYPE of INTEGER, and which could have Character Data?

Page 194

Chapter 7

The Where Clause

Numbers are Right Justified and Character Data is Left SELECT FROM WHERE AND

* Sample_Table Column1 IS NULL Column2 IS NULL ;

Answer Set Column1 Integers are Right Justified!

? Right Justified

Column2

?

Character Data is Left Justified!

Left Justified

All Integers will start from the right and move left. Thus, Col1 was defined during the table create statement to hold an INTEGER. The next page shows a clear example.

Page 195

Chapter 7

The Where Clause

Answer – What Data is Left Justified and What is Right? SELECT Employee_No, First_Name FROM Employee_Table WHERE Employee_No = 2000000;

Answer Set Employee_No ____________ Integers are Right justified!

2000000

First_Name __________ Squiggy

Characters are Left justified!

All Integers will start from the right and move left. All Character data will start from the left and move to the right.

Page 196

Chapter 7

The Where Clause

An Example of Data with Left and Right Justification SELECT Student_ID, Last_Name FROM Student_Table ;

Student_ID __________

Integers are Right justified!

423400 125634 280023 260000 231222 234121 324652 123250 322133 333450

Last_Name _______

Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith

Characters are Left justified!

This is how a standard result set will look. Notice that the integer type in Student_ID starts from the right and goes left. Character data type in Last_Name moves left to right like we are used to seeing while reading English.

Page 197

Chapter 7

The Where Clause

A Visual of CHARACTER Data vs. VARCHAR Data Character Data on Disk Last_Name as a Char(20)

Jones _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Hanson _ _ _ _ _ _ _ _ _ _ _ _ _ _

Spaces padded at the end

McRoberts _ _ _ _ _ _ _ _ _ _ _ Johnson _ _ _ _ _ _ _ _ _ _ _ _ _ Varchar Data on Disk

Last_Name as a Varchar(20) 2-byte VLI Variable Length Indicator

0

5 Jones

0

6 Hanson

0

9 McRoberts

0

7

No Spaces

Johnson

Character data pads spaces to the right and Varchar uses a 2-byte VLI instead.

Page 198

Chapter 7

The Where Clause

RTRIM command Removes Trailing spaces on CHAR Data Character Data on Disk Last_Name as a Char(20) Jones _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Hanson _ _ _ _ _ _ _ _ _ _ _ _ _ _

Spaces padded at the end

Wilson _ _ _ _ _ _ _ _ _ _ _ _ _ _

Johnson _ _ _ _ _ _ _ _ _ _ _ _ _ SELECT Last_Name FROM Student_Table WHERE RTRIM (Last_Name) LIKE '%n' ;

Trim removes spaces at the front and back

Last_Name __________ Hanson Wilson Johnson

Last_Name has a Data Type of CHAR (20)

By using the TRIM command on the Last_Name column, you are able to trim off any spaces from the end. Once we use the TRIM on Last_Name, we have eliminated any spaces at the end, so now we are set to bring back anyone with a Last_Name that truly ends in ‘n’! Page 199

Chapter 7

The Where Clause

Using Like with an AND Clause to Find Multiple Letters

The above uses an additional AND clause to find anyone with both an 'M' and an 'S' in their last name. Notice that the Azure SQL Data Warehouse is not case sensitive.

Page 200

Chapter 7

The Where Clause

Using Like with an OR Clause to Find Either Letters

The above uses an additional OR clause to find anyone with both an ‘M’ and an 'S' in their last name. Notice that the Azure SQL Data Warehouse is not case sensitive. Page 201

Chapter 7

The Where Clause

Declaring a Variable and Using it with the LIKE Command Addresses Subscriber_No _________________ ____________ Street City _________ State _____ MI 3333333 2468 Appreciate Ave. Mytown Sometown CA 2222222 123 Some St. Anytown AL 1111111 123 Any St. Big City NY 5555555 121 Jump St. Big City NY 4444444 12 Jump St. We declare a variable

Zip _________AreaCode ________ _______ Phone 123561111 937 3334567 256781212 475 5651213 456780000 435 5551213 334566598 310 4531111 334566576 310 4530097

DECLARE @StreetVar nvarchar(60) = 'Appreciate'; SELECT Street, City, State Variable value Highlight and FROM Addresses Run these Commands together WHERE street LIKE '%' + @StreetVar + '%'; + means concatenate

+ means concatenate

Street City _________________ _______ 2468 Appreciate Ave. Mytown

State _____ MI

The equivalent statement would be Select Street, City, State from Addresses WHERE street like '%Appreciate% ;

In the above example, we have declared a variable and placed a value in it with the word 'Appreciate'. We use it in our LIKE query in conjunction with concatenation. WARNING: The DECLARE and SELECT must be highlighted and run together. Page 202

Chapter 7

The Where Clause

Escape Character in the LIKE Command changes Wildcards

Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999

Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_

Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%

FR FR JR ? SO FR SR SR JR SO FR

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90

/* We just pretended to add a new row to the Student_Table */

/* Can you use the LIKE command to find S% above? */

Here you will have to utilize a Wildcard Escape Character. Turn the page for more. Page 203

Chapter 7

The Where Clause

Escape Characters Turn off Wildcards in the LIKE Command Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999

Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_

Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%

FR FR JR ? SO FR SR SR JR SO FR

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90

Can you use the LIKE command to find S% above? SELECT * FROM Student_Table WHERE First_Name LIKE 'S@%' Escape '@';

We can pick our Escape character and we have chosen the @ sign. This turns the wildcard off for 1 character so we find ‘S%’, without bringing back Stanley or Susie. Page 204

Chapter 7

The Where Clause

Quiz – Turn off that Wildcard

Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999

Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_

Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%

FR FR JR ? SO FR SR SR JR SO FR

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90

Can you use the LIKE command to find the Last_Name of T_? (pronounced Tunderscore!)

This is a little trickier than you might think so be on your toes…. And get a haircut! Page 205

Chapter 7

The Where Clause

ANSWER – To Find that Wildcard Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999

Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_

Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%

FR FR JR ? SO FR SR SR JR SO FR

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90

Can you use the LIKE command to find the Last_Name of T_? (pronounced Tunderscore!)

SELECT * FROM Student_Table WHERE RTRIM(Last_Name) LIKE 'T@_' Escape '@' ;

You didn’t really need to get a full haircut, but just a RTRIM Command and the Escape!

Page 206

Chapter 8

Page 207

Distinct, Group By and TOP

Chapter 8

Distinct, Group By and TOP

Chapter 8 – Distinct, Group By and TOP

“A bird does not sing because it has the answers, it sings because it has a song.” - Anonymous

Page 208

Chapter 8

Distinct, Group By and TOP

The Distinct Command Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT Distinct Class_Code FROM Student_Table ORDER BY 1;

Class_Code __________ ? Distinct FR won't repeat SO duplicate JR values SR

DISTINCT eliminates duplicates from returning in the Answer Set.

Page 209

Chapter 8

Distinct, Group By and TOP

Distinct vs. GROUP BY SELECT Class_Code FROM Student_Table GROUP BY Class_Code ORDER BY 1;

SELECT Distinct Class_Code FROM Student_Table ORDER BY 1;

Both examples produce the exact same result

Class_Code _________ ? FR JR SO SR Rules for Distinct Vs. GROUP BY (1) Many Duplicates – use GROUP BY (2) Few Duplicates – use DISTINCT

(3) Space Exceeded - use GROUP BY

Distinct and GROUP BY in the two examples return the same answer set. Page 210

Chapter 8

Distinct, Group By and TOP

Quiz – How many rows come back from the Distinct? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT Distinct Class_Code, Grade_Pt FROM Student_Table ORDER BY Class_Code, Grade_Pt;

How many rows will come back from the above SQL? Page 211

Chapter 8

Distinct, Group By and TOP

Answer – How many rows come back from the Distinct? SELECT Distinct Class_Code, Grade_Pt FROM Student_Table ORDER BY Class_Code, Grade_Pt ;

Class_Code __________ ? FR FR FR JR JR SO SO SR SR

Grade_Pt ________ ? 0.00 2.88 4.00 1.90 3.95 2.00 3.80 3.00 3.35

No Rows have the exact same values for both the Class_Code and Grade_Pt. Each row is Distinct!

How many rows will come back from the above SQL? 10. All rows came back. Why? Because there are no exact duplicates that contain a duplicate Class_Code and Duplicate Grade_Pt combined. Each row in the SELECT list is distinct. Page 212

Chapter 8

Distinct, Group By and TOP

TOP Command Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT TOP (3 ) Last_Name, Class_Code, Grade_Pt FROM Student_Table ;

Last_Name Class_Code Grade_Pt __________ __________ ________ Hanson Bond Smith

FR JR SO

2.88 3.95 2.00

In the above example, we brought back 3 rows only. This is because of the TOP 3 statement which means to get an answer set, and then bring back the first 3 rows in that answer set. Because this example does not have an ORDER BY statement, you can consider this example as merely bringing back 3 random rows. Page 213

Chapter 8

Distinct, Group By and TOP

TOP Command is brilliant when ORDER BY is used! Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT TOP (3) Last_Name, Class_Code, Grade_Pt FROM Student_Table ORDER BY Grade_Pt DESC ;

Last_Name Grade_Pt _________ Class_Code _________ ________ Thomas FR 4.00 Bond JR 3.95 Wilson SO 3.80

In the above example, we brought back 3 rows only. This is because of the TOP 3 statement which means to get an answer set, and then bring back the first 3 rows. Because this example uses an ORDER BY statement, the data brought back is from the top 3 students with the highest Grade_Pt. This is the real power of the TOP command. Use it with an ORDER BY! Page 214

Chapter 8

Distinct, Group By and TOP

TOP Command with Ties Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ 2nd Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Tie Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? 1st Thomas Wendy FR 4.00 Tie Phillips Martin SR 3.00

SELECT TOP (2) WITH TIES Last_Name ,Class_Code ,Grade_Pt FROM Student_Table ORDER BY Class_Code ;

Last_Name _________ Class_Code _________ Grade_Pt ________ Johnson ? ? Larkins FR 0.00 Thomas FR 4.00 Hanson FR 2.88

By using the TOP WITH TIES Command, this will bring in the TOP amount along with ANY ties. So while you might only ask for the top 2 with ties, you might get 4 rows back. Why did 4 rows return here? Which row came back first? Four rows returned with the first row coming back as a NULL for Class_Code. Then the next row returned was one of the Freshman. There were two other Freshman that tie. All ‘FR’ come back in a tie! Page 215

Chapter 8

Distinct, Group By and TOP

TOP Command Using a Variable

Highlight the DECLARE and the SELECT statement and hit EXECUTE

You can use the TOP command in conjunction with a variable. Above, we declared a variable called @TOPVAR. We set the variable to 5. If we highlight the DECLARE and the Query we get the answer set providing the TOP 5 salaried employees.

Page 216

Chapter 9

Page 217

Aggregation

Chapter 9

Aggregation

Chapter 9 – Aggregation

“The Azure SQL Data Warehouse climbed Aggregate Mountain and delivered a better way to Sum It.” – Tera-Tom Coffing

Page 218

Chapter 9

Aggregation

Quiz – You calculate the Answer Set in your own Mind Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT

FROM WHERE

Avg(Grade_Pt) AS "AVG" ,Count(Grade_Pt) AS "Count" ,Count(*) AS "Count *" Student_Table Class_Code IS NULL AVG _____ Count _____

Count * _______

What would the result set be from the above query? The next slide shows answers!

Page 219

Chapter 9

Aggregation

Answer – You calculate the Answer Set in your own Mind Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT

Avg(Grade_Pt) AS "AVG" ,Count(Grade_Pt) AS "Count" ,Count(*) AS "Count *" Student_Table Class_Code IS NULL

FROM WHERE

AVG _____ Count _____ ?

Here are your answers!

Page 220

0

Count * _______ 1

Here are the correct answers

Aggregates ignore Null values

Chapter 9

Aggregation

The 3 Rules of Aggregation Aggregation_Table Employee_No 423400 423401 423402

Salary 100000.00 100000.00 NULL

SELECT AVG(Salary) as "AVG" ,Count(Salary) as SalCnt ,Count(*) as RowCnt FROM Aggregation_Table ;

1) Aggregates Ignore Null Values.

2) Aggregates WANT to come back in one row. 3) You CAN’T mix Aggregates with normal columns unless you use a GROUP BY.

AVG(Salary) = $100000.00

Page 221

Count(Salary) = 2

Count(*) = 3

Chapter 9

Aggregation

There are Five Aggregates There are FIVE AGGREGATES which are the following: MIN – The Minimum Value. MAX – The Maximum Value. AVG – The Average of the Column Values. SUM – The Sum Total of the Column Values. COUNT – The Count of the Column Values.

SELECT MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,Count(*) FROM Employee_Table ;

“Don’t count the days, make the days count.” – Mohammed Ali

The five aggregates are listed above. Mohammed Ali was way off in his quote. He meant to say, "Don't you count the days, make the data count for you". Page 222

Chapter 9

Aggregation

Quiz – How many rows come back? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT MIN (Salary) AS Minsal ,MAX (Salary) AS Maxsal ,SUM (Salary) AS Sumsal ,AVG (Salary) AS Avgsal ,Count(*) AS Countrows FROM Employee_Table

How many rows will the above query produce in the result set? Page 223

How many rows come back?

Chapter 9

Aggregation

Answer – How many rows come back? SELECT MIN (Salary) AS Minsal ,MAX (Salary) AS Maxsal ,SUM (Salary) AS Sumsal ,AVG (Salary) AS Avgsal ,Count(*) AS Countrows FROM Employee_Table

Minsal ________

Maxsal ________

Sumsal ________

32800.50

64300.00

421039.38

Only one row comes back

Avgsal ________

46782.153333

How many rows will the above query produce in the result set? The answer is one. Page 224

Countrows _________ 9

Chapter 9

Aggregation

Troubleshooting Aggregates Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name __________ Jones Squiggy Smythe Richard Chambers Mandee Coffing Billy Smith John Larkins Loraine Strickling Cletus Reilly William Harrison Herbert

SELECT Dept_No ,MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,Count(*) FROM Employee_Table ;

Salary _______ 32800.50 64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00

NON-Aggregate

Error

If you have a normal column (non aggregate) in your query, you must have a corresponding GROUP BY statement. Page 225

Chapter 9

Aggregation

GROUP BY when Aggregates and Normal Columns Mix

NON-Aggregate

Group By Needed

If you have a normal column (non aggregate) in your query, you must have a corresponding GROUP BY statement.

Page 226

Chapter 9

Aggregation

GROUP BY delivers one row per Group

Group By Needed

Dept_No ________ 10 100 200 300 400 ?

Min __________ 64300.00 48850.00 41888.88 40200.00 36000.00 32800.50

NON-Aggregate SELECT Dept_No ,MIN (Salary) AS "Min ,MAX (Salary) AS "Max" ,SUM (Salary) AS "Sum" ,AVG (Salary) AS "Avg" ,Count(*) AS "Count" FROM Employee_Table GROUP BY Dept_No ORDER BY Dept_No ;

Max __________ 64300.00 48850.00 48000.00 40200.00 54500.00 32800.50

Sum AVG Count __________ ___________ _______ 64300.00 1 64300.00 48850.00 1 48850.00 44944.44 2 89888.88 40200.00 1 40200.00 48333.33 3 145000.00 32800.50 1 32800.50

Group By Dept_No command allow for the Aggregates to be calculated per Dept_No. The data has also been sorted with the ORDER BY statement.

Page 227

Chapter 9

Aggregation

Count_Big

Count_Big has a data type of BIGINT

SELECT Dept_No, COUNT(Salary) AS CountSal, COUNT_BIG(Salary) AS CountSalBig FROM Employee_Table GROUP BY Dept_No ORDER BY Dept_No;

Dept_No __________ CountSal CountSalBig ________ __________ ? 1 1 10 1 1 100 1 1 200 2 2 300 1 1 400 3 3

The Count_Big command is the same as a Count, but the Count_Big uses a data type of BIGINT. The Count uses an Integer data type. The Count_Big is for values > 2000,000,000 (two billion).

Page 228

Chapter 9

Aggregation

Limiting Rows and Improving Performance with WHERE Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT Dept_No, MIN (Salary), MAX (Salary), SUM (Salary) , AVG (Salary) , COUNT(*) WHERE Clause acts FROM Employee_Table as a filter before any WHERE Dept_No IN (200, 400) Calculations are done GROUP BY Dept_No Order by 1 ; Will Dept_No 300 be calculated? Of course you know it will…NOT!

Page 229

Chapter 9

Aggregation

WHERE Clause in Aggregation limits unneeded Calculations SELECT Dept_No , MIN (Salary) as "Min" , MAX (Salary) as "Max" , SUM (Salary) as "Sum" , AVG (Salary) as "Avg" , COUNT(*) as "Count" FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No Order by 1 ;

WHERE Clause acts as a filter before any Calculations are done

Dept_No __________ Min Max Sum AVG Count ________ __________ __________ ___________ ________ 200 400

41888.88 36000.00

48000.00 54500.00

89888.88 145000.00

44944.44 48333.33

2 3

The system eliminates reading any other Dept_No’s other than 200 and 400. This means that only Dept_No’s of 200 and 400 will come off the disk to be calculated.

Page 230

Chapter 9

Aggregation

Keyword HAVING tests Aggregates after they are Totaled SELECT Dept_No , MIN (Salary) as "Min" , MAX (Salary) as "Max" , SUM (Salary) as "Sum" , AVG (Salary) as "Avg" , COUNT(*) as "Count" FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No HAVING AVG(Salary) > 45000 Order by 1 ;

HAVING Clause acts as a filter on the totals after the Calculations are done

Dept_No __________ Min Max Sum AVG Count ________ __________ __________ ___________ ________ 200 400

41888.88 36000.00

48000.00 54500.00

89888.88 145000.00

44944.44 48333.33

2 3

The HAVING Clause only works on Aggregate Totals. The WHERE filters rows to be excluded from calculation, but the HAVING filters the Aggregate totals after the calculations, thus eliminating certain Aggregate totals.

Page 231

Chapter 9

Aggregation

Group By Grouping Sets SELECT S.Product_Id ,DATEPART (Month, S.Sale_Date) as "Mo" ,DATEPART (Year, S.Sale_Date) as "Year" ,SUM (Daily_Sales) as Total FROM Sales_Table as S GROUP BY Grouping Sets (S.Product_Id ,DATEPART(Month, S.Sale_Date) ,DATEPART(Year, S.Sale_Date)) ORDER BY 1 , 2 , 3 Product_Id _________ ? ? ? 1000 2000 3000

Mo ____ ? 9 10 ? ? ?

Year _____ 2000 ? ? ? ? ?

Total _________ 862404.35 } 418769.36 443634.99 331204.72 306611.81 224587.82

}

}

Each Grouping Set totals 862404.35 in this example

This query does not work in this release of the Azure SQL Data Warehouse

The example above shows the Group By Grouping Sets. This will show the figures from the Sales_Table many different ways. You will see the total sales for all sales combined, total sales per year, total sales per month and total sales per Product_Id. There are three of these commands, Group By Grouping Sets, Group By Rollup and Group By Cube.

Page 232

Chapter 9

Aggregation

Group By Rollup SELECT S.Product_Id ,DATEPART (Month, S.Sale_Date) as "Mo" ,DATEPART (Year, S.Sale_Date) as "Year" ,SUM (Daily_Sales) as Total FROM Sales_Table as S GROUP BY ROLLUP (S.Product_Id ,DATEPART(Month, S.Sale_Date) ,DATEPART(Year, S.Sale_Date)) ORDER BY 1 , 2 , 3

This query does not work in this release of the Azure SQL Data Warehouse

The example above shows the Group By Rollup. This will show the figures from the Sales_Table many different ways. The Answer set is on the following page. There are three of these command, Group By Grouping Sets, Group By Rollup and Group By Cube. Grouping Sets shows a few different views. Group By Rollup takes it further and Group By Cube even more. Turn the page and see what Rollup does.

Page 233

Chapter 9

Aggregation

Answer Set for Group By Rollup Query Product_Id _________ ? 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 3000 3000 3000 3000 3000

Mo ____ ? ? 9 9 10 10 ? 9 9 10 10 ? 9 9 10 10

Year _____ ? ? ? 2000 ? 2000 ? ? 2000 ? 2000 ? ? 2000 ? 2000

Total _________ 862404.35 331204.72 139350.69 139350.69 191854.03 191854.03 306611.81 139738.91 139738.91 166872.90 166872.90 224587.82 139679.76 139679.76 84908.06 84908.06

SELECT S.Product_Id ,DATEPART (Month, S.Sale_Date) as "Mo" ,DATEPART (Year, S.Sale_Date) as "Year" ,SUM (Daily_Sales) as Total FROM Sales_Table as S GROUP BY ROLLUP (S.Product_Id ,DATEPART(Month, S.Sale_Date) ,DATEPART(Year, S.Sale_Date)) ORDER BY 1 , 2 , 3

The answer set from the previous page is above. All of the different colors in the Total add up to 862404.35.

Page 234

Chapter 9

Aggregation

Creating a Cube SELECT S.Product_Id ,DATEPART (Month, S.Sale_Date) as "Mo" ,DATEPART (Year, S.Sale_Date) as "Year" ,SUM (Daily_Sales) as Total FROM Sales_Table as S GROUP BY CUBE(S.Product_Id ,DATEPART(Month, S.Sale_Date) ,DATEPART(Year, S.Sale_Date)) ORDER BY 1 , 2 , 3

This query does not work in this release of the Azure SQL Data Warehouse

The example above shows how to create a cube. This will show the figures from the Sales_Table many different ways. You will see the total sales for all sales combined, total sales per year, total sales per month and more. The following page will show the answer set.

Page 235

Chapter 9

Aggregation

Answer Set for Cube Query Product_Id _________ ? ? ? ? ? ? 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 3000 3000 3000 3000 3000 3000

Mo ____ ? ? 9 9 10 10 ? ? 9 9 10 10 ? ? 9 9 10 10 ? ? 9 9 10 10

Year _____ ? 2000 ? 2000 ? 2000 ? 2000 ? 2000 ? 2000 ? 2000 ? 2000 ? 2000 ? 2000 ? 2000 ? 2000

Total _________ 862404.35 862404.35 418769.36 418769.36 443634.99 443634.99 331204.72 331204.72 139350.69 139350.69 191854.03 191854.03 306611.81 306611.81 139738.91 139738.91 166872.90 166872.90 224587.82 224587.82 139679.76 139679.76 84908.06 84908.06

SELECT S.Product_Id ,DATEPART (Month, S.Sale_Date) as "Mo" ,DATEPART (Year, S.Sale_Date) as "Year" ,SUM (Daily_Sales) as Total FROM Sales_Table as S GROUP BY CUBE (S.Product_Id ,DATEPART(Month, S.Sale_Date) ,DATEPART(Year, S.Sale_Date)) ORDER BY 1 , 2 , 3

The answer set from the previous page is above. All of the different colors in the Total add up to 862404.35.

Page 236

Chapter 9

Aggregation

An Easy Example of Creating a Cube

This is not a great cube example because there is only one customer who placed one order, however, it is done to show you the concept of a cube. At the top of the answer set is what was made in total. Then it is further broken down. Page 237

Chapter 9

Aggregation

Quiz - GROUP BY GROUPING SETS Challenge Course_Table Course_ID Course_Name _________ _________________ Student_Course_Table Student_ID Course_ID

280023 231222 125634 231222 125634 322133 125634 322133 324652 333450 260000 333450 234121 123250

210 210 100 220 200 220 220 300 200 500 400 400 100 100

100 200 210 220 300 400

Credits ______ Seats ____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 V2R3 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16 Student_Table

__________ Student_ID 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

__________ Last_Name Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

First_Name __________ __________ Class_Code Grade_Pt ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

Write SQL that will perform a Group by Grouping Sets. Your mission is to build a report that will show the Average Grade_Pt for three different sets. Those sets are by Class_Code, by Credits and by Course_ID. Sort the final report first by Class_Code (FR, SO, JR, SR) and then by Credits DESC and then by Course_ID Desc. Good Luck!

The answer is on the next page. Page 238

Chapter 9

Aggregation

Answer To Quiz - GROUP BY GROUPING SETS Challenge SELECT Class_Code ,AVG(Grade_Pt) ,Credits ,"c".Course_Id FROM Student_Table s, Student_Course_Table sc, Course_Table "c" WHERE s.Student_Id=sc.Student_Id AND sc.Course_Id="c".Course_Id GROUP BY GROUPING SETS (Class_Code, "c".Course_ID, Credits) ORDER BY CASE Class_Code WHEN 'FR' Then 1 WHEN 'SO' Then 2 WHEN 'JR' Then 3 WHEN 'SR' Then 4 ELSE 5 END, Credits DESC, Course_ID DESC ; Above, is something to enjoy and learn from.

Page 239

Chapter 9

Aggregation

Getting the Average Values Per Column

The first query retrieved the average rows per value for the column Product_ID. The example below did the same, but for the column Sale_Date.

Page 240

Chapter 9

Average Values per Column for all Columns in a Table

The query above retrieved the average rows per value for both columns in the table.

Page 241

Aggregation

Chapter 10

Page 242

Join Functions

Chapter 10

Join Functions

Chapter 10 - Join Functions

“When spider webs unite they can tie up a lion.” - African Proverb

Page 243

Chapter 10

Join Functions

The Azure SQL Data Warehouse Join Quiz Which Statement is NOT true! 1. Each Table in the Azure SQL Data Warehouse has a Distribution Key, unless it is a replicated table. 2. The Distribution Key is the mechanism that allows the Azure SQL Data Warehouse to physically distribute the rows of a table across the Nodes. 3. For two rows to be Joined together the Azure SQL Data Warehouse insists that both rows are physically in the same memory. 4. The Azure SQL Data Warehouse will either Redistribute one or both of the tables or Duplicate the smaller table across all nodes to ensure matching rows are in the same memory, even if it is only for the life of the Join.

Do you know which statement above is False?

Page 244

Chapter 10

Join Functions

The Azure SQL Data Warehouse Join Quiz Answer Which Statement is NOT true!

1. Each Table in the Azure SQL Data Warehouse has a Distribution Key, unless it is a replicated table. 2. The Distribution Key is the mechanism that allows the Azure SQL Data Warehouse to physically distribute the rows of a table across the Nodes. 3. For two rows to be Joined together the Azure SQL Data Warehouse insists that both rows are physically in the same memory. All statements are true 4. The Azure SQL Data Warehouse will either Redistribute one or both of the tables or Duplicate the smaller table across all nodes to ensure matching rows are in the same memory, even if it is only for the life of the Join.

Distribution Customer_Table row

Memory

ACE Consult 555-1212 31323134

Order_Table row

31323134 123552 10/01/1999 5111.47

Join on Customer_Number

All statements above are TRUE! Two joining rows have to be in the same memory of a single node.

Page 245

Chapter 10

Join Functions

Redistribution Distribution Memory ACE Consult 555-1212 31323134

31323134 123552 10/01/1999 5111.47

Join on Customer_Number The Distribution Key for this table is Customer_Number, so it is naturally on this Distribution.

Customer_Number is NOT the Distribution Key so this row was re-hashed by Customer_Number.

SELECT C.Customer_Number, C.Phone_Number ,C.Customer_Name ,O.Customer_Number, O.Order_Number, O.Order_Date, Order_Total FROM Customer_Table as C INNER JOIN Order_Table as O ON C.Customer_Number = O.Customer_Number ;

The Azure SQL Data Warehouse can redistribute data (temporarily) by re-hashing Customer_Number from the Order_Table. Now, all joining rows will be on the same node's memory. That is one of two ways to get matching rows together. Page 246

Chapter 10

Join Functions

Big Table Small Table Join Strategy Distribution

Distribution

Employee_Table has 1 million rows

Employee_Table has 1 million rows

Department_Table

Department_Table

100 Marketing 400 Customer Support

200 Research and Dev 300 Sales

The Department_Table is small. It only has four rows

The Azure SQL Data Warehouse has a special way of dealing with big table and small table joins. Turn the page and be prepared to be amazed! Page 247

Chapter 10

Join Functions

Duplication of the Smaller Table across All-Distributions Duplicate the smaller table in memory

Distribution 100 200 300 400

Marketing Research and Dev Sales Customer Support

Employee_Table has 1 million rows

Distribution 100 200 300 400

Marketing Research and Dev Sales Customer Support

Employee_Table has 1 million rows

Department_Table

Department_Table

100 Marketing 400 Customer Support

200 Research and Dev 300 Sales

The Department_Table is small. It only has four rows

The Azure SQL Data Warehouse took the Department_Table and gathered up all 4-rows (temporarily) and in memory Duplicated the entire 4-row Table across all Nodes. Now the joins can happen! This is the second way to get rows together. If one table is much bigger than the other, the Azure SQL Data Warehouse will duplicate the smaller table on all Nodes, just for the life of the query. Page 248

Chapter 10

Join Functions

If the Join Condition is the Distribution Key no Movement SELECT Last_Name, Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

Is Dept_No the Primary of Employee_Table? YES

Is Dept_No the Primary of Department_Table? YES

If the above tables (being joined by Dept_No) also had Dept_No as their Primary then matching rows would naturally be on the same Node together. See this visually by turning the page!

The Azure SQL Data Warehouse knows that it can only JOIN two rows together if they are physically on the same node. This can occur naturally if the join condition columns are the Distribution Keys of their respective tables, but most likely the Azure SQL Data Warehouse will have to move data to get the matching rows on the same Node. What will the Optimizer decide to do next? Page 249

Chapter 10

Join Functions

Matching Rows That Are On The Same Node Naturally Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

If Dept_No was the Distribution Key of both tables the matching rows would already be on the same Node

Distribution

Distribution

Distribution

Distribution

100 Marketing

200 Research and Dev

300 Sales

400 Customer Support

300 Larkins

400 Harrison 400 Reilly 400 Strickling

100 Chambers 10 Smythe

200 Coffing 200 Smith

If both the Employee_Table and the Department_table (being joined by Dept_No) have Dept_No as their respective Distribution Keys they are considered co-located. Anytime these two tables are joined, there is no need to redistribute or duplicate because the matching rows are naturally on the same Node. That is the brilliance of the Hash Formula. Page 250

Chapter 10

Join Functions

What if the Join Condition Columns are Not Primary Indexes SELECT Last_Name, E.Dept_No, Department_Name FROM Employee_Table as E, Department_Table as D WHERE E.Dept_No = D.Dept_No Order BY 1 ;

Is Dept_No the Primary Index of Employee_Table? NO

Is Dept_No the Primary Index of Department_Table? YES

Redistribute the Employee_Table by Dept_No for this join only.

The Optimizer knows that the Dept_No column is the Distribution Key for the Department_Table. It also knows that the Dept_No column is NOT the Distribution Key for the Employee_Table, so the Optimizer commands the Nodes to Redistribute the entire Employee_Table by Dept_No temporarily. This is equivalent to loading the Employee_Table with a Distribution Key of Dept_No. Now all matching rows can join. Page 251

Chapter 10

Join Functions

Strategy 1 of 4 – The Merge Join The rows to be joined have to be located on a common Distribution's memory Both spools have to be sorted by the ROWID calculated over the join column(s )

1

1

1

2

2

3

2

3

Re-Distribution of one or both spools by ROWHASH or Duplication of the smaller spool to all Nodes Sorting of one or both spools by the ROWID

Relocation of rows to the common node can be done by redistribution of the rows by the join column(s) ROWHASH or by copying the smaller table as a whole to all nodes. Page 252

Chapter 10

Join Functions

Quiz – Redistribute the Employees by their Dept_No Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Distribution

100 Marketing

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Distribution

200 Research and Dev

Dept_No ________________ Department_Name ________

Distribution

300 Sales

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

If the Azure SQL Data Warehouse decided to Redistribute the Employee_Table by Dept_No, which nodes will hold which employee rows? Try and place them yourself.

Distribution

400 Customer Support

Fill in the quiz above. This is a great opportunity to understand the Azure SQL Data Warehouse engine.

Page 253

Chapter 10

Join Functions

Quiz –Dept_No landed on Distribution with Matches Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

The hashing formula is consistent. Notice that all of the Dept_No 400 rows landed on Node 4. That is because one hash formula ensures matches reside together.

Distribution

Distribution

Distribution

Distribution

100 Marketing

200 Research and Dev

300 Sales

400 Customer Support

300 Larkins

400 Harrison 400 Reilly 400 Strickling

100 Chambers 10 Smythe

200 Coffing 200 Smith

Each redistributed row landed on the same Node as its matching row. Notice that Squiggy Jones has a NULL department so the Azure SQL Data Warehouse will not redistribute that row on an Inner Join. Smythe in Dept_No 10 hashes to SPU 1 but has no match. Turn the page. Page 254

Chapter 10

Join Functions

Quiz – Redistribute the Orders to the Proper Distribution Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________

11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

Distribution 11111111 Billy's Best Choice

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Distribution

Distribution

31313131 Acme Products 57896883 XYZ Plumbing

31323134 ACE Consulting 87323456 Databases N-U

If the Azure SQL Data Warehouse decides to Redistribute the Order_Table by Customer_No, which nodes will hold which Orders? Place their Customer_Number and Order_Total on the node after Redistribution.

Fill in the quiz above. This is a great opportunity to understand the Azure SQL Data Warehouse engine.

Page 255

Chapter 10

Join Functions

Answer to Redistribute the Employees by their Dept_No Quiz Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

Distribution 11111111 Billy's Best Choice 11111111 12347.53 11111111 8005.91

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

Distribution

Distribution

31313131 Acme Products 57896883 XYZ Plumbing

31323134 ACE Consulting 87323456 Databases N-U

57896883 23454.84

31323134 5111.47 87323456 15231.62

It is no coincidence that when Customer_Number 11111111 was hashed every 11111111 row went to Distribution 1.

Each row redistributed to the same Distribution as its matching row.

Page 256

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 10

Join Functions

Strategy 2 of 4 – The Hash Join 1) The rows to be joined must be located on a common Distribution 2) The smaller spool is sorted by the ROWHASH calculated over the join column(s) and is kept in the cache (memory) 3) The bigger spool stays unsorted

3

Sorted by Row Hash

1

2

2

4

3

1

Unsorted

5 The bigger spool scanned row by row and then each ROWID from the bigger spool is searched in the smaller spool (by means of a binary search)

The Hash Join takes advantage of memory and loads the entire smaller spool into Cache memory. Then, each row from the bigger spool is joined one at a time by doing a binary search (on the sorted smaller spool).

Page 257

Chapter 10

Join Functions

Strategy 4 of 4 – The Product Join The rows to be joined have to be located on the same Distribution No spool needs to be sorted!

3

Unsorted

2

2

1

4

3

1

5 A full table scan is done on the smaller spool and each qualifying row of spool 1 is compared against each row of spool 2

The Product Join takes is not well received. Keep an eye on it. It could be a bad sign.

Page 258

Chapter 10

Join Functions

A Two-Table Join Using Traditional Syntax Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Customer_Table.Customer_Number The column ,Customer_Name Customer_Number is in both ,Order_Number tables. It must be fully ,Order_Total qualified with the table name FROM Customer_Table, or it errors. Order_Table WHERE Customer_Table.Customer_Number = Order_Table.Customer_Number ; Customer_Number is the column that has matching data in both tables. This is called the "Join Condition"

A Join combines columns on the report from more than one table. The example above joins the Customer_Table and the Order_Table together. The most complicated part of any join is the JOIN CONDITION. The JOIN CONDITION is which Column from each table is a match. In this case, Customer_Number is a match that establishes the relationship, so this join will happen on matching Customer_Number columns. Page 259

Chapter 10

Join Functions

A two-table join using Non-ANSI Syntax with Table Alias Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT The column Customer_Number is in both tables. It must be fully qualified or it errors.

Cust.Customer_Number ,Customer_Name We alias the table ,Order_Number names to shorten the typing when ,Order_Total fully qualifying a FROM Customer_Table as Cust, column. Order_Table as ORD WHERE Cust.Customer_Number = Ord.Customer_Number;

A Join combines columns on the report from more than one table. The example above joins the Customer_Table and the Order_Table together. The most complicated part of any join is the JOIN CONDITION. The JOIN CONDITION means which Column from each table is a match. In this case, Customer_Number is a match that establishes the relationship.

Page 260

Chapter 10

Join Functions

You Can Fully Qualify All Columns Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

The column Customer_Number is in both tables. It must be fully qualified or it errors.

SELECT

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

A good practice is

Cust.Customer_Number to fully qualify all ,Cust.Customer_Name columns in the SELECT list for ,Ord.Order_Number clarity to other ,Ord.Order_Total users. FROM Customer_Table as Cust, Order_Table as ORD WHERE Cust.Customer_Number = Ord.Customer_Number ;

Whenever a column is in both tables, you must fully qualify it when doing a join. You don't have to fully qualify tables that are only in one of the tables because the system knows which table that particular column is in. You can choose to fully qualify every column if you like. This is a good practice because it is more apparent which columns belong to which tables for anyone else looking at your SQL. Page 261

Chapter 10

Join Functions

A two-table join using ANSI Syntax Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

ON Keyword is used instead of WHERE

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

SELECT Cust.Customer_Number, Customer_Name, Order_Number, Order_Total FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number ;

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

INNER JOIN Keyword replaces the comma

This is the same join as the previous slide except it is using ANSI syntax. Both will return the same rows with the same performance. Rows are joined when the Customer_Number matches on both tables, but non-matches won’t return. Page 262

Chapter 10

Join Functions

Both Queries have the same Results and Performance Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

Traditional Syntax SELECT Cust.Customer_Number, Customer_Name, Order_Number, Order_Total FROM Customer_Table as Cust, Order_Table as ORD WHERE Cust.Customer_Number = Ord.Customer_Number ;

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

ANSI Syntax SELECT Cust.Customer_Number, Customer_Name, Order_Number, Order_Total FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number ;

Both of these syntax techniques bring back the same result set and have the same performance. The INNER JOIN is considered ANSI. Which one does Outer Joins?

Page 263

Chapter 10

Join Functions

Quiz – Can You Finish the Join Syntax? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT First_Name, Last_Name, Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON Finish the Join

Finish this join by placing the missing SQL in the proper place!

Page 264

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 10

Join Functions

Answer to Quiz – Can You Finish the Join Syntax? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Primary Key

Foreign Key

SELECT First_Name, Last_Name, Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

This query is ready to run. Page 265

Dept_No is the column that both tables have in common. This is called a Primary Key/Foreign Key relationship

Chapter 10

Join Functions

Quiz – Can You Find the Error? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________

SELECT First_Name ,Last_Name ,Dept_No ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

This query has an error! Can you find it?

Page 266

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Can you find the error?

Chapter 10

Join Functions

Answer to Quiz – Can You Find the Error? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

The column Dept_No is in both tables. It needs to be fully qualified as E.Dept_No or D.Dept_No

Department_Table Dept_No ________________ Department_Name ________

SELECT First_Name ,Last_Name ,E.Dept_No ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

If a column in the SELECT list is in both tables, you must fully qualify it.

Page 267

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 10

Join Functions

Super Quiz – Can You Find the Difficult Error? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

SELECT First_Name ,Last_Name ,E.Dept_No ,Department_Name Can you find FROM Employee_Table as E the error? INNER JOIN Department_Table as D ON Employee_Table.Dept_No = D.Dept_No ; This query has an error! Can you find it?

Page 268

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 10

Join Functions

Answer to Super Quiz – Can You Find the Difficult Error? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name, Last_Name, E.Dept_No ,Department_Name Once you FROM Employee_Table as E alias a table INNER JOIN (as E) Department_Table as D ON Employee_Table.Dept_No = D.Dept_No ; You must fully qualify with E.Dept_No (Not Employee_Table.Dept_No) (This query thinks there are three tables (E, D, and Employee_Table)

If a column in the SELECT list is in both tables, you must fully qualify it. Once you create an alias you must use the alias.

Page 269

Chapter 10

Join Functions

Quiz – Which rows from both tables won’t Return? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,E.Last_Name ,D.Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

This inner join will return all rows that have a matching Dept_No in both tables. Which rows won't return?

An Inner Join returns matching rows, but did you know an Outer Join returns both matching rows and nonmatching rows? You will understand soon! Page 270

Chapter 10

Join Functions

Answer to Quiz – Which rows from both tables Won’t Return? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,E.Last_Name ,D.Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

1 2 3

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Squiggy Jones has a NULLDept_No Richard Smythe has an invalid Dept_No 10

No Employees work in Department 500

The bottom line is that the three rows excluded did not have a matching Dept_No.

Page 271

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 10

Join Functions

LEFT OUTER JOIN Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

1st Table after FROM is always the LEFT Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

Department_Table Dept_No ________________ Department_Name ________

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Since we are doing a Left Outer Join, the Employee_Table is referred to as the outer table.

This is a LEFT OUTER JOIN. That means that all rows from the LEFT Table will appear in the report regardless if it finds a match on the right table. Page 272

Chapter 10

Join Functions

LEFT OUTER JOIN Results Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

First_Name __________ Mandee Herbert William Loraine Squiggy Richard Cletus Billy John

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Department_Name ________________ Marketing Customer Support Customer Support Sales Nulls show ? mismatches ? Customer Support Research and Dev Research and Dev

Marketing Research and Dev Sales Customer Support Human Resources

The matching rows return just like an inner join, but orphaned rows from the Left table also return.

A LEFT Outer Join Returns all rows from the LEFT Table including all Matches. If a LEFT row can’t find a match, a NULL is placed on right columns not found! Page 273

Chapter 10

Join Functions

RIGHT OUTER JOIN Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

2nd Table after FROM is always the RIGHT Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E RIGHT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

Department_Table Dept_No ________________ Department_Name ________

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Since we are doing a Right Outer Join, the Department_Table is referred to as the outer table.

This is a RIGHT OUTER JOIN. That means that all rows from the RIGHT Table will appear in the report regardless if it finds a match with the LEFT Table. Page 274

Chapter 10

Join Functions

RIGHT OUTER JOIN Example and Results Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E RIGHT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ; Nulls show mismatches

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

First_Name __________ Department_Name ________________ Mandee Herbert William Loraine Cletus Billy John ?

Marketing Customer Support Customer Support Sales Customer Support Research and Dev Research and Dev Human Resources

The matching rows return just like an inner join, but orphaned rows from the Right table also return.

All rows from the Right Table were returned with matches, but since Dept_No 500 didn’t have a match, the system put a NULL Value for Left Column values. Page 275

Chapter 10

Join Functions

FULL OUTER JOIN Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E FULL OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

Department_Table Dept_No ________________ Department_Name ________

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Since we are doing a Full Outer Join, both tables are referred to as the outer table.

This is a FULL OUTER JOIN. That means that all rows from both the RIGHT and LEFT Table will appear in the report regardless if it finds a match.

Page 276

Chapter 10

Join Functions

FULL OUTER JOIN Results Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E FULL OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

First_Name __________ Mandee Herbert William Loraine Squiggy Richard Cletus Billy John ?

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Department_Name ________________ Marketing Customer Support Customer Support Sales ? ? Customer Support Research and Dev Research and Dev Human Resources

The FULL Outer Join Returns all rows from both Tables. NULLs show the flaws!

Page 277

All rows return from both tables on a Full Outer Join

Chapter 10

Join Functions

Which Tables are the Left and which Tables are Right? Fill in the blank. Is the SELECT Cla.Claim_Id, table a Left Table or a Cla.Claim_Date, Right Table? SUB.Last_Name, SUB.First_Name, Claims __________ "ADD".Phone, Providers __________ Services __________ SER.Service_Pay, Subscribers __________ PRO.Provider_Code, Addresses __________ PRO.Provider_Name FROM CLAIMS Cla LEFT OUTER JOIN PROVIDERS PRO ON Cla.Provider_No = PRO.Provider_Code LEFT OUTER JOIN SERVICES SER ON Cla.Claim_Service = SER.Service_Code LEFT OUTER JOIN SUBSCRIBERS SUB ON Cla.Subscriber_No = SUB.Subscriber_No AND Cla.Member_No = SUB.Member_No LEFT OUTER JOIN ADDRESSES "ADD" ON SUB.Subscriber_No = "ADD".Subscriber_No;

The Can you list which tables above are left tables and which tables are right tables?

Page 278

Chapter 10

Join Functions

Answer - Which Tables are the Left and Which are the Right? Fill in the blank. SELECT Cla.Claim_Id, Is the table a Left Cla.Claim_Date, Table or a Right SUB.Last_Name, Table? SUB.First_Name, Claims Left "ADD".Phone, Providers Right SER.Service_Pay, Services Right PRO.Provider_Code, Subscribers Right PRO.Provider_Name Addresses Right FROM CLAIMS Cla LEFT OUTER JOIN PROVIDERS PRO ON Cla.Provider_No = PRO.Provider_Code LEFT OUTER JOIN SERVICES SER ON Cla.Claim_Service = SER.Service_Code LEFT OUTER JOIN SUBSCRIBERS SUB ON Cla.Subscriber_No = SUB.Subscriber_No AND Cla.Member_No = SUB.Member_No LEFT OUTER JOIN ADDRESSES "ADD" ON SUB.Subscriber_No = "ADD".Subscriber_No;

There is always only one Left table (the first table after the FROM clause) All tables after the first table are each Right Tables.

Tables are joined two at a time. The result from each join remains the Left Table

The first table is always the left table and the rest are right tables. The results from the first two tables being joined becomes the left table.

Page 279

Chapter 10

Join Functions

INNER JOIN with Additional AND Clause Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D WHERE E.Dept_No = D.Dept_No AND Department_Name like 'Marke%' ; The additional AND is performed first in order to eliminate unwanted data, so the join is less intensive than joining everything first and then eliminating rows that don't qualify. Page 280

Chapter 10

Join Functions

ANSI INNER JOIN with Additional AND Clause Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND Department_Name like 'Marke%' ;

The additional AND is performed first in order to eliminate unwanted data, so the join is less intensive than joining everything first and then eliminating after.

Page 281

Chapter 10

Join Functions

ANSI INNER JOIN with Additional WHERE Clause Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No WHERE Department_Name like 'Marke%' ;

The additional WHERE is performed first in order to eliminate unwanted data, so the join is less intensive than joining everything first and then eliminating. Page 282

Chapter 10

Join Functions

OUTER JOIN with Additional WHERE Clause Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT First_Name, Last_Name, Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No WHERE E.Dept_No = 100 ;

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

__________ First_Name Department_Name _______________ Marketing Mandee

Only Mandee Chambers is in Dept_No 100

The additional WHERE is performed last on Outer Joins. All rows will be joined first and then the additional WHERE clause filters after the join takes place.

Page 283

Chapter 10

Join Functions

OUTER JOIN with Additional AND Clause Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Department_Name AS Dname FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND E.Dept_No = 100 ;

The additional AND is performed in conjunction with the ON statement on Outer Joins. All rows will be evaluated with the ON clause and the AND combined.

Page 284

Chapter 10

Join Functions

OUTER JOIN with Additional AND Clause Results Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

OUTER Join with additional AND Clause SELECT First_Name ,Department_Name AS Dname FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND E.Dept_No = 100 ;

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

First_Name __________ Mandee Herbert William Loraine Squiggy Richard Cletus Billy John

Dname ________ Marketing ? ? ? ? ? ? ? ?

The additional AND is performed in conjunction with the ON statement on Outer Joins. This can surprise you. Only Mandee is in Dept_No 100, so she showed up like expected, but an outer join returns non-matches also. Ouch!!!

Page 285

Chapter 10

Join Functions

Quiz – Why is this considered an INNER JOIN? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name, Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND D.Dept_No = 400 ;

This is considered an INNER JOIN because we are doing a LEFT OUTER JOIN on the Employee_Table and then filtering with the AND for a column in the right table!

Page 286

Chapter 10

Join Functions

Evaluation Order for Outer Queries SELECT Cou.*, STU1.* FROM COURSE_TABLE Cou LEFT OUTER JOIN STUDENT_COURSE_TABLE STU ON Cou.Course_Id = STU.Course_Id LEFT OUTER JOIN STUDENT_TABLE STU1 ON STU.Student_Id = STU1.Student_Id;

The Order in which Server evaluates Outer Queries

1

The first ON clause in the query (reading from left to right).

2

Any ON clause applies to its immediately preceding join operation.

3

Parenthesis can be used to override the natural left to right order.

When you perform an inner join the Azure SQL Data Warehouse considers this to be both commutative and associative. That means that two tables being inner joined will easily come up with the intended answer. This allows the optimizer to select the best join order between tables. This is because the end result will be the same. Outer Joins are different. They will follow the above three rules for evaluation order by the Optimizer.

Page 287

Chapter 10

Join Functions

The DREADED Product Join Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

No Join Condition Linking the Two Tables!

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D WHERE Department_Name like '%m%' Order by 1, 2, 3;

This query becomes a Product Join because it does not possess any JOIN Conditions (Join Keys). Every row from one table is compared to every row of the other table, and quite often, the data is not what you intended to get back.

Page 288

Chapter 10

Join Functions

The DREADED Product Join Results

No Join Condition Linking the Two Tables!

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D WHERE Department_Name like '%m%' Order by 1, 2, 3;

First_Name _________ Last_Name _________ Department_Name ________________

Not all rows are displayed

Billy Billy Billy Billy Cletus Cletus Cletus Cletus Herbert

Coffing Coffing Coffing Coffing Strickling Strickling Strickling Strickling Harrison

Customer Support Human Resources Marketing Research and Development Customer Support Human Resources Marketing Research and Development Marketing

36 Rows came back. Nine employees with each working in three different departments. This data is WRONG!

How can Billy Coffing work in 4 different departments?

A Product Join is often a mistake! 4 Department rows had an ‘m’ in their name, so these were joined to every employee, and the information is worthless.

Page 289

Chapter 10

Join Functions

The Horrifying Cartesian Product Join Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

No WHERE Clause in the join!

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D

A Cartesian Product Join is usually a big mistake.

Page 290

Department_Table

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

This joins every row from one table to every row of another table. 9 rows multiplied by 5 rows = 45 rows of complete nonsense!

Chapter 10

Join Functions

The ANSI Cartesian Join will ERROR Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

No ON Clause in the join!

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D

Dept_No ________________ Department_Name ________ 100 200 300 400 500

This query Errors because ANSI forbids joins without ON clauses.

Error

This causes an error. ANSI won’t let this run unless a join condition is present.

Page 291

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 10

Join Functions

Quiz – Do these Joins Return the Same Answer Set? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Query 1 SELECT First_Name, Department_Name FROM Employee_Table INNER JOIN Department_Table ;

Do these two queries produce the same result?

Page 292

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Query 2 SELECT First_Name, Department_Name FROM Employee_Table, Department_Table ;

Chapter 10

Join Functions

Answer – Do these Joins Return the Same Answer Set? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Query 1 SELECT First_Name, Department_Name FROM Employee_Table INNER JOIN Department_Table ;

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Query 2 SELECT First_Name, Department_Name FROM Employee_Table, Department_Table ; Cartesian product join occurs

This query errors

Do these two queries produce the same result? No, Query 1 Errors due to ANSI syntax and no ON Clause, but Query 2 Product Joins to bring back junk! Page 293

Chapter 10

Join Functions

The CROSS JOIN Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

A Cross Join is the ANSI equivalent to a Product Join Only a WHERE will work. ON Will NOT!

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Customer_Name, Order_Number FROM Customer_Table CROSS JOIN Order_Table WHERE Order_Number = 123456 ORDER BY 1 ;

This query becomes a Product Join because a Cross Join is an ANSI Product Join. It will compare every row from the Customer_Table to Order_Number 123456 in the Order_Table. Check out the Answer Set on the next page.

Page 294

Chapter 10

Join Functions

The CROSS JOIN Answer Set Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Answer Set

SELECT Customer_Name, Order_Number FROM Customer_Table CROSS JOIN Order_Table WHERE Order_Number = 123456 ORDER BY 1 ;

Customer_Name ______________ Order_Number _____________ Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

This Cross Join produces information that just isn’t worth anything quite often!

Page 295

123456 123456 123456 123456 123456

Chapter 10

Join Functions

The Self Join Employee_Table2 Employee_No Dept_No Last_Name First_Name Salary ____________ _______ _________ _________ _______ 1232578 100 Chambers Mandee 48850.00 54500.00 1256349 400 Harrison Herbert 2341218 400 Reilly William 36000.00 54500.00 1121334 400 Strickling Cletus 2312225 300 Larkins Loraine 40200.00 2000000 ? Jones Squiggy 32800.50 1000234 10 Smythe Richard 32800.00 41888.88 1324657 200 Coffing Billy 48000.00 1333454 200 Smith John SELECT Mgrs.Dept_No , Mgrs.Last_Name as MgrName , Mgrs.Salary as MgrSal , Emps.Last_Name as EmpName , Emps.Salary as Empsal FROM Employee_Table2 as Emps, Employee_Table2 as Mgrs WHERE Emps.Dept_No = Mgrs.Dept_No AND Mgrs.Mgr = 'Y' AND Emps.Salary > Mgrs.Salary ;

Mgr ____ Y N Y N Y N N N Y

Which Workers make a bigger Salary than their Manager?

A Self Join gives itself 2 different Aliases, which is then seen as two different tables. Page 296

Chapter 10

Join Functions

The Self Join with ANSI Syntax Employee_Table2 Employee_No Dept_No Last_Name First_Name Salary ____________ _______ _________ _________ _______ 1232578 100 Chambers Mandee 48850.00 54500.00 1256349 400 Harrison Herbert 2341218 400 Reilly William 36000.00 54500.00 1121334 400 Strickling Cletus 2312225 300 Larkins Loraine 40200.00 2000000 ? Jones Squiggy 32800.50 1000234 10 Smythe Richard 32800.00 41888.88 1324657 200 Coffing Billy 48000.00 1333454 200 Smith John SELECT Mgrs.Dept_No , Mgrs.Last_Name as MgrName , Mgrs.Salary as MgrSal , Emps.Last_Name as EmpName , Emps.Salary as Empsal FROM Employee_Table2 as Emps INNER JOIN Employee_Table2 as Mgrs ON Emps.Dept_No = Mgrs.Dept_No WHERE Mgrs.Mgr = 'Y' AND Emps.Salary > Mgrs.Salary ;

Mgr ____ Y N Y N Y N N N Y

Which Workers make a bigger Salary than their Manager?

A Self Join gives itself 2 different Aliases, which is then seen as two different tables.

Page 297

Chapter 10

Join Functions

Quiz – Will both queries bring back the same Answer Set? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1; Will both queries bring back the same result set?

Page 298

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;

Chapter 10

Join Functions

Answer – Will both queries bring back the same Answer Set? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1;

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;

Will both queries bring back the same result set? Yes! Because they’re both inner joins.

Page 299

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 10

Join Functions

Quiz – Will both queries bring back the same Answer Set? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1;

Will both queries bring back the same result set? Page 300

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;

Chapter 10

Join Functions

Answer – Will both queries bring back the same Answer Set? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1;

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;

Will both queries bring back the same result set? NO! The WHERE is performed last.

Page 301

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 10

Join Functions

How would you Join these two tables? Course_Table Course_ID Course_Name Credits _________ _________________ ______ Seats ____ 100 Database Concepts 3 50 200 Introduction to SQL 3 20 210 Advanced SQL 3 22 220 V2R3 SQL Features 2 25 300 Physical Database Design 4 20 400 Database Administration 4 16 Student_Table Student_ID __________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name __________ Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

First_Name Class_Code Grade_Pt __________ __________ ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

How would you join these two tables together? You can't do it. There is no matching column with like data. There is no Primary Key/Foreign Key relationship between these two tables. That is why you are about to be introduced to a bridge table. It is formally called an Associative table or a Lookup table. Page 302

Chapter 10

Join Functions

An Associative Table is a Bridge that Joins Two Tables Associative

Course_Table

Table

Course_ID Course_Name Credits _________ _________________ ______ Seats ____ 100 Database Concepts 3 50 200 Introduction to SQL 3 20 210 Advanced SQL 3 22 220 V2R3 SQL Features 2 25 300 Physical Database Design 4 20 400 Database Administration 4 16

Student_Course_Table Student_ID Course_ID 280023 231222 125634 231222 125634 322133 125634 322133 324652 333450 260000 333450 234121 123250

210 210 100 220 200 220 220 300 200 500 400 400 100 100

Student_Table Student_ID __________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name __________ Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

First_Name Class_Code Grade_Pt __________ __________ ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

The Associative Table is a bridge between the Course_Table and Student_Table.

Page 303

Chapter 10

Join Functions

Quiz – Can you write the 3-Table Join? Associative

Course_Table

Table

Course_ID Course_Name Credits _________ _________________ ______ Seats ____ 100 Database Concepts 3 50 200 Introduction to SQL 3 20 210 Advanced SQL 3 22 220 V2R3 SQL Features 2 25 300 Physical Database Design 4 20 400 Database Administration 4 16

Student_Course_Table Student_ID Course_ID 280023 231222 125634 231222 125634 322133 125634 322133 324652 333450 260000 333450 234121 123250

210 210 100 220 200 220 220 300 200 500 400 400 100 100

Student_Table Student_ID __________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name __________ Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

First_Name Class_Code Grade_Pt __________ __________ ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

SELECT ALL Columns from the Course_Table and Student_Table and Join them. Page 304

Chapter 10

Join Functions

Answer to Quiz – Can you Write the 3-Table Join? Student_Course_Table Student_Table

Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

SELECT S.*, C.* FROM Student_Table as S, Course_Table as C, Student_Course_Table as SC Where S.Student_ID = SC.Student_ID AND C.Course_ID = SC.Course_ID ;

Course_ID Course_Name Credits Seats

Notice the * technique of getting ALL columns from both tables!

The Associative Table is a bridge between the Course_Table and Student_Table, and its sole purpose is to join these two tables together. Page 305

Chapter 10

Join Functions

Quiz – Can you write the 3-Table Join to ANSI Syntax? Student_Course_Table Student_Table

Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

Course_ID Course_Name Credits Seats

SELECT S.*, C.* FROM Student_Table as S, Course_Table as C, Student_Course_Table as SC Where S.Student_ID = SC.Student_ID AND C.Course_ID = SC.Course_ID ; Convert this query to ANSI syntax Please re-write the above query using ANSI Syntax.

Page 306

Chapter 10

Join Functions

Answer – Can you Write the 3-Table Join to ANSI Syntax? Student_Course_Table

Student_Table Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

Course_ID Course_Name Credits Seats

ANSI Syntax Traditional Syntax SELECT S.*, C.* FROM Student_Table as S, Course_Table as C, Student_Course_Table as SC Where S.Student_ID = SC.Student_ID AND C.Course_ID = SC.Course_ID ;

Select S.*, C.* From Student_Table as S INNER JOIN Student_Course_Table as SC ON S.Student_ID = SC.Student_ID INNER JOIN Course_Table as C ON C.Course_ID = SC.Course_ID;

The above queries show both traditional and ANSI form for this three table join.

Page 307

Chapter 10

Join Functions

Quiz – Can you Place the ON Clauses at the End? Student_Course_Table Student_Table Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

Course_ID Course_Name Credits Seats

ANSI Syntax Select S.*, C.* From Student_Table as S INNER JOIN Student_Course_Table as SC ON S.Student_ID = SC.Student_ID INNER JOIN Course_Table as C ON C.Course_ID = SC.Course_ID; Please re-write the above query and place both ON Clauses at the end.

Page 308

Can you rewrite this and place all of the ON clauses at the end?

Chapter 10

Join Functions

Answer – Can you Place the ON Clauses at the End? Student_Course_Table Student_Table Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

Course_ID Course_Name Credits Seats

Select S.*, C.* The trick is to From Student_Table as S put the first ON INNER JOIN clause for the Student_Course_Table as SC last join and go INNER JOIN backwards Course_Table as C ON C.Course_ID = SC.Course_ID ON SC.Student_ID = S.Student_ID;

This is tricky. The only way it works is to place the ON clauses backwards. The first ON Clause represents the last INNER JOIN and then moves backwards. Page 309

Chapter 10

Join Functions

The 5-Table Join – Logical Insurance Model Addresses

Subscriber_No

Subscribers

Claims

Subscriber_No

Subscriber_No

Member_No

Member_No

Services Service_Code

Claim_Service

Providers Provider_Code

Provider_No

Above, is the logical model for the insurance tables showing the Primary Key and Foreign Key relationships (PK/FK).

Page 310

Chapter 10

Join Functions

Quiz - Write a Five Table Join Using ANSI Syntax Addresses

Subscriber_No

Subscribers

Claims

Subscriber_No

Subscriber_No

Member_No

Member_No

Services Service_Code

Claim_Service

Providers Provider_Code

Provider_No

Your mission is to write a five table join selecting all columns using ANSI syntax.

Page 311

Chapter 10

Join Functions

Answer - Write a Five Table Join Using ANSI Syntax SELECT cla1.*, sub1.*, add1.* ,pro1.*, ser1.* FROM CLAIMS AS cla1 INNER JOIN SUBSCRIBERS AS sub1 ON cla1.Subscriber_No = sub1.Subscriber_No AND cla1.Member_No = sub1.Member_No INNER JOIN ADDRESSES AS add1 ON sub1.Subscriber_No = add1.Subscriber_No INNER JOIN PROVIDERS AS pro1 ON cla1.Provider_No = pro1.Provider_Code INNER JOIN SERVICES AS ser1 ON cla1.Claim_Service = ser1.Service_Code ;

Above, is the example writing this five table join using ANSI syntax.

Page 312

Chapter 10

Join Functions

Quiz - Write a Five Table Join Using Non-ANSI Syntax Addresses

Subscriber_No

Subscribers

Claims

Subscriber_No

Subscriber_No

Member_No

Member_No

Services Service_Code

Claim_Service

Providers Provider_Code

Provider_No

Your mission is to write a five table join selecting all columns using Non-ANSI syntax.

Page 313

Chapter 10

Join Functions

Answer - Write a Five Table Join Using Non-ANSI Syntax

SELECT FROM

WHERE AND AND AND AND

cla1.*, sub1.*, add1.* ,pro1.*, ser1.* CLAIMS AS cla1, SUBSCRIBERS AS sub1, ADDRESSES AS add1, PROVIDERS AS pro1, SERVICES AS ser1 cla1.Subscriber_No = sub1.Subscriber_No cla1.Member_No = sub1.Member_No sub1.Subscriber_No = add1.Subscriber_No cla1.Provider_No = pro1.Provider_Code cla1.Claim_Service = ser1.Service_Code ;

Above, is the example writing this five table join using Non-ANSI syntax.

Page 314

Chapter 10

Join Functions

Quiz –Re-Write this putting the ON clauses at the END SELECT cla1.*, sub1.*, add1.* ,pro1.*, ser1.* FROM CLAIMS AS cla1 INNER JOIN SUBSCRIBERS AS sub1 ON cla1.Subscriber_No = sub1.Subscriber_No AND cla1.Member_No = sub1.Member_No INNER JOIN ADDRESSES AS add1 ON sub1.Subscriber_No = add1.Subscriber_No INNER JOIN PROVIDERS AS pro1 ON cla1.Provider_No = pro1.Provider_Code INNER JOIN SERVICES AS ser1 ON cla1.Claim_Service = ser1.Service_Code ;

Above, is the example writing this five table join using Non-ANSI syntax.

Page 315

Chapter 10

Join Functions

Answer –Re-Write this putting the ON clauses at the END SELECT cla1.*, sub1.*, add1.* ,pro1.*, ser1.* FROM PROVIDERS AS pro1 INNER JOIN ADDRESSES AS add1 INNER JOIN SUBSCRIBERS AS sub1 INNER JOIN SERVICES AS ser1 INNER JOIN CLAIMS as cla1 ON cla1.Claim_Service = ser1.Service_Code ON cla1.Subscriber_No = sub1.Subscriber_No AND cla1.Member_No = sub1.Member_No ON sub1.Subscriber_No =add1.Subscriber_No ON cla1.Provider_No = pro1.Provider_Code ;

Above is the example writing this five table join using ANSI syntax with the ON clauses at the end. We had to move the tables around also to make this happen. Notice that the first ON clause represents the last two tables being joined, and then it works backwards.

Page 316

Chapter 11

Page 317

Date Functions

Chapter 11

Date Functions

Chapter 11 – Date Function

"An inch of time cannot be bought with an inch of gold." - Chinese Proverb

Page 318

Chapter 11

Date Functions

Current_Timestamp

Above, is the keyword Current_Timestamp that allows a user to get the timestamp. This is a reserved word and so the system will deliver the timestamp to you when requested.

Page 319

Chapter 11

Date Functions

Getdate This example uses the Getdate() function to return the timestamp.

SELECT Getdate() as "The Date"; The Date -----------03/30/2015 8:46:04.567

“Not all who wander are lost.” – J. R. R. Tolkien

The Getdate command will return today's date and time just like the Current_Timestamp command. This is not ANSI.

Page 320

Chapter 11

Date Functions

Date and Time Keywords SELECT GETDATE() AS [GETDATE] , CURRENT_TIMESTAMP AS [CURRENT_TIMESTAMP] , GETUTCDATE() AS [GETUTCDATE] GETDATE CURRENT_TIMESTAMP 03/30/2015 8:42:04.833 03/30/2015 8:42:04.833 Date and Time

Date and Time ANSI

SELECT SYSDATETIME() ,SYSUTCDATETIME()

Date and Time UTC

AS [SYSDATETIME] AS [SYSUTCDATETIME]

SYSDATETIME 2015-03-30 08:42:04.8355769 Date and Time

GETUTCDATE 03/30/2015 1:42:04.833

SYSUTCDATETIME 2015-03-30 13:42:04.8355769 Date and Time UTC

The above examples show how to get the date and time. The GETDATE and CURRENT_TIMESTAMP are equivalent, but CURRENT_TIMESTAMP is ANSI compliant. The differences between the top and bottom examples are that the top has a data type of DATETIME and the bottom DATETIME2, which is an expanded form of DATETIME. Page 321

Chapter 11

Date Functions

SYSDATETIMEOFFSET Provides the Timezone Offset SELECT SYSUTCDATETIME() AS [SYSUTCDATETIME] ,SYSDATETIMEOFFSET() AS [SYSDATETIMEOFFSET]; GETUTCDATE 2015-03-30 13:42:04.8355769 Date and Time UTC

SYSDATETIMEOFFSET 2015-03-30 08:42:04.8355769 -05:00 Date and Time with a Timezone offset

The CETUTCDATE function will provide a Current_Timestamp, but in Universal Time Coordinate (UTC) time. The SYSDATETIMEOFFSET shows the timezone difference between UTC and the local Current_Timestamp.

Page 322

Chapter 11

SYSDATETIMEOFFSET Provides the Timezone Offset

This is how you can get just the current_date and the current_time..

Page 323

Date Functions

Chapter 11

Date Functions

Using both CAST and CONVERT in Literal Values SELECT CAST('20150216' AS DATE) as "Date YMD"; Date YMD 2015-02-16

SELECT CONVERT(CHAR(8), CURRENT_TIMESTAMP, 112) AS "Converted" ; Converted

20150330 This converts the current date and time to CHAR(8) by using style 112 ('YYYYMMDD')

This is an example of using the CAST function with a date literal. The first SQL example converts the character string literal ‘20150216’ to a DATE data type. The second SQL example converts the current date and current time to a CHAR (8) data type using the style 112, which is a 'YYYYMMDD' format.

Page 324

Chapter 11

Date Functions

Using Both CAST and CONVERT in Literal Values

Converts the current date and time value to a CHAR(12) by using style 114 ('hh:mm:ss.nnn').

This example converts the current date and time value to CHAR (12) by using style 114 ('hh:mm:ss.nnn').

Page 325

Chapter 11

Date Functions

Using both CAST and CONVERT in Literal Values SELECT SYSDATETIME() as "Local Time Eastern" ,SWITCHOFFSET(SYSDATETIMEOFFSET(), '-06:00') as "Timestamp Central" ,SWITCHOFFSET(SYSDATETIMEOFFSET(), '-07:00') as "Timestamp Mountain" ,SWITCHOFFSET(SYSDATETIMEOFFSET(), '-08:00') as "Timestamp Pacific" ; 2015-03-30 11:03:38.9877064 2015-03-30 10:03:38.9877064 -06:00 2015-03-30 09:03:38.9877064 -07:00 2015-03-30 08:03:38.9877064 -08:00

Local Time Eastern Timestamp Central Timestamp Mountain Timestamp Pacific

The times above are the converted times, but they are displayed vertically to save space on the screen

The SWITCHOFFSET function can be used to adjust an input DATETIMEOFFSET value to a specified time zone. We are showing in the example SQL above how to convert to Central, Mountain and Pacific time. Page 326

Chapter 11

Date Functions

The DATEADD Function

Valid values for the part input include year, quarter, month, dayofyear, day, week, weekday, hour, minute, second, millisecond, microsecond, and nanosecond. You can also specify the part in abbreviated form, such as yy instead of year.

The syntax for the DATEADD function is DATEADD (part, n, date_value). Valid values for the part are year, quarter, month, dayofyear, day, week, weekday, hour, minute, second, millisecond, microsecond, and nanosecond.

Page 327

Chapter 11

Date Functions

The DATEDIFF Function

The syntax for the DATEDIFF function is DATEDIFF (part, dt_val1, dt_val2). Above, we have used the literal dates of '2014-01-30 (January 30, 2014) and '2015-06-30' (June 30, 2015). We then can see the differences in the number of years, months, days, hours, minutes and seconds. Page 328

Chapter 11

Date Functions

DATEADD Function SELECT Order_Date ,DateAdd (Day, 60,Order_Date) as "Due Date" ,Order_Total ,DateAdd (Day, 50,Order_Date) as Discount ,Cast(Order_Total *.98 as Decimal(8,2)) as Discount_Total FROM Order_Table ORDER BY 1 ;

Order_Date __________ 05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999

Due Date Order_Total _________ Discount _____________ Discount_Total _________ __________ 07/03/1998 03/02/1999 11/08/1999 11/30/1999 12/09/1999

12347.53 8005.91 23454.84 5111.47 15231.62

06/23/1998 02/20/1999 10/29/1999 11/20/1999 11/29/1999

12,100.58 7,845.79 22,985.74 5,009.24 14,926.99

Valid values for the part argument include year, quarter, month, dayofyear, day, week, weekday, hour, minute, second, millisecond, microsecond, nanosecond, TZoffset, and ISO_WEEK. Page 329

Chapter 11

Date Functions

A Real World Example for DateAdd Using the Order Table SELECT Order_Date ,DateAdd (Day, 60,Order_Date) as "Due Date" ,Order_Total ,DateAdd (Day, 50,Order_Date) as Discount ,Cast(Order_Total *.98 as Decimal(8,2)) as Discount_Total FROM Order_Table ORDER BY 1 ;

Order_Date __________ 05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999

Due Date Order_Total _________ Discount _____________ Discount_Total _________ __________ 07/03/1998 03/02/1999 11/08/1999 11/30/1999 12/09/1999

12347.53 8005.91 23454.84 5111.47 15231.62

06/23/1998 02/20/1999 10/29/1999 11/20/1999 11/29/1999

The example above uses a real world example from the Order_Table.

Page 330

12,100.58 7,845.79 22,985.74 5,009.24 14,926.99

Chapter 11

Date Functions

DATEPART Function SELECT Order_Date ,DateAdd (Day, 60,Order_Date) as "Due Date" ,Order_Total ,DateAdd (Day, 50,Order_Date) as Discount ,Cast(Order_Total *.98 as Decimal(8,2)) as Discount_Total FROM Order_Table WHERE DATEPART(Month, Order_Date) = 10 ORDER BY 1 ;

Order_Date __________

Due Date Order_Total _________ Discount _____________ Discount_Total _________ __________

10/01/1999 11/30/1999 10/10/1999 12/09/1999

5111.47 11/20/1999 15231.62 11/29/1999

5,009.24 14,926.99

This example only looks for orders that happened in October. This is done by using the DATEPART function in the WHERE clause. Valid values for the part argument include year, quarter, month, dayofyear, day, week, weekday, hour, minute, second, millisecond, microsecond, nanosecond, TZoffset, and ISO_WEEK.

Page 331

Chapter 11

Date Functions

DATEPART Function Examples SELECT * FROM Order_Table WHERE DATEPART(Year, Order_Date) = 1998 ;

Year = 1998

SELECT * FROM Order_Table WHERE DATEPART(Quarter, Order_Date) = 4 ;

Quarter = 4th

SELECT * FROM Order_Table WHERE DATEPART(Month, Order_Date) = 10 ;

Month = October

SELECT * FROM Order_Table WHERE DATEPART(Day, Order_Date) = 4 ;

Day = 4th day of the month

SELECT * FROM Order_Table WHERE DATEPART(DayofYear, Order_Date) = 1 ;

Day of year = January 1st

SELECT * FROM Order_Table WHERE DATEPART(Week, Order_Date) = 1 ;

Week = 1st week of year

SELECT * FROM Order_Table WHERE DATEPART(WeekDay, Order_Date) = 1 ;

Week Day = Sunday

Above, are some excellent examples to pull from using the DATEPART function. Page 332

Chapter 11

Date Functions

YEAR, MONTH, and DAY Functions SELECT Order_Date ,Year(Order_Date) as "Yr" ,Month(Order_Date) as "Mo" ,Day(Order_Date) as "Day" FROM Order_Table ORDER BY 1 ;

Order_Date ____ Yr Mo __________ ___ 1998-05-04 1999-01-01 1999-09-09 1999-10-01 1999-10-10

1998 1999 1999 1999 1999

5 1 9 10 10

Day ____ 4 1 9 1 10

The YEAR, MONTH, and DAY functions are abbreviations for the DATEPART function. Page 333

Chapter 11

Date Functions

A Better Technique for YEAR, MONTH, and DAY Functions SELECT Order_Number, Customer_Number, Order_Date, Order_Total FROM Order_Table WHERE YEAR(order_date) = 1999 AND MONTH(order_date) = 10;

SELECT Order_Number, Customer_Number, Order_Date, Order_Total FROM Order_Table This approach is more efficient for SQL WHERE order_date >= '19991001' Server and Azure SQL Data Warehouse. AND order_date < '19991101' Indexes can take advantage of this technique!

Both queries above do the same thing and deliver the same result set, but the bottom query could be much faster.

Order_Number ________________ Customer_Number Order_Date _____________ __________ Order_Total __________ 123552 123585

31323134 87323456

1999-10-01 1999-10-10

5111.47 15231.62

Above, is the tale of two queries. The top query applies manipulation on the filtered column. In most cases the Azure SQL Data Warehouse can’t use an index efficiently when using this technique. The bottom query uses a range filter instead. Page 334

Chapter 11

Date Functions

DATENAME Function SELECT Order_Date ,DATENAME(Year, Order_Date) as "Yr" ,DATENAME(Month, Order_Date) as "Mo" ,DATENAME(Day, Order_Date) as "Day" FROM Order_Table ORDER BY 1 ;

Order_Date ____ Yr __________ 1998-05-04 1999-01-01 1999-09-09 1999-10-01 1999-10-10

1998 1999 1999 1999 1999

Mo _________

Day ____

May January September October October

4 1 9 1 10

The DATENAME function returns the name of the requested part rather than the number. Notice above that only the Month returns the actual name of the month, but both the Year and the Day still return the integer values.

Page 335

Chapter 11

Date Functions

ISDATE Function

T The ISDATE function accepts a character string as input and returns a Boolean. ISDATE returns a 1 if it is convertible to a date and time data type. It returns a 0 if it is not convertible to a date and time data type. Above, we have used the date of February 29th. This is only a valid date during a leap year. It only returns a 1 when the date is valid.

Page 336

Chapter 12

Page 337

Temporary Tables

Chapter 12

Temporary Tables

Chapter 12 - Temporary Tables

“Graffiti’s always been a temporary art form. You make your mark and then they scrub it off.” - Banksy

Page 338

Chapter 12

Temporary Tables

Temporary Tables Derived Tables • • • •

Is a SELECT Statement with a SELECT Statement Is purely logical as opposed to physical Exists only within a query Has its execution optimized at run time Temporary Table

Is always created as #tablename Space comes from tempdb Can only be used by the connection that created the table Can be created by the User, then populated with an INSERT/SELECT Table and Data are deleted after the connection that created the table is closed

Derived tables exist for the life of a single query, but the database tempdb is used by the Azure SQL Data Warehouse system for local temporary tables. A local temporary table is created using a (pound sign) # prefix before the table name. Each temporary table that is created can only be accessed by the user who created it and only in the session that created it.

Page 339

Chapter 12

Temporary Tables

CREATING A Derived Table • • • •

Is a SELECT Statement with a SELECT Statement Is purely logical as opposed to physical Exists only within a query Has its execution optimized at run time along with the rest of the query

SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ; A query within a query.

AVGSAL ________ 46782.15

Answer Set

The SELECT Statement that creates and populates the Derived table is always inside Parentheses.

Page 340

Chapter 12

Temporary Tables

Naming the Derived Table SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ;

The name of the Derived Table is TeraTom

AVGSAL ________ 46782.15

Answer Set

In the example above, TeraTom is the name we gave the Derived Table. It is mandatory that you always name the table or its errors.

Page 341

Chapter 12

Temporary Tables

Aliasing the Column Names in the Derived Table SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ; AVGSALis the Column Name in the derived table named TeraTom

AVGSAL ________

46782.15

Answer Set

AVGSAL is the name we gave to the column in our Derived Table that we call TeraTom. Our SELECT (which builds the columns) shows we are only going to have one column in our derived table, and we have named that column AVGSAL.

Page 342

Chapter 12

Temporary Tables

Multiple Ways to Alias the Columns in a Derived Table 1

SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ; The derived table must always be named

2

SELECT * FROM (SELECT AVG(salary) AS AVGSAL FROM Employee_Table) AS TeraTom ; The derived table must always be named

Page 343

Alias CAN be done here

Alias CAN be done inside the derived SELECT statement

Chapter 12

Temporary Tables

CREATING a Derived Table using the WITH Command Create the Derived Table while we run the query!

WITH TeraTom(AVGSAL) AS (SELECT AVG(salary)FROM Employee_Table) SELECT * FROM TeraTom ;

AVGSAL ________ 46782.15

Answer Set

When using the WITH Command, we can CREATE our Derived table while running the main query.

Page 344

Chapter 12

Temporary Tables

The Same Derived Query shown Three Different Ways

1

SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) TeraTom (AVGSAL) ; Alias CAN be done here or here

2

3

Page 345

SELECT * FROM (SELECT AVG(salary) as AVGSAL FROM Employee_Table) TeraTom ;

WITH TeraTom(AVGSAL) AS (SELECT AVG(salary)FROM Employee_Table) SELECT * FROM TeraTom ;

Chapter 12

Temporary Tables

MULTIPLE Derived Tables using the WITH Command 1st Derived Table

2nd Derived Table

WITH WellPaid(Employee_No, Last_Name) AS (SELECT Employee_No, Last_Name FROM Employee_Table WHERE Salary > (SELECT AVG(Salary) FROM Employee_Table)) ,DeptMgr(Mgr_No, Department_Name) AS (SELECT Mgr_No, Department_Name FROM Department_Table INNER JOIN WellPaid ON (Employee_No = Mgr_No)) SELECT Last_Name AS WellPaidMgr ,Department_Name FROM WellPaid INNER JOIN DeptMgr ON (Employee_No = Mgr_No) ;

Using the WITH Command, we can CREATE multiple Derived tables that can be referenced elsewhere in the query.

Page 346

Chapter 12

Temporary Tables

Column Alias Can Default For Normal Columns I don't need to alias this SELECT E.*, AVGSAL because it can default to its FROM Employee_Table as E current name INNER JOIN (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

TeraTom Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33 The derived table is built first

In a derived table, you will always have a SELECT query in parenthesis, and you will always name the table. You have options when aliasing the columns. As in the example above, you can let normal columns default to their current name. Page 347

Chapter 12

Temporary Tables

Most Derived Tables Are Used To Join To Other Tables SELECT E.*, AVGSAL The SELECT is the FROM Employee_Table as E Derived Table INNER JOIN (SELECT Dept_No, AVG(salary) FROM Employee_Table GROUP BY Dept_No) AS TeraTom (Dept_No, AVGSAL) ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

The derived table name is TeraTom

The columns are aliased

Employee_No _______ Dept_No Last_Name First_Name ______ Salary ___________ ________ ________ 1000234 1232578 1324657 1333454 2312225 1121334 1256349 2341218

10 100 200 200 300 400 400 400

Smythe Chambers Coffing Smith Larkins Strickling Harrison Reilly

Richard Mandee Billy John Loraine Cletus Herbert William

64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 54500.00 36000.00

AVGSAL _______ 64300.00 48850.00 44944.44 44944.44 40200.00 48333.33 48333.33 48333.33

The first five columns in the Answer Set came from the Employee_Table. AVGSAL came from the derived table named TeraTom

Page 348

Chapter 12

Temporary Tables

A Join Example Showing Different Column Alias Styles SELECT E.*, AVGSAL This does not need an alias because it can default to its FROM Employee_Table as E current name INNER JOIN (SELECT Dept_No as Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom This must have ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

an alias because it is an aggregate

Employee_No ________ Dept_No _________ Last_Name _________ First_Name _______ Salary AVGSAL __________ _______ 1000234 1232578 1324657 1333454 2312225 1121334 1256349 2341218

Page 349

10 100 200 200 300 400 400 400

Smythe Chambers Coffing Smith Larkins Strickling Harrison Reilly

Richard Mandee Billy John Loraine Cletus Herbert William

64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 54500.00 36000.00

64300.00 48850.00 44944.44 44944.44 40200.00 48333.33 48333.33 48333.33

Chapter 12

Temporary Tables

The Three Components of a Derived Table SELECT E.*, Salary, AVGSAL FROM Employee_Table as E INNER JOIN (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33 The derived table is optimized with the rest of the query.

1

A derived table is a SELECT query. The SELECT query always starts with an open parenthesis and ends with a close parenthesis.

2

The derived table must be given a name. Above we called our derived table TeraTom.

3

You will need to define (alias) the columns in the derived table. Above we could allow Dept_No to default to Dept_No, but we had to specifically alias AVG(Salary) as AVGSAL.

Every derived table must have the three components listed above

Page 350

TeraTom

Chapter 12

Temporary Tables

Visualize This Derived Table SELECT E.*, (Salary - AVGSAL) as PlusMinAvg FROM Employee_Table as E INNER JOIN (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

Employee_No ____________ Dept_No ________ 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 1256349 400 2341218 400

TeraTom Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33 The derived table is built first

Last_Name Salary PlusMinAvg ___________ First_Name ___________ ________ ___________ Smythe Richard 64300.00 0.00 Chambers Mandee 48850.00 0.00 Coffing Billy 41888.88 -3055.56 Smith John 48000.00 3055.56 Larkins Loraine 40200.00 0.00 Strickling Cletus 54500.00 6166.67 Harrison Herbert 54500.00 6166.67 Reilly William 36000.00 -12333.33

Our example above shows the data in the derived table named TeraTom. This query allows us to see each employee and the plus or minus avg of their salary compared to the other workers in their department.

Page 351

Chapter 12

Temporary Tables

Our Join Example With The WITH Syntax WITH TeraTom (Dept_No, AVGSAL) AS (SELECT Dept_No , AVG(Salary) FROM Employee_Table GROUP BY Dept_No) SELECT E.*, AVGSAL FROM Employee_Table as E INNER JOIN TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

Now, the lower portion of the query refers to TeraTom Almost like it is a permanent table, but it is not!

Page 352

TeraTom Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33

Chapter 12

Temporary Tables

Quiz - Answer the Questions SELECT Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table INNER JOIN (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) as TeraTom (Depty, AVGSAL) ON Dept_No = Depty ;

1) What is the name of the derived table? 2) How many columns are in the derived table? 3) What is the name of the derived table columns?

4) Is there more than one row in the derived table? 5) What common keys join the Employee and Derived? 6) Why were the join keys named differently?

Page 353

Chapter 12

Temporary Tables

Answer to Quiz - Answer the Questions SELECT Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table INNER JOIN (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) as TeraTom (Depty, AVGSAL) ON Dept_No = Depty ;

1) What is the name of the derived table? TeraTom 2) How many columns are in the derived table? 2

3) What’s the name of the derived columns? Depty and AVGSAL 4) Is their more than one row in the derived table? Yes 5) What keys join the tables? Dept_No and Depty 6) Why were the join keys named differently? If both were named Dept_No, we would error unless we full qualified.

Page 354

Chapter 12

Temporary Tables

Clever Tricks on Aliasing Columns in a Derived Table SELECT Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table Alias Here INNER JOIN

1

(SELECT Dept_No as Depty, AVG(Salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) as TeraTom ON Dept_No = Depty ;

SELECT E.Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table as E INNER JOIN Alias Here

2

(SELECT Dept_No, AVG(Salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) as TeraTom ON E.Dept_No = TeraTom.Dept_No ;

Page 355

Chapter 12

Temporary Tables

A Derived Table lives only for the lifetime of a single query

First query

1

Second query

WITH T (Dept_No, AVGSAL) AS (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) SELECT T.Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table as E INNER JOIN T ON E.Dept_No = T.Dept_No ;

SELECT * FROM T ;

2

The semi-colon (;) indicates the end of the query.

Page 356

Error – Query Fails…. T does Not exist.

Chapter 12

Temporary Tables

An Example of Two Derived Tables in a Single Query WITH T (Dept_No, AVGSAL) AS (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) SELECT T.Dept_No, E.First_Name, E.Last_Name, T.AVGSAL, S.Counter FROM Employee_Table as E INNER JOIN T ON E.Dept_No = T.Dept_No INNER JOIN (SELECT Employee_No, Row_Number() OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) FROM Employee_Table) as S (Employee_No, Counter) ON E.Employee_No = S.Employee_No ORDER BY T.Dept_No;

Page 357

Chapter 12

Temporary Tables

RECURSIVE Derived Table Hierarchy TeraTom Coffing CEO

Jane Stevens VP North

Ricardo Gonzales VP South

Hitesh Patel North Manager

Inquayee Mumba South Manager

North Analysts

South Analysts

Robert Pantelle Ming Zao Constantine Mikas

Betty Boston Kelly Roberts Brett Valens

Above, is a company hierarchy and this is what we will use to perform our WITH Recursive query.

Page 358

Chapter 12

Temporary Tables

RECURSIVE Derived Table Query WITH TeraTom (Emp, Mgr, LastN, Pos_Name, DEPTH) AS (SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, 0 FROM Hierarchy_Table WHERE Mgr_Employee_No IS NULL UNION ALL SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, DEPTH+1 FROM TeraTom INNER JOIN Hierarchy_Table ON Emp= Mgr_Employee_No ) SELECT * FROM TeraTom ORDER BY 5,2,1 ; Recursive queries are not supported in the first release of the Azure SQL Data Warehouse

Above, is the WITH Recursive query. Page 359

Chapter 12

Temporary Tables

RECURSIVE Derived Table Definition This is a recursive query

The recursive derived table's name

The recursive derived table is defined with 5 columns. They are Emp, Mgr, LastN, Pos_Name, DEPTH

WITH TeraTom (Emp, Mgr, LastN, Pos_Name, DEPTH) AS (SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, 0 FROM Hierarchy_Table WHERE Mgr_Employee_No IS NULL UNION ALL SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, DEPTH+1 FROM TeraTom INNER JOIN Hierarchy_Table ON Emp= Mgr_Employee_No ) SELECT * FROM TeraTom ORDER BY 5,2,1 ; Recursive queries are not supported in the first release of the Azure SQL Data Warehouse

Above, is the WITH Recursive query and the highlighted part explains the recursive derived table definition itself.

Page 360

Chapter 12

Temporary Tables

WITH RECURSIVE Derived Table Seeding WITH TeraTom This entire (Emp, Mgr, LastN, Pos_Name, DEPTH) AS highlighted (SELECT Employee_No, Mgr_Employee_No, section will Last_Name, Position_Name, 0 produce only FROM Hierarchy_Table a single row in our WHERE Mgr_Employee_No IS NULL derived table UNION ALL SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, DEPTH+1 FROM TeraTom INNER JOIN Hierarchy_Table ON Emp= Mgr_Employee_No ) SELECT * FROM TeraTom ORDER BY 5,2,1 ; One row is Emp Mgr ________ LastN Pos_Name ______ Depth ____ ____ __________

1

?

Coffing

CEO

0

placed in our derived table. That is called "seeding the Table".

Recursive queries are not supported in the first release of the Azure SQL Data Warehouse

Above, is the WITH Recursive query and the highlighted part explains how the first row is placed inside the derived table. The only employee with no manager is the CEO, Tom Coffing. His Mgr_Employee_No is NULL. The table is now seeded! Page 361

Chapter 12

Temporary Tables

WITH RECURSIVE Derived Table Looping WITH TeraTom (Emp, Mgr, LastN, Pos_Name, DEPTH) AS (SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, 0 FROM Hierarchy_Table WHERE Mgr_Employee_No IS NULL UNION ALL SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, DEPTH+1 FROM TeraTom INNER JOIN Hierarchy_Table ON Emp= Mgr_Employee_No ) SELECT * FROM TeraTom ORDER BY 5,2,1 ;

The highlighted section joins the derived table to the Hierarchy_Table and loops until finished

Emp Mgr ________ LastN Pos_Name ______ Depth ____ ____ __________ 1

?

Coffing

CEO

0

Recursive queries are not supported in the first release of The Azure SQL Data Warehouse

Above, is the WITH Recursive query and the highlighted part explains how the derived table is joined to the Hierarchy_Table in a looping fashion. The highlighted part keeps looping and adding rows until it loops and adds no rows. Then, it is done. Page 362

Chapter 12

Temporary Tables

RECURSIVE Derived Table Looping in Slow Motion UNION ALL SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, DEPTH+1 FROM TeraTom The first loop INNER JOIN places two more Hierarchy_Table ON Emp= Mgr_Employee_No rows inside the derived table

TeraTom Emp Mgr ________ LastN Pos_Name ______ Depth ____ ____ __________

1 10 20

? 1 1

Coffing Stevens Gonzales

CEO VP NORTH VP SOUTH

0 1 1

Recursive queries are not supported in the first release of the Azure SQL Data Warehouse

Above, is the WITH Recursive query and the highlighted part explains how the derived table is joined to the Hierarchy_Table in a looping fashion. The highlighted part keeps looping and adding rows until it loops and adds no rows. Then, it is done. This is the first loop and as you can see two rows were added. That is because our join condition is Emp = Mgr_Employee_No. Both Stevens and Gonzales report to a manager with an Emp = 1. Page 363

Chapter 12

Temporary Tables

RECURSIVE Derived Table Looping Continued UNION ALL SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, DEPTH+1 FROM TeraTom The second loop INNER JOIN places two more Hierarchy_Table ON Emp= Mgr_Employee_No rows inside the derived table

TeraTom Emp Mgr ________ LastN Pos_Name ______ Depth ____ ____ __________ 1 10 20 100 200

? 1 1 10 20

Coffing Stevens Gonzales Patel Mumba

CEO VP NORTH VP SOUTH North Manager South Manager

0 1 1 2 2

Recursive queries are not supported in the first release of the Azure SQL Data Warehouse

Above, is our WITH Recursive query and the highlighted part explains how the derived table is joined to the Hierarchy_Table in a looping fashion. The highlighted part keeps looping and adding rows until it loops and adds no rows. Then, it is done. This is the second loop and as you can see two rows were added. That is because our join condition is Emp=Mgr_Employee_No. Both Patel and Mumba report to a manager inside our recursive derived table. Page 364

Chapter 12

Temporary Tables

RECURSIVE Derived Table Looping Continued UNION ALL SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, DEPTH+1 FROM TeraTom INNER JOIN Hierarchy_Table ON Emp= Mgr_Employee_No

Emp Mgr ____ ____

The third loop places six more TeraTom rows inside the LastN Pos_Name Depth ________ __________ ______ derived table

1 10 20 100 200 1000 3000 5000 2000 4000 6000

Coffing Stevens Gonzales Patel Mumba Mikas Zao Pantelle Valens Roberts Boston

? 1 1 10 20 100 100 100 200 200 200

CEO VP NORTH VP SOUTH North Manager South Manager Analyst North Analyst North Analyst North Analyst South Analyst South Analyst South

0 1 1 2 2 3 3 3 3 3 3

Recursive queries are not supported in the first release of the Azure SQL Data Warehouse

Six rows are added in the third loop. Page 365

Chapter 12

Temporary Tables

RECURSIVE Derived Table Ends the Looping UNION ALL SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, DEPTH+1 FROM TeraTom INNER JOIN Hierarchy_Table ON Emp= Mgr_Employee_No The fourth loop added no rows!

TeraTom Emp Mgr ________ LastN Pos_Name ______ Depth ____ ____ __________ 1 10 20 100 200 1000 3000 5000 2000 4000 6000

? 1 1 10 20 100 100 100 200 200 200

Coffing Stevens Gonzales Patel Mumba Mikas Zao Pantelle Valens Roberts Boston

CEO VP NORTH VP SOUTH North Manager South Manager Analyst North Analyst North Analyst North Analyst South Analyst South Analyst South

0 1 1 2 2 3 3 3 3 3 3

Recursive queries are not supported in the first release of the Azure SQL Data Warehouse

No rows were added in the fourth loop. This loop is done!

Page 366

The loop is finished

Chapter 12

Temporary Tables

RECURSIVE Derived Table Definition WITH TeraTom (Emp, Mgr, LastN, Pos_Name, DEPTH) AS (SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, 0 FROM Hierarchy_Table WHERE Mgr_Employee_No IS NULL UNION ALL SELECT Employee_No, Mgr_Employee_No, Last_Name, Position_Name, DEPTH+1 FROM TeraTom When the loop INNER JOIN failed to add Hierarchy_Table a row the system knows it is done ON Emp= Mgr_Employee_No looping ) SELECT * Now it runs the FROM TeraTom final SELECT to ORDER BY 5,2,1 ; get the answer set. Recursive queries are not supported in the first release of the Azure SQL Data Warehouse

Above, is the WITH Recursive query and the highlighted part is now run so the final answer set can be delivered.

Page 367

Chapter 12

RECURSIVE Derived Table Answer Set

The answer set is delivered.

Page 368

Temporary Tables

Chapter 12

Temporary Tables

What is TEMPDB? TEMPDB is a database similar to all other SQL Server databases

It is recreated every time SQL Server is started

Allows for transactions to be rolled back, but does not allow for database recovery

Because of limited logging, operations in TEMPDB can be much faster than in other databases

Is the storage location for private, global, and direct temporary tables; as well as table variables

Like most things in life, TEMPDB is temporary. It is wonderful for temporary data storage. Just do not count on it having data that will be there for you in the future.

Page 369

Chapter 12

Temporary Tables

Creating a Temporary Table CREATE TABLE #Emp_Temp ( Employee_No INTEGER ,Dept_No SMALLINT It is mandatory you put in the ,First_Name VARCHAR(12) LOCATION = USER_DB ,Last_Name CHAR(20) ,Salary DECIMAL(8,2) ) WITH (LOCATION = USER_DB, DISTRIBUTION = HASH (Employee_No)) ; Populate the temp table with an INSERT/SELECT statement

INSERT INTO #Emp_Temp SELECT * FROM Employee_Table;

SELECT AVG(Salary) as AVGSAL FROM #Emp_Temp ; AVGSAL ___________ 46782.153333

You create a local temporary table by using the # prefix before the table name. The temporary table can only be accessed from its own session. You cannot create partitions, views, or non-clustered indexes on a temporary table, nor can you have two temporary tables with the same name in the same session. Page 370

Chapter 12

Temporary Tables

The Three Steps to Use a Private Temporary Table CREATE TABLE #Dept_Agg_Vol ( Dept_no Integer 1 ,Sum_Salary Decimal(10,2) ) WITH (Location=User_DB) ; INSERT INTO #Dept_Agg_Vol SELECT Dept_no ,SUM(Salary) 2 FROM Employee_Table GROUP BY Dept_no ;

3 SELECT * FROM #Dept_Agg_Vol ORDER BY 1;

Only you can see this data because your session number is associated with your Private Temporary Tables. You can’t even see this table if you login and query it from another session!

1) A USER Creates a Private Temporary Table and populates it with an INSERT/SELECT Statement, and then queries it until Logging off.

Page 371

Chapter 12

Temporary Tables

Creating a Temporary Table With a Clustered Index CREATE TABLE TEMPDB.#Dept_Agg_Vol3 ( Dept_no Integer ,Sum_Salary Decimal(10,2) ) WITH (Location=User_DB) ; CREATE CLUSTERED INDEX IDX_Dept_Agg_Vol_Dept_no ON TEMPDB.#Dept_Agg_Vol3 (Dept_No) ; Temporary Tables can have clustered and non-clustered indexes just like “regular “tables. Both the tables and their indexes are stored in tempdb.

You can have clustered indexes on a temporary table.

Page 372

Chapter 12

Temporary Tables

Creating a Columnstore Temporary Table From a CTAS CREATE TABLE #Order_Columnar WITH ( LOCATION=USER_DB, CLUSTERED COLUMNSTORE INDEX, DISTRIBUTION = Hash (Order_Number) ) AS SELECT * FROM Order_Table; This Temporary Table has been created from the Order_Table, but the temporary table is a columnstore. We used a CREATE TABLE AS (CTAS) statement.

You can use a Create Table As (CTAS) statement to create a temporary table that is a columnstore.

Page 373

Chapter 13

Page 374

Sub-query Functions

Chapter 13

Sub-query Functions

Chapter 13 – Sub-query Functions

“An invasion of Armies can be resisted, but not an idea whose time has come.” - Victor Hugo

Page 375

Chapter 13

Sub-query Functions

An IN List is much like a Subquery Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT * FROM Employee_Table WHERE Dept_No IN (100, 200) ; Employee_No ____________ Dept_No ________ 1232578 100 1324657 200 1333454 200

Last_Name _________ Chambers Coffing Smith

First_Name _______ Salary __________ Mandee 48850.00 Billy 41888.88 John 48000.00

This query is very simple and easy to understand. It uses an IN List to find all Employees who are in Dept_No 100 or Dept_No 200.

Page 376

Chapter 13

Sub-query Functions

An IN List Never has Duplicates – Just like a Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name __________ Jones Squiggy Smythe Richard Chambers Mandee Coffing Billy Smith John Larkins Loraine Strickling Cletus Reilly William Harrison Herbert

Salary _______ 32800.50 64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00

SELECT * FROM Employee_Table WHERE Dept_No IN (100, 100,200, 200) ;

What is going on with this IN List? Why in the world are their duplicates in there? Will this query even work? What will the result set look like? Turn the page!

Page 377

Chapter 13

Sub-query Functions

An IN List Ignores Duplicates Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Duplicate values in SELECT * a list are irrelevant FROM Employee_Table WHERE Dept_No IN (100, 100,200, 200) ;

Employee_No ____________ Dept_No ________ 1232578 100 The answer set still 1324657 200 produced only 3 rows 1333454 200

Last_Name _________ Chambers Coffing Smith

First_Name _______ Salary __________ Mandee 48850.00 Billy 41888.88 John 48000.00

Duplicate values are ignored here. We got the same rows back as before, and it is as if the system ignored the duplicate values in the IN List. That is exactly what happened.

Page 378

Chapter 13

Sub-query Functions

The Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

There is a Top Query and a Bottom Query!

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table

Dept_No ________________ Department_Name ________

SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ;

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Which Query Runs First?

The query above is a Subquery which means there are multiple queries in the same SQL. The bottom query runs first, and its purpose in life is to build a distinct list of values that it passes to the top query. The top query then returns the result set. This query solves the problem: Show all Employees in Valid Departments! Page 379

Chapter 13

Sub-query Functions

The Three Steps of How a Basic Subquery Works Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT * FROM Employee_Table 1 WHERE Dept_No IN ( SELECT Dept_No The Bottom Query runs first! FROM Department_Table) ;

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

2 The result is passed to the top query!

3 SELECT * FROM Employee_Table WHERE Dept_No IN (100, 200, 300, 400, 500) ;

The top query runs using the bottom query answer set

The bottom query runs first and builds a distinct IN list. Then the top query runs using the list.

Page 380

Chapter 13

Sub-query Functions

These are Equivalent Queries Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

1

2

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ;

SELECT * FROM Employee_Table WHERE Dept_No IN (100, 200, 300, 400, 500) ;

Both queries above are the same. Query 2 has values in an IN list. Query 1 runs a subquery to build the values in the IN list. Page 381

Chapter 13

Sub-query Functions

The Final Answer Set from the Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400 Remember that a subquery never has columns return in the final answer set

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources Notice that No employees are in dept 500

SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ; Employee_No Dept_No ____________ ________ 1232578 100 1324657 200 1333454 200 2312225 300 1256349 400 2341218 400 1121334 400

Page 382

Department_Table

Last_Name __________ Chambers Coffing Smith Larkins Harrison Reilly Strickling

First_Name __________ Mandee Billy John Loraine Herbert William Cletus

Salary ________ 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00

Chapter 13

Sub-query Functions

Quiz- Answer the Difficult Question Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

How are Subqueries similar to Joins between two tables?

A great question was asked above. Do you know the key to answering? Turn the page!

Page 383

Chapter 13

Sub-query Functions

Answer to Quiz- Answer the Difficult Question Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Primary Key

Foreign Key

How are Subqueries similar to Joins between two tables?

A Subquery between two tables or a Join between two tables will each need a common key that represents the relationship. This is called a Primary Key/Foreign Key relationship.

A Subquery will use a common key linking the two tables together very similar to a join! When subquerying between two tables, look for the common link between the two tables. Most of the time they both have a column with the same name, but not always. Page 384

Chapter 13

Sub-query Functions

Should you use a Subquery of a Join? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

When do I Subquery? SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ;

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

When do I perform a Join? SELECT E.*, Department_Name FROM Employee_Table as E Inner Join Department_Table as D ON E.Dept_No = D.Dept_No;

If you only want to see a report where the final result set has only columns from one table, use a Subquery. Obviously, if you need columns on the report where the final result set has columns from both tables, you have to do a Join. Page 385

Chapter 13

Sub-query Functions

Quiz- Write the Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________

11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Write the Subquery

Select all columns in the Customer_Table if the customer has placed an order!

Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table. Good luck! Advice: Look for the common key among both tables!

Page 386

Chapter 13

Sub-query Functions

Answer to Quiz- Write the Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________

11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Write the Subquery

Select all columns in the Customer_Table if the customer has placed an order!

SELECT * FROM Customer_Table WHERE Customer_Number IN ( SELECT Customer_Number FROM Order_Table) ;

Customer_Number ________________ 31323134 57896883 11111111 87323456

Customer_Name ______________ ACE Consulting XYZ Plumbing Billy's Best Choice Databases N-U

The common key among both tables is Customer_Number. The bottom query runs first and delivers a distinct list of Customer_Numbers which the top query uses in the IN List! Page 387

Chapter 13

Sub-query Functions

Quiz- Write the More Difficult Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Write the Subquery Select all columns in the Customer_Table if the customer has placed an order over $10,000.00 Dollars!

Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table that is greater than $10,000.00.

Page 388

Chapter 13

Sub-query Functions

Answer to Quiz- Write the More Difficult Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Write the Subquery Select all columns in the Customer_Table if the customer has placed an order over $10,000.00 Dollars!

SELECT * FROM Customer_Table WHERE Customer_Number IN ( SELECT Customer_Number FROM Order_Table WHERE Order_Total > 10000.00) ;

Here is your answer!

Page 389

Customer_Number Customer_Name _______________ _______________ 11111111 Billy's Best Choice 57896883 XYZ Plumbing 87323456 Databases N-U

Chapter 13

Sub-query Functions

Quiz – Write the Extreme Subquery Course_Table Course_ID Course_Name _________ _________________ Student_Course_Table Student_ID Course_ID 280023 210 231222 210 125634 100 231222 220 125634 200 322133 220 125634 220 322133 300 324652 200 333450 500 260000 400 333450 400 234121 100 123250 100

100 200 210 220 300 400

Credits ______ Seats ____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 V2R3 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16 Student_Table

__________ Student_ID 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

__________ Last_Name Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

__________ First_Name __________ Class_Code Grade_Pt ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

Write SQL that will bring back an answer set that selects all columns from the Student_Table if that student is taking a course that has four (4) credits.

Use a subquery to get the answer set requested above. The answer is on the next page. Page 390

Chapter 13

Sub-query Functions

Answer to Quiz – Write the Extreme Subquery SELECT S.* FROM Student_Table as S WHERE Student_ID IN (SELECT Student_ID FROM Student_Course_Table WHERE Course_ID IN (SELECT Course_ID FROM Course_Table WHERE Credits=4))

Student_ID _________ 260000 322133 333450

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Johnson Bond Smith

Above, is something to enjoy and learn from.

Page 391

Stanley Jimmy Andy

? JR SO

? 3.95 2.00

Chapter 13

Sub-query Functions

Quiz- Write the Subquery with an Aggregate Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name __________ Jones Squiggy Smythe Richard Chambers Mandee Coffing Billy Smith John Larkins Loraine Strickling Cletus Reilly William Harrison Herbert

Salary _______ 32800.50 64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00

Write the Subquery Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary.

Another opportunity knocking! Would someone please answer the query door?

Page 392

Chapter 13

Sub-query Functions

Answer to Quiz- Write the Subquery with an Aggregate Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Write the Subquery Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary. SELECT * FROM Employee_Table WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table) ;

Page 393

Chapter 13

Sub-query Functions

Quiz- Write the Correlated Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Write the Correlated Subquery

Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary (within their own Department).

Another opportunity knocking! This is a tough one, and only the best get this written correctly.

Page 394

Chapter 13

Sub-query Functions

Answer to Quiz- Write the Correlated Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Write the Correlated Subquery Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary (within their own Department). SELECT * FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No) ;

Page 395

Chapter 13

Sub-query Functions

The Basics of a Correlated Subquery The Top Query is Co-Related (Correlated) with the Bottom Query. The table name from the top query and the table name from the bottom query are given a different alias.

The bottom query WHERE clause co-relates Dept_No from Top and Bottom. The top query is run first. The bottom query is run one time for each distinct value delivered from the top query. SELECT * FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No) ;

A correlated subquery breaks all the rules. It is the top query that runs first. Then, the bottom query is run one time for each distinct column in the bottom WHERE clause. In our example, this is the column Dept_No. This is because in our example, the WHERE clause is comparing the column Dept_No. After the top query runs and brings back its rows, the bottom query will run one time for each distinct Dept_No. If this is confusing, it is not you. These take a little time to understand, but I have a plan to make you an expert. Keep reading!

Page 396

Chapter 13

Sub-query Functions

The Top Query always runs first in a Correlated Subquery The Top Query runs first (colored in blue)

SELECT * FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No)

EE.Dept_No = EEEE.Dept_No

SELECT * FROM Employee_Table as EE Employee_No ____________ Dept_No ________ Last_Name _________ Null is 2000000 skipped ? Jones 1000234 10 Smythe 1232578 100 Chambers 1324657 200 Coffing 1333454 200 Smith 2312225 300 Larkins 1121334 400 Strickling 2341218 400 Reilly 1256349 400 Harrison

First_Name _______ Salary _________ Squiggy 32800.50 Richard 64300.00 Mandee 48850.00 Billy 41888.88 John 48000.00 Loraine 40200.00 Cletus 54500.00 William 36000.00 Herbert 54500.00

Dept_No ________ 10 100 200 300 400

Employee_No ________ Dept_No __________ Last_Name __________ First_Name _______ Salary ____________ 1333454 1256349 1121334

Page 397

200 400 400

Smith Harrison Strickling

John Herbert Cletus

The bottom Query (in red) runs 1 time for each distinct Dept_No

48000.00 54500.00 54500.00

AVGSAL ________ 64300.00 48850.00 44944.44 40200.00 48333.33

Only these three employees make more than the AVG salary within their own department

Chapter 13

Sub-query Functions

Correlated Subquery Example vs. a Join with a Derived Table SELECT Last_Name, Dept_No, Salary FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No) ;

SELECT E.*, AVGSAL FROM Employee_Table as E INNER JOIN (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) as TeraTom (Depty, AVGSAL) ON Dept_No = Depty AND Salary > AVGSAL ;

Correlated Subquery Last_Name ________ Dept_No _______ Salary __________ Smith 200 48000.00 Harrison 400 54500.00 Strickling 400 54500.00

Join with a Derived Table Last_Name ________ Dept_No _______ Salary AVGSAL _________ ________ Smith 200 48000.00 44944.44 Harrison 400 54500.00 48333.33 Strickling 400 54500.00 48333.33

Both queries above will bring back all employees making a salary that is greater than the average salary in their department. The biggest difference is that the Join with the Derived Table also shows the Average Salary in the result set.

Page 398

Chapter 13

Sub-query Functions

Quiz- A Second Chance to Write a Correlated Subquery Sales_Table

Product_ID _________ Sale_Date __________ 1000 10/02/2000 1000 09/30/2000 1000 10/01/2000 All Rows are 2000 10/04/2000 NOT 2000 10/02/2000 Displayed 2000 09/28/2000 3000 10/04/2000 3000 10/02/2000 3000 10/03/2000

Daily_Sales __________ 32800.50 36000.07 40200.43 32800.50 36021.93 41888.88 15675.33 19678.94 21553.79

Write the Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Product_ID.

Another opportunity knocking! This is your second chance. I will even give you a third chance.

Page 399

Chapter 13

Sub-query Functions

Answer - A Second Chance to Write a Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Product_ID. SELECT * FROM Sales_Table as TopS WHERE Daily_Sales > ( SELECT AVG(Daily_Sales) FROM Sales_Table as BotS WHERE TopS.Product_ID = BotS.Product_ID) ORDER BY Product_ID, Sale_Date ; Product_ID _________ Sale_Date __________ Daily_Sales __________

Answer Set

Page 400

1000 1000 1000 1000 2000 2000 2000 3000 3000 3000

09/28/2000 09/29/2000 10/03/2000 10/04/2000 09/29/2000 09/30/2000 10/01/2000 09/28/2000 09/29/2000 09/30/2000

48850.40 54500.22 64300.00 54553.10 48000.00 49850.03 54850.29 61301.77 34509.13 43868.86

Chapter 13

Sub-query Functions

Quiz- A Third Chance to Write a Correlated Subquery Sales_Table

Product_ID _________ Sale_Date __________ 1000 10/02/2000 1000 09/30/2000 1000 10/01/2000 All Rows are 2000 10/04/2000 NOT 2000 10/02/2000 Displayed 2000 09/28/2000 3000 10/04/2000 3000 10/02/2000 3000 10/03/2000

Daily_Sales __________ 32800.50 36000.07 40200.43 32800.50 36021.93 41888.88 15675.33 19678.94 21553.79

Write the Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Sale_Date.

Another opportunity knocking! There is just one minor adjustment and you are home free.

Page 401

Chapter 13

Sub-query Functions

Answer - A Third Chance to Write a Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Sale_Date. SELECT * FROM Sales_Table as TopS WHERE Daily_Sales > ( SELECT AVG(Daily_Sales) FROM Sales_Table as BotS WHERE TopS.Sale_Date = BotS.Sale_Date) ORDER BY Sale_Date ; Product_ID _________ Sale_Date __________ Daily_Sales __________

Answer Set

Page 402

3000 2000 1000 3000 2000 2000 2000 1000 2000 1000 1000

09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000 10/01/2000 10/02/2000 10/02/2000 10/03/2000 10/03/2000 10/04/2000

61301.77 48000.00 54500.22 43868.86 49850.03 54850.29 36021.93 32800.50 43200.18 64300.00 54553.10

Chapter 13

Sub-query Functions

Quiz- Last Chance To Write a Correlated Subquery Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

Write the Correlated Subquery Select all columns in the Student_Table if the Grade_Pt column is greater than the Average Grade_Pt within its own Class_Code.

Another opportunity knocking! There is just one minor adjustment and you are home free.

Page 403

Chapter 13

Sub-query Functions

Answer – Last Chance to Write a Correlated Subquery Select all columns in the Student_Table if the Grade_Pt column is greater than the Average Grade_Pt within its own Class_Code.

SELECT * FROM Student_Table as TopS WHERE Grade_Pt > ( SELECT AVG(Grade_Pt) FROM Student_Table as BotS WHERE TopS. Class_Code = BotS.Class_Code ) ORDER BY Class_Code ;

Answer Set Student_ID Last_Name First_Name __________ __________ __________ Class_Code __________ Grade_Pt ________ 234121 125634 322133 231222 324652

Page 404

Thomas Hanson Bond Wilson Delaney

Wendy Henry Jimmy Susie Danny

FR FR JR SO SR

4.00 2.88 3.95 3.80 3.35

Chapter 13

Sub-query Functions

Quiz – Write the Extreme Correlated Subquery Course_Table Course_ID Course_Name _________ _________________ Student_Course_Table Student_ID Course_ID 280023 210 231222 210 125634 100 231222 220 125634 200 322133 220 125634 220 322133 300 324652 200 333450 500 260000 400 333450 400 234121 100 123250 100

100 200 210 220 300 400

Credits ______ Seats ____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 V2R3 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16 Student_Table

__________ Student_ID 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

__________ Last_Name Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

First_Name __________ __________ Class_Code Grade_Pt ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

Write a correlated subquery that will bring back an answer set that returns all columns from the Course_Table if that course is being taken by a student who has a greater than average grade point within their own class code.

Use a subquery to get the answer set requested above. The answer is on the next page.

Page 405

Chapter 13

Sub-query Functions

Answer To Quiz – Write the Extreme Correlated Subquery SELECT * FROM Course_Table WHERE Course_ID IN (SELECT Course_ID FROM Student_Course_Table WHERE Student_ID IN (SELECT Student_ID FROM Student_Table AS s1 WHERE Grade_Pt > (SELECT AVG(Grade_Pt) FROM Student_Table AS s2 WHERE s1.Class_Code=s2.Class_Code) ) ); Course_ID _________ 200 100 220 300 210

Course_Name _____________________ Credits ______ Seats _____ Introduction to SQL 3 20 Database Concepts 3 50 V2R3 SQL Features 2 25 Physical Database Design 4 20 Advanced SQL 3 22

Above, is something to enjoy and learn from. Page 406

Chapter 13

Sub-query Functions

Quiz- Write the NOT Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

Write the Subquery Select all columns in the Customer_Table if the Customer has NOT placed an order.

Another opportunity knocking! Write the above query

Page 407

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 13

Sub-query Functions

Answer to Quiz- Write the NOT Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________

11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Select all columns in the Customer_Table if the Customer has NOT placed an order. SELECT * FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table WHERE Customer_Number IS NOT NULL) ;

Page 408

Nulls are a NOT IN nightmare. Notice how I account for them!

Chapter 13

Sub-query Functions

Quiz- Write the Subquery using a WHERE Clause Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

Write the Subquery Select all columns in the Order_Table that were placed by a customer with ‘Bill’ anywhere in their name.

Another opportunity to show your brilliance is ready for you to make it happen.

Page 409

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 13

Sub-query Functions

Answer - Write the Subquery using a WHERE Clause Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

Write the Subquery Select all columns in the Order_Table that were placed by a customer with ‘Bill’ anywhere in their name.

SELECT * FROM Order_Table WHERE Customer_Number IN (SELECT Customer_Number FROM Customer_Table WHERE Customer_Name like '%Bill%') ;

Page 410

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 13

Sub-query Functions

Quiz – Write the Triple Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Write the Subquery

What is the Customer_Name who has the highest dollar order among all customers? This query will have multiple Subqueries!

Good luck in writing this. Remember that this will involve multiple Subqueries.

Page 411

Chapter 13

Sub-query Functions

Answer to Quiz – Write the Triple Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Write the Subquery What is the Customer_Name who has the highest dollar order among all customers? This query will have multiple Subqueries! SELECT Customer_Name XYZ Plumbing FROM Customer_Table WHERE Customer_Number IN 58796883 This runs (SELECT Customer_Number FROM Order_Table second WHERE Order_Total IN (SELECT Max(Order_Total) FROM Order_Table)) ; 23454.84 This runs first This runs third

The query is above and, of course, the answer is XYZ Plumbing.

Page 412

Chapter 13

Sub-query Functions

Quiz – How many rows return on a NOT IN with a NULL? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Name FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table ) ;

How many rows return from the query now that a NULL value is in a Customer_Number? We really didn’t place a new row inside the Order_Table with a NULL value for the Customer_Number column, but in theory, if we had, how many rows would return?

Page 413

Chapter 13

Sub-query Functions

Answer – How many rows return on a NOT IN with a NULL? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Name FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table ) ;

How many rows return from the query now that a NULL value is in a Customer_Number? ZERO rows will return

The answer is no rows come back. This is because when you have a NULL value in a NOT IN list, the system doesn’t know the value of NULL, so it returns nothing.

Page 414

Chapter 13

Sub-query Functions

How to handle a NOT IN with Potential NULL Values Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Name FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table WHERE Customer_Number IS NOT NULL) ;

How many rows return NOW from the query? 1 Acme Products

You can utilize a WHERE clause that tests to make sure Customer_Number IS NOT NULL. This should be used when a NOT IN could encounter a NULL.

Page 415

Chapter 13

Sub-query Functions

Using a Correlated Exists Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Use EXISTS to find which Customers have placed an Order?

SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ; The EXISTS command will determine via a Boolean if something is True or False. If a customer placed an order, it EXISTS, and using the Correlated Exists statement, only customers who have placed an order will return in the answer set. EXISTS is different than IN as it is less restrictive as you will soon understand.

Page 416

Chapter 13

Sub-query Functions

How a Correlated Exists matches up Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Does not Acme Products Exist in ACE Consulting Order_Table XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ; Customer_Number ________________

________________ Customer_Name

11111111 31323134 57896883 87323456

Billy’s Best Choice ACE Consulting XYZ Plumbing Databases N-U

Only customers who placed an order return with the above Correlated EXISTS.

Page 417

Chapter 13

Sub-query Functions

The Correlated NOT Exists Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Use NOT EXISTS to find which Customers have NOT placed an Order? SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ;

The EXISTS command will determine via a Boolean if something is True or False. If a customer placed an order, it EXISTS, and using the Correlated Exists statement, only customers who have placed an order will return in the answer set. EXISTS is different than IN as it is less restrictive as you will soon understand.

Page 418

Chapter 13

Sub-query Functions

The Correlated NOT Exists Answer Set Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Use NOT EXISTS to find which Customers have NOT placed an Order? SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ;

Customer_Number Customer_Name ________________ ______________ 31313131

Acme Products

The only customer who did NOT place an order was Acme Products.

Page 419

Chapter 13

Sub-query Functions

Quiz – How many rows come back from this NOT Exists? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ;

How many rows return from the query?

A NULL value in a list for queries with NOT IN returned nothing, but you must now decide if that is also true for the NOT EXISTS. How many rows will return?

Page 420

Chapter 13

Sub-query Functions

Answer – How many rows come back from this NOT Exists? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ; How many rows return from the query? One row Acme Products

NOT EXISTS is unaffected by a NULL in the list, that’s why it is more flexible

Page 421

Chapter 14

Page 422

Window Functions OLAP

Chapter 14

Window Functions OLAP

Chapter 14 – Window Functions OLAP

“Don’t count the days, make the days count.” - Mohammad Ali

Page 423

Chapter 14

Window Functions OLAP

The Row_Number Command SELECT Product_ID ,Sale_Date , Daily_Sales, ROW_NUMBER() OVER (ORDER BY Product_ID, Sale_Date) AS Seq_Number FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID __________ Sale_Date ________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01

Daily_Sales ___________ Seq_Number _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29

1 2 3 4 5 6 7 8 9 10 11

The ROW_NUMBER() Keyword(s) caused Seq_Number to increase sequentially. Notice that this does NOT have a Rows Unbounded Preceding, and it still works! Page 424

Chapter 14

Window Functions OLAP

Quiz – How did the Row_Number Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, ROW_NUMBER() OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

What Keyword(s) caused StartOver to reset? Page 425

Daily_Sales _________

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

StartOver _______

1 2 3 4 5 6 7 1 2 3 4 5 6 7

Chapter 14

Window Functions OLAP

Quiz – How did the Row_Number Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, ROW_NUMBER() OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________

Sale_Date ________

Daily_Sales _________

StartOver _______

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

1 2 3 4 5 6 7 1 2 3 4 5 6 7

What Keyword(s) caused StartOver to reset? It is the PARTITION BY statement. Page 426

Chapter 14

Window Functions OLAP

Using a Derived Table and Row_Number WITH Results AS ( SELECT ROW_NUMBER() OVER(ORDER BY Product_ID, Sale_Date) AS RowNumber, Product_ID, Sale_Date FROM Sales_Table ) SELECT * FROM Results WHERE RowNumber BETWEEN 8 AND 14 RowNumber __________ Product_ID _________ Sale_Date _________ 8 2000 2000-09-28 9 2000 2000-09-29 10 2000 2000-09-30 11 2000 2000-10-01 12 2000 2000-10-02 13 2000 2000-10-03 14 2000 2000-10-04

In the example above, we are using a derived table called Results and then using a WHERE clause to only take certain Row Numbers. Page 427

Chapter 14

Window Functions OLAP

Ordered Analytics OVER SELECT TOP (9) Product_ID as Prod ,Sale_Date ,Daily_Sales ,SUM(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Total ,AVG(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Avg ,COUNT(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Cnt ,MIN(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Min ,MAX(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Max FROM Sales_Table Prod ____ 1000 2000 3000 3000 2000 1000 1000 2000 3000

Sale_Date __________ Daily_Sales ________ Total Avg Cnt Min _________ ________ ___ ________ 2000-09-28 48850.40 152041.05 50680.35 3 41888.88 2000-09-28 41888.88 152041.05 50680.35 3 41888.88 2000-09-28 61301.77 152041.05 50680.35 3 41888.88 2000-09-29 34509.13 137009.35 45669.78 3 34509.13 2000-09-29 48000.00 137009.35 45669.78 3 34509.13 2000-09-29 54500.22 137009.35 45669.78 3 34509.13 2000-09-30 36000.07 129718.96 43239.65 3 36000.07 2000-09-30 49850.03 129718.96 43239.65 3 36000.07 2000-09-30 43868.86 129718.96 43239.65 3 36000.07

Above, is an example of the Ordered Analytics using the keyword OVER.

Page 428

Max _______ 61301.77 61301.77 61301.77 54500.22 54500.22 54500.22 49850.03 49850.03 49850.03

Chapter 14

Window Functions OLAP

RANK and DENSE RANK SELECT TOP (5) Product_ID, Daily_Sales, RANK() OVER (ORDER BY Daily_Sales ASC) as [Rank], DENSE_RANK() OVER(Order By Daily_Sales ASC) as [DenseRank] FROM Sales_Table WHERE Product_ID in(1000, 2000)

Prod ____ 1000 2000 1000 2000 1000

Daily_Sales Rank DenseRank __________ _____ __________ 32800.50 1 1 32800.50 1 1 36000.07 3 2 36021.93 4 3 40200.43 5 4

Above is an example of the RANK and DENSE_RANK commands. Notice the difference in the ties and the next ranking. Page 429

Chapter 14

Window Functions OLAP

RANK Defaults to Ascending Order SELECT Product_ID ,Sale_Date , Daily_Sales, RANK() OVER (ORDER BY Daily_Sales) AS Rank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID _________

Sale_Date ________

1000 2000 1000 2000 1000 Not all 2000 rows 2000 are displayed 2000 1000 2000 1000 1000 2000

10/02/2000 10/04/2000 09/30/2000 10/02/2000 10/01/2000 09/28/2000 10/03/2000 09/29/2000 09/28/2000 09/30/2000 09/29/2000 10/04/2000 10/01/2000

The RANK() OVER command defaults the Sort to ASC. Page 430

Daily_Sales Rank1 _________ _____ 1 32800.50 1 32800.50 3 36000.07 4 36021.93 5 40200.43 6 41888.88 7 43200.18 8 48000.00 9 48850.40 10 49850.03 11 54500.22 12 54553.10 13 54850.29

Chapter 14

Window Functions OLAP

Getting RANK to Sort in DESC Order SELECT Product_ID ,Sale_Date , Daily_Sales, RANK() OVER (ORDER BY Daily_Sales DESC) AS Rank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID _________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 2000 1000

Sale_Date ________

Daily_Sales _________

10/03/2000 10/01/2000 10/04/2000 09/29/2000 09/30/2000 09/28/2000 09/29/2000 10/03/2000 09/28/2000 10/01/2000 10/02/2000 09/30/2000 10/04/2000 10/02/2000

64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50

Rank1 _____ 1 2 3 4 5 6 7 8 9 10 11 12 13 13

Utilize the DESC keyword in the ORDER BY statement to rank in descending order. Page 431

Chapter 14

Window Functions OLAP

RANK() OVER and PARTITION BY SELECT Product_ID ,Sale_Date , Daily_Sales, RANK() OVER (PARTITION BY Product_ID ORDER BY Daily_Sales DESC) AS Rank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

10/03/2000 10/04/2000 09/29/2000 09/28/2000 10/01/2000 09/30/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 10/03/2000 09/28/2000 10/02/2000 10/04/2000

Daily_Sales Rank1 _________ _____ 64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50

1 2 3 4 5 6 7 1 2 3 4 5 6 7

What does the PARTITION Statement in the RANK() OVER do? It resets the rank. Page 432

Chapter 14

Window Functions OLAP

Cumulative Sum SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date) AS CsumAnsi FROM Sales_Table WHERE Product_ID BETWEEN 1000 and 2000 ;

Product_ID Sale_Date ___________ Daily_Sales __________ _________ 2000 2000-09-28 41888.88 1000 2000-09-28 48850.40 2000 2000-09-29 48000.00 Not all rows 1000 2000-09-29 54500.22 are displayed 1000 2000-09-30 36000.07 in this 49850.03 answer set 2000 2000-09-30 1000 2000-10-01 40200.43 2000 2000-10-01 54850.29 1000 2000-10-02 32800.50 2000 2000-10-02 36021.93

CsumAnsi ________ 41888.88 90739.28 138739.28 193239.50 229239.57 279089.60 319290.03 374140.32 406940.82 442962.75

The keywords Rows Unbounded Preceding determines that this is a cumulative sum (CSUM). There are only a few different statements and Rows Unbounded Preceding is the main one. It means start calculating at the beginning row, and continue calculating until the last row. Page 433

Chapter 14

Window Functions OLAP

The ANSI CSUM – Getting a Sequential Number SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date) SUMOVER, SUM(1) OVER (ORDER BY Product_ID, Sale_Date) AS Seq_Number FROM Sales_Table ;

Product_ID Daily_Sales ___________ SUM OVER ___________ Seq_Number __________ Sale_Date _________ __________ 1000 1000 Not all rows 1000 are displayed 1000 in this 1000 answer set 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 373093.60 421093.60 470943.63

1 2 3 4 5 6 7 8 9 10

With “Seq_Number”, because you placed the number 1 in the area which calculates the cumulative sum, it’ll continuously add 1 to the answer for each row. Page 434

Chapter 14

Window Functions OLAP

Troubleshooting The ANSI OLAP on a GROUP BY SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Sale_Date) AS AnsiCsum FROM Sales_Table GROUP BY Product_ID ;

Error! Why?

Never GROUP BY in a SUM()Over or with any ANSI Syntax OLAP command. If you want to reset you use a PARTITION BY Statement, but never a GROUP BY.

Page 435

Chapter 14

Window Functions OLAP

Reset with a PARTITION BY Statement SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date) AS SumANSI FROM Sales_Table ;

Product_ID Sale_Date ________ ________

Not all rows are displayed in this answer set

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

Daily_Sales SumANSI _________ ________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88 89888.88 139738.91

CSUM Resets on Product_ID break

The PARTITION Statement is how you reset in ANSI. This will cause the SUMANSI to start over (reset) on its calculating for each NEW Product_ID.

Page 436

Chapter 14

Window Functions OLAP

PARTITION BY only Resets a Single OLAP not ALL of them SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date) Subtotal, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date) GrandTotal FROM Sales_Table ;

Product_ID ________ Sale_Date Daily_Sales Subtotal GrandTotal _________ _________ ________ ________

Not all rows are displayed in this answer set

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88 89888.88 139738.91

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 373093.60 421093.60 470943.63

Above, are two OLAP statements. Only one has PARTITION BY, so only it resets. The other continuously does a CSUM. Page 437

Chapter 14

Window Functions OLAP

Sorting in DESC Order SELECT Product_ID, Sale_Date ,Daily_Sales ,SUM(Daily_Sales) OVER (ORDER BY Product_ID DESC, Sale_Date) AS CumulativeTotal FROM Sales_Table

Product_ID ________ Sale_Date Daily_Sales CumulativeTotal _________ _________ ____________

Not all rows are displayed in this answer set

3000 3000 3000 3000 3000 3000 3000 2000 2000 2000 2000

10/04/2000 10/03/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 09/28/2000 10/04/2000 10/03/2000 10/02/2000 10/01/2000

15675.33 21553.79 19678.94 28000.00 43868.86 34509.13 61301.77 32800.50 43200.18 36021.93 54850.29

15675.33 37229.12 56908.06 84908.06 128776.92 163286.05 224587.82 257388.32 300588.50 336610.43 391460.72

Above we used the DESC keyword in the ORDER BY statement for the Product_ID. Notice that the Product_ID is reversed. We see the Product_ID of 3000 first. Page 438

Chapter 14

Window Functions OLAP

Moving Average SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date MovAvg FROM Sales_Table ;

Product_ID Sale_Date _________ Daily_Sales __________ MovAvg _________ _________ 48850.40 48850.400000 1000 2000-09-28 54500.22 51675.310000 1000 2000-09-29 36000.07 46450.230000 1000 2000-09-30 40200.43 44887.780000 1000 2000-10-01 Not all rows 32800.50 42470.324000 1000 2000-10-02 are 64300.00 46108.603333 1000 2000-10-03 displayed 54553.10 47314.960000 1000 2000-10-04 41888.88 46636.700000 2000 2000-09-28 48000.00 46788.177777 2000 2000-09-29 49850.03 47094.363000 2000 2000-09-30

The AVG () Over allows you to get the moving average of a certain column. Page 439

Chapter 14

Window Functions OLAP

Casting a Moving Average SELECT Product_ID , Sale_Date, Daily_Sales, CAST(AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ) as Decimal (8,2)) AS CastAvg FROM Sales_Table ;

Product_ID ________ Sale_Date _________ Daily_Sales _________

CastAvg _______

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

48850.40 51675.31 46450.23 43566.91 36333.67 45766.98 50551.20 53580.66 48147.33 46579.64

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

Above, we have used a CAST statement to change the data type of the moving average to a Decimal(8,2) data type.

Page 440

Chapter 14

Window Functions OLAP

Partition By Resets an ANSI OLAP SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ) AS AVG3, AVG(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ) AS Continuous FROM Sales_Table; ANSI reset much Like a GROUP BY

Product_ID _________ Sale_Date Daily_Sales _______ AVG3 Continuous _________ _________ ___________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00

48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33

48850.400000 51675.310000 46450.230000 44887.780000 42470.324000 46108.603333 47314.960000 41888.880000 44944.440000

Use a PARTITION BY Statement to Reset the ANSI OLAP. The Partition By statement only resets the column using the statement. Notice that only Continuous resets.

Page 441

Chapter 14

Window Functions OLAP

COUNT OVER for a Sequential Number SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (ORDER BY Product_ID, Sale_Date ) Seq_Number FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID _________ Sale_Date _________ Daily_Sales Seq_Number ________ __________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29

1 2 3 4 5 6 7 8 9 10 11

This is the COUNT OVER. It will provide a sequential number starting at 1. The Keyword(s) ROWS UNBOUNDED PRECEDING causes Seq_Number to start at the beginning and increase sequentially to the end.

Page 442

Chapter 14

Window Functions OLAP

Quiz – What caused the COUNT OVER to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID Sale_Date _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

What Keyword(s) caused StartOver to reset? Page 443

Daily_Sales _________

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

StartOver _______

1 2 3 4 5 6 7 1 2 3 4 5 6 7

Chapter 14

Window Functions OLAP

Answer to Quiz – What caused the COUNT OVER to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

Daily_Sales _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

StartOver _______ 1 2 3 4 5 6 7 1 2 3 4 5 6 7

What Keyword(s) caused StartOver to reset? It is the PARTITION BY statement. Page 444

Chapter 14

Window Functions OLAP

The MAX OVER Command SELECT Product_ID ,Sale_Date , Daily_Sales, MAX(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date) MaxOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID ________ Sale_Date _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

Daily_Sales _________

MaxOver _______

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

48850.40 54500.22 54500.22 54500.22 54500.22 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00

After the sort, the Max() Over shows the Max Value up to that point. Page 445

Chapter 14

Window Functions OLAP

MAX OVER with PARTITION BY Reset SELECT Product_ID ,Sale_Date , Daily_Sales, MAX(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date) AS MaxOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID ________ Sale_Date _________ 1000 1000 1000 1000 Not all 1000 rows 1000 are displayed 1000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01

Daily_Sales _________

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29

MaxOver ________

48850.40 54500.22 54500.22 54500.22 54500.22 64300.00 64300.00 41888.88 48000.00 49850.03 54850.29

The largest value is 64300.00 in the column MaxOver. Once it was evaluated, it did not continue until the end because of the PARTITION BY reset.

Page 446

Chapter 14

Window Functions OLAP

MAX OVER Without Rows Unbounded Preceding SELECT Product_ID ,Sale_Date , Daily_Sales, MAX(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ) AS MaxOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Daily_Sales ________ MaxOver __________ Sale_Date ________ __________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

You don't need the Rows Unbounded Preceding with the MAX OVER.

Page 447

48850.40 54500.22 54500.22 54500.22 54500.22 64300.00 64300.00 64300.00 64300.00 64300.00

Chapter 14

Window Functions OLAP

The MIN OVER Command SELECT Product_ID, Sale_Date ,Daily_Sales ,MIN(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

Sale_Date ________ 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

Daily_Sales _________

MinOver _______

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

48850.40 48850.40 36000.07 36000.07 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50

After the sort, the MIN () Over shows the Max Value up to that point. Page 448

Chapter 14

Window Functions OLAP

Quiz – Fill in the Blank SELECT Product_ID ,Sale_Date , Daily_Sales, MIN(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID _________ Sale_Date Daily_Sales MinOver ________ _________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

The last two answers (MinOver) are blank, so you can fill in the blank. Page 449

48850.40 48850.40 36000.07 36000.07 32800.50 32800.50 32800.50 41888.88 41888.88 41888.88 41888.88 36021.93

Chapter 14

Window Functions OLAP

Answer – Fill in the Blank SELECT Product_ID ,Sale_Date , Daily_Sales, MIN(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID _________ Sale_Date Daily_Sales ________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

The last two answers (MinOver) are filled in.

Page 450

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

MinOver ________ 48850.40 48850.40 36000.07 36000.07 32800.50 32800.50 32800.50 41888.88 41888.88 41888.88 41888.88 36021.93 36021.93 32800.50

Chapter 14

Window Functions OLAP

How Ntile Works SELECT Product_ID, Sale_Date, Daily_Sales ,NTILE (4) OVER (ORDER BY Daily_Sales , Sale_Date ) AS "Quartiles" FROM Sales_Table WHERE Product_ID = 1000;

Product_ID Sale_Date __________ Daily_Sales ________ Quartiles __________ _________ 1000 1000 1000 1000 1000 1000 1000

10/02/2000 09/30/2000 10/01/2000 09/28/2000 09/29/2000 10/04/2000 10/03/2000

32800.50 36000.07 40200.43 48850.40 54500.22 54553.10 64300.00

1 1 2 2 3 3 4

Assigning a different value to the indicator of the Ntile function changes the number of partitions established. Each Ntile partition is assigned a number starting at 1 increasing to a value that is one less than the partition number specified. So, with an Ntile of 4 the partitions are 1 through 4. Then, all the rows are distributed as evenly as possible into each partition from highest to lowest values. Normally, extra rows with the lowest value begin back in the lowest numbered partitions. Page 451

Chapter 14

Window Functions OLAP

Ntile SELECT Last_Name, Grade_Pt, NTILE(5) OVER (ORDER BY Grade_Pt) as "Tile" FROM Student_Table ORDER BY "Tile" DESC;

Last_Name Grade_Pt ____ Tile ________ _________ 3.95 5 Bond 4.00 5 Thomas 3.35 4 Delaney 3.80 4 Wilson 2.88 3 Hanson 3.00 3 Phillips 1.90 2 McRoberts 2.00 2 Smith ? 1 Johnson 0.00 1 Larkins

The Ntile function organizes rows into n number of groups. These groups are referred to as tiles. The tile number is returned. For example, the example above has 10 rows, so NTILE(5) splits the 10 rows into five equally sized tiles. There are 2 rows in each tile in the order of the OVER() clause's ORDER BY. Page 452

Chapter 14

Window Functions OLAP

Ntile Continued SELECT Dept_No, EmployeeCount, NTILE(2) OVER (ORDER BY EmployeeCount) as "Tile" FROM (SELECT Dept_No, COUNT(*) as EmployeeCount FROM Employee_Table GROUP BY Dept_No ) AS Q ORDER BY "Tile" DESC; Dept_No ________ EmployeeCount _____________ Tile ____ 1 2 300 2 2 200 3 2 400 1 1 ? 1 1 10 1 1 100 The Ntile function organizes rows into n number of groups. These groups are referred to as tiles. The tile number is returned. For example, the example above has 6 rows, so NTILE(2) splits the 10 rows into 2 equally sized tiles. There are 3 rows in each tile in the order of the OVER() clause's ORDER BY. Page 453

Chapter 14

Window Functions OLAP

Ntile Percentile SELECT Claim_ID, Claim_Date, ClaimCount, NTILE(100) OVER (ORDER BY ClaimCount) as Percentile FROM (SELECT Claim_ID, Claim_Date, COUNT(*) as ClaimCount FROM Claims GROUP BY Claim_ID, Claim_Date ) AS Q ORDER BY Percentile DESC Claim_ID _________ 1302111 4307444 3306333 1304111 2303222 4305444 4303555 3402222 3308333

Claim_Date ClaimCount ___________ __________ 2003-03-01 4 2003-07-05 3 2003-06-28 3 2003-04-28 2 2003-03-12 2 2003-05-12 2 2004-03-01 2 2004-02-28 2 2003-08-01 2

Percentile _________ 26 25 24 23 22 21 20 19 18

Not all rows are displayed

The Ntile function organizes rows into n number of groups. These groups are referred to as tiles. The tile number is returned. Above, is a way to get the percentile.

Page 454

Chapter 14

Window Functions OLAP

Another Ntile Example This example determines the percentile for every row in the Sales table based on the daily sales amount and sorts it into sequence by the value being categorized, which here is daily sales. SELECT Product_ID, Sale_Date, Daily_Sales ,NTILE(100) OVER (ORDER BY Daily_Sales) AS "Quantile" FROM Sales_Table WHERE Product_ID < 2000 ;

Product_ID _________ 1000 1000 1000 1000 1000 1000 1000 Above, is another Ntile example. Page 455

Sale_Date _________

Daily_Sales ________ Quantile __________

10/02/2000 09/30/2000 10/01/2000 09/28/2000 09/29/2000 10/04/2000 10/03/2000

32800.50 36000.07 40200.43 48850.40 54500.22 54553.10 64300.00

1 2 3 4 5 6 7

Chapter 14

Window Functions OLAP

Using Quartiles (Partitions of Four) SELECT Product_ID, Sale_Date, Daily_Sales ,NTILE (4) OVER (Order by Daily_Sales , Sale_Date ) AS "Quartiles" FROM Sales_Table WHERE Product_ID in (1000, 2000) ;

Product_ID __________ 1000 2000 1000 2000 1000 2000 2000 2000 1000 2000 1000 1000 2000 1000

Sale_Date __________ Daily_Sales ________ Quartiles _________ 10/02/2000 32800.50 1 10/04/2000 32800.50 1 09/30/2000 36000.07 1 10/02/2000 36021.93 1 10/01/2000 40200.43 2 09/28/2000 41888.88 2 10/03/2000 43200.18 2 09/29/2000 48000.00 2 09/28/2000 48850.40 3 09/30/2000 49850.03 3 09/29/2000 54500.22 3 10/04/2000 54553.10 4 10/01/2000 54850.29 4 10/03/2000 64300.00 4

Instead of 100 the example above uses a quartile (QUANTILE based on 4 partitions). Page 456

Chapter 14

Window Functions OLAP

NTILE Buckets SELECT Product_ID ,Sale_Date , Daily_Sales, NTILE(4) OVER (ORDER BY Daily_Sales) AS Bucket FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID ________ Sale_Date ________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 1000 2000

10/03/2000 10/01/2000 10/04/2000 09/29/2000 09/30/2000 09/28/2000 09/29/2000 10/03/2000 09/28/2000 10/01/2000 10/02/2000 09/30/2000 10/02/2000 10/04/2000

Daily_Sales _________ 64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50

Bucket ________ 1 1 1 1 2 2 2 2 3 3 3 4 4 4

The NTILE() function divides the rows into buckets as evenly as possible. In this example, because PARTITION BY is omitted, the entire input will be sorted using the ORDER BY clause, and then divided into the number of buckets specified. Page 457

Chapter 14

Window Functions OLAP

NTILE Using a Value of 10 SELECT Product_ID ,Sale_Date , Daily_Sales, NTILE(10) OVER (ORDER BY Daily_Sales) AS Bucket FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date __________ _________ 1000 10/03/2000 2000 10/01/2000 1000 10/04/2000 1000 09/29/2000 2000 09/30/2000 1000 09/28/2000 2000 09/29/2000 2000 10/03/2000 2000 09/28/2000 1000 10/01/2000 2000 10/02/2000 1000 09/30/2000 1000 10/02/2000 2000 10/04/2000

Daily_Sales Bucket __________ _____ 64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50

1 1 2 2 3 3 4 4 5 6 7 8 9 10

The NTILE() function divides the rows into buckets as evenly as possible. In this example, because PARTITION BY is omitted, the entire input will be sorted using the ORDER BY clause, and then divided into the number of buckets specified. This example uses a value of 10 in the NTILE. Page 458

Chapter 14

Window Functions OLAP

NTILE With a Partition SELECT Product_ID ,Sale_Date , Daily_Sales, NTILE(3) OVER (PARTITION BY Product_ID ORDER BY Daily_Sales) AS Bucket FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID Sale_Date Daily_Sales __________ _________ __________ 32800.50 1000 10/02/2000 36000.07 1000 09/30/2000 40200.43 1000 10/01/2000 48850.40 1000 09/28/2000 54500.22 1000 09/29/2000 54553.10 1000 10/04/2000 64300.00 1000 10/03/2000 32800.50 2000 10/04/2000 36021.93 2000 10/02/2000 41888.88 2000 09/28/2000 43200.18 2000 10/03/2000 48000.00 2000 09/29/2000 49850.03 2000 09/30/2000 54850.29 2000 10/01/2000

Bucket ______ 1 1 1 2 2 3 3 1 1 1 2 2 3 3

The NTILE() function divides the rows into buckets as evenly as possible. In this example, because PARTITION BY is listed, the data will first be sorted by Product_ID and then sorted using the ORDER BY clause (within Product_ID), and then divided into the number of buckets specified. This example uses a value of 3 in the NTILE. Notice that the PARTITION BY statement causes the answer set to reset when the Product_ID goes from 1000 to 2000. Page 459

Chapter 14

Window Functions OLAP

Using LAG and LEAD Compatibility: SQL Server and Azure SQL Data Warehouse Extension The LAG and LEAD functions allow you to compare different rows of a table by specifying an offset from the current row. You can use these functions to analyze change and variation. Syntax for LAG and LEAD: {LAG | LEAD} (, [ [, ]]) OVER ([PARTITION BY [,...]] ORDER BY [ASC | DESC] [,...] ) ;

The above provides information and the syntax for LAG and LEAD.

Page 460

Chapter 14

Window Functions OLAP

Using LEAD SELECT Last_Name, Dept_No ,LEAD(Dept_No) OVER (ORDER BY Dept_No, Last_Name) as "Lead All" ,LEAD(Dept_No) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lead Partition" FROM Employee_Table; LAST_NAME Jones Smythe Chambers Coffing Smith Larkins Harrison Reilly Strickling

DEPT_NO ? 10 100 200 200 300 400 400 400

Lead All 10 100 200 200 300 400 400 400 ?

Lead Partition ? ? ? 200 ? ? 400 400 ?

As you can see, the first LEAD brings back the value from the next row except for the last which has no row following it. The offset value was not specified in this example, so it defaulted to a value of 1 row.

Page 461

Chapter 14

Window Functions OLAP

Using LEAD With and Offset of 2 SELECT Last_Name, Dept_No ,LEAD(Dept_No,2) OVER (ORDER BY Dept_No, Last_Name) as "Lead All" ,LEAD(Dept_No,2) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lead Partition" FROM Employee_Table;

LAST_NAME Jones Smythe Chambers Coffing Smith Larkins Harrison Reilly Strickling

DEPT_NO ? 10 100 200 200 300 400 400 400

Lead All 100 200 200 300 400 400 400 ? ?

Lead Partition ? ? ? ? ? ? 400 ? ?

Above, each value in the first LEAD is 2 rows away, and the partitioning only shows when values are contained in each value group with 1 more than offset value.

Page 462

Chapter 14

Window Functions OLAP

LEAD SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LEAD(Daily_Sales, 1, 0) OVER (ORDER BY Product_ID, Sale_Date) AS Lead1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000

Daily_Sales _________

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

Lead1 ________

-5649.82 18500.15 -4200.36 7399.93 -31499.50 9746.90 12664.22 -6111.12 -1850.03 -5000.26 18828.36 -7178.25 10399.68 32800.50

Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the next row's Daily_Sales, or one whose Daily_Sales is the same). The expression LEAD(Daily_Sales, 1, 0) tells LEAD() to evaluate the expression Daily_Sales on the row that is positioned one row following the current row. If there is no such row (as is the case on the last row of the partition or relation), then the default value of 0 is used. Page 463

Chapter 14

Window Functions OLAP

LEAD With Partitioning SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LEAD(Daily_Sales, 1, 0) OVER (PARTITION BY Product_ID ORDER BY Sale_Date) AS Lead1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000

Daily_Sales _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

Lead1 ________ -5649.82 18500.15 -4200.36 7399.93 -31499.50 9746.90 54553.10 -6111.12 -1850.03 -5000.26 18828.36 -7178.25 10399.68 32800.50

Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the next row's Daily_Sales, or one whose Daily_Sales is the same). We also partitioned the data by Product_ID. Page 464

Chapter 14

Window Functions OLAP

Using LAG SELECT Last-Name, Dept_No ,LAG(Dept_No) OVER (ORDER BY Dept_No, Last_Name) as "Lag All" ,LAG(Dept_No) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lag Partition" FROM Employee_Table;

LAST_NAME DEPT_NO Jones ? Smythe 10 Chambers 100 Coffing 200 Smith 200 Larkins 300 Harrison 400 Reilly 400 Strickling 400

Lag All ? ? 10 100 200 200 300 400 400

Lag Partition ? ? ? ? 200 ? ? 400 400

From the example above, you see that LAG uses the value from a previous row and makes it available in the next row. For LAG, the first row(s) will contain a null based on the value in the offset, here it defaulted to 1. The first null comes from the function where as the second row gets the null from the first row. Page 465

Chapter 14

Window Functions OLAP

Using LAG With an Offset of 2 SELECT Last_Name, Dept_No ,LAG(Dept_No,2) OVER (ORDER BY Dept_No, Last_Name) as "Lag All" ,LAG(Dept_No,2) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lag Partition" FROM Employee_Table; LAST_NAME Jones Smythe Chambers Coffing Smith Larkins Harrison Reilly Strickling

DEPT_NO ? 10 100 200 200 300 400 400 400

Lag All ? ? ? 10 100 200 200 300 400

Lag Partition ? ? ? ? ? ? ? ? 400

For this example, the first two rows have a null because there is not a row two rows before these. The number of nulls will always be the same as the offset value. There is a third null because Jones Dept_No is null.

Page 466

Chapter 14

Window Functions OLAP

LAG SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LAG(Daily_Sales, 1, 0) OVER (ORDER BY Product_ID, Sale_Date) AS Lag1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000

Daily_Sales _________

Lag1 _______

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

48850.40 5649.82 -18500.15 4200.36 -7399.93 31499.50 -9746.90 -12664.22 6111.12 1850.03 5000.26 -18828.36 7178.25 -10399.68

Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the previous row's Daily_Sales, or one whose Daily_Sales is the same). The expression LAG(Daily_Sales, 1, 0) tells LAG() to evaluate the expression Daily_Sales on the row that is positioned one row before the current row. If there is no such row (as is the case on the first row of the partition or relation), then the default value of 0 is used. Page 467

Chapter 14

Window Functions OLAP

LAG with Partitioning SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LAG(Daily_Sales, 1, 0) OVER (PARTITION BY Product_ID ORDER BY Sale_Date) AS Lag1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000

Daily_Sales Lag1 _________ _______ 48850.40 48850.40 54500.22 5649.82 36000.07 -18500.15 40200.43 4200.36 32800.50 -7399.93 64300.00 31499.50 54553.10 -9746.90 41888.88 41888.88 48000.00 6111.12 49850.03 1850.03 54850.29 5000.26 36021.93 -18828.36 43200.18 7178.25 32800.50 -10399.68

Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the previous row's Daily_Sales, or one whose Daily_Sales is the same). The expression LAG(Daily_Sales, 1, 0) tells LAG() to evaluate the expression Daily_Sales on the row that is positioned one row before the current row. If there is no such row (as is the case on the first row of the partition or relation), then the default value of 0 is used. Page 468

Chapter 14

Window Functions OLAP

SUM(SUM(n)) SELECT Product_ID , SUM(Daily_Sales) as Summy, SUM(SUM(Daily_Sales)) OVER (ORDER BY Sum(Daily_Sales) ) AS Prod_Sales_Running_Sum FROM Sales_Table GROUP BY Product_ID ;

Product_ID __________ Summy _______ Prod_Sales_Running_Sum ___________________ 3000 2000 1000

224587.82 306611.81 331204.72

224587.82 531199.63 862404.35

Window functions can compute aggregates of aggregates, as in the example above.

Page 469

Chapter 15

Page 470

Working with Strings

Chapter 15

Working with Strings

Chapter 15 - Working with Strings

“It’s always been and always will be the same in the world: the horse does the work and the coachman is tipped.” - Anonymous

Page 471

Chapter 15

Working with Strings

The ASCII Function The example below shows you how to convert characters into the integer ASCII value. Syntax: ASCII (string)

SELECT ASCII('H') as AsciiH ,ASCII('o') as AsciiO ,ASCII('w') as AsciiW ,ASCII('d') as AsciiD ,ASCII('y') as AsciiY

AsciiH AsciiO ______ AsciiW _______ AsciiD ______ AsciiY ______ ______ 72

111

119

100

121

The example above shows you how to convert characters into the integer ASCII value. Page 472

Chapter 15

Working with Strings

The CHAR Function The example below shows you how to convert the integer ASCII value into characters. Syntax: CHAR (integer)

SELECT CHAR(72) As CharH ,CHAR(111) As CharO ,CHAR(119) As CharW ,CHAR(100) As CharD ,CHAR(121) As CharY ;

CharH CharO CharW ______ CharD ______ CharY _____ _____ _____ H

o

w

d

y

The example above shows you how to convert the integer ASCII value into characters.

Page 473

Chapter 15

Working with Strings

The UNICODE Function The UNICODE function returns the Unicode integer value for the first character of the character or input expression. Syntax: UNICODE (string)

SELECT UNICODE('H') AS UniH ,UNICODE('o') AS UniO ,UNICODE('w') AS UniW ,UNICODE('d') AS UniD ,UNICODE('y') AS UniY ;

UniH _____ UniO _____

72

111

UniW _____

119

UniD _____ UniY _____

100

121

The example above shows you how to convert characters into the UNICODE value. Page 474

Chapter 15

Working with Strings

The NCHAR Function The NCHAR function takes the integer values and converts them back into characters.

Syntax: NCHAR (Integer)

SELECT NCHAR(72) ,NCHAR(111) ,NCHAR(119) ,NCHAR(100) ,NCHAR(121)

AS NcaH AS NcaO AS NcaW AS NcaD AS NcaY ;

NcaH _____ NcaO _____

NcaW _____

NcaD _____ NcaY _____

H

w

d

o

The example above shows you how to convert integers back to characters.

Page 475

y

Chapter 15

Working with Strings

The LEN Function The LEN function returns the number of characters in an input string. (Ending spaces are automatically excluded for CHAR data types) Syntax: LEN (string) SELECT First_Name ,LEN(First_Name) AS Lnth ,Last_Name ,LEN(Last_Name) AS Lnth FROM Employee_Table First_Name __________

Lnth ____

Last_Name __________

Richard Cletus Mandee Herbert Billy John Squiggy Loraine William

7 6 6 7 5 4 7 7 7

Smythe Strickling Chambers Harrison Coffing Smith Jones Larkins Reilly

Lnth ____ 6 10 8 8 7 5 5 7 6

The LEN function returns the number of characters in the input string and not necessarily the number of bytes.

Page 476

Chapter 15

Working with Strings

The DATALENGTH Function The DATALENGTH function returns the number of characters in an input string. (Ending spaces are automatically included for CHAR data types) Syntax: DATALENGTH (string) SELECT First_Name ,DATALENGTH(First_Name) AS Lnth ,Last_Name ,DATALENGTH(Last_Name) AS Lnth FROM Employee_Table First_Name __________

Lnth ____

Last_Name __________

Lnth ____

Richard Cletus Mandee Herbert Billy John Squiggy Loraine William

7 6 6 7 5 4 7 7 7

Smythe Strickling Chambers Harrison Coffing Smith Jones Larkins Reilly

20 20 20 20 20 20 20 20 20

The DATALENGTH function returns the number of characters in the input string and not necessarily the number of bytes. The difference between the LEN and the DATALENGTH functions is that the LEN function excludes trailing spaces. The DATALENGTH function counts them. Notice that each length is 20 characters for the Last_Name lengths. Page 477

Chapter 15

Working with Strings

Concatenation

The + sign means concatenate

SELECT First_Name ,Last_Name ,First_Name A space + '' + Last_Name as Full_Name FROM Employee_Table WHERE First_Name = 'Squiggy'

First_Name _________

Last_Name Full_Name _________ ___________

Squiggy

Jones

Squiggy Jones concatenated

See those + signs? Those represent concatenation. That allows you to combine multiple columns into one column. The + in this example has combined the first name, then a single space, and then the last name to get a new column called ‘Full name’. We brought back the full name of Squiggy Jones. Page 478

Chapter 15

Working with Strings

The RTRIM and LTRIM Command trims Spaces RTRIM Query

SELECT Last_Name ,RTRIM(Last_Name) AS Trim_Trailing_Spaces FROM Employee_Table ;

LTRIM Query SELECT Last_Name ,LTRIM(Last_Name) AS Trim_Leading_Spaces FROM Employee_Table ;

Trimming Both Leading and Trailing Spaces Query SELECT Last_Name ,LTRIM(RTRIM(Last_Name)) AS Trim_Spaces_Leading_Trailing FROM Employee_Table ; The RTRIM command trims trailing spaces from a character string. The LTRIM trims leading spaces from a character string. The LTRIM(RTRIM) combination trims both leading and trailing spaces from a character string.. Page 479

Chapter 15

Working with Strings

The SUBSTRING Command SELECT First_Name, SUBSTRING (First_Name, 2, 3) AS Quiz FROM Employee_Table ; Start in position 2

First_Name __________ Squiggy John Richard Herbert Mandee Cletus William Billy Loraine

Go for 3 positions

Quiz ______ qui ohn ich erb and let ill ill ora

This is a SUBSTRING. The substring is passed two parameters, and they are the starting position of the string and the number of positions to return (from the starting position). The above example will start in position 2 and go for 3 positions! Page 480

Chapter 15

Working with Strings

Using SUBSTRING to move Backwards SELECT First_Name, SUBSTRING (First_Name , 0 , 6) AS Before1 FROM Employee_Table ; Start in Position 0 (one space before)

First_Name Before1 __________ ________ Squiggy Squig John John Richard Richa Herbert Herbe Mandee Mande Cletus Cletu William Willi Billy Billy Loraine Lorai A starting position of zero moves one space in front of the beginning. Notice that our FOR Length is 6 so ‘Squiggy’ turns into ‘ Squig’. The point being made here is that both the starting position and ending positions can move backwards which will come in handy as you see other examples. Page 481

Chapter 15

Working with Strings

How SUBSTRING Works with a Starting Position of -1 SELECT First_Name, SUBSTRING (First_Name , -1 , 3) AS Before2 FROM Employee_Table ; Start in Position -1. This is two spaces before.

First_Name Before2 __________ ________ Squiggy S John J Richard R Herbert H Mandee M Cletus C William W Billy B Loraine L A starting position of -1 moves two spaces in front of the beginning. Notice that our FOR Length is 3, so each name delivers only the first initial. The point being made here is that both the starting position and ending positions can move backwards which will come in handy as you see other examples. Page 482

Chapter 15

Working with Strings

How SUBSTRING Works with an Ending Position of 0 SELECT First_Name, SUBSTRING (First_Name , 3 , 0) AS WhatsUp FROM Employee_Table ; Go for 0 positions

First_Name WhatsUp __________ ________ Squiggy John Richard Herbert Mandee Cletus William Billy Loraine In our example above, we start in position 3, but we go for zero positions, so nothing is delivered in the column. That is what’s up!

Page 483

Chapter 15

Working with Strings

Concatenation and SUBSTRING A Period (.) and a space

SELECT First_Name ,Last_Name ,Substring(First_Name, 1, 1) + '. ' + Last_Name as Full_Name FROM Employee_Table

First_Name _________ Last_Name ____________ Full_Name _________ Richard Smythe R. Smythe Cletus Strickling C. Strickling Mandee Chambers M. Chambers Herbert Harrison H. Harrison Billy Coffing B. Coffing John Smith J. Smith Squiggy Jones S. Jones Loraine Larkins L. Larkins William Reilly W. Reilly Of the three items being concatenated together, what is the first item of concatenation in the example above? The first initial of the First_Name. Then, we concatenated a literal space and a period. Next, we concatenated the Last_Name. Page 484

Chapter 15

Working with Strings

SUBSTRING and Different Aliasing SELECT Phone_Number ,First3digits = SUBSTRING(Phone_Number, 1, 3) ,Exchange = SUBSTRING(Phone_Number, 5,4) FROM Customer_Table WHERE Phone_Number LIKE '[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]'

Phone_Number __________ First3digits _________ Exchange ____________ 555-1234 555 1234 555-1111 555 1111 555-1212 555 1212 347-8954 347 8954 322-1012 322 1012

Above, we are using the Substring commands to extract certain portions of the Phone_Number. Notice that the column names are materialized at the beginning of the line. This is almost like a reverse alias.

Page 485

Chapter 15

Working with Strings

The LEFT and RIGHT Functions The LEFT and RIGHT functions are abbreviations of the SUBSTRING function. They return a requested number of characters from the left or right end of the input string. Syntax: LEFT(string, n), RIGHT(string, n)

SELECT First_Name ,LEFT (First_Name , 1) AS First_Initial ,Last_Name ,Right (RTRIM(Last_name), 2) AS "Last Two Letters" FROM Employee_Table WHERE Dept_No in (400) ; First_Name __________

First_Initial __________

Last_Name Last Two Letters __________ ______________

Cletus Herbert William

C H W

Strickling Harrison Reilly

ng on ly

In our example above, our result set will have the First_Name and Last_Name coming back, but we also use the LEFT and RIGHT functions to produce the first letter of the First_Name and the last two letters of the Last_Name. We filtered the rows with an additional WHERE clause to only bring back three rows. Notice the RTRIM of Last_Name. This is necessary because the Last_Name column has a data type of Character 20. This is padded with spaces. Page 486

Chapter 15

Working with Strings

Four Concatenations Together CHAR(20)

VARCHAR(12)

SELECT First_Name ,Last_Name ,RTRIM(Last_Name) + ' ' + Substring(First_Name, 1, 1) + '.' AS Last_Name_1st FROM Employee_Table ;

First_Name Last_Name_1st __________ Last_Name _________ _____________ Richard Cletus Mandee Herbert Billy John Squiggy Loraine William

Smythe Strickling Chambers Harrison Coffing Smith Jones Larkins Reilly

Smythe R. Strickling C. Chambers M. Harrison H. Coffing B. Smith J. Jones S. Larkins L. Reilly W.

Why did we TRIM the Last_Name? To get rid of the spaces or the output would have looked odd. How many items are being concatenated in the example above? There are 4 items concatenated. We start with the Last_Name (after we trim it), then we have a single space, then we have the First Initial of the First Name, and then we have a Period. Page 487

Chapter 15

Working with Strings

The DATALENGTH Function and RTRIM The DATALENGTH function returns the number of characters in an input string. (Ending spaces are automatically included for CHAR data types) Syntax: DATALENGTH (string) SELECT First_Name ,DATALENGTH(First_Name) AS Lnth ,Last_Name ,DATALENGTH(RTRIM(Last_Name)) AS Lnth FROM Employee_Table First_Name __________

Lnth ____

Last_Name __________

Richard Cletus Mandee Herbert Billy John Squiggy Loraine William

7 6 6 7 5 4 7 7 7

Smythe Strickling Chambers Harrison Coffing Smith Jones Larkins Reilly

Lnth ____ 6 10 8 8 7 5 5 7 6

The DATALENGTH function returns the number of characters in the input string and not necessarily the number of bytes. The difference between the LEN and the DATALENGTH functions is that the LEN function excludes trailing spaces. However, the DATALENGTH function counts them. Use either the LEN function or merely RTRIM with DATALENGTH. Page 488

Chapter 15

Working with Strings

A Visual of the TRIM Command Using Concatenation Concatenation without Trim and with Trim SELECT Last_Name concatenate ,First_Name ,Last_Name + First_Name as NameBackwards ,RTRIM(Last_Name) + First_Name as TrimNameBackwards FROM Employee_Table

Last_Name First_Name __________ __________ Jones Squiggy Smith John Smythe Richard Harrison Herbert Chambers Mandee Strickling Cletus Reilly William Coffing Billy Larkins Loraine

NameBackwards TrimNameBackwards ______________________ __________________ Jones Squiggy JonesSquiggy Smith John SmithJohn Smythe Richard SmytheRichard Harrison Herbert HarrisonHerbert Chambers Mandee ChambersMandee Strickling Cletus StricklingCletus Reilly William ReillyWilliam Coffing Billy CoffingBilly Larkins Loraine LarkinsLoraine

When you use the RTRIM command on a column, that column will have trailing spaces removed.

Page 489

Chapter 15

Working with Strings

CHARINDEX Function Finds a Letter(s) Position in a String Tell this function what character(s) to look for in a string, and optionally, what starting position to first start looking. If it does not find the character(s) in the string it returns a 0. It also only reports the first occurrence. Syntax: CHARINDEX(substring, string[, start_pos]) SELECT Last_Name ,CHARINDEX ('e', Last_Name) AS Find_E ,CHARINDEX ('f', Last_Name) AS Find_F ,CHARINDEX ('th', Last_Name) AS Find_TH ,CHARINDEX ('in', Last_Name, 6) AS Find_es_after_6 FROM Employee_Table WHERE Last_Name IN ('Smith', 'Smythe', 'Strickling', 'Coffing') ORDER BY 1 DESC; Last_Name _________ Strickling Smythe Smith Coffing

Find_E ______ 0 6 0 0

Find_F ______ 0 0 0 3

Find_TH ________ Find_ing_after_6 ______________ 0 4 4 0

8 0 0 0

Strickling does not have an 'e', 'f' or 'th' in it, but it does have an 'in' starting in position 8. Coffing shows only the first 'f' in position 3, but notice that Coffing also has an 'in'. However, we stated to start looking in position 6, thus a zero was returned to indicate it didn't find an occurrence. Smith and Smythe both have a 'th' starting in position 4. Page 490

Chapter 15

Working with Strings

The CHARINDEX Command is brilliant with SUBSTRING Starting position is a subquery. Find the first space and subtract two.

SELECT Last_Name ,SUBSTRING (Last_Name, CHARINDEX(' ', Last_name) -2 , 2) as Last_Two_Letters FROM Employee_Table; Last_Name _________

Smythe Strickling Chambers Harrison Coffing Smith Jones Larkins Reilly

Last_Two_Letters _____________

he ng rs on ng th es ns ly

What was the starting position of the Substring in the above query? It uses a subquery. Page 491

Chapter 15

Working with Strings

The CHARINDEX Command Using a Literal The phrase we are seeking to find

The 1st character of the phrase starts here

SELECT CHARINDEX('May flowers', 'April showers bring May flowers') ;

(No column name) _____________ 21

We are looking for the phrase May flowers. This starts in position 21 of the substring Page 492

Chapter 15

Working with Strings

PATINDEX Function The PATINDEX, better named "Pattern Index" will find patterns in an argument somewhat similar to the LIKE command. The following example will show how to find the first occurrence of a digit within a string. Syntax: PATINDEX(pattern, string) SELECT PATINDEX('%[0-9]%', 'July 4th Holiday') as Number_Position; Give me the position of any number between 0-9 in the string

Number_Position _______________ 6

The "Pattern Index", referred to as PATINDEX will look for a pattern in a string and give you the position of the first character in the pattern. Above, we are using the literal 'July 4th Holiday', but we could have used a column value. The number 4 is in the 6th position of the value. Page 493

Chapter 15

Working with Strings

PATINDEX Function to Find a Character Pattern The PATINDEX, better named "Pattern Index" will find patterns in an argument somewhat similar to the LIKE command. The example below will find any occurrence where the column Street has a 3 before the St.

Syntax: PATINDEX(pattern, string) SELECT Subscriber_No, Street, PATINDEX('%[3]%St%', Street) As "Street_3" FROM Addresses Subscriber_No ____________ 5555555 2222222 4444444 1111111 3333333

Street Street_3 __________________ ________ 121 Jump St. 123 Some St. 12 Jump St. 123 Any St. 2468 Appreciate Ave.

0 3 0 3 0

The "Pattern Index", referred to as PATINDEX will look for a pattern in a string and give you the position of the first character in the pattern. Above, we are using the column Street to see if there is a 3 before the St. Notice that we have two hits and they are both in the 3rd position of the column Street. Page 494

Chapter 15

Working with Strings

SOUNDEX Function to Find a Sound The SOUNDEX, better named "Sound" will display similar sounding items. The example below will find any Last_Name that sounds like 'Smith'.

Syntax: SOUNDEX(String)

SELECT DISTINCT SOUNDEX(Last_Name) SoundsLike1 ,SOUNDEX('Smith') SoundsLike2 ,Last_Name FROM Employee_Table WHERE SOUNDEX(Last_Name) = SOUNDEX('Smith') SoundsLike1 Last_Name ___________ SoundsLike2 ___________ _________ S530 S530 Smith S530 S530 Smythe

Call center employees often look up customers by last name while speaking with the customer on the phone. The employees would like to guess at the spelling of the name to narrow the search results and then work with the customer to determine the appropriate spelling. This is what the SOUNDEX function does. Above, we are looking at anyone who has a name that sounds like 'Smith'. We got two results back in 'Smith' and 'Smythe'. Page 495

Chapter 15

Working with Strings

DIFFERENCE Function to Quantile a Sound The DIFFERENCE function will display similar sounding items and give them a quantile of 4 (high similarity) to a low of 0 (low similarity).

SELECT DISTINCT SOUNDEX(Last_Name) AS Sound1 ,SOUNDEX('smith') AS Sound_Smith ,DIFFERENCE(Last_Name, 'Smith') as High4Low0 ,Last_Name FROM Employee_Table ORDER BY 3 DESC ; Sound1 Sound_Smith __________ High4Low0 __________ Last_Name ______ ___________ 4 Smith S530 S530 Sounds a lot 4 Smythe S530 S530 like 'Smith' 2 Jones J520 S530 2 Reilly R400 S530 2 Strickling S362 S530 1 Coffing C152 S530 1 Chambers C516 S530 H625 S530 Sounds nothing 1 Harrison like 'Smith' 1 Larkins L625 S530

Call center employees often look up customers by last name while speaking with the customer on the phone. The employees would like to guess at the spelling of the name to narrow the search results and then work with the customer to determine the appropriate spelling. The SOUNDEX and DIFFERENCE functions can both be used. Above, we are using the DIFFERENCE function to show how close the name 'Smith' is to other Last_Name values. Page 496

Chapter 15

Working with Strings

The REPLACE Function The REPLACE function replaces all occurrences of substring1 in the string with substring2. Syntax: REPLACE(string, substring1, substring2) SELECT Customer_Name ,REPLACE (Customer_Name, ' ', '_') AS Under_Score ,Phone_Number ,REPLACE (Phone_Number, '-', ' ') AS No_Dash FROM Customer_table Customer_Name Under_Score ________________ ________________ Phone_Number _______________ No_Dash _________ Billy's Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

Billy's_Best_Choice Acme_Products ACE_Consulting XYZ_Plumbing Databases_N-U

Replace spaces with underscores

555-1234 555-1111 555-1212 347-8954 322-1012

555 1234 555 1111 555 1212 347 8954 322 1012

Replace dashes with spaces

The RELACE function replaces a value for another in a string. Above, we have replaced the spaces in a Customer Name with underscores. In the Phone Number we have replace the dashes (-) with a space. Page 497

Chapter 15

Working with Strings

LEN and REPLACE Functions for Number of Occurrences SELECT Last_Name ,LEN(Last_Name) - LEN(REPLACE(Last_Name, 'r', '')) AS Num_of_Occur FROM Employee_Table WHERE Last_Name LIKE '%r%' Two single quotes Last_Name ____________ Num_of_Occur _______________ Strickling Chambers Harrison Larkins Reilly

1 1 2 1 1

The LEN function returns the number of characters in an input string.

Syntax: LEN (string) The REPLACE function replaces all occurrences of substring1 in the string with substring2. Syntax: REPLACE(string, substring1, substring2)

The RELACE function and LEN function can be combined to find the number of occurrences of a character. You can use the REPLACE function to count the number of occurrences of a character within a string. To do this, you replace all occurrences of the character with an empty string (zero characters) and calculate the original length of the string minus the new length. Page 498

Chapter 15

Working with Strings

REPLICATE Function The REPLICATE function replicates a string a requested number of times. Syntax: REPLICATE(string, n) SELECT Last_Name ,Class_Code ,REPLICATE(Class_Code, 3) AS Repeat_3_Times ,REPLICATE('Go Wildcats! ', 2) AS UofA FROM Student_Table Last_Name __________ Phillips Hanson Wilson Thomas Johnson McRoberts Bond Delaney Smith Larkins

Class_Code __________ SR FR SO FR ? JR JR SR SO FR

Repeat_3_Times ______________ SRSRSR FRFRFR SOSOSO FRFRFR ? JRJRJR JRJRJR SRSRSR SOSOSO FRFRFR

UofA ______________________ Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats! Go Wildcats!

The REPLICATE function replicates a string a number of times. Above, notice we replicated the class_code column 3 times. Also, notice that we replicated a literal value of 'Go Wildcats! ' 2 times. Did you notice that Johnson had a null value for his Class_Code? The Null value did not replicate. Page 499

Chapter 15

Working with Strings

STUFF Function The STUFF function works on a character string and will put STUFF where you want STUFF after deleting STUFF.

Syntax: STUFF(string, pos, delete_length, insertstring)

SELECT Start in Delete Put in nd position 2 1 Character 'enior' First_Name ,Class_Code ,STUFF (Class_Code, 2, 1, 'enior') As Full_Class_Code FROM Student_Table WHERE Class_Code = 'SR' First_Name __________ Martin Danny

Class_Code __________ SR SR

Full_Class_Code _______________ Senior Senior

The STUFF function operates on an input parameter string. It deletes as many characters as the number specified in the delete_length parameter, starting at the character position specified in the pos input parameter. The function inserts the string specified in the insertstring parameter in position pos. If you decide to insert a string and not delete anything, you can specify a length of 0 as the third argument. Page 500

Chapter 15

Working with Strings

STUFF without Deleting Function The STUFF function works on a character string and will put STUFF where you want STUFF after deleting STUFF. Syntax: STUFF(string, pos, delete_length, insertstring)

Start in 1st position

Delete 0 Characters

Put in 'Course: '

SELECT Course_Name ,STUFF (Course_Name, 1, 0, 'Course: ') As Course_Added FROM Course_Table Course_Name _____________________ Advanced SQL Database Administration Introduction to SQL Physical Database Design SQL Server Concepts V2R3 SQL Features

Course_Added ____________________________ Course: Advanced SQL Course: Database Administration Course: Introduction to SQL Course: Physical Database Design Course: SQL Server Concepts Course: V2R3 SQL Features

Above, we decided not to delete anything, but to insert a string called 'Course: ', so we specified a length of 0 as the third argument. The STUFF function operates on an input parameter string. It deletes as many characters as the number specified in the delete_length parameter, starting at the character position specified in the pos input parameter. The function inserts the string specified in the insertstring parameter in position pos. Page 501

Chapter 15

Working with Strings

UPPER and lower Functions The UPPER and LOWER functions convert the input string to either all uppercase or lowercase characters. Syntax: UPPER(string), LOWER(string)

SELECT First_Name ,UPPER (First_Name) as "Upper Case" ,lower(First_Name) as "Lower Case" FROM Student_Table First_Name __________ Martin Henry Susie Wendy Stanley Richard Jimmy Danny Andy Michael

Upper Case Lower Case __________ __________ MARTIN martin HENRY henry SUSIE susie WENDY wendy STANLEY stanley RICHARD richard JIMMY jimmy DANNY danny ANDY andy MICHAEL michael

The UPPER and LOWER functions convert the input string to either all uppercase or lowercase characters. Page 502

Chapter 16

Page 503

Interrogating the Data

Chapter 16

Interrogating the Data

Chapter 16 - Interrogating the Data

"The difference between genius and stupidity is that genius has its limits" - Albert Einstein

Page 504

Chapter 16

Interrogating the Data

Quiz – What would the Answer be? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT Class_Code ,Grade_Pt / (Grade_Pt * 2 ) as Math1 FROM Student_Table ORDER BY 1,2 ;

Can you guess what would return in the Answer Set? Using the Student_Table above, and try and predict what the answer will be if this query was running on the system. Page 505

Chapter 16

Interrogating the Data

Answer to Quiz – What would the Answer be? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT Class_Code ,Grade_Pt / (Grade_Pt * 2 ) as Math1 FROM Student_Table ORDER BY 1,2 ; Error – Division by zero

You get an error when you DIVIDE by ZERO! Let’s turn the page and fix it!

Page 506

Chapter 16

Interrogating the Data

The NULLIF Command Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT Class_Code ,Grade_Pt / ( NULLIF (Grade_pt,0) * 2 ) AS Math1 FROM Student_Table; SELECT Class_Code ,Grade_Pt / ( NULLIF( (Grade_pt) * 2, 0 ) ) AS Math1 FROM Student_Table;

If you have a calculation where a ZERO could kill the operation, and you don’t want that, you can use the NULLIF command to convert any zero value to a null value. Both queries above bring back the same result. Page 507

Chapter 16

Interrogating the Data

Quiz – Fill in the Answers for the NULLIF Command Student_Table Student_ID _________ 423400 123250 234121

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Phillips Martin SR 3.00 Thomas Wendy FR 4.00

SELECT Fill in the Answer Last_Name Set below after ,NULLIF(Grade_Pt, 0) AS GP1 looking at the table ,NULLIF(Grade_Pt, 3.0) AS GP2 and the query. ,NULLIF(Grade_Pt, 4.0) AS GP3 FROM Student_Table WHERE Student_ID IN (423400, 123250, 234121) ORDER BY Last_Name ; Last_Name GP1 __________ ____ Larkins Phillips Thomas

GP2 ____

What would the above Answer Set produce from your analysis? Page 508

GP3 ____

Chapter 16

Interrogating the Data

Answer– Fill in the Answers for the NULLIF Command Student_Table Student_ID _________ 423400 123250 234121

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Phillips Martin SR 3.00 Thomas Wendy FR 4.00

SELECT Fill in the Answer Last_Name Set below after ,NULLIF(Grade_Pt, 0) AS GP1 looking at the table ,NULLIF(Grade_Pt, 3.0) AS GP2 and the query. ,NULLIF(Grade_Pt, 4.0) AS GP3 FROM Student_Table WHERE Student_ID IN (423400, 123250, 234121) ORDER BY Last_Name ; Last_Name GP1 GP2 __________ ____ ____ ? 0.00 Larkins 3.00 ? Phillips 4.00 4.00 Thomas

GP3 ____ 0.00 3.00 ?

Look at the answers above, and if it doesn’t make sense, go over it again until it does.

Page 509

Chapter 16

Interrogating the Data

The COALESCE Command – Fill In the Answers Student_Table Student_ID _________ 423400 260000 234121

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Johnson Stanley ? ? Thomas Wendy FR 4.00

SELECT Fill in the Answer Last_Name Set below after looking at the table ,Grade_Pt and the query. ,Class_Code ,COALESCE (Grade_Pt, Class_Code) as ValidStudents FROM Student_Table WHERE Last_Name IN ('Johnson', 'Larkins', 'Thomas') ORDER BY 1 ; Last_Name Grade_Pt __________ ________ Johnson Larkins Thomas

? 0.00 4.00

Class_Code __________ ValidStudents ___________ ? FR FR

Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null. Page 510

Chapter 16

Interrogating the Data

The COALESCE Answer Set Student_Table Student_ID _________ 423400 260000 234121

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Johnson Stanley ? ? Thomas Wendy FR 4.00

SELECT Last_Name ,Grade_Pt ,Class_Code ,COALESCE (Grade_Pt, Class_Code) as ValidStudents FROM Student_Table WHERE Last_Name IN ('Johnson', 'Larkins', 'Thomas') ORDER BY 1 ;

Last_Name Grade_Pt __________ ________ Johnson Larkins Thomas

? 0.00 4.00

Class_Code __________ ValidStudents ___________ ? FR FR

? 0.00 4.00

Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null.

Page 511

Chapter 16

Interrogating the Data

COALESCE is Equivalent to This CASE Statement SELECT Last_Name ,Grade_Pt ,Class_Code ,COALESCE (Grade_Pt, Class_Code) as ValidStudents FROM Student_Table ; SELECT Last_Name ,Grade_Pt ,Class_Code , CASE WHEN Grade_Pt IS NOT NULL THEN Grade_Pt WHEN Class_Code IS NOT NULL THEN Class_Code ELSE NULL END as ValidStudents FROM Student_Table ;

Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null. Above, are two queries that return the exact same answer set. These examples are designed to give you a better idea of how Coalesce works

Page 512

Chapter 16

Interrogating the Data

The Basics of CAST (Convert and Store) CAST will convert a column or value’s data type temporarily into another data type. Below is the syntax:

SELECT CAST( AS [()] ) FROM ; Convert smallint to character

Examples using CAST:

CAST ( CAST ( CAST ( CAST (

AS CHAR(5) ) AS INTEGER ) AS VARCHAR(5) ) AS FLOAT )

Truncates decimals

Data can be converted from one type to another by using the CAST function. As long as the data involved does not break any data rules (i.e. placing alphabetic or special characters into a numeric data type), the conversion works. The name of the CAST function comes from the Convert And STore operation that it performs.

Page 513

Chapter 16

Interrogating the Data

Some Great CAST (Convert and Store) Examples SELECT CAST('ABCDE' AS CHAR(1) ) AS Trunc ,CAST(128 AS CHAR(3) ) AS This_Is_OK ,CAST(127 AS INTEGER ) AS Bigger ;

_____ ____ Trunc This_Is_OK ______ A 128

Bigger ______ 127

The first CAST truncates the five characters (left to right) to form the single character ‘A’. In the second CAST, the integer 128 is converted to three characters and left justified in the output. The 127 was initially stored in a SMALLINT (5 digits - up to 32767) and then converted to an INTEGER. Hence, it uses 11 character positions for its display, ten numeric digits and a sign (positive assumed) and right justified as numeric.

Page 514

Chapter 16

Interrogating the Data

Some Great CAST (Convert and Store) Examples SELECT CAST(121.53 AS SMALLINT) AS Whole ,CAST(121.53 AS DECIMAL(3,0)) AS Rounder ;

______ _______ Whole Rounder 121 122

The value of 121.53 was initially stored as a DECIMAL as 5 total digits with 2 of them to the right of the decimal point. Then, it is converted to a SMALLINT using CAST to remove the decimal positions. Therefore, it truncates data by stripping off the decimal portion. It does not round data using this data type. On the other hand, the CAST in the fifth column called Rounder is converted to a DECIMAL as 3 digits with no digits (3,0) to the right of the decimal, so it will round data values instead of truncating. Since .53 is greater than .5, it is rounded up to 122.

Page 515

Chapter 16

Interrogating the Data

A Rounding Example SELECT CAST(.014 ,CAST(.016 ,CAST(.015 ,CAST(.0150 ,CAST(.0250 ,CAST(.0159

Digit to Right of rounding digit < 5 (no change)

.014 ____ 0.01

AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2))

AS ".014" AS ".016" AS ".015" AS ".0150" AS ".0250" AS ".0159"

Digit to Right of rounding digit > 5 (increase 1)

.016 ____ 0.02

.015 ____ 0.02

.0150 _____ 0.02

.0250 _____ 0.03

Above, is an example of what you might expect to see in similar rounding examples.

Page 516

.0159 _____ 0.02

Chapter 16

Interrogating the Data

Quiz - CAST Examples SELECT Order_Number as OrdNo ,Customer_Number as CustNo ,Order_Date ,Order_Total ,CAST(Order_Total as integer) as Chopped ,CAST(Order_Total as Decimal(5,0)) as Rounded FROM Order_Table ORDER BY 1 ; Fill in the Answer Set below after looking at the data and the query.

OrdNo _________ CustNo Order_Date Order_Total _______ __________ __________ Chopped _______ 123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

1998-05-04 1999-01-01 1999-10-01 1999-10-10 1999-09-09

Rounded _______

12347.53 8005.91 5111.47 15231.62 23454.84

The Column Chopped takes Order_Total (a Decimal (10,2) and CASTs it as an integer which chops off the decimals. Rounded CASTs Order_Total as a Decimal (5,0), which takes the decimals and rounds up if the decimal is .50 or above. Page 517

Chapter 16

Interrogating the Data

Answer to Quiz - CAST Examples SELECT Order_Number as OrdNo ,Customer_Number as CustNo ,Order_Date ,Order_Total ,CAST(Order_Total as integer) as Chopped ,CAST(Order_Total as Decimal(5,0)) as Rounded FROM Order_Table ORDER BY 1 ;

OrdNo _________ CustNo Order_Date Order_Total _______ __________ __________ Chopped _______

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

1998-05-04 1999-01-01 1999-10-01 1999-10-10 1999-09-09

12347.53 8005.91 5111.47 15231.62 23454.84

12347 8005 5111 15231 23454

Rounded _______

12348 8006 5111 15232 23455

The Column Chopped takes Order_Total (a Decimal (10,2) and CASTs it as an integer which chops off the decimals. Rounded CASTs Order_Total as a Decimal (5,0), which takes the decimals and rounds up if the decimal is .50 or above.

Page 518

Chapter 16

Interrogating the Data

Quiz - The Basics of the CASE Statements Course_Table Course_ID _________ 100 200 210 220 300 400

Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16

SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' END AS CreditAlias FROM Course_Table WHERE Course_ID IN (220, 300) ; Course_Name ______________________ CreditAlias ____________ Physical Database Design SQL Features

This is a CASE STATEMENT which allows you to evaluate a column in your table, and from that, come up with a new answer for your report. Every CASE begins with a CASE, and they all must end with a corresponding END. What would the answer be? Page 519

Chapter 16

Interrogating the Data

Answer to Quiz - The Basics of the CASE Statements Course_Table Course_ID _________ 100 200 210 220 300 400

Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16

SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' END AS CreditAlias FROM Course_Table WHERE Course_ID IN (220, 300) ; Course_Name ______________________ CreditAlias ____________ ? Physical Database Design Two Credits SQL Features

The answer for the Physical Database Design class is null. This is because it fell through the case statement. The answer for the SQL Features course is Two Credits. Once a case statement gets a match, it leaves the statement and gets the next row. Page 520

Chapter 16

Interrogating the Data

Using an ELSE in the Case Statement Course_Table Course_ID _________ 100 200 210 220 300 400

Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16

SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' ELSE 'Four Credits' END AS CreditAlias FROM Course_Table WHERE Course_ID IN (220, 300) ; Course_Name ______________________ CreditAlias ____________ Four Credits Physical Database Design Two Credits SQL Features

Now that we have an ELSE in our case statement we are guaranteed that nothing will fall through.

Page 521

Chapter 16

Interrogating the Data

Using an ELSE as a Safety Net Course_Table Course_ID _________ 100 200 210 220 300 400

Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16

SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' WHEN 4 THEN 'Four Credits' ELSE 'Do not know' END AS CreditAlias FROM Course_Table ; Now that we have an ELSE in our case statement we are guaranteed that nothing will fall through. An ELSE should be used in case you forgot a possibility and there was no match.

Page 522

Chapter 16

Interrogating the Data

Rules For a Valued Case Statement SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' Else 'Credits not found' END AS CreditAlias FROM Course_Table ;

The column Credits (in blue) follows the word CASE. This is a valued case statement. The value is the column Credits.

Rules for a Valued CASE: 1. You can only check for equality 2. You can only check the value of the column Credits

There are two types of CASE statements. There is the Valued CASE and the Searched CASE. Above, are the rules for the Valued CASE statement.

Page 523

Chapter 16

Interrogating the Data

Rules for a Searched Case Statement SELECT Course_Name No Value follows the ,CASE word CASE. This is WHEN Credits = 1030 is in page 3.

Page 549

Chapter 17

Table Create and Data Types

The Building of a B-Tree for a Clustered Index (3 of 3) 1001

Intermediate Node 1001

Header

1030

Header

2000

Header

3000

6000

Intermediate Node 3000

Header

4000

Header

5000

Header

Root Node

Intermediate Node 6000

Header

7000

Header

8000

Header

Leaf Pages containing the actual data rows

Let's look at this B-Tree starting at the leaf level. Each leaf is an 8 K page that contains data rows. Each data row has a RowID containing the FileID:PageNo:RowNum, which takes up 8 bytes. The rows are sorted in each page by Employee_No. Each Intermediate node has a pointer to the first RowID and Employee_No for every leaf it is responsible for. The Root node has a pointer to the first RowID and Employee_No for each Intermediate node. As a leaf adds rows and expands past 8 K, it splits. As an Intermediate node adds leafs and expands past 8 K, it splits into two more Intermediate nodes. As a Root node continues to add more Intermediate node pointers and expands past 8 K, it splits into two Root nodes. The reason they call it a B-Tree (Balanced Tree) is because every row can be retrieved at the exact same speed. . Page 550

Chapter 17

Table Create and Data Types

The Row Offset Array is the Guidance System For Every Row Previous Page# - 1

PAGE 2

Next Page# - 3

1000 2 1 1

1001 100 Rafael Minal 90000

1000 2 2 1

1004 400 Kyle

1000 2 3 1

1007 200 Sushma Davis 50000

1000 2 4 1

1020 200 May

1000 2 5 1

1030 500 Dawn Wilson 50000

1000 2 6 1

1040 300 Red

Saylor 40000

1000 2 7 1

1050 300 Rex

Mason 60000

1000 2 8 1

1060 400 Kit

Wagner 50000

Row Offset Array (ROA)

Stover 60000 Jones

60000

The Row Offset Array guides every search. It holds the starting position of every row within the page. It is always in perfect descending order. (The first row (yellow) on the right represents the starting position of the first row on the page (also in yellow).

2 Bytes for each ROA slot

798 698 598 498 398 298 198 98

When a page of data is moved from disk into memory it is ready to be searched to produce an answer set. Every read of every row will first go through the Row Offset Array. The page above holds eight rows. In each case, the Row Offset Array will be read and that will guide the Azure SQL Data Warehouse directly to the offset of the row. For example, to read the first row on the page the Azure SQL Data Warehouse will go to the last slot in the Row Offset Array (in yellow) and it will know that the first row starts in byte 98. It will then go to byte 98 and read the row. Page 551

Chapter 17

Table Create and Data Types

The Row Offset Array Provides Two Search Options (1 of 2) Previous Page# - 1

PAGE 2

Next Page# - 3

1000 2 1 1

1001 100 Rafael Minal 90000

1000 2 2 1

1004 400 Kyle

1000 2 3 1

1007 200 Sushma Davis 50000

1000 2 4 1

1020 200 May

1000 2 5 1

1030 500 Dawn Wilson 50000

1000 2 6 1

1040 300 Red

Saylor 40000

1000 2 7 1

1050 300 Rex

Mason 60000

1000 2 8 1

1060 400 Kit

Wagner 50000

Row Offset Array (ROA)

Stover 60000

Jones

60000

2 Bytes for each ROA slot

1 The first search option, which is the slowest is a sequential search.

Each row will be read starting from the first row to the last. This is done when a query does not use an index. All Full Table Scans are sequential searches.

798 698 598 498 398 298 198 98

When a page of data is moved from disk into memory it is ready to be searched to produce an answer set. Every read of every row will first go through the Row Offset Array. The slowest search happens when there is no index being used. This is often called a Full Table Scan. There are eight rows in the above example. The Row Offset Array will be used with each read. The Azure SQL Data Warehouse will read the last offset first (yellow color) and then read the first row in the page starting at byte 98. The Azure SQL Data Warehouse will then read the second offset (pink color) and then read the second row in offset 198, and so on. Stay with me because there are two more reasons this design is always used. Page 552

Chapter 17

Table Create and Data Types

The Row Offset Array Provides Two Search Options (2 of 2) Previous Page# - 1

PAGE 2

Next Page# - 3

1000 2 1 1

1001 100 Rafael Minal 90000

1000 2 2 1

1004 400 Kyle

1000 2 3 1

1007 200 Sushma Davis 50000

1000 2 4 1

1020 200 May

1000 2 5 1

1030 500 Dawn Wilson 50000

1000 2 6 1

1040 300 Red

Saylor 40000

1000 2 7 1

1050 300 Rex

Mason 60000

1000 2 8 1

1060 400 Kit

Wagner 50000

Row Offset Array (ROA)

The second search option, which is the fastest is a Binary search. 2

Stover 60000 Jones

60000

2 Bytes for each ROA slot

798 698 598 498 398 298 198 98

This search uses an index and it is like using a phone book.

The first row read will be in the middle of the page. The system will then know whether to move up or down because the rows are sorted.

When a page of data is moved from disk into memory it is ready to be searched to produce an answer set. Every read of every row will first go through the Row Offset Array. When the data on the page is sorted using a clustered index, a binary search is fast. The Azure SQL Data Warehouse reads the Row Offset Array to find the row in the middle. It can then move up or down depending on if it is too high or too low. It always cuts the remaining search in half. Imagine if we the query was searching for Employee 1050. The Row Offset Array would first go to the middle are read the row for employee 1020 (red arrow). It would then realize it was too low. It would then read the Row Offset Array to move to employee 1040. It would still be too low and then use the Row Offset Array to continue cutting the remaining rows in half, and then choose the row for Employee 1050. Found it! A binary search can be used on queries that take advantage of an index on a sorted page.

Page 553

Chapter 17

Table Create and Data Types

The Row Offset Array Helps With Inserts Previous Page# - 1

PAGE 2

Next Page# - 3

1000 2 1 1 1001 100 Rafael Minal 90000 1000 2 2 1 1004 400 Kyle

Stover 60000

1000 2 3 1 1007 200 Sushma Davis 50000 1000 2 4 1 1020 200 May

Jones

60000

1000 2 5 1 1030 500 Dawn Wilson 50000

1000 2 6 1 1040 300 Red

Saylor 40000

1000 2 7 1 1050 300 Rex

Mason 60000

1000 2 8 1 1060 400 Kit

Wagner 50000

1000 2 9 1 1002 100 Bill Row Offset Array (ROA)

Mason 75000

2 Bytes for each ROA slot

798 698 598 498 398 298 198

898

The new row just inserted logically sorts as row 2, but the Azure SQL Data Warehouse places it at the end of the page, but logically places it second in the Row Offset Array.

The row for Employee_No 1002 has just been inserted.

98

When a page is sorted using a clustered index the rows are sorted physically and logically. Let me explain. The Row Offset Array is always in perfect descending order. Above, you can see that each row is sorted physically by Employee_No. In a perfect world, the Row Offset Array logically lists the rows on the page in perfect descending order and the rows are physically in perfect order within the page. However, when SQL is used for an insert statement, it will often write the row physically as the last row on the page (for speed), but it will still list the row in the Row Offset Array in perfect logical order. Notice, we added a new row (in black) at the end of the page. Since this table is sorted with a clustered index on Employee_No you should notice that the row in black has an Employee_No of 1002. It should physically be the second row on the page. It is the second row according the Row Offset Array. Any sequential search will read the black row second. Page 554

Chapter 17

Table Create and Data Types

What is a Uniquefier? PAGE 2

Previous Page# - 1

Next Page# - 3

1000 2 1 1

1001 100 Rafael Minal 90000

1000 2 2 1

1004 400 Kyle

1000 2 3 1

1007 200 Sushma Davis 50000

1000 2 4 1

1020 200 May

1000 2 5 1

1030 500 Dawn Wilson 50000

1000 2 6 1

1040 300 Red

Saylor 40000

1050 300 Rex

Mason 60000

1060 400 Kit

Wagner 50000

1060 300 Will

Day

1000 2 7 1 1000 2 8 1

1000 2 9 2 The Uniquefier Identifies duplicate values in a clustered index

1060 1060

Stover 60000

Jones

60000

75000

2 Bytes for each ROA slot

898 798 698 598 498 398 298 198 98

When a page is sorted using a clustered index the rows are sorted physically and logically. Since, the Azure SQL Data Warehouse does not allow for Unique Clustered Indexes, a Uniquefier is added to the Row ID. Above, we have two individuals who have an Employee_No of 1060. The first employee gets the Uniquefier of 1 and the second of 2.

Page 555

Chapter 17

Table Create and Data Types

Adding an Index CREATE TABLE Emp_Intl ( Employee_No INTEGER ,Dept_No SMALLINT ,First_Name VARCHAR(12) ,Last_Name CHAR(20) ,Salary DECIMAL(8,2) ) ;

1

CREATE UNIQUE CLUSTERED INDEX Idx1 ON Emp_Intl (Employee_No);

2

CREATE INDEX Idx2 ON Emp_Intl (Dept_No);

Above, we have created a table called Emp_Intl. Each row in the table will contain a RowID. The RowID is 8bytes in size and contain FileID, PageID, SlotNo. A table can only have one clustered index. A clustered index sorts the rows of the table by the clustered key column value. In this example, the rows will be sorted in ascending order by Employee_NoA table can have numerous NON-CLUSTERED INDEXES. Think of them as pointers to data. They are implemented as B-TREES. More about B_TREES on the next slide.

Page 556

Chapter 17

Table Create and Data Types

When Do I Create a Non Clustered Index? 1. Utilize columns that only contain a large number of distinct values. This might be a combination of last name and first name or a social security number. If there are many duplicates then SQL Server will perform a sequential scan instead. 2. When queries do not return large result sets. This goes back to having many distinct values. 3. Utilize on columns that are frequently involved in the WHERE clause that utilize equality searches. 4. Utilize these not on OLTP applications, but when you have large Decision-support-system (DSS) applications. DSS systems for when joins and grouping are frequently required. A best practice is to create multiple nonclustered indexes on columns involved in join and grouping operations, and a clustered index on the foreign keys. 5. In Cover query situations. A Cover query uses only the non clustered index to retrieve the data to satisfy the query instead of utilizing the table for a query. The answer set is said to be covered by the index.

Following the do's and don'ts on this page can enhance performance and prevent difficulties.

Page 557

Chapter 17

Table Create and Data Types

B-Tree for Non Clustered Index on a Clustered Table (1 of 2) Previous Page# - null

Root

Next Page# - 2,3

Dept_No Employee_No

Non Clustered Index

100 200 300 400 500

1001 1007 1020 1040 1050 1004 1060 1030

Clustered Index Values

Leaf Page Previous Page# - 1

Leaf Page PAGE 2

Next Page# - 3

Previous Page# - 2

PAGE 3

Next Page# - Null

1000 2 1 1001 100 Rafael Minal 90000

1000 3 1 1030 500 Dawn Wilson 50000

1000 2 2 1004 400 Kyle

Stover 60000

1000 3 2 1040 300 Red

Saylor 40000

1000 2 3 1007 200 Sushma Davis 50000

1000 3 3 1050 300 Rex

Mason 60000

1000 2 4 1020 200 May

1000 3 4 1060 400 Kit

Wagner 50000

Jones

60000

398 298 198 98

398 298 198 98

A non clustered index will utilize a B-Tree node and it will always have a root node. A non clustered index will store the index value in order within the index node. When a non clustered index is created on a table that has a clustered index, then the index node will contain two values: index value and clustered index value(s). Above, we created a non clustered index on the column Dept_No. Since the base table also had a clustered index on Employee_No, then the Employee_No, values are also included. So, if your query and wanted to retrieve all rows WHERE the Dept_No was equal to 400, then the Azure SQL Data Warehouse would look in the non clustered index and see that there were two rows, the Employee_No 1004 and 1060. Then, the system would use the clustered index to find them. Page 558

Chapter 17

Table Create and Data Types

B-Tree for Non Clustered Index on a Clustered Table (2 of 2) 100

Intermediate Node 100

Header

Header

500

Intermediate Node 300

200

Header

300

Header

Root Node

Intermediate Node

400

Header

500

Header

Header

Header

Header

Leaf Pages containing the actual data rows

We created a non clustered index on the column Dept_No on a table with a clustered index on Employee_No. A non clustered index in this example will sort by Dept_No and point to the row(s) Employee_No. A page always allocates 8 K for both disk and memory use. If a leaf, intermediate or even root node reaches the 8 K limit, it will split into two nodes.

Page 559

Chapter 17

Table Create and Data Types

Adding a Non Clustered Index To A Heap CREATE TABLE Emp_Intl ( Employee_No INTEGER ,Dept_No SMALLINT ,First_Name VARCHAR(12) ,Last_Name CHAR(20) ,Salary DECIMAL(8,2) ) ;

1

CREATE INDEX Idxlast ON Emp_Intl (Last_Name);

Above, we have created a table called Emp_Intl and it does not have a clustered index. This means that the rows are unordered and stored in a heap. Each row in the table will contain a RowID. The RowID is 8-bytes in size and contain FileID, PageID, SlotNo. This table could have many non clustered indexes, but we only created one on Last_Name. The next pages will show the B-Tree for the newly created non clustered index.

Page 560

Chapter 17

Table Create and Data Types

B-Tree for Non Clustered Index on a Heap Table (1 of 2) Previous Page# - null

Next Page# - 2,3

Root

Last_Name RowID Davis Jones Minal Stover

Non Clustered Index

1000:1:2 1000:1:3 1000:1:1 1000:1:4

Row ID

Leaf Page FileID PageNum SlotNum

Physical Rows

1000

1

1

1001 100 Rafael Minal 90000

1000

1

2

1007 200 Sushma Davis 50000

1000

1

3

1020 200 May

Jones

1000

1

4

1004 400 Kyle

Stover 60000

60000

In a heap rows are not sorted

Row Identifier FREE SPACE

Row ID

A non clustered index will utilize a B-Tree node and it will always have a root node. A non clustered index will store the index value in order within the index node. When a non clustered index is created on a table that is a heap, then the index node will contain two values: index value and RowID(s). Above, we created a non clustered index on Last_Name. The index will contain every Last_Name sorted and the RowID. Page 561

Chapter 17

Table Create and Data Types

B-Tree for a Non Clustered Index on a Heap Table (2 of 2) Adams

Intermediate Node Adams

Header

Indy

Header

Header

Jones

Sims

Intermediate Node Jones

Header

Header

Root Node

Intermediate Node

Tan

Sims

Header

Header

Zin

Header

Header

Leaf Pages containing the actual data rows We created a non clustered index on the column Last_Name on a table that had no clustered index, which is considered an unordered heap of rows. A non clustered index in this example will sort by Last_Name and point to the RowID(s) on the leaf page. A page always allocates 8 K for both disk and memory use. If a leaf page, intermediate node or even root node reaches the 8 K limit, it will split into two leafs or nodes. Page 562

Chapter 17

Table Create and Data Types

Default Values CREATE TABLE Emp_Intl (Employee_No INTEGER NOT NULL DEFAULT ('') ,Dept_No SMALLINT ,First_Name VARCHAR(12) Above, we have directed the Azure SQL ,Last_Name CHAR(20) Data Warehouse to put in an empty ,Salary DECIMAL(8,2) string (two single quotation marks with ); no space between them) as the default value.

If you don’t desire a NULL in a particular column, you can alternatively utilize a default value to indicate that the column has not yet been populated. All you need to do is specify a DEFAULT constraint by adding the DEFAULT clause right after saying NOT NULL.

Page 563

Chapter 18

Page 564

View Functions

Chapter 18

View Functions

Chapter 18 – View Functions

"Be the change that you want to see in the world." - Mahatma Gandhi

Page 565

Chapter 18

View Functions

The Fundamentals of Views View Fundamentals A view is a virtual table. A view may define a subset of columns A view can even define a subset of rows if it has a WHERE clause A view never duplicates data or stores the data separately Views provide security

View Advantages

An additional level of security is provided. Helps the business user not miss join conditions. Help control read and update privileges. Unaffected when new columns are added to a table. Unaffected when a column is dropped unless its referenced in the view. View Recommendations

The above is designed to introduce View fundamentals and View advantages. Page 566

Chapter 18

View Functions

Creating a Simple View to Restrict Sensitive Columns Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name __________ Jones Squiggy Smythe Richard Chambers Mandee Coffing Billy Smith John Larkins Loraine Strickling Cletus Reilly William Harrison Herbert

Salary _______ 32800.50 64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00

CREATE View Employee_V AS SELECT Employee_No ,First_Name ,Last_Name ,Dept_No FROM Employee_Table ; The purposes of views is to restrict access to certain columns, derive columns or Join Tables, and to restrict access to certain rows (if a WHERE clause is used). This view does not allow the user to see the column salary.

Page 567

Chapter 18

View Functions

Creating a Simple View to Restrict Rows Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name __________ Jones Squiggy Smythe Richard Chambers Mandee Coffing Billy Smith John Larkins Loraine Strickling Cletus Reilly William Harrison Herbert

Salary _______ 32800.50 64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00

CREATE VIEW Employee_View AS SELECT First_Name ,Last_Name ,Dept_No ,Salary FROM Employee_Table WHERE Dept_No IN (300, 400) ;

The purposes of views is to restrict access to certain columns, derive columns or Join Tables, and to restrict access to certain rows (if a WHERE clause is used). This view does not allow the user to see information about rows unless the rows have a Dept_No of either 300 or 400. Page 568

Chapter 18

View Functions

Basic Rules for Views No ORDER BY inside the View CREATE (exceptions exist) All Aggregation needs to have an ALIAS Any Derived columns (such as Math) needs an ALIAS

Why do these two columns need aliases?

CREATE View DeptSal_V AS SELECT Dept_No ,SUM(Salary) as SumSal , SUM(Salary) / 12 as MonthSal FROM Employee_Table You don't put an GROUP BY Dept_No; Order By in the view creation.

So we can bring them back in the SELECT query

SELECT Dept_No ,SumSal FROM DeptSal_V Order By 1 ;

Above, are the basic rules of Views with excellent examples.

Page 569

Users put the Order By when selecting from the view

Chapter 18

View Functions

Two Exceptions to the ORDER BY Rule inside a View CREATE VIEW Top_Sal_V AS SELECT TOP 3 * The TOP command FROM Employee_Table goes with Order By ORDER BY Salary DESC; like bread goes with butter.

Create view Sales_Olap_V AS SELECT Product_ID, Sale_Date, Daily_Sales ,Sum(Daily_Sales) OVER (ORDER BY Daily_Sales) as "CSUM" FROM Sales_Table ; Every ANSI Ordered Analytic has an Order By in it naturally There are EXCEPTIONS to the ORDER BY rule. The TOP command allows a view to work with an ORDER BY inside. ANSI OLAP statements also work inside a View.

Page 570

Chapter 18

View Functions

Views sometimes CREATED for Row Security CREATE VIEW empl_200_v AS SELECT Employee_No AS Emp_No ,Last_Name AS Last ,Salary/12 AS Mnth_Sal FROM Employee_Table WHERE Dept_No = 200 ; Only Dept_No 200 employees return

SELECT * FROM Empl_200_v ORDER BY Mnth_Sal ;

Emp_No _______

Last_Name _________

Mnth_Sal ___________

1324657

Coffing

3,490.740000

1333454

Smith

4,000.000000

Views are designed to do many things. In the example above, this derives data, limits columns, and also limits the rows coming back with a WHERE. Page 571

Chapter 18

View Functions

Creating a View to Join Tables Together

This view is designed to join two tables together. By creating a view, we have now made it easier for the user community to join these tables by merely selecting the columns you want from the view. The view exists now in the database sql_views and accesses the tables in sql_class.

Page 572

Chapter 18

View Functions

You Select From a View

Once the view is created, then users can query them with a SELECT statement. Above, we have queried the view we created to join the employee_table to the department_table (created on previous page). Users can select all columns with an asterisk, or they can choose individual columns (separated by a comma). Above, we selected all columns from the view.

Page 573

Chapter 18

View Functions

Another Way to Alias Columns in a View CREATE CREATE VIEW E_View (Emp_Nbr, Last, Mnth_Sal) AS SELECT Employee_No ,Last_Name Option 1: Aliases ,Salary/12 FROM Employee_Table can be here WHERE Dept_No = 200 ;

SELECT * FROM E_View ORDER BY Mnth_Sal ;

Emp_No _______

Last_Name _________

Mnth_Sal _________

1324657

Coffing

3490.74

1333454

Smith

4000.00

Will this View CREATE work or will it error? It works fine because it’s aliased above!

Page 574

Chapter 18

View Functions

The Standard Way Most Aliasing is done CREATE VIEW emp_v2 AS SELECT Employee_No ,Last_Name ,Salary/12 as Sal_Monthly FROM Employee_Table Option 2: WHERE Dept_No = 200 ; The most popular form of aliasing

SELECT * FROM Emp_v2 ORDER BY Sal_Monthly ;

Emp_No _______ 1324657 1333454

Last_Name _________ Coffing Smith

Sal_Monthly ___________ 3490.74 4000.00

The ALIAS for Salary / 12 that’ll be used in this example is Sal_Monthly and this form of aliasing is most often used.

Page 575

Chapter 18

View Functions

What Happens When Both Aliasing Options Are Present CREATE VIEW emp_v3 (Emp_Nbr, Last, Mnth_Sal) AS Once you alias here SELECT Employee_No that is the alias ,Last_Name ,Salary/12 as Sal_Mnth FROM Employee_Table This alias is not recognized WHERE Dept_No = 200 ; SELECT * FROM Emp_v3 ORDER BY 3 ;

Emp_No _______ 1324657 1333454

Last_Name _________ Coffing Smith

Mnth_Sal _________ $3,490.74 $4,000.00

The ALIAS for Salary / 12 that’ll be used in this example is Mnth_Sal. It came first at the top, even though it is aliased in the SELECT list also.

Page 576

Chapter 18

View Functions

Resolving Aliasing Problems in a View CREATE CREATE VIEW emp_v3 (Emp_Nbr, Last, Mnth_Sal) AS Once you alias here SELECT Employee_No that is the alias ,Last_Name ,Salary/12 as Sal_Mnth FROM Employee_Table This alias is not recognized WHERE Dept_No = 200 ;

SELECT * FROM Emp_v3 ORDER BY Sal_Mnth ; What happens when this query runs?

What will happen in the above query?

Page 577

Chapter 18

View Functions

Answer to Resolving Aliasing Problems in a View CREATE CREATE VIEW emp_v3 (Emp_Nbr, Last, Mnth_Sal) AS Once you alias here SELECT Employee_No that is the alias ,Last_Name ,Salary/12 as Sal_Mnth FROM Employee_Table This alias is not recognized WHERE Dept_No = 200 ; SELECT * FROM Emp_v3 ORDER BY Sal_Mnth ; What happens when this query runs?

Error – Sal_Mnth is unrecognized The query above errors because Sal_Mnth is an unrecognized alias. That is because we did our aliasing at the top, so this makes the alias right after Salary/12 non-valid for use when querying the view.

Page 578

Chapter 18

View Functions

Aggregates on View Aggregates CREATE VIEW Aggreg_Order_v AS SELECT Customer_Number ,COUNT(Order_Total) AS Order_Cnt ,SUM(Order_Total) AS Order_Sum ,AVG(Order_Total) AS Order_Avg FROM Order_Table GROUP BY Customer_Number ;

SELECT Customer_Number ,Order_Sum FROM Aggreg_Order_v ; Customer_Number Order_Sum _______________ __________ 31323134 5111.47 87323456 15231.62 11111111 8005.91 11111111 12347.53 57896883 23454.84

SELECT SUM (Order_Sum) FROM Aggreg_Order_v ;

SUM(Order_Sum) _______________ 64151.37

The examples above show how we put a SUM on the aggregate Order_Sum .

Page 579

Chapter 18

View Functions

Altering a Table CREATE TABLE Employee_Table2 WITH (Distribution = Replicate) AS SELECT * from employee_table;

CREATE VIEW Emp_HR_v AS SELECT Employee_No ,Dept_No ,Last_Name ,First_Name FROM Employee_Table2 ;

Altering the actual Table

ALTER TABLE Employee_Table2 ADD Mgr_No INTEGER ; Will the View STILL run?

SELECT * FROM Emp_HR_v;

YES!

This view will run after the table has added an additional column! Page 580

Chapter 18

View Functions

Altering a Table after a View has been Created CREATE TABLE Employee_Table3 WITH (Distribution = Replicate) AS SELECT * from employee_table;

CREATE VIEW Emp_HR_v3 AS SELECT * FROM Employee_Table3 ;

Altering the actual Table

ALTER TABLE Employee_Table3 ADD Mgr_No INTEGER ; Will the View STILL run?

SELECT * FROM Emp_HR_v3;

YES!

Only columns present when the view was created will be visible.

This view runs after the table has added an additional column, but it won’t include Mgr_No in the view results even though there is a SELECT * in the view. The View includes only the columns present when the view was CREATED.

Page 581

Chapter 18

View Functions

A View that Errors After an ALTER CREATE TABLE Employee_Table5 WITH (Distribution = Replicate) AS SELECT * from employee_table;

CREATE VIEW Emp_HR_v5 AS SELECT Employee_No ,Dept_No ,Last_Name ,First_Name FROM Employee_Table5 ;

Altering the actual Table

ALTER TABLE Employee_Table5 DROP COLUMN Dept_No; Will the View STILL run?

SELECT * FROM Emp_HR_v5;

ERROR

This view will NOT run after the table has dropped a column referenced in the view.

Page 582

Chapter 18

View Functions

Troubleshooting a View CREATE VIEW Emp_HR_v6 AS SELECT * FROM Employee_Table6 ; Altering the actual Table

ALTER TABLE Employee_Table6 DROP COLUMN Dept_No ; Will the View STILL run?

SELECT * FROM Emp_HR_v6;

Error This view will NOT run after the table has dropped a column referenced in the view even though the View was CREATED with a SELECT *. At View CREATE Time, the columns present were the only ones the view considered responsible for, and Dept_No was one of those columns. Once Dept_No was dropped, the view no longer works. Page 583

Chapter 18

View Functions

Loading Data through a View

New row Inserted

You can actually utilize a view to load data.

Page 584

Chapter 19

Page 585

Data Manipulation Language (DML)

Chapter 19

Data Manipulation Language (DML)

Chapter 19 – Data Manipulation Language (DML)

“I tried to draw people more realistically, but the figure I neglected to update was myself.” - Joe Sacco

Page 586

Chapter 19

Data Manipulation Language (DML)

INSERT Syntax # 1 The following syntax of the INSERT does not use the column names as part of the command. Therefore, it requires that the VALUES portion of the INSERT match each and every column in the table with a data value or a NULL.

INSERT [ INTO ] VALUES ( [ ..., ] ;

The INSERT statement is used to put new row(s) into a table. A status is the only returned value from the database; no rows are returned to the user. This INSERT syntax requires either a data value or a NULL for all the columns in a table. When executed this code places a single new row into a table.

Page 587

Chapter 19

Data Manipulation Language (DML)

INSERT Example with Syntax 1 INSERT INTO Employee_Table VALUES ( 20, 5, 'Jones', NULL , 45000) ;

20

5

Jones NULL 45000

The Employee_Table was created with these columns in this order: Employee_No ,Dept_No, Last_Name, First_Name, Salary

After the execution of the above INSERT, there is a new row with the integer value of 1 going into Column1, the integer value of 5 going into Column2, the character value of Jones going into Column3, a NULL value going into Column4 , and an integer value of 15 going into Column5. The NULL expressed in the VALUES list is the literal representation for no data.

Page 588

Chapter 19

Data Manipulation Language (DML)

INSERT Syntax #2 The syntax of the second type of INSERT follows:

INSERT [ INTO ] ( [..., ] VALUES ( [..., ] ;

This is another form of the INSERT statement that can be used when some of the data is not available. It allows for the missing values (NULL) to be eliminated from the list in the VALUES clause. It is also the best format when the data is arranged in a different sequence than the CREATE TABLE. Page 589

Chapter 19

Data Manipulation Language (DML)

INSERT Example with Syntax 2 INSERT INTO Employee_Table8 (Employee_No ,Dept_No, First_Name, Last_Name) VALUES( 24, 5,'Joe', 'Smoe') ;

24

5

Smoe

Joe

NULL

Notice that only four columns were inserted and that there are five columns in the row. The system filled the empty columns with Null.

SELECT * FROM Employee_Table8 WHERE Employee_No = 24 Employee_No Dept_No Last_Name First_Name Salary 24

5

Smoe

Joe

?

The above statement incorporates both of the reasons to use this syntax. First, notice that the column names ,Last_Name and First_Name, have been switched, to match the data values. Also, notice that Salary does not appear in the column list, therefore, it is assumed to be NULL. Page 590

Chapter 19

Data Manipulation Language (DML)

INSERT/SELECT Command The syntax of the INSERT / SELECT is:

INSERT [ INTO ] SELECT [..., ] FROM ;

Although the INSERT is great for adding a single row not currently present in the system, an INSERT/SELECT is even better when the data already exists within the Azure SQL Data Warehouse. In this case, the INSERT is combined with a SELECT. However, no rows are returned to the user. Instead, they go into the table as new rows. The SELECT reads the data values from the one or more columns, in one or more tables and uses them as the values to INSERT into another table. Simply put, the SELECT takes the place of the VALUES portion of the INSERT. This is a common technique for building data marts, interim tables and temporary tables. It is normally a better and much faster alternative than extracting the rows to a data file, then reading the data file and inserting the rows using a utility. Page 591

Chapter 19

Data Manipulation Language (DML)

INSERT/SELECT Example using All Columns (*) CREATE TABLE Employee_table9 (Employee_No integer, Dept_No smallint, Last_name char(20), First_name varchar(12), Salary decimal(8,2));

INSERT INTO Employee_Table9 SELECT * FROM Employee_Table;

This is a classic example of an INSERT SELECT statement. Because both tables have the exact same columns in the exact same order, the SELECT * works just fine. Page 592

Chapter 19

Data Manipulation Language (DML)

INSERT/SELECT Example with Less Columns When fewer than all the columns are desired or you want to change certain values, either of the following INSERT / SELECT statements will do the job:

INSERT INTO Order_Table4 (Order_Number, Customer_Number, Order_Date, Order_Total) SELECT Order_Number, Customer_Number, '2015-06-30', Order_Total FROM Order_Table ; Literal value

INSERT INTO Order_Table5 (Order_Number, Customer_Number, Order_Date, Order_Total) SELECT Order_Number, Customer_Number, GetDate(), Order_Total FROM Order_Table ; System value for current date

INSERT INTO Order_Table6 (Order_Number, Customer_Number) SELECT Order_Number, Customer_Number FROM Order_Table ;

Order_Date and Order_Total columns have NULL values

In both of the above examples, only the specified columns are populated in the Order_Table4 and Order_Table5 examples. In the first INSERT, the data is a literal date. The second INSERT uses the GETDATE() function. Both are acceptable, depending on what is needed. Working with the same concept of a normal INSERT, when using the column names, the only data values needed are for these columns and they must be in the same sequence as the column list, not the CREATE TABLE. Therefore, as in the final example, omitted data values or column names become a NULL data value. Page 593

Chapter 19

Data Manipulation Language (DML)

The UPDATE Command Basic Syntax UPDATE [ FROM [AS ] ] SET = { | } [..., = | ] [ WHERE ] [ AND … ] [ OR … ] ;

The UPDATE statement is used to modify data values in one or more columns of one or more existing rows. A status is the only returned value from the database; no rows are returned to the user. When business requirements call for a change to be made in the existing data, then the UPDATE is the SQL statement to use. In order for the UPDATE to work, it must know a few things about the data row(s) involved. Like all SQL, it must know which table to use for making the change, which column or columns to change and the change to make within the data.

Page 594

Chapter 19

Data Manipulation Language (DML)

Two UPDATE Examples UPDATE Order_Table6 SET Order_Date = GetDate() ,Order_Total = 10500.25 WHERE Order_Number = 123456;

UPDATE Order_Table6 SET Order_Date = '2016/06/30' ,Order_Total = 14500.23 WHERE Order_Number = 123512 AND Customer_Number = 11111111;

The first UPDATE command modifies all rows for Order_Number 123456. It changes the values in two columns with new data values provided after the equal sign (=). The next UPDATE uses the same table as the above statement. The UPDATE determines which row(s) to modify with compound conditions written in the WHERE clause based on values stored in other columns. Page 595

Chapter 19

Data Manipulation Language (DML)

Subquery UPDATE Command Syntax UPDATE [ FROM [AS ] ] SET = { | } [..., = | ] WHERE [..., ] IN ( SELECT [..., ] FROM [ WHERE … ] ) ;

Sometimes it is necessary to update rows in a table when they match rows in another table. To accomplish this, the tables must have one or more columns in the same domain. The matching process then involves either a subquery or join processing. Page 596

Chapter 19

Data Manipulation Language (DML)

Example of Subquery UPDATE Command Order_Table6 can be changed based on Order_Table. The following UPDATE uses a subquery operation to accomplish the operation:

UPDATE Order_Table6 SET Order_Date = GetDate() WHERE Order_Number IN (SELECT Order_Number FROM Order_Table WHERE Order_Total > 10000) ;

Sometimes it is necessary to update rows in a table when they match rows in another table. To accomplish this, the tables must have one or more columns in the same domain. The matching process then involves either a subquery or join processing.

Page 597

Chapter 19

Data Manipulation Language (DML)

Join UPDATE Command Syntax UPDATE SET = { | } [..., = | ] [ FROM [ AS ] ] WHERE [.] = [.] [ AND ] [ OR ] ;

When adding an alias to the UPDATE, the alias becomes the table name and MUST be used in the WHERE clause when qualifying columns.

Page 598

Chapter 19

Data Manipulation Language (DML)

Example of an UPDATE Join Command Order_Table6 can be changed based on the original Order_Table. The following UPDATE uses a join operation to accomplish the operation:

UPDATE Order_Table6 SET Order_Total = 11000 FROM Order_Table AS Orig WHERE Order_Table6.Customer_Number =Orig.Customer_Number AND Order_Table6.Customer_Number = 11111111 ;

Sometimes it is necessary to update rows in a table when they match rows in another table. To accomplish this, the tables must have one or more columns in the same domain. The matching process then involves either a subquery or join processing. Above, we join two tables together and we have an additional AND clause.

Page 599

Chapter 19

Data Manipulation Language (DML)

The DELETE Command Basic Syntax DELETE [ FROM ] [ AS ] [ WHERE condition ] ;

The DELETE statement has one function and that is to remove rows from a table. A status is the only returned value from the database; no rows are returned to the user. One of the fastest things that SQL Server does is to remove ALL rows from a table. Be Very CAREFUL with DELETE. It can come back to bite you if you’re not careful. Page 600

Chapter 19

Data Manipulation Language (DML)

Two DELETE Examples to DELETE ALL Rows in a Table DELETE FROM Order_Table5 ;

DELETE Order_Table5 ;

Select * FROM Order_Table5;

Order_Number Customer_Number Order_Date Order_Total _____________ ________________ ___________ ___________

Both examples will delete all the rows in the table. Since the FROM is optional, the second example still removes all rows from a table and executes exactly the same as the above statement. Page 601

Chapter 19

Data Manipulation Language (DML)

To DELETE or to TRUNCATE TRUNCATE TABLE Order_Table6 ; The table being truncated can be distributed or replicated, permanent or temporary or it can be a rowstore or columnstore. Both TRUNCATE and DELETE (without a WHERE clause) will remove all rows from the table. TRUNCATE is faster than DELETE because it does no logging of individual rows. Truncating will instantly remove all pages in the table. SQL statements, Batch statements and stored procedures can be used to TRUNCATE. If a TRUNCATE TABLE is cancelled or there is a system failure before completion, the table performs a rollback and all rows are still present as before. When truncating a table, the Azure SQL Data Warehouse keeps and updates all statistics. Indexes on the table are updated as well. After TRUNCATE TABLE completes, all statistics are updated by using the row count of (1000), which is the default.

While TRUNCATE TABLE is running, an Exclusive lock is placed on the table so all other operations on the table will not be allowed.

TRUNCATING uses fewer system resources, does not require row logging and is faster. So TRUNCATE if you can. Page 602

Chapter 19

Data Manipulation Language (DML)

A DELETE Example Deleting only Some of the Rows

DELETE FROM Employee_Table1 WHERE Employee_No = 1121334 ;

The DELETE example above only removes the row that contained Employee_No of 1121334 and leaves all other rows in the table. Page 603

Chapter 19

Data Manipulation Language (DML)

Subquery and Join DELETE Command Syntax The subquery syntax for the DELETE statement follows:

DELETE WHERE [..., ] IN ( SELECT [..., ] FROM [ AS ] [ WHERE condition … ] ) ;

The join syntax for the DELETE statement follows:

DELETE [ FROM [ AS ] ] WHERE .=. [ AND ] [ OR ] ;

You may be asked to delete rows in one table based on data from a different table. Sometimes it is desirable to delete rows from one table based on their existence in or by matching a value stored in another table. To access these rows from another table for comparison, a subquery or a join operation can be used. Page 604

Chapter 19

Data Manipulation Language (DML)

Example of Subquery DELETE Command The DELETE below is to remove rows from Order_Table6 for all Orders with an Order_Total > 13000 in the Order_Table. This DELETE uses a subquery operation to accomplish the DELETE.

DELETE FROM Order_Table6 WHERE Customer_Number IN ( SELECT Customer_Number FROM Order_Table WHERE Order_Total > 13000 ) ;

The above uses a Subquery and the DELETE command.

Page 605

Chapter 19

Data Manipulation Language (DML)

MERGE INTO Want to synchronize data in two different tables?

MERGE INTO can be used to get your tables in synch.

MERGE INTO involves performing an UPDATE, INSERT, or DELETE on a target table based on data in a source table. The target and source tables are joined on common column(s) or key(s). A target table is modified to reflect the data in a source table.

MERGE merges a source row set into a target table based on whether there is a MATCH or whether there is NOT a MATCH. If there is a MATCH and data changed, an UPDATE can be made. Yet, if there is a NOT a MATCH, an INSERT or DELETE can be made.

Page 606

Chapter 19

Data Manipulation Language (DML)

MERGE INTO Want to synchronize data in two different tables? MERGE INTO can be used to get your tables in synch.

SELECT Employee_No, Dept_No, Last_Name, First_Name, Salary FROM Employee_Table_Original;

SELECT Employee_No, Dept_No, Last_Name, First_Name, Salary FROM Employee_Table_New;

We are going to update Employee_Table_Original based on data in Employee_Table_New using a MERGE INTO.

Page 607

Chapter 20

Page 608

Set Operators Functions

Chapter 20

Set Operators Functions

Chapter 20 – Set Operators Functions

"The man who doesn't read good books has no advantage over the man who can't read them." - Mark Twain

Page 609

Chapter 20

Set Operators Functions

Rules of Set Operators 1.

Each query will have at least two SELECT Statements separated by a SET Operator

2.

SET Operators are UNION, UNION ALL, INTERSECT and EXCEPT

3.

Must specify the same number of columns from the same domain (data type/range)

4.

If using Aggregates, both SELECTs much have their own GROUP BY

5.

Both SELECTS must have a FROM Clause

6.

The First SELECT is used for all ALIAS and FORMAT Statements

7.

The Second SELECT will have the ORDER BY statement which must be a number

8.

When multiple operators the order of precedence is INTERSECT, UNION, and EXCEPT

9.

Parentheses can change the order of Precedence

10. Duplicate rows are eliminated in the spool unless the ALLkeyword is used 11. ◦Set operators consider two NULLs as equal for the purpose of comparison.

Page 610

Chapter 20

Set Operators Functions

INTERSECT Explained Logically

Table_Red

Table_Blue

1 2 3

3 4 5

SELECT * FROM Table_Red INTERSECT SELECT * FROM Table_Blue ; In this example, what numbers in the answer set would come from the query above? Page 611

Chapter 20

Set Operators Functions

INTERSECT Explained Logically Table_Red

Table_Blue

1 2 3

3 4 5

SELECT * FROM Table_Red INTERSECT SELECT * FROM Table_Blue ;

3 In this example, only the number 3 was in both tables so they INTERSECT. Page 612

Chapter 20

Set Operators Functions

UNION Explained Logically

Table_Red

Table_Blue

1 2 3

3 4 5

SELECT * FROM Table_Red UNION SELECT * FROM Table_Blue ;

In this example, what numbers in the answer set would come from the query above? Page 613

Chapter 20

Set Operators Functions

UNION Explained Logically Table_Red

Table_Blue

1 2 3

3 4 5

SELECT * FROM Table_Red UNION SELECT * FROM Table_Blue ;

1 2 3 4 5 Both top and bottom queries run simultaneously, then the two different spools files are merged to eliminate duplicates and place the remaining numbers in the answer set.

Page 614

Chapter 20

Set Operators Functions

UNION ALL Explained Logically

Table_Red

Table_Blue

1 2 3

3 4 5

SELECT * FROM Table_Red UNION ALL SELECT * FROM Table_Blue ;

In this example, what numbers in the answer set would come from the query above? Page 615

Chapter 20

Set Operators Functions

UNION ALL Explained Logically Table_Red

Table_Blue

1 2 3

3 4 5

SELECT * FROM Table_Red UNION ALL SELECT * FROM Table_Blue ;

1 2 3 3 4 5 Both top and bottom queries run simultaneously, then the two different spools files are merged together to build the answer set. The ALL prevents eliminating Duplicates.

Page 616

Chapter 20

Set Operators Functions

EXCEPT Explained Logically

Table_Red

Table_Blue

1 2 3

3 4 5

EXCEPT never adds additional rows, but only takes rows away!

SELECT * FROM Table_Red EXCEPT SELECT * FROM Table_Blue ;

In this example, what numbers in the answer set would come from the query above? Page 617

Chapter 20

Set Operators Functions

EXCEPT Explained Logically Table_Red

Table_Blue

1 2 3

3 4 5

SELECT * FROM Table_Red EXCEPT SELECT * FROM Table_Blue ;

1 2 The Top query SELECTED 1, 2, 3 from Table_Red. From that point on, only 1, 2, 3 at most could come back. The bottom query is run on Table_Blue, and if there are any matches, they are not ADDED to the 1, 2, 3 but instead take away either the 1, 2, or 3.

Page 618

Chapter 20

Set Operators Functions

Another EXCEPT Example

Table_Red

Table_Blue

1 2 3

3 4 5

EXCEPT never adds additional rows, but only takes rows away!

SELECT * FROM Table_Blue EXCEPT SELECT * FROM Table_Red ;

In this example, what numbers in the answer set would come from the query above?

Page 619

Chapter 20

Set Operators Functions

EXCEPT Explained Logically in Reverse Order Table_Red

Table_Blue

1 2 3

3 4 5

SELECT * FROM Table_Blue EXCEPT SELECT * FROM Table_Red ;

4 5 The Top query SELECTED 3, 4, 5 from Table_Blue. From that point on, only 3, 4, 5 at most could come back. The bottom query is run on Table_Red, and if there are any matches, they are not ADDED to the 3, 4, 5, but instead, take away either the 3, 4, or 5. Page 620

Chapter 20

Set Operators Functions

An Equal Amount of Columns in both SELECT List SELECT Dept_No ,Employee_No FROM Employee_Table INTERSECT SELECT Dept_No ,Mgr_No FROM Department_Table;

Both queries have the same number of columns in the SELECT list.

Dept_No _______

Employee_No ____________

400

1256349

Rule 1

You must have an equal amount of columns in both SELECT lists. This is because data is compared from the two spool files, and duplicates are eliminated. So, for comparison purposes, there must be an equal amount of columns in both queries. Page 621

Chapter 20

Set Operators Functions

Columns in the SELECT list should be from the same Domain SELECT First_Name FROM Employee_Table INTERSECT SELECT Department_Name FROM Department_Table;

First_Name __________

You can’t compare First_Name with Department_Name! Different Domains!

Rule 2

Answer set

No rows returned

The above query works without error, but no data is returned. There are no First Names that are the same as Department Names. This is like comparing Apples to Oranges. That means they are NOT in the same Domain.

Page 622

Chapter 20

Set Operators Functions

The Top Query handles all Aliases SELECT Dept_No as Depty ,Employee_No as "The Mgr" FROM Employee_Table INTERSECT Top query is responsible for SELECT Dept_No the column ,Mgr_No ALIAS, Title FROM Department_Table;

and Formatting.

Depty _____

The Mgr ________

400

1256349

The Top Query is responsible for ALIASING.

Page 623

Answer set

Rule 3

Chapter 20

Set Operators Functions

The Bottom Query does the ORDER BY SELECT Dept_No as Depty ,Employee_No as "The Mgr" FROM Employee_Table Bottom INTERSECT query is SELECT Dept_No responsible ,Mgr_No for the Sort FROM Department_Table with an ORDER BY 1 ; ORDER BY SELECT Dept_No as Depty ,Employee_No as "The Mgr" FROM Employee_Table Bottom query INTERSECT is SELECT Dept_No can use the ,Mgr_No number, FROM Department_Table column name ORDER BY Depty ; or alias

Rule 4

Rule 5

The Bottom Query is responsible for sorting. Above, we have both examples referencing the ORDER BY column as either the number 1 (column 1) or Depty. We could have also used Dept_No or even "The Mgr", but the ORDER BY statement must come from referencing column names, aliases or the number representing the column from the top query. Page 624

Chapter 20

Set Operators Functions

Great Trick: Place your Set Operator in a Derived Table SELECT Employee_No AS MANAGER ,RTRIM(Last_Name) + ', ' + First_Name as "Name" FROM Employee_Table INNER JOIN (SELECT Employee_No FROM Employee_Table INTERSECT SELECT Mgr_No FROM Department_Table) AS TeraTom (empno) ON Employee_No = empno ORDER BY "Name" __________ MANAGER 1256349 1333454 1000234 1121334

_______________ Name Harrison, Herbert Smith, John Smythe, Richard Strickling, Cletus

The Derived Table gave us the empno for all managers, and we were able to join it. Page 625

Chapter 20

Set Operators Functions

UNION Vs UNION ALL SELECT Department_Name, Dept_No FROM Department_Table UNION ALL SELECT Department_Name, Dept_No FROM Department_Table ORDER BY 1; UNION ALL Answer Set

UNION Answer Set _____________________ Department_Name ________ Dept_No

_____________________ Department_Name Dept_No ________

Customer Support Human Resources Marketing Research and Development Sales

Customer Support Customer Support Human Resources Human Resources Marketing Marketing Research and Development Research and Development Sales Sales

400 500 100 200 300

400 400 500 500 100 100 200 200 300 300

UNION eliminates duplicates, but UNION ALL does not. If you know that a set operator query does not have any duplicates you are still better to use UNION ALL. This does not check for duplicates so it is faster in performance. Page 626

Chapter 20

Set Operators Functions

Using UNION ALL and Literals SELECT Dept_No AS Dept ,'Employee ' as "Title" ,First_Name + ' ' + Last_Name as "Name" Dept _________ Title FROM Employee_Table ____ UNION ALL ? Employee SELECT Dept_No 10 Employee 100 Department ,'Department' 100 Employee ,Department_Name 200 Department FROM Department_Table 200 Employee ORDER BY 1, 2 ;

Name ______________

Squiggy Jones Richard Smythe Marketing Mandee Chambers Research and Develop Billy Coffing 200 Employee John Smith 300 Department Sales 300 Employee Loraine Larkins 400 Department Customer Support 400 Employee Cletus Strickling 400 Employee Herbert Harrison 400 Employee William Reilly 500 Department Human Resources

Notice the 2nd SELECT column in that it is a literal ‘Employee ‘ (with two spaces) and the other Literal is ‘Department’. These literals match up because now they are both 10 characters long exactly. The UNION ALL brings back all Employees and all Departments and shows the employees in each valid department. Page 627

Chapter 20

Set Operators Functions

A Great Example of how EXCEPT works Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________

SELECT Dept_No as Department_Number FROM Department_Table EXCEPT SELECT Dept_No FROM Employee_Table ORDER BY 1 ; _________________ Department_Number 500

This query brought back all Departments without any employees.

Page 628

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 20

Set Operators Functions

USING Multiple SET Operators in a Single Request SELECT Dept_No , Employee_No empno FROM Employee_Table UNION ALL SELECT Dept_No, Employee_No FROM Employee_Table Dept_No ________ INTERSECT ? SELECT Dept_No, Mgr_No 10 FROM Department_Table 100 EXCEPT 200 SELECT Dept_No, Mgr_No 200 FROM Department_Table 300 WHERE Department_Name 400 LIKE '%Sales%' 400 ORDER BY 1, 2; 400

Empno ________ 2000000 1000234 1232578 1324657 1333454 2312225 1121334 1256349 2341218

Above, we use multiple SET Operators. They follow the natural Order of Precedence in that UNION is evaluated first, then INTERSECT, and finally EXCEPT. Page 629

Chapter 20

Set Operators Functions

Changing the Order of Precedence with Parentheses SELECT Dept_No , Employee_No empno FROM Employee_Table UNION ALL Dept_No (SELECT Dept_No, Employee_No _______ FROM Employee_Table ? INTERSECT (SELECT Dept_No, 10 Mgr_No 100 FROM Department_Table 200 EXCEPT 200 SELECT Dept_No, Mgr_No 300 FROM Department_Table 400 WHERE Department_Name 400 LIKE '%Sales%')) 400 ORDER BY 1, 2; 400

Empno _______ 2000000 1000234 1232578 1324657 1333454 2312225 1121334 1256349 1256349 2341218

Above, we use multiple SET Operators and Parentheses to change the order of precedence. Above, the EXCEPT runs first, then the INTERSECT and lastly, the UNION. The natural Order of Precedence without parentheses is UNION, INTERSECT, and finally EXCEPT. Page 630

Chapter 20

Set Operators Functions

Building Grouping Sets Using UNION SELECT NULL AS "Year", Product_ID, SUM(Daily_Sales) TotalSales FROM Sales_Table soh GROUP BY Product_ID UNION SELECT Year(Sale_Date) "Year", NULL Product_ID, SUM(Daily_Sales) TotalSales FROM Sales_Table soh GROUP BY Year(Sale_Date)

Year Product_ID TotalSales ____ _________ _________ ? ? ? 2000

1000 2000 3000 ?

331204.72 306611.81 224587.82 862404.35

The example above shows us that we made $862404.35 in the year 2,000. It also shows us what we made for Product_ID 1000, 2000 and 3000. If you totaled up the TotalSales 1000, 2000 and 3000, it would equal $862404.35.

Page 631

Chapter 20

Set Operators Functions

Three Grouping Sets Using a UNION SELECT NULL AS "Yr_Mnth", Product_ID, SUM(Daily_Sales) TotalSales FROM Sales_Table soh GROUP BY Product_ID UNION SELECT Year(Sale_Date) "Yr_Mnth", NULL Product_ID, SUM(Daily_Sales) TotalSales FROM Sales_Table soh GROUP BY Year(Sale_Date) UNION SELECT Month(Sale_Date) "Yr_Mnth", NULL Product_ID, SUM(Daily_Sales) TotalSales FROM Sales_Table soh GROUP BY Month(Sale_Date)

Yr_Mnth Product_ID TotalSales ________ _________ _________ ? ? ? 9 10 2000

1000 2000 3000 ? ? ?

331204.72 306611.81 224587.82 418769.36 443634.99 862404.35

The example above shows us that we made $862404.35 in the year 2,000. It also shows us what we made for Product_ID 1000, 2000 and 3000. If you totaled up the TotalSales 1000, 2000 and 3000, it would equal $862404.35. It also shows what we did in the months of September and October. If you total those months up it would also equal $862404.35. Page 632

Chapter 21

Page 633

Stored Procedure Functions

Chapter 21

Stored Procedure Functions

Chapter 21 – Stored Procedure Functions

“Freedom from effort in the present merely means that there has been effort stored up in the past.” - Theodore Roosevelt

Page 634

Chapter 21

Stored Procedure Functions

Creating a Stored Procedure Schema DBO "Database Owner"

The name of the Stored Procedure

CREATE PROCEDURE dbo.ListStudents AS SELECT Student_ID ,Last_Name ,First_Name ,Class_Code ,Grade_Pt FROM Student_Table ORDER BY Class_Code ; The CREATE PROCEDURE command will create a stored procedure. The above procedure will return information about students from the Student_Table. The answer set is sorted by freshman, sophomore, junior and then senior. This procedure is created in the dbo schema. The letters dbo stand for “database owner.” The dbo schema is one that is always present in every database, and it is an excellent standard repository for stored procedures. Page 635

Chapter 21

Stored Procedure Functions

Executing a Stored Procedure The name of the Stored Procedure

EXEC dbo.ListStudents ;

Student_ID Last_Name __________ First_Name __________ __________

260000 125634 234121 423400 280023 322133 231222 333450 123250 324652

Johnson Hanson Thomas Larkins McRoberts Bond Wilson Smith Phillips Delaney

Stanley Henry Wendy Michael Richard Jimmy Susie Andy Martin Danny

Class_Code __________ Grade_Pt _________

? FR FR FR JR JR SO SO SR SR

? 2.88 4.00 0.00 1.90 3.95 3.80 2.00 3.00 3.35

You SELECT from a table. You SELECT from a view. You EXECUTE a Stored Procedure with an EXEC statement. This stored procedure queries the contents of a single table, returning a result set. A stored procedure works much like a view, but the query plan will actually be cached once it is executed for the first time. After the first time, the execution time will be consistent in consecutive executions. One reason to use Stored Procedures is consistency in executions. Page 636

Chapter 21

Stored Procedure Functions

There are Three Ways to Execute a Stored Procedure 1

ListStudents ;

2

EXECUTE ListStudents ;

3

EXEC ListStudents ;

Student_ID Last_Name __________ First_Name __________ __________ 260000 125634 234121 423400 280023 322133 231222 333450 123250 324652

Johnson Hanson Thomas Larkins McRoberts Bond Wilson Smith Phillips Delaney

Stanley Henry Wendy Michael Richard Jimmy Susie Andy Martin Danny

Class_Code __________ Grade_Pt _________ ? FR FR FR JR JR SO SO SR SR

? 2.88 4.00 0.00 1.90 3.95 3.80 2.00 3.00 3.35

You SELECT from a table. You SELECT from a view. You EXECUTE a Stored Procedure with an EXEC statement. Above, you can see the three ways to execute a Stored Procedure. The procedure name is ListStudents. You can merely type in ListStudents or EXECUTE ListStudents or EXEC ListStudents. Page 637

Chapter 21

Stored Procedure Functions

Creating a Stored Procedure with a CASE Statement CREATE PROC dbo.Employees AS SELECT Employee_No ,Dept_No ,Department = CASE DEPT_NO WHEN 100 THEN 'Marketing' WHEN 200 THEN 'Research and Development' WHEN 300 THEN 'Sales' WHEN 400 THEN 'Customer Support' WHEN 500 THEN 'Human Resources' ELSE 'Invalid Department' END ,First_Name + ' ' + Last_Name AS Fullname ,Salary FROM Employee_Table ORDER BY Department; Notice the name Department in blue and the following CASE statement. Check out the answer set on the following page.

Page 638

Chapter 21

Stored Procedure Functions

Our Answer Set EXEC dbo.Employees

Employee_No Department Fullname ____________ Dept_No _______ ______________________ _______________ Herbert Harrison 1256349 400 Customer Support William Reilly 2341218 400 Customer Support Cletus Strickling 1121334 400 Customer Support Richard Smythe 1000234 10 Invalid Department Squiggy Jones 2000000 ? Invalid Department Mandee Chambers 1232578 100 Marketing 1324657 200 Research and Development Billy Coffing 1333454 200 Research and Development John Smith Loraine Larkins 2312225 300 Sales

How about that answer set?

Page 639

Salary ________

54500.00 36000.00 54500.00 64300.00 32800.50 48850.00 41888.88 48000.00 40200.00

Chapter 21

Stored Procedure Functions

Dropping a Stored Procedure

DROP PROCEDURE SQL_Class.dbo.ListStudents

Dropping a stored procedure is easy. Just use the DROP PROCEDURE command. Above, we have fully qualified the database, schema and tablename in order to not accidentally drop another procedure named ListStudents in another database or schema. We could have also just ensured our default database was SQL_Class and then just typed, "DROP PROCEDURE ListStudents".

Page 640

Chapter 21

Stored Procedure Functions

Passing an Input Parameter to a Stored Procedure CREATE Procedure dbo.Employee_Find (@Employee_Num INTEGER ) AS SELECT * FROM Employee_Table WHERE Employee_No = @Employee_Num;

EXEC Employee_Find 2000000;

Employee_No Salary ____________ Dept_No ________ Last_Name _________ First_Name __________ ________ 2000000 ? Jones Squiggy 32800.50 Passing parameters is an integral part of why stored procedures are important. Using parameters, you can pass information into the body of a procedure in order to control how the procedure operates, and dramatically reduce complexity. The example above has a single input parameter. It is an Employee_No. When the stored procedure is executed, you merely supply the Employee_No you are trying to find and the result is a single employee. In this case it is Squiggy Jones.

Page 641

Chapter 21

Stored Procedure Functions

Executing With Positional Parameter vs. Named Parameters CREATE Procedure dbo.Employee_Find (@Employee_Num INTEGER ) AS SELECT * FROM Employee_Table WHERE Employee_No = @Employee_Num; This is how you execute using a positional parameter: EXEC SQL_Class.DBO.Employee_Find 2000000 ;

Employee_Number is equal to 2000000

This is how you execute using a named parameter: EXEC SQL_Class.DBO.Employee_Find @Employee_Num = 2000000 ;

The example above has a single input parameter. It is an Employee_No. In our examples above we have fully qualified the execute statement with the database.schema.storedprocedurename. In our first example we use a positional parameter, thus, the stored procedure assumes that 2000000 is the @Employee_Num value. In the second example, we specify this by using the @Employee_Num = 2000000. Page 642

Chapter 21

Stored Procedure Functions

Passing an Output Parameter to a Stored Procedure CREATE PROCEDURE dbo.Student_Count @Class_Code CHAR(2) OUTPUT AS SELECT Class_Code FROM Student_Table WHERE Class_Code = 'FR';

DECLARE @Class_Code Char(2); EXEC dbo.Student_Count @Class_Code OUTPUT; PRINT @Class_Code;

Class_Code __________ FR FR FR

The stored procedure above begins with a defined output parameter called @Class_Code. Notice, it has a data type of Char(2) and also specifies the keyword OUTPUT. The code that invokes the stored procedure creates the variable to pass as the output parameter (DECLARE @Class_Code Char(2)). The EXEC statement will also specify that the parameter is OUTPUT. Page 643

Chapter 21

Stored Procedure Functions

Changing a Stored Procedure with an ALTER The name of the Stored Procedure

ALTER PROCEDURE dbo.ListStudents AS SELECT Student_ID ,Last_Name ,First_Name ,Class_Code ,Grade_Pt FROM Student_Table ORDER BY CASE Class_Code WHEN 'Fr' THEN 1 WHEN 'So' THEN 2 WHEN 'Jr' THEN 3 WHEN 'Sr' THEN 4 ELSE 5 END, Grade_Pt DESC

The CREATE PROCEDURE command will create a stored procedure. The above procedure will return information about students from the Student_Table. The answer set is sorted by Freshman, Sophomore, Junior and then Senior. Then, the minor sort is Grade_Pt DESC. This procedure is created in the dbo schema. The letters dbo stand for “database owner.” The dbo schema is one that is always present in every database, and it is an excellent standard repository for stored procedures. Page 644

Chapter 21

Stored Procedure Functions

Answer Set for the Altered Stored Procedure The name of the Stored Procedure

EXEC dbo.ListStudents

Student_ID Last_Name __________ First_Name __________ __________

234121 125634 423400 231222 333450 322133 280023 324652 123250 260000

Thomas Hanson Larkins Wilson Smith Bond McRoberts Delaney Phillips Johnson

Wendy Henry Michael Susie Andy Jimmy Richard Danny Martin Stanley

Class_Code __________ Grade_Pt _________

FR FR FR SO SO JR JR SR SR ?

4.00 2.88 0.00 3.80 2.00 3.95 1.90 3.35 3.00 ?

We have altered the stored procedure to sort by Fr, So, Jr, Sr and then the null value. The minor sort is Grade_Pt DESC. It is very easy to create, execute and alter a stored procedure.

Page 645

Chapter 21

Stored Procedure Functions

Using a Stored Procedure to Delete a Row SELECT Customer_Number as Cust ,Customer_Name as Name ,Phone_Number as Phone FROM Customer_Table;

Cust _________ 11111111 31313131 31323134 57896883 87323456

Name ________________ Billy's Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

Phone _________ 555-1234 555-1111 555-1212 347-8954 322-1012

CREATE PROCEDURE Del_Cust AS DECLARE @Cust_No INT; SET @Cust_No = 31313131; DELETE FROM SQL_Class.dbo.Customer_Table WHERE Customer_Number = @Cust_No; Exec Del_Cust ; SELECT Customer_Number as Cust ,Customer_Name as Name ,Phone_Number as Phone FROM Customer_Table;

Cust _________ 11111111 31323134 57896883 87323456

Name ________________ Billy's Best Choice ACE Consulting XYZ Plumbing Databases N-U

Phone _________ 555-1234 555-1212 347-8954 322-1012

The example above demonstrates how to delete a row from a table using a stored procedure. Page 646

Chapter 21

Stored Procedure Functions

A Different Method to Delete a Row SELECT Customer_Number as Cust ,Customer_Name as Name ,Phone_Number as Phone FROM Customer_Table;

Cust _________ 11111111 31323134 57896883 87323456

Name ________________ Billy's Best Choice ACE Consulting XYZ Plumbing Databases N-U

Phone _________ 555-1234 555-1212 347-8954 322-1012

CREATE PROCEDURE Del_Cust2 AS DECLARE @Cust_No INT = 87323456; DELETE FROM SQL_Class.dbo.Customer_Table WHERE Customer_Number = @Cust_No;

Exec Del_Cust2 ; SELECT Customer_Number as Cust ,Customer_Name as Name ,Phone_Number as Phone FROM Customer_Table;

Cust _________ 11111111 31323134 57896883

Name ________________ Billy's Best Choice ACE Consulting XYZ Plumbing

Phone _________ 555-1234 555-1212 347-8954

The example above demonstrates another way to declare a variable and to delete a row from a table using a stored procedure. Page 647

Chapter 21

Stored Procedure Functions

Deleting a Row Using an Input Parameter SELECT Customer_Number as Cust ,Customer_Name as Name ,Phone_Number as Phone FROM Customer_Table;

Cust _________ 11111111 31323134 57896883

Name ________________ Billy's Best Choice ACE Consulting XYZ Plumbing

Phone _________ 555-1234 555-1212 347-8954

CREATE PROCEDURE Del_Cust_Parm @Cust_No INT AS DELETE FROM SQL_Class.dbo.Customer_Table WHERE Customer_Number = @Cust_No;

Del_Cust_Parm 31323134;

SELECT Customer_Number as Cust ,Customer_Name as Name ,Phone_Number as Phone FROM Customer_Table;

Cust Name Phone _________ ________________ _________ 11111111 Billy's Best Choice 555-1234 57896883 XYZ Plumbing 347-8954

The example above demonstrates how to delete a row from a table using a stored procedure via an input parameter. This is the preferred method because you can use this stored procedure over and over again. You just supply a different customer_number each time you execute it. Page 648

Chapter 21

Stored Procedure Functions

Using Loops in Stored Procedures

1

2

CREATE Table My_Tbl_XYZ ( Cntr Integer ,TheDate Date );

Use your initials for the XYZ piece of the table

CREATE PROCEDURE Inserter_Five AS DECLARE @Cntr INTEGER = 0; WHILE @Cntr < 5 BEGIN; SET @Cntr = @Cntr + 1; INSERT INTO My_Tbl_XYZ VALUES (@Cntr, '2015-06-30') ; END;

3

Inserter_Five ;

There are now Five rows the table

We created a table called My_Tbl_XYZ. Then, we inserted five rows inside the table using a WHILE loop. The WHILE loop did a loop five times. Page 649

Chapter 21

Stored Procedure Functions

Stored Procedure Workshop Create the table below but substitute the XYZ with you initials

CREATE Table Table_XYZ ( Col1 INTEGER ,Col2 INTEGER ); Create the procedure but substitute the XYZ with you initials

Now, create a stored procedure called Final_XYZ that places 1,000 rows inside the table:

Col1 should have 1000 unique values Col2 should have 250 different values Your mission is to create the table above and then create a stored procedure that will insert 1,000 rows. The tricky part is that col1 should have 1,000 unique values, but col2 should have only 250 different values. Page 650

Chapter 21

Stored Procedure Functions

Looping with a WHILE Statement CREATE Table SQL_Class.dbo.Table_XYZ ( Col1 INTEGER ,Col2 INTEGER) ; CREATE Procedure Final_XYZ AS DECLARE @Cntr INT = 0; DECLARE @Cntr2 INT = 0; WHILE @Cntr < 1000 BEGIN; SET @Cntr = @Cntr + 1; SET @Cntr2 = @Cntr2 + 1; If @Cntr2 = 251 BEGIN; SET @Cntr2 = 1; END; INSERT INTO SQL_Class.dbo.Table_XYZ Values (@Cntr, @Cntr2); END; Exec Final_XYZ;

The assignment is to create a table called Table_XYZ. It has two columns (Col1 and Col2). Their data types are integer. The next part of the assignment is to insert 1,000 rows inside the table. The first column (Col1) will have 1,000 unique values. The second column (Col2) will have only 250 different values. Above is how it is done. Page 651

Chapter 22

Page 652

Statistical Aggregate Functions

Chapter 22

Statistical Aggregate Functions

Chapter 22 – Statistical Aggregate Functions

"You can make more friends in two months by becoming interested in other people than you will in two years by trying to get other people interested in you." - Dale Carnegie

Page 653

Chapter 22

Statistical Aggregate Functions

The Stats Table Col1 Col3 ____ Col4 _____ Col5 _____ Col6 ____ Col2 ____ ____ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

1 1 3 3 3 4 5 5 5 5 7 7 9 9 9 9 10 10 10 10 10 10 13 13 13 14 15 15 16 16

1 1 10 10 10 10 10 10 10 20 20 20 20 20 20 20 20 20 20 20 20 20 20 30 30 40 40 50 50 60

30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

1 2 3 4 5 6 7 8 9 10 22 12 13 14 15 14 13 12 11 9 8 7 6 5 4 3 2 1 1 1

0 5 10 15 20 30 30 30 35 35 40 40 45 45 50 55 55 60 60 65 65 65 70 70 80 85 90 90 95 100

Above, is the Stats_Table data in which we will use in our statistical examples.

Page 654

Chapter 22

Statistical Aggregate Functions

The VAR and VARP Functions SELECT VAR(col1) AS Variance_Example, VARP(col1) AS Var_EntirePopulation FROM Stats_Table ;

Variance_Example _______________ Var_EntirePopulation _________________ 77.5

74.92

The VAR and VARP functions return the statistical variance of all the values in the specified expression. The VAR uses a sample of the data population to return a value. The VARP returns the value based upon the entire data population. The expression parameter must be one of the exact or approximate numeric data types, except for the bit data type.

Page 655

Chapter 22

Statistical Aggregate Functions

A VAR Example Col1 Col3 ____ Col4 _____ Col5 _____ Col6 ____ Col2 ____ ____ 1 1 1 30 1 0 2 1 1 29 2 5 3 3 10 28 3 10 4 3 10 27 4 15 5 3 10 26 5 20 6 4 10 25 6 30 7 5 10 24 7 30 8 5 10 23 8 30 9 5 10 22 9 35 10 5 20 21 10 35 11 7 20 20 22 40 12 7 20 19 12 40 13 9 20 18 13 45 14 9 20 17 14 45 15 9 20 16 15 50 16 9 20 15 14 55 17 10 20 14 13 55 18 10 20 13 12 60 19 10 20 12 11 60 20 10 20 11 9 65 21 10 20 10 8 65 22 10 20 9 7 65 23 13 20 8 6 70 24 13 30 7 5 70 25 13 30 6 4 80 26 14 40 5 3 85 27 15 40 4 2 90 28 15 50 3 1 90 29 16 50 2 1 95 30 16 60 1 1 100

Page 656

SELECT VAR(col1) AS Col1 ,VAR(col2) AS Col2 ,VAR(col3) AS Col3 ,VAR(col4) AS Col4 ,VAR(col5) AS Col5 ,VAR(col6) AS Col6 FROM Stats_Table;

____ _____ Col1 Col2 ____ Col3 ____ Col4 _____ Col5 Col6 _____ 77.5 19.95 197.65 77.5 20.25 747.73

The VAR function returns the statistical variance of all the values in the specified expression. The VAR uses a sample of the data population to return a value.

Chapter 22

Statistical Aggregate Functions

A VARP Example Col1 Col3 ____ Col4 _____ Col5 _____ Col6 ____ Col2 ____ ____ 1 1 1 30 1 0 2 1 1 29 2 5 3 3 10 28 3 10 4 3 10 27 4 15 5 3 10 26 5 20 6 4 10 25 6 30 7 5 10 24 7 30 8 5 10 23 8 30 9 5 10 22 9 35 10 5 20 21 10 35 11 7 20 20 22 40 12 7 20 19 12 40 13 9 20 18 13 45 14 9 20 17 14 45 15 9 20 16 15 50 16 9 20 15 14 55 17 10 20 14 13 55 18 10 20 13 12 60 19 10 20 12 11 60 20 10 20 11 9 65 21 10 20 10 8 65 22 10 20 9 7 65 23 13 20 8 6 70 24 13 30 7 5 70 25 13 30 6 4 80 26 14 40 5 3 85 27 15 40 4 2 90 28 15 50 3 1 90 29 16 50 2 1 95 30 16 60 1 1 100

Page 657

SELECT VARP(col1) ,VARP(col2) ,VARP(col3) ,VARP(col4) ,VARP(col5) ,VARP(col6) FROM Stats_Table;

AS Col1 AS Col2 AS Col3 AS Col4 AS Col5 AS Col6

____ _____ Col1 Col2 ____ Col3 ____ Col4 _____ Col5 Col6 _____ 74.92 19.29 191.06 74.92 19.58 722.81

The VARP function returns the statistical variance of all the values in the specified expression. The VARP returns the value based upon the entire data population.

Chapter 22

Statistical Aggregate Functions

The STDEV and STDEVP Functions SELECT STDEV(Col1) AS StandDev ,STDEVP(Col1) AS StandDevPop FROM Stats_Table

StandDev ________ 8.8 The STDEV function returns the standard deviation, but only uses a sample of the data population

StandDevPop ___________ 8.66 The STDEVPOP function returns the standard deviation based on the entire population

The STDEV function returns the standard deviation of all the values in the specified expression, but only uses a sample of the data population to return a value. The STDEVP returns the standard deviation value based upon the entire data population. The expression parameter must be one of the exact or approximate numeric data types, except for the bit data type. Page 658

Chapter 22

Statistical Aggregate Functions

A STDEV Example Col1 Col3 ____ Col4 _____ Col5 _____ Col6 ____ Col2 ____ ____ 1 1 1 30 1 0 2 1 1 29 2 5 3 3 10 28 3 10 4 3 10 27 4 15 5 3 10 26 5 20 6 4 10 25 6 30 7 5 10 24 7 30 8 5 10 23 8 30 9 5 10 22 9 35 10 5 20 21 10 35 11 7 20 20 22 40 12 7 20 19 12 40 13 9 20 18 13 45 14 9 20 17 14 45 15 9 20 16 15 50 16 9 20 15 14 55 17 10 20 14 13 55 18 10 20 13 12 60 19 10 20 12 11 60 20 10 20 11 9 65 21 10 20 10 8 65 22 10 20 9 7 65 23 13 20 8 6 70 24 13 30 7 5 70 25 13 30 6 4 80 26 14 40 5 3 85 27 15 40 4 2 90 28 15 50 3 1 90 29 16 50 2 1 95 30 16 60 1 1 100

Page 659

SELECT STDEV(col1) ,STDEV(col2) ,STDEV(col3) ,STDEV(col4) ,STDEV(col5) ,STDEV(col6) FROM Stats_Table;

AS Col1 AS Col2 AS Col3 AS Col4 AS Col5 AS Col6

____ _____ Col1 Col2 ____ Col3 ____ Col4 _____ Col5 Col6 _____ 8.8 4.47 14.06 8.8 4.5 27.34

The STDEV function returns the standard deviation, but only uses a sample of the data population

Chapter 22

Statistical Aggregate Functions

A STDEVP Example Col1 Col3 ____ Col4 _____ Col5 _____ Col6 ____ Col2 ____ ____ 1 1 1 30 1 0 2 1 1 29 2 5 3 3 10 28 3 10 4 3 10 27 4 15 5 3 10 26 5 20 6 4 10 25 6 30 7 5 10 24 7 30 8 5 10 23 8 30 9 5 10 22 9 35 10 5 20 21 10 35 11 7 20 20 22 40 12 7 20 19 12 40 13 9 20 18 13 45 14 9 20 17 14 45 15 9 20 16 15 50 16 9 20 15 14 55 17 10 20 14 13 55 18 10 20 13 12 60 19 10 20 12 11 60 20 10 20 11 9 65 21 10 20 10 8 65 22 10 20 9 7 65 23 13 20 8 6 70 24 13 30 7 5 70 25 13 30 6 4 80 26 14 40 5 3 85 27 15 40 4 2 90 28 15 50 3 1 90 29 16 50 2 1 95 30 16 60 1 1 100

Page 660

SELECT STDEVP(col1) , STDEVP(col2) , STDEVP(col3) , STDEVP(col4) , STDEVP(col5) , STDEVP(col6) FROM Stats_Table;

____ _____ Col1 Col2 ____ Col3 8.66 4.39 13.82

AS Col1 AS Col2 AS Col3 AS Col4 AS Col5 AS Col6

____ Col4 _____ Col5 Col6 _____ 8.66 4.42 26.89

The STDEVPOP function returns the standard deviation based on the entire population

Chapter 23

Page 661

Systems Views

Chapter 23

Systems Views

Chapter 23 – Systems Views

“In the end we’ll remember not the words of our enemies, but the silence of our friends.” - Martin Luther King, Jr.

Page 662

Chapter 23

Systems Views

System Views System views describe three things: 1. Metadata 2. System catalog 3. Dynamic processes for the Azure SQL Data Warehouse There are three types of views within system views: 1. Catalog views 2. dynamic management views (DMVs) 3. Information schema views Catalog Views show information about metadata, such as table and column names. The name of each Azure SQL Data Warehouse catalog view begins with sys.pdw_. underscore Dynamic Management Views (DMVs) show information about dynamic processes, such as the queries that are currently in progress and memory usage for each node. The name of each Azure SQL Data Warehouse DMV begins with sys.dm_pdw_. underscore

Information Schema Views show metadata for the data objects in a particular database. These views have a special schema named INFORMATION_SCHEMA. This schema is contained in each database.

The basics of the Azure SQL Data Warehouse system views are listed above.

Page 663

Chapter 23

Systems Views

sys.all_columns SELECT name ,max_length ,precision ,scale FROM sys.all_columns

Name ____________ Subscriber_No Street City State Zip AreaCode Phone

max_length ________ precision scale __________ _____ 4 30 20 2 4 2 4

10 0 0 0 10 5 10

The sys.all_columns view shows columns for both user-defined and system objects.

Page 664

0 0 0 0 0 0 0

Chapter 23

Systems Views

sys.all_objects SELECT Name ,Type_Desc ,Create_Date ,Modify_Date FROM sys.all_objects WHERE name = 'Emp_Intl'

Name Type_Desc Create_Date Modify_Date ________ _____________ __________________ ____________________ Emp_Intl USER_TABLE 04/30/2015 9:23:16.317 04/30/2015 9:23:16.317

The sys.all_objects views shows user-defined and system objects, including tables and views.

Page 665

Chapter 23

Systems Views

sys.all_sql_modules

The sys.all_sql_modules views shows SQL Server modules, including user-defined and system objects.

Page 666

Chapter 23

Systems Views

sys.all_views SELECT Name ,Type_Desc ,Create_Date ,Modify_Date FROM sys.all_views WHERE name = 'Employee_V'

Name ________

Type_Desc ____________________ Create_Date Modify_Date __________ _____________________

Employee_V VIEW

04/30/2015 10:37:20.077 04/30/2015 10:37:20.077

The sys.all_views shows user defined and system objects.

Page 667

Chapter 23

Systems Views

sys.columns Select Object_ID ,Name ,Max_Length ,Precision ,Scale FROM sys.columns WHERE name = 'First_Name'

Object_ID Name Max_Length __________ __________ __________ Precision ________ Scale _____ 690101499 754101727 850102069 882102183 978102525 994102582 1010102639

First_Name First_name First_Name First_Name First_Name First_Name First_Name

12 12 12 20 12 12 12

0 0 0 0 0 0 0

0 0 0 0 0 0 0

The sys.columns system view shows columns for user-defined tables and user-defined views. What a great way to check to see if columns with the same name exist in other tables and if they are consistently defined.

Page 668

Chapter 23

Systems Views

sys.data_spaces

The sys.data_spaces system view contains a row for each data space. This can be a filegroup or partition scheme.

Page 669

Chapter 23

Systems Views

sys.database_files

The sys.database_files system view returns one row per file of the current Azure SQL Data Warehouse database as stored in the database itself. This is a per-database view.

Page 670

Chapter 23

Systems Views

sys.database_principals Select Name ,Type ,Type_Desc ,Default_Schema_Name as "Schema" ,Create_Date FROM sys.database_principals WHERE name in ('public', 'dbo')

Name Type_Desc Schema Create_Date ______ Type ____ _________________ _______ ____________________ public R dbo S

DATABASE_ROLE ? SQL_USER dbo

04/08/2003 9:10:42.317 04/08/2003 9:10:42.287

The sys.database_principals system view returns a row for each security principal in a database.

Page 671

Chapter 23

Systems Views

sys.database_role_members

The sys.database_role_members system view returns one row for each member of each database role for the Azure SQL Data Warehouse.

Page 672

Chapter 23

Systems Views

sys.databases

The sys.databases system view is aligned with the corresponding view exposed by SQL Server. The Azure SQL Data Warehouse exposes logical databases rather than the actual physical databases on the various Compute node instances. This view will show you the databases in the system. However, because some features are not supported on the Azure SQL Data Warehouse, some columns have fixed return values. Page 673

Chapter 23

sys.filegroups

The sys.filegroup system view contains a row for each data space that is a filegroup.

Page 674

Systems Views

Chapter 23

sys.identity_columns

The sys.identity_columns system view shows identity columns.

Page 675

Systems Views

Chapter 23

Systems Views

sys.objects SELECT name, type_desc, create_date FROM sys.objects WHERE name like '%table%' name __________________ Customer_table Order_table Student_table Course_table Student_Course_table Sales_table Employee_table Department_table Stats_table Job_table Emp_Job_table Names_table Hierarchy_table

type_desc _____________ USER_TABLE USER_TABLE USER_TABLE USER_TABLE USER_TABLE USER_TABLE USER_TABLE USER_TABLE USER_TABLE USER_TABLE USER_TABLE USER_TABLE USER_TABLE

create_date ____________________ 03/17/2015 8:29:41.820 03/17/2015 8:29:43.283 03/17/2015 8:29:43.760 03/17/2015 8:29:44.257 03/17/2015 8:29:44.713 03/17/2015 8:29:45.180 03/17/2015 8:29:45.563 03/17/2015 8:29:45.983 03/17/2015 8:29:46.387 03/17/2015 8:29:46.940 03/17/2015 8:29:47.450 03/17/2015 8:29:47.943 03/17/2015 8:29:48.350

The sys.objects system view returns a row for each user-defined object that is created within a database.

Page 676

Chapter 23

Systems Views

sys.partition_range_values

The sys.partition_range_values view contains a row for each range boundary value of a partition function of type R.

Page 677

Chapter 23

sys.schemas

The sys.schemas system view contains a row for each database schema.

Page 678

Systems Views

Chapter 23

Systems Views

sys.server_role_members

The sys.server_role_members system view returns one row for each member of each fixed server role in the Azure SQL Data Warehouse.

Page 679

Chapter 23

sys.sql_logins

The sys.sql_logins system view returns one row for every SQL Server authentication login.

Page 680

Systems Views

Chapter 24

Page 681

Nexus

Chapter 24

Nexus

Chapter 24 – Nexus

“I might be just a little bit biased, but because of our long term vision and incredible determination, the Nexus has become what some consider the greatest BI tool on planet Earth!” - Tera-Tom Coffing

Page 682

Chapter 24

Nexus

Nexus is Now Available on the Microsoft Azure Cloud

Why the Nexus Chameleon should be your query tool of choice: 1) Queries every major system 2) Provides visualization and automatically writes the SQL 3) Can perform cross-system joins with a few clicks of the mouse 4) Converts table structures and moves the table and data between systems 5) Compares and synchronizes databases 6) Can move an entire database of tables or views between systems 7) Has the "Garden of Analysis" to re-query answer sets inside your PC 8) Provides a dashboard of graphs and charts for answer sets Download the Nexus for a free trial at www.CoffingDW.com and use Nexus in-house or on the Microsoft Azure cloud. Page 683

Chapter 24

Nexus

Nexus Queries Every Major System

Nexus is connected to each of the systems above

Priority number one for us was to build the best BI tool and then get it working on every major platform.

Page 684

Chapter 24

Nexus

Setup of Nexus is as easy as pie

To add a system just right click on the Systems Tree and choose Add data source connection

Some of the reasons Nexus is so popular on cloud platforms are because Nexus queries every major platform and it is so easy to setup. Just right click on the systems tree and choose "Add data source connection". You can then add all of your systems one by one and before you know it you are ready to query them all.

Page 685

Chapter 24

Nexus

Setup of Nexus is a Easy as 1, 2, 3

1 Choose your system type from the drop down menu

2 Hit the Add New Button

3 Pick your driver. The Nexus Chameleon drivers are already installed for you.

Once you have right clicked on the Systems Tree and selected "Add new data source", you will come to the Data Source Connection page (see above). First, choose your Source Type from the drop down menu. Hit the Add New button and choose your driver from the System DSN tab (The Nexus Chameleon drivers are outstanding). Then, hit the CONFIGURE button and put in your IP address, login and password. You are ready to begin querying.

Page 686

Chapter 24

Nexus

Nexus Data Visualization

“It never made sense to me that the data scientist and the business user couldn't work together on the same playing field. We developed a way for them to work together, by building the Super Join Builder.” - Tera-Tom Coffing

Page 687

Chapter 24

Nexus

Nexus Data Visualization

1

Right Click on any table and choose Super Join Builder

You can write the SQL yourself and Nexus will bring back an answer set, but why not let Nexus write the SQL for you? The Nexus has the best data visualization and it took years of work and millions of lines of code. Just right click on a table in any of your systems trees (above we chose the Addresses table) and then choose SUPER JOIN BUILDER. The table will appear visually and in color. It will show the table name, the columns and their data types. Just check the columns you want on your report and Nexus will build the SQL for you automatically!

Page 688

Chapter 24

Nexus

Nexus Data Visualization Shows What Tables Can Be Joined

Left Click on the Add Join drop down

The menu shows what tables can be joined together.

1

2

Once you see your first table in the Super Join Builder the fun is just beginning. Left Click on the top right of the visible table and select the drop down menu where it says "Add Join". Nexus will show you what tables can be joined. The Addresses table above can be joined to the Subscribers table. The Subscribers table can be joined to the Claims table. The Claims table can be joined to the Providers and Services tables. Be prepared to be amazed at the next page!

Page 689

Chapter 24

Nexus

Nexus is doing a Five-Table Join

The "Add Join" button showed us the tables that could be joined and we chose them all. Notice that we can now see each table visually (and in color) and their respective columns and data types. Also, notice that we have checked the columns we want on our report. The Nexus has already built the SQL instantly and automatically for you and it does so perfectly. This technology puts the business user on the same level with the data scientist. The next page will show the SQL generated! Page 690

Chapter 24

Nexus

Nexus Generates the SQL Automatically

1 By clicking on the SQL tab you can see the SQL that Nexus has generated automatically

This is the SQL that was built automatically from the previous page. Since we are querying an Azure SQL Data Warehouse system, the Nexus built T-SQL to satisfy the query. It does not matter whether the system you are querying is Hadoop, Oracle, SQL Server, Teradata or any other system, the Nexus will build the SQL perfectly for that system. All you have to do now is hit the EXECUTE button and you will receive your answer set.

Page 691

Chapter 24

Nexus

Nexus Delivers the Report 1

By hitting the EXECUTE button the report was delivered in the Answer Set tab

When you hit the EXECUTE button Nexus executes the query and delivers the report.

Page 692

Chapter 24

Nexus

Cross-System Joins from Teradata, Oracle and SQL Server

The three top tables are from Teradata and the bottom tables are from Oracle and SQL Server.

The three tables at the top are from Teradata, but the tables at the bottom are from Oracle and SQL Server. When you hit EXECUTE, the Nexus will deliver the report. Nexus not only builds the SQL needed, but the table conversions and data movement to make it happen. Nexus does all of the difficult things for you. You just point and click on the columns you want from the tables and Nexus does the rest. Is that amazing or what?

Page 693

Chapter 24

Nexus

The Tab of the Super Join Builder

“The 9 tabs of the Super Join Builder are each dedicated to a single query, but each provides a different function. This makes the automatic writing of the SQL so easy, intuitive, quick and yet powerful.” - Tera-Tom Coffing .

Page 694

Chapter 24

Nexus

The 9 Tabs of the Super Join Builder – Objects Tab 1

The Objects tab is the first screen you see whenever you right click on any table and choose Super Join Builder. The Objects tab (in red) shows the table, columns and their data types. The Objects tab also allows you to left click on the right corner of any table on the ADD JOIN dropdown to see what other tables are joinable. If you click on a joinable table in the ADD JOIN menu then that table will appear in the Objects tab as well. If you check mark any of the columns from any tables in the Objects tab the SQL will be built and include those selected columns in the report. Since we have not selected any columns yet the SQL has not been built. Once we begin to checkmark columns the SQL will be built. Above, we first entered the Super Join Builder by right clicking on the Addresses table in our systems tree and we chose Super Join Builder. We then left clicked on the right corner of the Addresses table on the ADD JOIN drop down and we selected Subscribers. Both tables then appeared in the Object tab. We then left clicked on the Subscribers table on the ADD JOIN drop down and we can see that Claims joins to subscribers. Page 695

Chapter 24

Nexus

Selecting Columns in the Objects Tab

The Objects tab is the first screen you see whenever you right click on any table and choose Super Join Builder. The Objects tab (in red) shows the table, columns and their data types, and allows you to left click on the right corner of any table on the ADD JOIN dropdown to see what other tables are joinable. We have chosen a two table join between the Addresses table and the Subscribers table. Notice that we clicked on the checkbox on the columns Street, City and State of the Addresses table, and also notice that we clicked on the SELECT * of the Subscribers table which auto-clicked all columns. Our answer set (as of now) will come back with 9 columns (Street, City, State, Last_Name, First_Name, Gender, SSN, Member_No and Subscriber_No). The SQL has automatically been generated with each check of a column. Page 696

Chapter 24

Nexus

The 9 Tabs of the Super Join Builder – Columns Tab 2

These are the columns that will be on the report. This is because you checked them in the Objects Tab. You can rearrange these columns like a Rubik's cube (click and drag) and the SQL will reflect any and all changes.

These are the columns that you did not check in the Object Tab. You can drag them up if you decide you want any of them on the report.

The columns tab displays the columns that you selected in the Object tab that will be on the report (at the top). Notice the colors correspond to their respective tables. It is here that you can change the order of the columns by dragging them to the order that you prefer. Notice at the bottom are the columns that you did not select in the Objects tab. These will not be on the report. You can however drag these up to the top and then they will be on the report. The columns at the top are on the report and the columns at the bottom are not, but you can rearrange these columns until the report is exactly what you want.

Page 697

Chapter 24

Nexus

Removing Columns from the Report in the Columns Tab

Drag any column to the trashcan (blue arrow) and it is no longer on the report.

You can remove a series of columns also. Above, we did a CTRL click on Gender and a SHIFT Click on Member_No and all columns between highlight. Keep the SHIFT key down and move them all to the trashcan.

If you want to delete a column from the report, just drag it to the trashcan and it will appear at the bottom in the list in the area of non-selected columns. You can also remove a series of columns. Above, we did a CTRL click on the Gender column and then did a SHIFT click on the Member_No column. If we keep the SHIFT key down, and drag them all together to the trash can, then all three of these columns are removed from the report.

Page 698

Chapter 24

Nexus

The 9 Tabs of the Super Join Builder – Sorting Tab 3

The Sorting tab allows you to sort the answer set by simply double clicking on a column or by dragging it up. The columns are listed near the bottom (in color) and the columns at the very bottom were not selected to be on the report, but you can still sort by them. In our example above, we chose the Column State to be the major sort key. The Column State was selected previously to be part of the report. The column Zip is the minor sort key, but as you can see it was not previously selected to be on the report. Now, the Nexus will automatically place an ORDER BY statement at the end of the query. That ORDER BY statement will be ORDER BY State, Zip DESC.

Page 699

Chapter 24

Nexus

The 9 Tabs of the Super Join Builder – Joins Tab 4

The drop down box allows you to change the join from an INNER JOIN to a:

LEFT OUTER JOIN RIGHT OUTER JOIN FULL OUTER JOIN

The Nexus defaults all joins to INNER JOIN, but the Joins tab will allow you to change any of the joins from an INNER JOIN to any OUTER JOIN. Just hit the drop down box (red arrow) and your outer join options await you.

Page 700

Chapter 24

Nexus

The 9 Tabs of the Super Join Builder – Where Tab 5

Columns on the report Indexed columns Columns not on the report

The WHERE tab is designed to do two things. First, it shows you the indexes of the tables even if you are using views. This allows users to click on indexed columns and utilize an additional WHERE clause. Secondly, it shows all of the columns already on the report and those that are not on the report. Either way, you can double click on any column and write the WHERE or AND clause. I will demonstrate that on the next page.

Page 701

Chapter 24

Nexus

Using the WHERE Tab For Additional WHERE or AND

By double clicking on the Subscriber_No, that column is placed down below. I entered the Subscriber_No = 123456778. SELECT "Add".Street, "Add".City, "Add"."State", SUB.Last_Name, SUB.First_Name, SUB.Subscriber_No FROM SQL_CLASS.ADDRESSES "Add" INNER JOIN SQL_CLASS.SUBSCRIBERS SUB ON "Add".Subscriber_No = SUB.Subscriber_No WHERE "Add".Subscriber_No = 123456778 ORDER BY "Add"."State" ASC, "Add".Zip DESC;

That will automatically be placed in the SQL. Since that column is an index the system will retrieve the answer set faster.

If you went to the SQL tab you would see this SQL. Notice it reflects everything we have done over the past several examples.

The example above shows us double clicking on the Subscriber_No column. Notice (follow the arrow) that an additional WHERE clause was added. The Subscriber_No = awaited us to place a Subscriber_No in and we typed 123456778.

Page 702

Chapter 24

Nexus

The 9 Tabs of the Super Join Builder – SQL Tab 6

The SQL tab shows the SQL that Nexus has automatically generated. Every click from every tab can cause a change to the SQL. We first went to the Object Tab where we chose the Addresses table and the Subscribers table. We chose our columns in the Objects tab, but we then went to the Columns tab and deleted some of the columns. We then went to the Sorting tab and chose our ORDER BY keys of State and Zip DESC. We then went to the WHERE tab and added an additional WHERE clause choosing the column Subscriber_No and then we typed in = 123456778. The SQL reflects everything we did. Page 703

Chapter 24

Nexus

The 9 Tabs of the Super Join Builder – Answer Set Tab 7 1

By hitting the EXECUTE button the report was delivered in the Answer Set tab

When you hit EXECUTE, the SQL generated is run on your system and you receive an answer set. The above example is a different example than our previous examples. This reflects just a two-table join.

Page 704

Chapter 24

Nexus

The 9 Tabs of the Super Join Builder – Analytics Tab 9

Select all of the columns and then click on the Analytics tab

The Analytics tab is used for Rank, OLAP, and for Group by Grouping Sets, Group by Rollup and Group by Cube queries. It is usually used with a single table. Above, we have right clicked on the Sales_Table and chosen Super Join Builder. We will next click on the Analytics tab to show you how to generate analytics quickly. Turn the page and let's get started. Page 705

Chapter 24

Nexus

Analytics Tab Your three Analytics options are OLAP, Rank, Grouping Sets

We are in the OLAP tab.

The Analytics tab is used for Rank, OLAP, and for Group by Grouping Sets, Group by Rollup and Group by Cube queries. It is usually used with a single table. Above, we have right clicked on the Sales_Table and chosen Super Join Builder. We will next click on the Analytics tab to show you how to generate analytics quickly. Turn the page and let's get started.

Page 706

Chapter 24

Nexus

Analytics Tab – OLAP Example

This report will generate an OLAP report (Online Analytic Processing) such as a Cumulative Sum, Moving Sum, etc. We dragged the Daily_Sales column from the bottom to the OLAP column (top left). We dragged the Product_ID and Sale_Date columns to the sorting area. We dragged the Product_ID column to the Partitioning area and we changed our moving window to a 3. We then checked all of the OLAP functions on the top right, including the OLAP with Partitioning. The next slide will show the SQL automatically generated in the SQL tab.

Page 707

Chapter 24

Nexus

Analytics Tab – OLAP Example of SQL Generated

SELECT Sal.Product_ID, Sal.Sale_Date, Sal.Daily_Sales, SUM(Sal.Daily_Sales) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS UNBOUNDED PRECEDING) , SUM(Sal.Daily_Sales) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS 2 PRECEDING) , AVG(Sal.Daily_Sales) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS 2 PRECEDING) , Sal.Daily_Sales - SUM(Sal.Daily_Sales) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS BETWEEN 2 PRECEDING AND 2 PRECEDING) , COUNT(*) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS UNBOUNDED PRECEDING) , MIN(Sal.Daily_Sales) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ) , MAX(Sal.Daily_Sales) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ) , SUM(Sal.Daily_Sales) OVER (PARTITION BY Sal.Product_ID ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS UNBOUNDED PRECEDING) , SUM(Sal.Daily_Sales) OVER (PARTITION BY Sal.Product_ID ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS 2 PRECEDING) , AVG(Sal.Daily_Sales) OVER (PARTITION BY Sal.Product_ID ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS 2 PRECEDING) , AVG(Sal.Daily_Sales) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS UNBOUNDED PRECEDING) , Sal.Daily_Sales - SUM(Sal.Daily_Sales) OVER (PARTITION BY Sal.Product_ID ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS BETWEEN 2 PRECEDING AND 2 PRECEDING) , COUNT(*) OVER (PARTITION BY Sal.Product_ID ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS UNBOUNDED PRECEDING) , MIN(Sal.Daily_Sales) OVER (PARTITION BY Sal.Product_ID ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ) , MAX(Sal.Daily_Sales) OVER (PARTITION BY Sal.Product_ID ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ) FROM SQL_CLASS.Sales_table Sal;

The SQL above might take hours to write, but with the Nexus it can be generated in 30 seconds.

Page 708

Chapter 24

Nexus

Analytics Tab – Grouping Sets Example

This report will generate Grouping Sets that also include Rollup and Cube. Notice that we dragged the Product_ID to the Product. We dragged the Sale_Date column to the Date Column and we dragged the Daily_Sales column to the Sum. We then checked the Grouping Sets, Rollup and Cube on the top right. The report is now ready to be executed. Page 709

Chapter 24

Nexus

Analytics Tab – Grouping Sets Answer Set

Notice now that there are three Result Sets. The picture above shows Result Set 3 which is the Group by Grouping Sets. The Result 1 tab will show the Group by Rollup and the Result 2 tab will show the Group by Cube.

Page 710

Chapter 24

Nexus

Nexus Data Movement

“If you have ever had to build a load script or convert table structures between different systems, you have experienced the impossible. We spent 7 years to make sure our users could do it with a single click of a button.” - Tera-Tom Coffing

Page 711

Chapter 24

Nexus

Moving a Single Table To a Different System

Just Right Click on a table and choose "Move Data".

Just right click on any single table and select Move Data. The data movement screen will appear. Check out the next slide.

Page 712

Chapter 24

Nexus

The Single Table Data Movement Screen Lite Speed is for smaller tables, but Warp Speed is for large table.

This button will show the size and number of rows of the source table

Choose your Target system and put in your login information once and Nexus will remember the next time. Above, we are moving the Addresses table from SQL Server to Teradata. When the EXECUTE button is hit, the table is converted automatically and moved. Simple and easy! Wait until you see the Database Mover! It is next.

Page 713

Chapter 24

Nexus

Moving an Entire Database To a Different System

Just Right Click on a Database and choose "Move Data".

Just right click on any database and select Move Data. The database mover screen will appear. Check out the next slide.

Page 714

Chapter 24

Nexus

The Database Mover Screen Lite Speed is for small tables. Warp Speed is for large table. Auto chooses Lite or Warp based on size parameters in the Options tab

Check the tables or move through the Views and then press the blue button.

Select all the tables, a single table, some of the tables or choose to move through the views (bottom left),and then press the blue arrow button. We are moving 19 tables from SQL Server to Teradata. Hit EXECUTE and all of the tables move. Don't forget though to check out the Options tab. You can set your parameters there. The next slide will show the Options Tab. Page 715

Chapter 24

Nexus

The Database Mover Options Tab

The Row Count and Table size parameters take effect for Lite or Warp Speed if you select AUTO on the previous screen.

The Options tab allows you to set more detailed parameters. Once you set them the first time, the Nexus Chameleon will remember them the next time (as defaults). You can change them as you see fit.

Page 716

Chapter 24

Nexus

Converting DDL Table Structures

1. Right Click on a Database and a menu appears. 2. Choose CONVERT TABLE STRUCTURES. 3. Pick the Database you want to convert to.

Right Click on a Database

We right clicked on a Teradata database and chose Convert Table Structures and then chose to convert the Teradata tables to Hadoop. Check out the next couple of screens on the following pages.

Page 717

Chapter 24

Nexus

Converting DDL Table Structures

Check the tables you want converted and press the big blue button.

You can click on the table’s box (red arrow) and it checks all the tables. You can uncheck any table you don't want, but once you have the tables you want converted checked, then you just press the big blue arrow. The tables will move over to the right in the To Be Converted area. Just hit Execute (at the top) and the table structures (DDL) will be converted. This example has converted 19 Teradata tables to Hadoop table structures. Check out the DDL Nexus creates on the next page.

Page 718

Chapter 24

Nexus

Converting DDL Table Structures

Nexus converts and creates the new DDL. You can logion to the system and paste in the table structures. This complex and difficult project sometimes takes a month, but with the Nexus it takes a minute.

Page 719

Chapter 24

Nexus

Compare and Synchronize

“Cloud computing will provide a necessity for companies to compare and synchronize tables and data across platforms. Nexus has once again shown that it is ahead of the curve.” - Tera-Tom Coffing

Page 720

Chapter 24

Nexus

Compare Two Different Databases From Different Systems

Our source system is a SQL Server system

Our target system is a Teradata system

Drag up or down the table names so the target and source tables are aligned

Uncheck any boxes you don't want in the Comparison

Nexus can compare two separate databases table by table across different platforms. Choose your source system and the database and then choose your target system and the database. Line up the tables for comparison and hit EXECUTE! Page 721

Chapter 24

Nexus

Comparisons Down to the Column Level

Hit the Columns Tab and see a comparison of each table's columns

Blue column colors indicate the compare key(s)

You can drag columns to rearrange them to align perfectly from source to target

Nexus even shows you the column by column comparisons. You can also move the column names up or down to make sure that everything is aligned down to the column level. Check the columns you want compared and hit EXECUTE!

Page 722

Chapter 24

Nexus

The Results Tab

These table had some differences. Hit Go to see the differences

No Differences between these tables

Scroll Bar so you can see all your tables

The Results Tab will show you all of the table comparisons. If two tables being compared were exactly the same then the result will be NO DIFFERENCES. If there are differences between two tables you can VIEW DIFFERENCES. Page 723

Chapter 24

Nexus

View Differences The In Both With Differences Tab shows the rows that are the same, but with a little difference in a column(s) value

The Full Differences Tab shows the differences in both tables

The In Source, Not Target Tab shows rows that are in the Source, but Not in the Target

The In Target, Not Source Tab shows rows that are in the Target, but Not in the Source

The View Differences Tab will show you the differences between two tables being compared. Above, you can see that there are two extra rows in the Target (Teradata table) that do not reside in the Source (SQL Server table).

Page 724

Chapter 24

Nexus

Synchronizing Differences In the Results Tab

We are synchronizing the Source to the Target.

Hit GO and see the next screen!

The Results tab will either show Differences or No Differences. The Drop Down arrow next to the GO button will allow you to Synchronize the Source to the Target or the Target to the Source. Above, we have chosen to Sync the Source to the Target. Page 725

Chapter 24

Nexus

Synchronizing Differences In the Results Tab

We are synchronizing the Source to the Target. You can now hit the Perform Synchronization button

The final Synchronization screen gives you more options, but when you are ready to perform the synchronization you can hit the Perform Synchronization button and the magic will happen.

Page 726

Chapter 24

Nexus

Hound Dog Compression

“Using Multi-Value compression on a Teradata table is a win-win. Large tables are about 35% smaller, 35% faster and take up about 35% less network traffic. The only negative is you have to figure out the correct algorithm and write the DDL. We spent a long time making this all happen automatically.” - Tera-Tom Coffing

Page 727

Chapter 24

Nexus

Hound Dog Compression on Teradata

Right Click on a Teradata Database

Save yourself an enormous amount of money by using the Hound Dog Compression tool. It is as easy as a right click on a Teradata database and then choosing Hound Dog Compress Database. Check out the next screen to see how easily it is done.

Page 728

Chapter 24

Nexus

Hound Dog Compression on Teradata

Check the tables and press the blue button.

You can click on the table’s box (red arrow) and it checks all the tables. You can uncheck any table you don't want, but once you have the tables you want to be compressed just press the big blue arrow. The tables will move over to the right in the Table to Compress area. Just hit Execute (at the top) and the tables will be compressed. You can then look at the dashboard tab (top right) and see the compression savings for every table.

Page 729