Table of contents : Cover Title Page Copyright Page About the Author About the Reviewer Acknowledgement Preface Table of Contents 1. Writing Basic SQL Queries Introduction Structure Objectives Databricks The Databricks Web interface Getting started with Databricks notebooks Cells Spark SQL Databricks databases (schemas) and tables Activating a database Displaying the data in a table How does it work Help writing SQL Limiting the number of records displayed How does it work Displaying data from a specific field How does it work Finding the columns in a table SQL writing style Displaying data from a specific set of fields How does it work Columns or fields Modifying the field name in the output using aliases How does it work Removing duplicates from query output How does it work Undo and redo Sorting data How does it work Sorting data in reverse order How does it work Applying multiple sort criteria How does it work Running SQL queries Displaying the available tables Finding all the views in a database Resizing book cells Overriding the 1,000-row output limit Conclusion 2. Filtering Data Introduction Structure Objectives Filtering text How does it work Applying multiple text filters How does it work SQL writing style Excluding an element How does it work Using multiple exclusion filters How does it work Filtering numbers over a defined threshold How does it work Filtering numbers under a defined threshold How does it work Filtering on values up to and including a specific number How does it work Filtering on a range of values How does it work Using Boolean filters (true or false) How does it work Conclusion 3. Applying Complex Filters to Queries Introduction Structure Objectives Using either/or filters How does it work Using multiple separate criteria concurrently How does it work Using multiple filters and an exclusion How does it work Filtering on both text and numbers simultaneously How does it work Applying complex alternative filters at the same time How does it work Removing case-sensitivity in filters How does it work Using wildcard searches How does it work Forcing case-insensitivity in wildcard filters How does it work Using wildcards to exclude data How does it work Applying alternative wildcard patterns How does it work Using a specific part of the text to filter data How does it work Filtering NULLs or nonexistent data How does it work Searching using regular expressions How does it work Conclusion 4. Simple Calculations Introduction Structure Objectives Doing simple math How does it work Examining data types in SQL tables and views Isolating sections of formulas when applying math How does it work Calculating ratios How does it work Increasing values by a defined percentage How does it work Ordering the output of calculations How does it work NULLs Handling missing data How does it work Filtering on a calculation How does it work Using complex calculated filters How does it work Operator precedence Conclusion 5. Aggregating Output Introduction Structure Objectives Calculating table totals How does it work Using calculated aggregations How does it work Using grouped aggregations How does it work Using multiple levels of grouping How does it work Calculating averages How does it work Counting grouped elements How does it work Counting unique elements How does it work Displaying upper and lower numeric thresholds How does it work Aggregating across multiple columns How does it work Aggregating across columns and rows How does it work Filtering groups How does it work Filtering on aggregated results How does it work Selecting data based on aggregated results as well as specific filter criteria How does it work Query analysis Sorting by aggregated results How does it work Finding elements where every Boolean value is true How does it work Conclusion 6. Working with Dates in Databricks Introduction Structure Objectives Dates in Databricks Filtering records by date How does it work Date datatypes in tables Using a range of dates to filter data How does it work Finding the number of days between two dates How does it work Aggregating data over a date range How does it work Filtering by year How does it work Filtering records over a series of years How does it work Find sales for a specific day of the month How does it work Isolating data for a specific year and month How does it work Finding data for a given quarter How does it work Filtering data by weekday How does it work Finding records for a specific week of the year How does it work Aggregating data by the day of the week in a given year How does it work Analyzing data by day of year How does it work Grouping data by the full weekday How does it work Displaying cumulative data over a period of months to a specific date How does it work Displaying cumulative data over 90 days up to a specific date How does it works Displaying the data for the previous three months How does it work Finding the current system date Conclusion 7. Formatting Text in Query Output Introduction Structure Objectives Adding text to the output How does it work Adding text to numbers How does it work Amalgamating columns How does it work Avoiding NULLs in text-based data How does it work Concatenating and grouping How does it work Adding multiple pieces of text to numbers How does it work Converting text to uppercase How does it work Converting text to lowercase How does it work Converting text to initial capitals Extracting the first few characters from a field How does it work Displaying the three characters at the right of the text How does it work Displaying a given number of characters at a specific place in the text How does it work Replacing NULLs with the contents of another field How does it work Filtering records based on the part of a field How does it work Filtering data using specific characters at a given position inside a field How does it work Conclusion 8. Formatting Numbers and Dates Introduction Structure Objectives Removing the decimals from the output How does it work Rounding a field up to the nearest whole number How does it work Rounding a value to the nearest whole number How does it work Rounding to a specific number of decimals Rounding a value up or down to the nearest thousand How does it work Banker’s rounding Displaying a value in a specific numeric format How does it work Displaying a value in a specific currency How does it work Outputting a date in a specific date format How does it work Presenting the time in a specific format Conclusion 9. Using Basic Logic to Enhance Analysis Introduction Structure Objectives Generating an alert when a value is too high How does it work Shortening text and adding ellipses to indicate truncation How does it work Designing complex calculated alerts How does it work Creating key performance indicators How does it work Classifying a series of elements without the necessary categories present in your data How does it work Creating ad hoc category groupings How does it work Applying multiple ad hoc categories How does it work Categorizing data using nested classifications How does it work Choosing elements from a list How does it work Placing nulls at the start or end of a list How does it work Conclusion 10. Using Multiple Tables When Querying Data Introduction Structure Objectives Joining tables How does it work Other join syntax Joining multiple tables How does it work Join fields Join subtleties How does it work Using table aliases How does it work Joining many tables How does it work Visualizing databases Querying across databases Conclusion 11. Using Advanced Table Joins Introduction Structure Objectives Filtering data using inner joins Filtering data using multiple tables join How does it work Semi joins How does it work Filtering data output using intermediate tables How does it work Using left joins to return all the data in one table but not from the other table How does it work Right joins to return all the data in one table but not from the other How does it work Full joins to return all the data from both tables in a join How does it work Intermediate table joins How does it work Using multiple fields in joins How does it work Joining a table to itself How does it work Joining tables on ranges of values How does it work Cross joins How does it work Join concepts Conclusion 12. Subqueries Introduction Structure Objectives Adding aggregated fields to detailed datasets How does it work Displaying a value as the percentage of a total How does it work Using a subquery to filter data How does it work Using a subquery as part of a calculation to filter data How does it work Filtering on an aggregated range of data using multiple subqueries How does it work Filtering on aggregated output using a second aggregation How does it work Using multiple results from a subquery to filter data How does it work Complex aggregated subqueries How does it work Nested subqueries How does it work Using subqueries to exclude data How does it work Filtering across queries and subqueries How does it work Applying separate filters to the Subquery and the main query How does it work Conclusion 13. Derived Tables Introduction Structure Objectives Using a derived table to create intermediate calculations How does it work Grouping and ordering data using a custom classification How does it work Joining derived tables with other tables How does it work Joining multiple derived tables How does it work Using multiple derived tables for complex aggregations How does it work Data visibility Using derived tables to join unconnected tables How does it work Compare year-on-year data using a derived table How does it work Synchronizing filters between a derived table and the main query How does it work Conclusion 14. Common Table Expressions Introduction Structure Objectives Simplifying complex queries with CTEs A basic common table expression How does it work Calculating averages across multiple values using a CTE How does it work CTE or derived table? Reusing CTEs in a query How does it work Using a CTE in a derived table to deliver two different levels of aggregation How does it work Using a CTE to isolate data from a separate dataset at a different level of detail How does it work Multiple common table expressions in a single query How does it work Nested CTEs How does it work Using multiple CTEs to compare disparate datasets How does it work Conclusion 15. Correlated Subqueries Introduction Structure Objectives Simple correlated subqueries How does it work Correlated subqueries to display percentages of a specific total How does it work Comparing datasets using a correlated subquery How does it work Avoid correlated subqueries in certain cases Duplicating the output of a correlated subquery in the query results How does it work Using correlated subqueries to filter data on an aggregate value How does it work What makes a query correlated Using correlated subqueries to detect if records exist How does it work Using a correlated subquery to exclude data How does it work Conclusion 16. Dataset Manipulation Introduction Structure Objectives Read data from multiple identical tables using the UNION operator How does it work Isolate identical data in multiple tables using the INTERSECT operator How does it work Isolating nonidentical records using the EXCEPT operator How does it work Joining multiple identical tables in a subquery How does it work Conclusion 17. Using SQL for More Advanced Calculations Introduction Structure Objectives Calculating the percentage represented by each record in a dataset How does it work Replacing multiple subqueries How does it work Remove decimals in calculations How does it work Numeric datatypes Converting between numeric datatypes How does it work Avoiding divide-by-zero errors How does it work Finding the remainder in a division using the modulo function How does it work Creating financial calculations How does it work Using a tally table to produce a sequential list of numbers How does it work Generating completely random sample output from a dataset How does it work Handling source data where figures are stored as text How does it work Conclusion 18. Segmenting and Classifying Data Introduction Structure Objectives Organizing data by rank How does it work Creating multiple groups of rankings How does it work Creating multiple ranked groups and subgroups How does it work Filtering data by ranked items How does it work Classifying data by strict order of rank How does it work Segment data into deciles How does it work Plot values for a percentile How does it work Extracting data from a specific quintile How does it work Display median values How does it work Conclusion 19. Rolling Analysis Introduction Structure Objectives Adding a running total How does it work Using windowing functions in an aggregated query How does it work Grouping running totals How does it work Applying windowing functions to a subquery How does it work Adding unique ids on the fly using ROW_NUMBER() How does it work Displaying records for missing data How does it work Displaying a complete range of dates and relevant data How does it work Comparing data with the data from a previous record How does it work Comparing data over time using the FIRST_VALUE() and LAST_VALUE() functions How does it work Displaying rolling averages over a specified number of records How does it work Show the first sale and last four sales per client How does it work Calculating cumulative distribution How does it work Classifying data using the PERCENT_RANK() function How does it work Using the LAG() function with alphabetical data How does it work Conclusion 20. Analyzing Data Over Time Introduction Structure Objectives Aggregating values for the year to date How does it work Isolating data for the previous month How does it work Using a derived table to compare data with values from a previous year How does it work Finding the total amount for sales each weekday over a year How does it work Count the number of weekend days between two dates How does it work Aggregate data for the last day of the month How does it work Aggregate data for the last Friday of the month How does it work Analyzing timespans as years, months, and days How does it work Isolate time periods from date and time data How does it work Listing data by time of day How does it work Aggregating data by hourly bandings How does it work Aggregate data by a quarter of an hour How does it work Reading dates and times stored as strings How does it work Conclusion 21. Complex Data Output Introduction Structure Objectives Creating a pivot table How does it work Creating a pivot table displaying multiple row groupings How does it work Adding totals to aggregate queries How does it work Creating subtotals and totals in aggregated queries How does it work Creating clear tables that include totals and subtotals How does it work Replace acronyms with full text in the final output How does it work Conclusion Index