SQL for Data Engineering

My full course to help you

build production data pipelines with SQL

All video lessons are free on YouTube
Supporter Access unlocks structure, guided practice, and community

Unlock Full Access

Pricing

FREE OPTION

Free

Features:

🎥 Full YouTube Course (14+ hours)

🧑‍💻 Two Complete Portfolio Projects

🔗 Links to Required Materials & Resources

📊 Real-World Dataset (2023 to mid-2025)

SUPPORTERS

$49

One-time payment — Supporter Access adds:

🧪 170+ Interview-Level SQL Problems

📺 Playlist-Style Lesson Videos

⏳ Progress Tracking

💬 Community Access

📝 Course Notes

📋 Cheat Sheets

🏆 Certificate of Completion

🎁 Full Real-World Dataset (2023–Present)

Unlock Full Access

Course Outline

Course Timeline Accordion

0️⃣ Course Intro

⏱️ 19 mins 📚 3 Lessons

▾

Course Intro

⏱️ 6 mins 📦 1 concepts

▾

Course Intro

⏱️ 6 mins

What is SQL

⏱️ 6 mins 📦 1 concepts

▾

What is SQL

⏱️ 6 mins

Data & Pipeline Intro

⏱️ 6 mins 📦 1 concepts

▾

Data & Pipeline Intro

⏱️ 6 mins

1️⃣ SQL Foundations

⏱️ 4 hrs 12 mins 📚 12 Lessons

▾

SQL & Dataset Setup

⏱️ 14 mins 📦 4 concepts

▾

Lesson Intro

⏱️ 1 min

Where are we running SQL

⏱️ 3 mins

Create a MotherDuck Account

⏱️ 2 mins

MotherDuck UI Walkthrough

⏱️ 5 mins

Database Setup

⏱️ 4 mins

Basic Keywords

⏱️ 19 mins 📦 9 concepts

▾

Lesson Intro

⏱️ <1 min

SELECT * / FROM

⏱️ 4 mins

LIMIT

⏱️ 1 min

DISTINCT

⏱️ 2 mins

WHERE

⏱️ 3 mins

IS NULL / IS NOT NULL

⏱️ 2 mins

Commenting Code

⏱️ 2 mins

ORDER BY

⏱️ 2 mins

Order of Commands

⏱️ 2 mins

DuckDB - Friendly Syntax

⏱️ 2 mins

Comparison & Logical Operators

⏱️ 23 mins 📦 5 concepts

▾

Lesson Intro

⏱️ <1 min

Intro to Operators

⏱️ 1 min

Comparison Operators - Pt.1 (=, !=, <, >)

⏱️ 6 mins

Logical Operators (AND, OR, NOT)

⏱️ 6 mins

Comparison Operators - Pt.2 (BETWEEN, IN)

⏱️ 3 mins

Final Example

⏱️ 6 mins

Wildcards & Aliases

⏱️ 11 mins 📦 3 concepts

▾

Lesson Intro

⏱️ <1 min

Wildcards w/ LIKE

⏱️ 4 mins

Alias w/ AS

⏱️ 2 mins

Final Example

⏱️ 4 mins

Arithmetic Operators

⏱️ 12 mins 📦 4 concepts

▾

Lesson Intro

⏱️ <1 min

Arithmetic Operators Intro

⏱️ <1 min

Addition & Subtraction

⏱️ 5 mins

Multiplication & Division

⏱️ 3 mins

Modulus (%)

⏱️ 3 mins

Aggregate Functions

⏱️ 17 mins 📦 9 concepts

▾

Lesson Intro

⏱️ <1 min

Aggregate Function Intro

⏱️ 1 min

COUNT()

⏱️ 2 mins

COUNT(DISTINCT)

⏱️ 1 min

SUM()

⏱️ 1 min

AVG()

⏱️ <1 min

GROUP BY

⏱️ 3 mins

MIN() / MAX()

⏱️ 2 mins

MEDIAN()

⏱️ 3 mins

HAVING

⏱️ 3 mins

Terminal Intro

⏱️ 33 mins 📦 5 concepts

▾

Lesson Intro

⏱️ 1 min

Intro to the Terminal

⏱️ 4 mins

Installing / Opening the Terminal

⏱️ 7 mins

Basic Terminal Commands (pwd, ls, cd)

⏱️ 4 mins

Working with Files & Folders (mkdir, touch, rm)

⏱️ 15 mins

Getting Help

⏱️ 2 mins

Local DuckDB Intro

⏱️ 38 mins 📦 10 concepts

▾

Lesson Intro

⏱️ <1 min

Local DuckDB Intro

⏱️ 2 mins

Install DuckDB (Disclaimer)

⏱️ 4 mins

Install DuckDB - Windows Users ("Easy" Option)

⏱️ 4 mins

Install DuckDB - Windows Users (Pro Option)

⏱️ 2 mins

Install DuckDB - Mac Users (Easy Option)

⏱️ 6 mins

Install DuckDB - Mac Users (Pro Option)

⏱️ 3 mins

Local DuckDB Terminal Intro

⏱️ 2 mins

Local DuckDB Database

⏱️ 6 mins

Local DuckDB UI

⏱️ 4 mins

Local DuckDB Connect to MotherDuck

⏱️ 4 mins

VS Code Intro

⏱️ 23 mins 📦 5 concepts

▾

Lesson Intro

⏱️ 1 min

Why VS Code?

⏱️ 3 mins

VS Code Install & Intro

⏱️ 4 mins

VS Code SQL Setup

⏱️ 5 mins

Setting up DuckDB & MotherDuck

⏱️ 8 mins

Getting Help w/ GitHub Copilot

⏱️ 3 mins

Data Modeling Pt.1

⏱️ 20 mins 📦 3 concepts

▾

Lesson Intro

⏱️ 1 min

Databases, Schemas, & Tables

⏱️ 6 mins

Entity Relationship Diagram (ERD)

⏱️ 7 mins

Database Metadata (information_schema)

⏱️ 7 mins

JOINs

⏱️ 22 mins 📦 6 concepts

▾

Lesson Intro

⏱️ <1 min

What are JOINs?

⏱️ 1 min

LEFT JOIN

⏱️ 8 mins

RIGHT JOIN

⏱️ 2 mins

INNER JOIN

⏱️ 2 mins

FULL OUTER JOIN

⏱️ 2 mins

Final Example

⏱️ 7 mins

Order of Execution

⏱️ 19 mins 📦 5 concepts

▾

Lesson Intro

⏱️ <1 min

Query Processing 101

⏱️ 2 mins

SQL Clause Order

⏱️ 1 min

Order of Execution

⏱️ 2 mins

Final Example Pt.1 - Query Order Execution

⏱️ 7 mins

Final Example Pt.2 - Execution w/ EXPLAIN

⏱️ 8 mins

📊 SQL Exploratory Data Analysis — Project 1

⏱️ 1 hr 40 mins 📚 7 Lessons

▾

Project #1 Intro

⏱️ 8 mins 📦 3 concepts

▾

Lesson Intro

⏱️ 1 min

Background: Data Warehouse

⏱️ 2 mins

Project #1 Goal

⏱️ 2 mins

Project #1 Scope

⏱️ 3 mins

EDA #1 - In-Demand Skills

⏱️ 9 mins 📦 1 concepts

▾

EDA #1 - In-Demand Skills

⏱️ 9 mins

EDA #2 - Highest Paying Skills

⏱️ 9 mins 📦 1 concepts

▾

EDA #2 - Highest Paying Skills

⏱️ 9 mins

EDA #3 - Most Optimal Skills

⏱️ 15 mins 📦 1 concepts

▾

EDA #3 - Most Optimal Skills

⏱️ 15 mins

README.md Build

⏱️ 16 mins 📦 3 concepts

▾

Lesson Intro

⏱️ <1 min

README.md Intro

⏱️ 2 mins

Markdown Basics

⏱️ 7 mins

README.md Build

⏱️ 7 mins

Git & GitHub Pt.1

⏱️ 36 mins 📦 6 concepts

▾

Lesson Intro

⏱️ 1 min

Git vs. GitHub

⏱️ 5 mins

Homebrew & Git Install (Mac Users Only)

⏱️ 3 mins

Git Setup (git config)

⏱️ 2 mins

Create Local Repository (git init, add, commit)

⏱️ 13 mins

GitHub Setup w/ Remote Repository

⏱️ 2 mins

Push & Pull Repo w/ GitHub (git push, git pull)

⏱️ 10 mins

Share Project #1

⏱️ 7 mins 📦 2 concepts

▾

Lesson Intro

⏱️ <1 min

GitHub Final Push - Add README.md

⏱️ 3 mins

LinkedIn - Project Share & Post

⏱️ 3 mins

2️⃣ Production SQL

⏱️ 6 hrs 22 mins 📚 13 Lessons

▾

Data Types

⏱️ 17 mins 📦 4 concepts

▾

Lesson Intro

⏱️ 1 min

Data Types Intro

⏱️ 2 mins

Common Data Types

⏱️ 4 mins

Check Column Data Type

⏱️ 2 mins

CAST Operator

⏱️ 8 mins

DDL & DML Pt.1

⏱️ 38 mins 📦 8 concepts

▾

Lesson Intro

⏱️ 2 mins

DDL vs. DML Intro

⏱️ 4 mins

CREATE / DROP DATABASE

⏱️ 4 mins

CREATE / DROP SCHEMA

⏱️ 4 mins

CREATE / DROP TABLE

⏱️ 9 mins

INSERT INTO

⏱️ 5 mins

ALTER TABLE - ADD / DROP COLUMN

⏱️ 2 mins

UPDATE

⏱️ 2 mins

ALTER TABLE - RENAME TABLE & RENAME/ALTER COLUMN

⏱️ 6 mins

DDL & DML Pt.2

⏱️ 25 mins 📦 6 concepts

▾

Lesson Intro

⏱️ 1 min

DDL & DML - Refresher

⏱️ 3 mins

CTAS - CREATE TABLE AS SELECT

⏱️ 5 mins

CREATE VIEW

⏱️ 5 mins

CREATE TEMP TABLE

⏱️ 4 mins

DELETE

⏱️ 3 mins

TRUNCATE

⏱️ 4 mins

Subqueries and CTEs

⏱️ 36 mins 📦 4 concepts

▾

Lesson Intro

⏱️ 1 min

What are Subqueries & CTEs?

⏱️ 4 mins

Subquery

⏱️ 10 mins

CTEs - Common Table Expressions

⏱️ 9 mins

Final Example - Existence Filtering w/ EXISTS

⏱️ 11 mins

DDL & DML Pt.3

⏱️ 39 mins 📦 6 concepts

▾

Lesson Intro

⏱️ 1 min

Batch vs. Continuous Processing

⏱️ 5 mins

priority_roles - Table Load

⏱️ 3 mins

priority_jobs_snapshot - Initial Load

⏱️ 7 mins

UPDATE / INSERT / DELETE (Refresher)

⏱️ 14 mins

MERGE INTO

⏱️ 7 mins

CTAS vs. MERGE

⏱️ 3 mins

Data Modeling Pt.2

⏱️ 23 mins 📦 6 concepts

▾

Lesson Intro

⏱️ 1 min

Data Modeling - Refresher

⏱️ 1 min

Why Data Modeling Matters?

⏱️ 3 mins

Source Systems to Analytical Systems

⏱️ 3 mins

Choosing a Database: OLTP vs OLAP

⏱️ 3 mins

Core Design Patterns

⏱️ 8 mins

SCDs - Slowly Changing Dimensions

⏱️ 4 mins

CASE Expressions

⏱️ 21 mins 📦 3 concepts

▾

Lesson Intro

⏱️ <1 min

CASE Expressions

⏱️ 2 mins

CASE: Engineering Use Cases

⏱️ 12 mins

Final Example

⏱️ 6 mins

Date Functions

⏱️ 22 mins 📦 4 concepts

▾

Lesson Intro

⏱️ 1 min

Intro to Dates

⏱️ 3 mins

EXTRACT()

⏱️ 5 mins

DATE_TRUNC()

⏱️ 5 mins

AT TIME ZONE

⏱️ 9 mins

SET Operators

⏱️ 17 mins 📦 2 concepts

▾

Lesson Intro

⏱️ 1 min

UNION / INTERSECT / EXCEPT

⏱️ 6 mins

Final Example

⏱️ 10 mins

Text & NULL Functions

⏱️ 18 mins 📦 4 concepts

▾

Lesson Intro

⏱️ <1 min

Text Functions - REPLACE / CONCAT

⏱️ 7 mins

Final Example - Text Functions

⏱️ 3 mins

NULL Functions - NULLIF / COALESCE

⏱️ 6 mins

Final Example - NULL Functions

⏱️ 2 mins

Window Functions

⏱️ 34 mins 📦 8 concepts

▾

Lesson Intro

⏱️ 1 min

What are Window Functions?

⏱️ 3 mins

Window Function Syntax

⏱️ 3 mins

PARTITION BY

⏱️ 5 mins

ORDER BY

⏱️ 3 mins

PARTITION & ORDER BY

⏱️ 6 mins

Aggregation Functions

⏱️ 3 mins

Row & Rank Functions

⏱️ 5 mins

Navigation Functions

⏱️ 6 mins

Nested Functions

⏱️ 50 mins 📦 8 concepts

▾

Lesson Intro

⏱️ 1 min

Intro to Nested Data Structures

⏱️ 5 mins

Arrays

⏱️ 7 mins

Structs

⏱️ 6 mins

Array of Structs

⏱️ 4 mins

Maps

⏱️ 6 mins

JSON - JavaScript Object Notation

⏱️ 6 mins

Final Example - Arrays

⏱️ 9 mins

Final Example - Array of Structs

⏱️ 6 mins

Git & GitHub Pt.2

⏱️ 42 mins 📦 7 concepts

▾

Lesson Intro

⏱️ 1 min

What is a Branch?

⏱️ 4 mins

Managing Branches (git branch, git switch)

⏱️ 5 mins

Making Changes on a Branch

⏱️ 4 mins

Merging Branches: Fast Forward Merge

⏱️ 6 mins

Merging Branches: Three-Way Merge

⏱️ 10 mins

Pull Requests (PRs)

⏱️ 3 mins

.gitignore File

⏱️ 9 mins

🏗️ End-to-End Data Pipeline — Project 2

⏱️ 2 hrs 12 mins 📚 7 Lessons

▾

Project #2 Intro

⏱️ 11 mins 📦 3 concepts

▾

Lesson Intro

⏱️ 1 min

Data Warehouse vs. Data Mart - Recap

⏱️ 2 mins

Project #2 Goals

⏱️ 2 mins

Project #2 Scope

⏱️ 5 mins

Build Data Warehouse

⏱️ 38 mins 📦 5 concepts

▾

Lesson Intro

⏱️ <1 min

Project #2 Git Workflow

⏱️ 2 mins

Create Star Schema Tables

⏱️ 11 mins

Load Data into Data Warehouse

⏱️ 16 mins

Data Validation

⏱️ 5 mins

Merge Feature Branch to Development

⏱️ 2 mins

Build Flat Table Mart

⏱️ 19 mins 📦 4 concepts

▾

Lesson Intro

⏱️ 1 min

Why Build a Flat Table Mart?

⏱️ 2 mins

Build Flat Table Mart

⏱️ 11 mins

Data Validation

⏱️ 3 mins

Commit & Merge

⏱️ 2 mins

Build Skills Mart

⏱️ 28 mins 📦 4 concepts

▾

Lesson Intro

⏱️ 1 min

Why Build a Skill Demand Mart?

⏱️ 3 mins

Building Skill Demand Mart

⏱️ 19 mins

Data Validation

⏱️ 2 mins

Update Master Build Script & Commit/Merge

⏱️ 2 mins

Build Priority Mart

⏱️ 22 mins 📦 6 concepts

▾

Lesson Intro

⏱️ 1 min

Why Build This Priority Mart?

⏱️ 1 min

Create the Priority Mart

⏱️ 7 mins

Incremental Updates to Mart

⏱️ 6 mins

Update Master Build Script & Commit/Merge

⏱️ 2 mins

MotherDuck Deployment of DW & Mart

⏱️ 4 mins

Optional Exercise: Build Company Mart

⏱️ 1 min

README.md Build

⏱️ 11 mins 📦 3 concepts

▾

Lesson Intro

⏱️ 1 min

Build Project #2 README.md

⏱️ 6 mins

Update Main Repo README.md

⏱️ 1 min

Commit & Merge

⏱️ 3 mins

Share Project #2

⏱️ 4 mins 📦 1 concepts

▾

Lesson Intro

⏱️ 1 min

LinkedIn Updates

⏱️ 4 mins

Course Resources

💽 Course Dataset — SQL Environment

This is the primary dataset used throughout the entire course. It contains real-world data engineering & analytics job postings (2023 to mid-2025) and is hosted in MotherDuck for instant querying.

🔗 Step 1 — Sign in to MotherDuck

Create your free account 👉 https://lukeb.co/motherduck

💻 Step 2 — Attach Database

Run this SQL inside the MotherDuck editor:

SQL

ATTACH 'md:_share/data_jobs/87603155-cdc7-4c80-85ad-3a6b0d760d93'

📊 Project 1 — SQL Exploratory Data Analysis

Explore real-world job data using SQL to uncover in-demand skills, salary trends, and hiring patterns. You’ll practice EDA techniques and build your first portfolio-ready project.

🔗 Project #1 Repo

👉 https://lukeb.co/sql-de-project1

🏗️ Project 2 — Data Pipeline: Warehouse + Mart

Build a production-style SQL pipeline — modeling a data warehouse and creating analytical marts. You’ll apply data modeling, transformations, and best practices to deliver a second portfolio project.

🔗 Project #2 Repo

👉 https://lukeb.co/sql-de-project2

Supporter Resources

📝 Practice Problems

🧩 170+ Interview-Level Problems: Learn SQL faster with meticulously designed exercises spanning a range from easy to challenging

🔍 Detailed Solutions and Results: Every problem is accompanied by a comprehensive solution and your expected query results

📺 Structured Video Lessons

🚢 Navigate with Ease: Jump instantly to any lesson or specific topic within the course – no more wasting time scrubbing through hours of video to find what you need

🧠 Focused Learning: Master concepts more effectively with dedicated, bite-sized videos for each distinct lesson, allowing for better concentration and easier review

🗒️ Lesson Notes & Cheat Sheets

📖 Structured Lesson Notes: Step-by-step walkthroughs for every topic, helping you follow along with each lesson and understand why queries and pipelines are built the way they are

📋 Practical Cheat Sheets: Quick-reference guides for core SQL syntax, transformations, and data engineering concepts you’ll reuse across projects

✨ Certificate of Completion

🎖️ Certificate of Completion: Receive a certificate to validate your new skills and enhance your LinkedIn profile

🧑‍💻 Showcase Experience: Share how you used real-world data to help solve a problem for data professionals

Unlock Full Access

About the Instructors

Luke Barousse - Course Instructor

🌎 Real-world Experience with SQL

Spearheaded innovative projects in collaboration with MrBeast's team, integrating popular tools like SQL & Python.

💡🤖 Sharing Knowledge about Data & AI

Guides a community of +600,000 data nerds in harnessing analytical tools to revolutionize their professional workflows.

🎓 Trusted Course Developer

Imparted wisdom to +30,000 learners on DataCamp in leveraging analytical tools to elevate their career efficiency.

Kelly Adams - Course Producer

🕹️ Hands on Experience with SQL

Driving strategic decisions within the social gaming industry at Golden Hearts Games, using popular tools like Google BigQuery (SQL), Dataform (dbt) and Looker.

📊 Analytics Engineering & Data Pipelines

Building scalable data pipelines that support product, finance, and growth teams with reliable, decision-ready metrics.

📹 Course Producer for Data Analytics Content

Educating an audience of +600,000 analysts about the latest data analytical tools to improve their professional skill sets.

Rikki Singh - Content Developer

🧑‍💻 Hands-on SQL & Analytics

Works across gaming, entertainment, and marketing—using Redshift and BigQuery to query and model data, and builds decision-ready dashboards in Looker and Tableau.

💼 Director-Level Operator

Leads analytics initiatives—bringing a “what matters to the business” lens to every lesson and project.

🎬 Course Producer for Data Analytics Content

Builds high-signal practice problems by benchmarking a wide range of learning platforms and question styles, then translating the best patterns into realistic, interview-ready exercises.

SQL for Data Engineering

My full course to help you

build production data pipelines with SQL

All video lessons are free on YouTubeSupporter Access unlocks structure, guided practice, and community

Pricing

FREE OPTION

Free

SUPPORTERS

$49

Course Outline

Course Resources

💽 Course Dataset — SQL Environment

🔗 Step 1 — Sign in to MotherDuck

💻 Step 2 — Attach Database

📊 Project 1 — SQL Exploratory Data Analysis

🔗 Project #1 Repo

🏗️ Project 2 — Data Pipeline: Warehouse + Mart

🔗 Project #2 Repo

Supporter Resources

📝 Practice Problems

🧩 170+ Interview-Level Problems: Learn SQL faster with meticulously designed exercises spanning a range from easy to challenging

🔍 Detailed Solutions and Results: Every problem is accompanied by a comprehensive solution and your expected query results

📺 Structured Video Lessons

🚢 Navigate with Ease: Jump instantly to any lesson or specific topic within the course – no more wasting time scrubbing through hours of video to find what you need

🧠 Focused Learning: Master concepts more effectively with dedicated, bite-sized videos for each distinct lesson, allowing for better concentration and easier review

🗒️ Lesson Notes & Cheat Sheets

📖 Structured Lesson Notes: Step-by-step walkthroughs for every topic, helping you follow along with each lesson and understand why queries and pipelines are built the way they are

📋 Practical Cheat Sheets: Quick-reference guides for core SQL syntax, transformations, and data engineering concepts you’ll reuse across projects

✨ Certificate of Completion

🎖️ Certificate of Completion: Receive a certificate to validate your new skills and enhance your LinkedIn profile

🧑‍💻 Showcase Experience: Share how you used real-world data to help solve a problem for data professionals

About the Instructors

Luke Barousse - Course Instructor

🌎 Real-world Experience with SQL

Spearheaded innovative projects in collaboration with MrBeast's team, integrating popular tools like SQL & Python.

💡🤖 Sharing Knowledge about Data & AI

Guides a community of +600,000 data nerds in harnessing analytical tools to revolutionize their professional workflows.

🎓 Trusted Course Developer

Imparted wisdom to +30,000 learners on DataCamp in leveraging analytical tools to elevate their career efficiency.

Kelly Adams - Course Producer

🕹️ Hands on Experience with SQL

Driving strategic decisions within the social gaming industry at Golden Hearts Games, using popular tools like Google BigQuery (SQL), Dataform (dbt) and Looker.

📊 Analytics Engineering & Data Pipelines

Building scalable data pipelines that support product, finance, and growth teams with reliable, decision-ready metrics.

📹 Course Producer for Data Analytics Content

Educating an audience of +600,000 analysts about the latest data analytical tools to improve their professional skill sets.

Rikki Singh - Content Developer

🧑‍💻 Hands-on SQL & Analytics

Works across gaming, entertainment, and marketing—using Redshift and BigQuery to query and model data, and builds decision-ready dashboards in Looker and Tableau.

💼 Director-Level Operator

Leads analytics initiatives—bringing a “what matters to the business” lens to every lesson and project.

🎬 Course Producer for Data Analytics Content

Builds high-signal practice problems by benchmarking a wide range of learning platforms and question styles, then translating the best patterns into realistic, interview-ready exercises.

100% Satisfaction Guarantee or Your Money Back

⏱️ If you don’t feel the course problems and notes help you learn this tool as it has for countless others, I’ll refund your money!

📫 Email me within 30 days of purchasing the course on why you are unsatisfied, and I’ll return the full purchase price to you ASAP.

FAQ

What is “SQL for Data Engineering”?

Who is this course for?

Should I take this course or "SQL for Data Analytics"?

Are the course problems and notes required?

What are the computer requirements?

How long should this course take to complete?

📣 Enrollment Paused 📣

ChatGPT for Data Analytics

All video lessons are free on YouTube
Supporter Access unlocks structure, guided practice, and community