In-Class Activity - SQL Joins Practice with Bend Blockbuster Database

DATA 351: Data Management with SQL

Prerequisites: Setup

Before we begin, load the database schema and data:

-- Connect to your database
\c blockbuster

From command line (import the schema and data using the files from Canvas for Assignment 3):

psql -U postgres -d blockbuster -f blockbuster-schema.sql
psql -U postgres -d blockbuster -f blockbuster-data.sql

Prerequisites: Verify Setup

Run these queries to confirm your database is ready:

SELECT COUNT(*) FROM film;      -- Expected: 1000
SELECT COUNT(*) FROM customer;  -- Expected: 599
SELECT COUNT(*) FROM rental;    -- Expected: 16044

If you see these counts, you’re ready to go!

Today’s Focus: JOIN Types

┌─────────────────────────────────────────────────────┐
│                    JOIN Types                       │
├─────────────────────────────────────────────────────┤
│  INNER JOIN  │ Only matching rows from both tables  │
│  LEFT JOIN   │ All from left + matches from right   │
│  RIGHT JOIN  │ All from right + matches from left   │
│  FULL JOIN   │ All rows from both tables            │
└─────────────────────────────────────────────────────┘

Today we’ll practice INNER JOIN and LEFT JOIN.

Problem 1: Film Titles with Language

Task: Display film titles with the language name.

Requirements:

  • Select title from film
  • Select name as language_name from language
  • Join using language_id
  • Order by title, limit 10

Problem 1 output screenshot

Problem 1: Strategy

Step 1: Identify the tables and their relationship

┌──────────────────┐         ┌──────────────────┐
│      film        │         │    language      │
├──────────────────┤         ├──────────────────┤
│ film_id (PK)     │         │ language_id (PK) │
│ title            │         │ name             │
│ language_id (FK) │────────▶│                  │
└──────────────────┘         └──────────────────┘

The language_id in film references the language table.

Problem 1: Join Visualization

INNER JOIN keeps only rows where both tables match:

    film                          language
┌────┬───────────┬────┐       ┌────┬─────────┐
│ id │ title     │lang│       │ id │ name    │
├────┼───────────┼────┤       ├────┼─────────┤
│ 1  │ Ace Gold  │ 1  │──┐    │ 1  │ English │
│ 2  │ Academy   │ 1  │──┼───▶│ 2  │ French  │
│ 3  │ Affair    │ 1  │──┘    └────┴─────────┘
└────┴───────────┴────┘

         Result: title + language.name

Problem 1: Solution

SELECT f.title, l.name AS language_name
FROM film f
JOIN language l ON f.language_id = l.language_id
ORDER BY f.title
LIMIT 10;

Key Points:

  • JOIN (same as INNER JOIN)
  • Table aliases: f for film, l for language
  • ON clause specifies the matching condition

Problem 1: Output

Problem 1 output screenshot

Problem 2: Customers and Their Cities

Task: Show customer names with their city.

Requirements:

  • Concatenate first_name and last_name as customer_name
  • Select city from the city table
  • Join chain: customeraddresscity
  • Order by last_name, then first_name, limit 10

Problem 2 output screenshot

Problem 2: Strategy - The Join Chain

Step 1: Map out the path between tables

┌──────────────┐      ┌──────────────┐      ┌──────────────┐
│   customer   │      │   address    │      │     city     │
├──────────────┤      ├──────────────┤      ├──────────────┤
│ customer_id  │      │ address_id   │      │ city_id (PK) │
│ first_name   │      │ address      │      │ city         │
│ last_name    │      │ city_id (FK) │─────▶│              │
│ address_id   │─────▶│              │      └──────────────┘
└──────────────┘      └──────────────┘

We need two JOINs to connect customer to city.

Problem 2: Building the Chain

Think of it as stepping stones:

Step 1: Start at customer
        customer ─────┐
                      │ JOIN on address_id
Step 2: Land on address
        address ──────┤
                      │ JOIN on city_id
Step 3: Arrive at city|
        city ─────────┘

Each JOIN connects to the next table in the chain.

Problem 2: Solution

SELECT c.first_name || ' ' || c.last_name AS customer_name,
       ci.city
FROM customer c
JOIN address a ON c.address_id = a.address_id
JOIN city ci ON a.city_id = ci.city_id
ORDER BY c.last_name, c.first_name
LIMIT 10;

Key Points:

  • String concatenation: || operator with ' ' space
  • Each JOIN adds one more table to the result
  • Order by the original columns (not the alias) for consistent sorting
  • Alias ci for city (avoid conflict with c for customer)

Problem 2: Output

Problem 2 output screenshot

Problem 3: Rental Count per Customer

Task: Count how many rentals each customer has made.

Requirements:

  • Customer name as customer_name
  • Count rentals as rental_count
  • Join customer to rental
  • Group by customer, order by count DESC
  • Limit to top 10

Problem 3 output screenshot

Problem 3: Strategy - JOIN + Aggregation

Concept: Combine rows first, then aggregate

    customer              rental
┌────┬──────────┐    ┌────────┬──────────┐
│ id │ name     │    │rent_id │ cust_id  │
├────┼──────────┤    ├────────┼──────────┤
│ 1  │ Mary     │◀───│ 101    │ 1        │
│    │          │◀───│ 102    │ 1        │
│    │          │◀───│ 103    │ 1        │
│ 2  │ John     │◀───│ 104    │ 2        │
└────┴──────────┘    └────────┴──────────┘

After JOIN: Mary appears 3 times, John appears 1 time
After GROUP BY + COUNT: Mary=3, John=1

Problem 3: The GROUP BY Rule

Important: When using aggregates, all non-aggregated columns must be in GROUP BY.

SELECT name, COUNT(*)     ← name is not aggregated
FROM ...
GROUP BY name             ← so name must be here

For customer_id + first_name + last_name:

GROUP BY c.customer_id, c.first_name, c.last_name

Problem 3: Solution

SELECT c.first_name || ' ' || c.last_name AS customer_name,
       COUNT(r.rental_id) AS rental_count
FROM customer c
JOIN rental r ON c.customer_id = r.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY rental_count DESC
LIMIT 10;

Key Points:

  • COUNT(r.rental_id) counts matching rentals
  • ORDER BY rental_count DESC puts highest first
  • Group by ID ensures unique customers

Problem 3: Output

Problem 3 output screenshot

Problem 4: Films Never Rented (LEFT JOIN)

Task: Find films that have never been rented.

Requirements:

  • Select film_id and title
  • Use LEFT JOIN: filminventoryrental
  • Filter where rental_id IS NULL
  • Order by title

Problem 4 output screenshot

Problem 4: Strategy - LEFT JOIN Concept

LEFT JOIN keeps all rows from the left table:

      film                    inventory
┌────┬───────────┐       ┌─────┬────────┐
│ id │ title     │       │inv_id│film_id│
├────┼───────────┤       ├─────┼────────┤
│ 1  │ Ace Gold  │◀──────│ 101 │ 1      │
│ 2  │ Academy   │◀──────│ 102 │ 2      │
│ 3  │ Affair    │       └─────┴────────┘
└────┴───────────┘       

LEFT JOIN Result:
┌────┬───────────┬─────────┐
│ id │ title     │ inv_id  │
├────┼───────────┼─────────┤
│ 1  │ Ace Gold  │ 101     │  ← has inventory
│ 2  │ Academy   │ 102     │  ← has inventory
│ 3  │ Affair    │ NULL    │  ← NO inventory!
└────┴───────────┴─────────┘

Problem 4: Finding the NULLs

The “anti-join” pattern:

LEFT JOIN + WHERE right_side IS NULL
= rows with NO match
┌────────────────────────────────────────────┐
│  LEFT JOIN keeps unmatched rows            │
│  WHERE ... IS NULL filters TO them         │
│                                            │
│  Result: Only rows that DON'T have         │
│          a matching record                 │
└────────────────────────────────────────────┘

This is how we find “films without rentals.”

Problem 4: Full Join Path

film ──LEFT JOIN──▶ inventory ──LEFT JOIN──▶ rental
                                              │
                                              ▼
                                    WHERE rental_id IS NULL

A film might have inventory but never been rented, so we need to check all the way to rental.

Problem 4: Solution

SELECT f.film_id, f.title
FROM film f
LEFT JOIN inventory i ON f.film_id = i.film_id
LEFT JOIN rental r ON i.inventory_id = r.inventory_id
WHERE r.rental_id IS NULL
ORDER BY f.title;

Key Points:

  • LEFT JOIN preserves all films
  • Films with no inventory OR no rentals will have NULL
  • IS NULL filter finds the unmatched rows

Problem 4: Output

Problem 4 output screenshot

Problem 5: Revenue by Category

Task: Calculate total revenue for each film category.

Requirements:

  • Category name
  • SUM(amount) as total_revenue
  • COUNT(rental_id) as rental_count
  • 6-table join chain
  • Group by category, order by revenue DESC

Problem 5 output screenshot

Problem 5: Strategy - The Long Chain

Map the complete path from payment to category:

payment ──▶ rental ──▶ inventory ──▶ film ──▶ film_category ──▶ category
   │          │           │          │            │               │
   └──rental_id           │          │            │               │
              └──inventory_id        │            │               │
                          └──film_id─┴──film_id───┘               │
                                                  └──category_id──┘

Each arrow = one JOIN operation.

Problem 5: Visual Table Map

┌─────────┐   ┌─────────┐   ┌────────────┐   ┌───────┐
│ payment │──▶│ rental  │──▶│ inventory  │──▶│ film  │
│         │   │         │   │            │   │       │
│ amount  │   │rental_id│   │inventory_id|   │film_id|
│rental_id│   │         │   │film_id     │   │       │
└─────────┘   └─────────┘   └────────────┘   └──┬────┘
                                                │
                            ┌───────────────────┘
                            ▼
                     ┌──────────────┐   ┌──────────┐
                     │film_category │──▶│ category │
                     │              │   │          │
                     │film_id       │   │ name     │
                     │category_id   │   │          │
                     └──────────────┘   └──────────┘

Problem 5: Building It Step by Step

-- Start with payment (has the money!)
FROM payment p

-- Connect to rental to know what was rented
JOIN rental r ON p.rental_id = r.rental_id

-- Connect to inventory to know which copy
JOIN inventory i ON r.inventory_id = i.inventory_id

-- Connect to film to know which movie
JOIN film f ON i.film_id = f.film_id

-- Connect to junction table for categories
JOIN film_category fc ON f.film_id = fc.film_id

-- Finally get the category name
JOIN category cat ON fc.category_id = cat.category_id

Problem 5: Solution

SELECT cat.name,
       SUM(p.amount) AS total_revenue,
       COUNT(r.rental_id) AS rental_count
FROM payment p
JOIN rental r ON p.rental_id = r.rental_id
JOIN inventory i ON r.inventory_id = i.inventory_id
JOIN film f ON i.film_id = f.film_id
JOIN film_category fc ON f.film_id = fc.film_id
JOIN category cat ON fc.category_id = cat.category_id
GROUP BY cat.name
ORDER BY total_revenue DESC;

Problem 5: Key Insights

Why start from payment?

  • Payment has the amount we need to SUM
  • Starting here ensures we only count paid rentals

Junction table pattern:

  • film_category connects films to categories (many-to-many)
  • Has film_id AND category_id as composite key

Problem 5: Output

Problem 5 output screenshot

Summary: JOIN Patterns

Pattern Use Case
Simple JOIN Connect 2 related tables
Join Chain Navigate through multiple tables
JOIN + GROUP BY Aggregate across relationships
LEFT JOIN + IS NULL Find unmatched records

Tips for Complex Joins

  1. Draw the schema - visualize table relationships
  2. Start from the data you need - work backwards
  3. One JOIN at a time - build incrementally
  4. Test each step - run partial queries
  5. Use aliases - keep your SQL readable

Questions?

Take time to practice these patterns.

Join mastery = SQL mastery!