DATA 503: FUNDAMENTALS OF DATA ENGINEERING
  • Lectures

On this page

  • From Design to Implementation
    • The Data Engineering Pipeline
      • What Comes Next
    • Naming Conventions
      • Why Naming Matters
      • PostgreSQL Naming Rules
      • The Case Sensitivity Trap
      • Best Practices for Naming
      • Naming: Tables vs Columns
    • Primary Keys
      • Recap: What Is a Primary Key?
      • Two Approaches to Primary Keys
      • Natural Keys
      • Natural Keys: Testing Uniqueness
      • Composite Natural Keys
      • Composite Keys: Testing Uniqueness
      • Surrogate Keys
      • Surrogate Keys: Auto-Increment in Action
      • Natural vs Surrogate: When to Use Which
      • Two Syntax Styles for PRIMARY KEY
    • Foreign Keys
      • Connecting Tables with Foreign Keys
      • Creating Foreign Key Relationships
      • Foreign Keys: Enforcing Referential Integrity
      • What Happens When You Delete a Parent Row?
      • ON DELETE Options
      • Foreign Key Design Patterns
    • CHECK Constraints
      • Validating Data with CHECK
      • CHECK: Practical Examples
      • When to Use CHECK Constraints
    • UNIQUE Constraints
      • Enforcing Uniqueness Beyond the Primary Key
      • UNIQUE: Testing the Constraint
      • UNIQUE vs PRIMARY KEY
    • NOT NULL Constraints
      • Requiring Values with NOT NULL
      • When to Use NOT NULL
    • Modifying Tables with ALTER TABLE
      • Adding and Removing Constraints
      • ALTER TABLE: NOT NULL
      • Common ALTER TABLE Operations
      • ALTER TABLE in Practice
    • Speeding Things Up: Indexes
      • What Is an Index?
      • Without an Index: Sequential Scan
      • Creating an Index
      • With an Index: Index Scan
      • When to Create Indexes
      • EXPLAIN ANALYZE: Your Performance Detective
      • Managing Indexes
    • Activity: Build a Veterinary Clinic Database
      • The Scenario
      • Part 1: Write the CREATE TABLE Statements (10 min)
      • Part 2: Compare and Discuss (5 min)
      • Part 3: One Possible Solution
      • Part 4: Extension Questions
    • Putting It All Together
      • The Complete Table Design Workflow
      • Example: Building a Music Database
      • Constraints Summary
      • What Is Next?
    • References
      • Sources

Other Formats

  • RevealJS
  • PDF

Lecture 05-1: Table Design and Constraints

DATA 503: Fundamentals of Data Engineering

Author
Affiliation

Lucas P. Cordova, Ph.D.

Willamette University

Published

February 9, 2026

Abstract

This lecture covers the practical implementation of database table design in PostgreSQL. After learning to normalize data and design relational schemas, we now focus on building those tables with proper constraints. Topics include naming conventions, primary keys (natural vs surrogate), foreign keys and referential integrity, CHECK constraints, UNIQUE constraints, NOT NULL constraints, modifying tables with ALTER TABLE, and speeding up queries with indexes.

From Design to Implementation

You have learned to import raw data into a single table and design a normalized schema. Now we build the tables that bring that design to life. Think of it as the difference between an architect’s blueprint and actually pouring the concrete.

The Data Engineering Pipeline

What Comes Next

Today we learn the DDL (Data Definition Language) skills to implement your normalized designs:

After today, the next step is migrating data from your staging table into the new structure using INSERT, UPDATE, and DELETE (Chapter 9).

Naming Conventions

Why Naming Matters

Good naming conventions make your database self-documenting. A well-named schema tells you what it contains without reading a single comment.

Bad naming leads to:

  • Confusion when writing queries
  • Bugs from misremembering column names
  • Onboarding headaches for new team members
  • Passive-aggressive comments in code reviews

PostgreSQL Naming Rules

PostgreSQL has specific rules for identifiers (table and column names):

  • Can contain letters, digits, and underscores
  • Must begin with a letter or underscore
  • Are case-insensitive by default (folded to lowercase)
  • Maximum length of 63 characters
-- These all refer to the SAME table:
CREATE TABLE customers (...);
CREATE TABLE Customers (...);  -- Error: already exists!
CREATE TABLE CUSTOMERS (...);  -- Error: already exists!

PostgreSQL treats your shouting the same as your whispering.

The Case Sensitivity Trap

PostgreSQL folds unquoted identifiers to lowercase. If you use double quotes, the name becomes case-sensitive:

CREATE TABLE "Customers" (...);  -- Creates "Customers" (capital C)
SELECT * FROM customers;          -- Looks for "customers" (lowercase)
SELECT * FROM "Customers";        -- Finds "Customers" (capital C)
Warning

Avoid double-quoted identifiers. They create maintenance headaches because every query must use the exact casing with quotes. You will curse your past self at 2 AM.

Best Practices for Naming

Convention Example Avoid
Use snake_case trees_planted treesPlanted, TreesPlanted
Be descriptive arrival_time arv_tm
Use plurals for tables teachers, vehicles teacher, vehicle
Include context tire_sizes_2017 data_table_1
Prefix dates report_2026_01_15 15_01_2026_report

Naming: Tables vs Columns

  • Tables
  • Columns
  • Junction Tables

Tables represent collections of entities. Use plural nouns:

  • students (not student)
  • departments (not department)
  • order_items (not order_item)

Columns represent attributes. Use singular, descriptive names:

  • first_name (not first_names)
  • hire_date (not dates_hired)
  • salary_usd (not sal)

For many-to-many relationships, combine both table names:

  • student_courses
  • employee_projects
  • author_books

Primary Keys

Recap: What Is a Primary Key?

A primary key uniquely identifies each row in a table. It provides:

  • Uniqueness: No two rows share the same key value
  • Non-nullability: The key value cannot be NULL
  • Identity: A reliable way to reference a specific row

Every table in a well-designed database should have a primary key.

Two Approaches to Primary Keys

Natural Keys

A natural key uses data that already exists and naturally identifies the entity.

CREATE TABLE natural_key_example (
    license_id varchar(10) CONSTRAINT license_key PRIMARY KEY,
    first_name varchar(50),
    last_name varchar(50)
);

Here license_id is a real-world identifier. Each person has exactly one, and it is unique. In theory. In practice, natural keys have a habit of being less unique than you were promised.

Natural Keys: Testing Uniqueness

Let us see what happens when we violate the primary key:

INSERT INTO natural_key_example (license_id, first_name, last_name)
VALUES ('T229901', 'Lynn', 'Malero');

INSERT INTO natural_key_example (license_id, first_name, last_name)
VALUES ('T229901', 'Sam', 'Tracy');

The second INSERT fails:

ERROR: duplicate key value violates unique constraint "license_key"
DETAIL: Key (license_id)=(T229901) already exists.

The database enforces uniqueness automatically. It is polite about it, but firm.

Composite Natural Keys

Sometimes no single column is unique, but a combination is:

CREATE TABLE natural_key_composite_example (
    student_id varchar(10),
    school_day date,
    present boolean,
    CONSTRAINT student_key PRIMARY KEY (student_id, school_day)
);

A student can only have one attendance record per day. Neither student_id nor school_day is unique alone, but together they form a unique identifier.

Composite Keys: Testing Uniqueness

INSERT INTO natural_key_composite_example (student_id, school_day, present)
VALUES(775, '1/22/2017', 'Y');

INSERT INTO natural_key_composite_example (student_id, school_day, present)
VALUES(775, '1/23/2017', 'Y');  -- OK: different day

INSERT INTO natural_key_composite_example (student_id, school_day, present)
VALUES(775, '1/23/2017', 'N');  -- FAILS: same student + day
ERROR: duplicate key value violates unique constraint "student_key"
DETAIL: Key (student_id, school_day)=(775, 2017-01-23) already exists.

Surrogate Keys

A surrogate key is a system-generated value with no real-world meaning:

CREATE TABLE surrogate_key_example (
    order_number bigserial,
    product_name varchar(50),
    order_date date,
    CONSTRAINT order_key PRIMARY KEY (order_number)
);

PostgreSQL’s serial types auto-generate incrementing integers:

Type Range
smallserial 1 to 32,767
serial 1 to 2,147,483,647
bigserial 1 to 9.2 quintillion

Surrogate Keys: Auto-Increment in Action

INSERT INTO surrogate_key_example (product_name, order_date)
VALUES ('Beachball Polish', '2015-03-17'),
       ('Wrinkle De-Atomizer', '2017-05-22'),
       ('Flux Capacitor', '1985-10-26');

SELECT * FROM surrogate_key_example;
 order_number |    product_name     | order_date
--------------+---------------------+------------
            1 | Beachball Polish    | 2015-03-17
            2 | Wrinkle De-Atomizer | 2017-05-22
            3 | Flux Capacitor      | 1985-10-26

Notice we never specified order_number. PostgreSQL generated it automatically. One less thing to argue about in a design meeting.

Natural vs Surrogate: When to Use Which

Factor Natural Key Surrogate Key
Meaning Has real-world meaning Meaningless identifier
Stability Can change (email, name) Never changes
Size Varies (could be long) Fixed, compact
Performance Depends on data type Fast (integer)
Universality Not always available Always available
Tip

Practical guidance: Use surrogate keys (serial/bigserial) as primary keys for most tables. If a natural key exists and is truly stable (ISBN, SSN), consider it. When in doubt, surrogate wins. Nobody has ever been fired for using a serial primary key.

Two Syntax Styles for PRIMARY KEY

You can declare a primary key inline or as a table constraint:

  • Inline (Column Level)
  • Table Level
CREATE TABLE example (
    id serial CONSTRAINT example_key PRIMARY KEY,
    name varchar(50)
);

Best for single-column keys.

CREATE TABLE example (
    id serial,
    name varchar(50),
    CONSTRAINT example_key PRIMARY KEY (id)
);

Required for composite keys. Also works for single-column keys.

Foreign Keys

Connecting Tables with Foreign Keys

A foreign key is a column in one table that references the primary key of another table. It enforces referential integrity: you cannot reference a row that does not exist.

Creating Foreign Key Relationships

CREATE TABLE licenses (
    license_id varchar(10),
    first_name varchar(50),
    last_name varchar(50),
    CONSTRAINT licenses_key PRIMARY KEY (license_id)
);

CREATE TABLE registrations (
    registration_id varchar(10),
    registration_date date,
    license_id varchar(10) REFERENCES licenses (license_id),
    CONSTRAINT registration_key PRIMARY KEY (registration_id, license_id)
);

The REFERENCES keyword creates the foreign key relationship. It is essentially a contract: “I promise this value exists over there, and I would like the database to hold me to it.”

Foreign Keys: Enforcing Referential Integrity

-- This works: license T229901 exists
INSERT INTO licenses (license_id, first_name, last_name)
VALUES ('T229901', 'Lynn', 'Malero');

INSERT INTO registrations (registration_id, registration_date, license_id)
VALUES ('A203391', '3/17/2017', 'T229901');
-- This FAILS: license T000001 does not exist
INSERT INTO registrations (registration_id, registration_date, license_id)
VALUES ('A75772', '3/17/2017', 'T000001');
ERROR: insert or update on table "registrations" violates foreign key
constraint "registrations_license_id_fkey"
DETAIL: Key (license_id)=(T000001) is not present in table "licenses".

What Happens When You Delete a Parent Row?

By default, PostgreSQL prevents deleting a row from the parent table if child rows reference it. This protects data integrity but can be inconvenient.

ON DELETE CASCADE tells PostgreSQL to automatically delete child rows when the parent is deleted:

CREATE TABLE registrations (
    registration_id varchar(10),
    registration_date date,
    license_id varchar(10) REFERENCES licenses (license_id)
        ON DELETE CASCADE,
    CONSTRAINT registration_key PRIMARY KEY (registration_id, license_id)
);

ON DELETE Options

Option Behavior
RESTRICT (default) Prevent deletion if children exist
CASCADE Delete children automatically
SET NULL Set foreign key to NULL in children
SET DEFAULT Set foreign key to default value in children
Warning

Use CASCADE carefully. Deleting one parent row could cascade through multiple tables and remove far more data than intended. CASCADE is the database equivalent of pulling a loose thread on a sweater.

Foreign Key Design Patterns

One-to-Many: A department has many employees. Put department_id in the employees table.

Many-to-Many: Students take many courses; courses have many students. Create student_courses with FKs to both.

One-to-One: A person has one passport. Put person_id in passports with a UNIQUE constraint.

CHECK Constraints

Validating Data with CHECK

A CHECK constraint ensures that column values meet a logical condition. If the condition evaluates to false, the row is rejected.

CREATE TABLE check_constraint_example (
    user_id bigserial,
    user_role varchar(50),
    salary integer,
    CONSTRAINT user_id_key PRIMARY KEY (user_id),
    CONSTRAINT check_role_in_list
        CHECK (user_role IN('Admin', 'Staff')),
    CONSTRAINT check_salary_not_zero
        CHECK (salary > 0)
);

CHECK: Practical Examples

CHECK constraints can enforce a wide variety of business rules:

-- Graduation requirements
CONSTRAINT grad_check
    CHECK (credits >= 120 AND tuition = 'Paid')

-- Pricing logic
CONSTRAINT sale_check
    CHECK (sale_price < retail_price)

-- Date validation
CONSTRAINT date_check
    CHECK (end_date > start_date)

-- Rating boundaries
CONSTRAINT rating_check
    CHECK (rating BETWEEN 1 AND 5)

When to Use CHECK Constraints

Scenario Example
Enumerated values status IN ('active', 'inactive', 'pending')
Numeric ranges age BETWEEN 0 AND 150
Comparison between columns end_date > start_date
Non-negative values quantity >= 0
String patterns email LIKE '%@%.%'
Tip

CHECK constraints catch bad data at the database level, regardless of which application inserts it. This is your last line of defense. Applications come and go, but the database remembers.

UNIQUE Constraints

Enforcing Uniqueness Beyond the Primary Key

A UNIQUE constraint ensures no duplicate values exist in a column (or combination of columns), separate from the primary key.

CREATE TABLE unique_constraint_example (
    contact_id bigserial CONSTRAINT contact_id_key PRIMARY KEY,
    first_name varchar(50),
    last_name varchar(50),
    email varchar(200),
    CONSTRAINT email_unique UNIQUE (email)
);

UNIQUE: Testing the Constraint

INSERT INTO unique_constraint_example (first_name, last_name, email)
VALUES ('Samantha', 'Lee', 'slee@example.org');

INSERT INTO unique_constraint_example (first_name, last_name, email)
VALUES ('Betty', 'Diaz', 'bdiaz@example.org');

-- This fails: duplicate email
INSERT INTO unique_constraint_example (first_name, last_name, email)
VALUES ('Sasha', 'Lee', 'slee@example.org');
ERROR: duplicate key value violates unique constraint "email_unique"
DETAIL: Key (email)=(slee@example.org) already exists.

UNIQUE vs PRIMARY KEY

Feature PRIMARY KEY UNIQUE
Uniqueness Yes Yes
Allows NULL No Yes (one NULL per column)
Per table Only one Multiple allowed
Creates index Yes Yes

A table has one primary key but can have many UNIQUE constraints. Use UNIQUE for candidate keys that are not the primary key.

NOT NULL Constraints

Requiring Values with NOT NULL

NOT NULL prevents a column from containing NULL values. This is essential for columns that must always have data:

CREATE TABLE not_null_example (
    student_id bigserial,
    first_name varchar(50) NOT NULL,
    last_name varchar(50) NOT NULL,
    CONSTRAINT student_id_key PRIMARY KEY (student_id)
);

Any INSERT that omits first_name or last_name (or sets them to NULL) will fail.

When to Use NOT NULL

Apply NOT NULL to columns where missing data would be meaningless or harmful:

Always NOT NULL Often Nullable
Names (first, last) Middle name
Email (if required) Phone number
Created timestamp Updated timestamp
Foreign keys (usually) Optional description
Status fields Notes, comments
Tip

Default to NOT NULL and only allow NULLs when there is a legitimate reason for missing data. This catches bugs early. Future you will appreciate the strictness, even if present you finds it annoying.

Modifying Tables with ALTER TABLE

Adding and Removing Constraints

You do not always get the design right on the first try. Nobody does. If you did, you would be suspicious. ALTER TABLE lets you modify constraints after creation:

-- Remove a constraint
ALTER TABLE not_null_example
    DROP CONSTRAINT student_id_key;

-- Add a constraint back
ALTER TABLE not_null_example
    ADD CONSTRAINT student_id_key PRIMARY KEY (student_id);

ALTER TABLE: NOT NULL

NOT NULL constraints use a different syntax because they are column properties, not named constraints:

-- Remove NOT NULL
ALTER TABLE not_null_example
    ALTER COLUMN first_name DROP NOT NULL;

-- Add NOT NULL back
ALTER TABLE not_null_example
    ALTER COLUMN first_name SET NOT NULL;

Common ALTER TABLE Operations

Operation Syntax
Drop constraint ALTER TABLE t DROP CONSTRAINT c;
Add constraint ALTER TABLE t ADD CONSTRAINT c ...;
Drop NOT NULL ALTER TABLE t ALTER COLUMN col DROP NOT NULL;
Set NOT NULL ALTER TABLE t ALTER COLUMN col SET NOT NULL;
Add column ALTER TABLE t ADD COLUMN col type;
Drop column ALTER TABLE t DROP COLUMN col;
Rename column ALTER TABLE t RENAME COLUMN old TO new;
Rename table ALTER TABLE t RENAME TO new_name;

ALTER TABLE in Practice

A common workflow when evolving a database:

-- Scenario: You realize email should be unique AND required
ALTER TABLE contacts
    ALTER COLUMN email SET NOT NULL;

ALTER TABLE contacts
    ADD CONSTRAINT email_unique UNIQUE (email);

Speeding Things Up: Indexes

What Is an Index?

An index is a data structure that speeds up data retrieval at the cost of additional storage and slower writes.

Think of it like the index at the back of a textbook: instead of reading every page to find “normalization,” you look it up in the index and jump directly to the right page. Databases without indexes are just very patient.

Without an Index: Sequential Scan

PostgreSQL reads every single row in the table:

EXPLAIN ANALYZE SELECT * FROM new_york_addresses
WHERE street = 'BROADWAY';
Seq Scan on new_york_addresses
  (cost=0.00..20730.68 rows=3730 width=46)
  (actual time=0.055..289.426 rows=3336 loops=1)
  Filter: ((street)::text = 'BROADWAY'::text)
  Rows Removed by Filter: 937038
Planning time: 0.617 ms
Execution time: 289.838 ms

289 ms to scan ~940,000 rows. Not terrible for a single query. Now imagine a thousand users running it simultaneously. Suddenly 289 ms feels less charming.

Creating an Index

CREATE INDEX street_idx ON new_york_addresses (street);

This builds a B-tree index (the default) on the street column.

With an Index: Index Scan

EXPLAIN ANALYZE SELECT * FROM new_york_addresses
WHERE street = 'BROADWAY';
Bitmap Heap Scan on new_york_addresses
  (cost=65.80..5962.17 rows=2758 width=46)
  (actual time=1.792..9.816 rows=3336 loops=1)
  -> Bitmap Index Scan on street_idx
     (cost=0.00..65.11 rows=2758 width=0)
     (actual time=1.253..1.253 rows=3336 loops=1)
     Index Cond: ((street)::text = 'BROADWAY'::text)
Planning time: 0.163 ms
Execution time: 5.887 ms

5.9 ms vs 289 ms. That is a 49x speedup from one line of SQL. Few things in life offer that kind of return on investment.

When to Create Indexes

Create Index When Skip Index When
Column used in WHERE clauses Table is small (< 1000 rows)
Column used in JOIN conditions Column has few distinct values
Column used in ORDER BY Table has heavy INSERT/UPDATE load
Foreign key columns You rarely query the column
Tip

PostgreSQL automatically creates indexes on PRIMARY KEY and UNIQUE columns. You only need to manually create indexes on other frequently queried columns.

EXPLAIN ANALYZE: Your Performance Detective

EXPLAIN ANALYZE shows you exactly how PostgreSQL executes a query:

EXPLAIN ANALYZE SELECT * FROM my_table
WHERE some_column = 'value';

Key things to look for:

  • Seq Scan: Reading every row (potentially slow)
  • Index Scan / Bitmap Index Scan: Using an index (fast)
  • Execution time: Total query time in milliseconds
  • Rows Removed by Filter: How many rows were checked but not returned

Managing Indexes

-- Create an index
CREATE INDEX idx_name ON table_name (column_name);

-- Create a multi-column index
CREATE INDEX idx_name ON table_name (col1, col2);

-- Remove an index
DROP INDEX idx_name;

Indexes are not free:

  • They consume disk space
  • They slow down INSERT, UPDATE, and DELETE operations
  • Too many indexes can hurt overall performance

Balance is key. Index the columns you query most. Indexing everything is like highlighting every word in a textbook. At that point, nothing is highlighted.

Activity: Build a Veterinary Clinic Database

The Scenario

A small veterinary clinic needs a database to track their patients, owners, and visits. They currently have everything in a single spreadsheet:

owner_name owner_email owner_phone pet_name species breed visit_date reason cost
Maria Lopez maria@email.com 503-555-0101 Luna Dog Labrador 2026-01-15 Checkup 75.00
Maria Lopez maria@email.com 503-555-0101 Luna Dog Labrador 2026-02-01 Vaccination 120.00
Maria Lopez maria@email.com 503-555-0101 Whiskers Cat Siamese 2026-01-20 Dental 250.00
James Park james@email.com 541-555-0202 Buddy Dog Golden 2026-01-18 Surgery 800.00

A previous data engineer already normalized this into three entities:

owner(owner_id, owner_name, email, phone)

pet(pet_id, pet_name, species, breed, owner_id)
    FK: owner_id -> owner(owner_id)

visit(visit_id, visit_date, reason, cost, pet_id)
    FK: pet_id -> pet(pet_id)

Your job: implement this design in PostgreSQL with proper constraints.

Part 1: Write the CREATE TABLE Statements (10 min)

Working individually, write the SQL to create all three tables. For each table, decide:

  1. Primary key: Natural or surrogate? What type?
  2. Foreign keys: Which columns reference other tables? What ON DELETE behavior?
  3. NOT NULL: Which columns must always have a value?
  4. CHECK: Are there any value constraints? (Think about cost, species, etc.)
  5. UNIQUE: Should any non-key columns be unique?
Tip

Remember to create tables in the right order. You cannot reference a table that does not exist yet.

Part 2: Compare and Discuss (5 min)

With a partner, compare your CREATE TABLE statements. Discuss:

  • Did you make the same choices for primary keys?
  • Where did your NOT NULL decisions differ?
  • Did either of you add CHECK constraints the other missed?
  • What ON DELETE behavior did you choose for the foreign keys, and why?

Part 3: One Possible Solution

  • owners
  • pets
  • visits
  • indexes
CREATE TABLE owners (
    owner_id bigserial,
    owner_name varchar(100) NOT NULL,
    email varchar(200) NOT NULL,
    phone varchar(20),
    CONSTRAINT owner_key PRIMARY KEY (owner_id),
    CONSTRAINT owner_email_unique UNIQUE (email)
);

Key decisions:

  • Surrogate key (bigserial) because names are not unique
  • Email is UNIQUE and NOT NULL (primary contact method)
  • Phone is nullable (not everyone provides one)
CREATE TABLE pets (
    pet_id bigserial,
    pet_name varchar(100) NOT NULL,
    species varchar(50) NOT NULL,
    breed varchar(100),
    owner_id bigint NOT NULL REFERENCES owners (owner_id),
    CONSTRAINT pet_key PRIMARY KEY (pet_id),
    CONSTRAINT check_species CHECK (species IN ('Dog', 'Cat', 'Bird',
        'Rabbit', 'Reptile', 'Other'))
);

Key decisions:

  • owner_id is NOT NULL (every pet must have an owner)
  • CHECK on species limits to known values
  • Breed is nullable (not always known, especially for mixed breeds)
  • No ON DELETE CASCADE: we do not want deleting an owner to silently remove their pets
CREATE TABLE visits (
    visit_id bigserial,
    visit_date date NOT NULL,
    reason text NOT NULL,
    cost numeric(8,2) NOT NULL,
    pet_id bigint NOT NULL REFERENCES pets (pet_id),
    CONSTRAINT visit_key PRIMARY KEY (visit_id),
    CONSTRAINT check_cost_positive CHECK (cost > 0),
    CONSTRAINT check_date_not_future
        CHECK (visit_date <= CURRENT_DATE)
);

Key decisions:

  • Cost is numeric(8,2) for precise currency, with a CHECK ensuring positive values
  • Date cannot be in the future (catches data entry errors)
  • pet_id is NOT NULL (every visit is for a specific pet)
-- Speed up lookups by owner
CREATE INDEX idx_pets_owner ON pets (owner_id);

-- Speed up lookups by pet (for visit history)
CREATE INDEX idx_visits_pet ON visits (pet_id);

-- Speed up searches by visit date
CREATE INDEX idx_visits_date ON visits (visit_date);

Foreign key columns and frequently filtered columns are good index candidates.

Part 4: Extension Questions

If time allows, discuss with your partner:

  1. How would you handle a pet that changes owners (adopted/transferred)?
  2. What if a visit involves multiple pets (e.g., wellness check for all of an owner’s animals)?
  3. Should reason be a free-text field or a lookup table? What are the trade-offs?
  4. How would you add a veterinarian entity to track which vet handled each visit?

Putting It All Together

The Complete Table Design Workflow

Example: Building a Music Database

From a normalized design, here is the implementation:

CREATE TABLE albums (
    album_id bigserial,
    catalog_code varchar(100),
    title text NOT NULL,
    artist text NOT NULL,
    release_date date,
    genre varchar(40),
    description text,
    CONSTRAINT album_key PRIMARY KEY (album_id),
    CONSTRAINT catalog_unique UNIQUE (catalog_code)
);

CREATE TABLE songs (
    song_id bigserial,
    title text NOT NULL,
    artist text NOT NULL,
    album_id bigint REFERENCES albums (album_id),
    CONSTRAINT song_key PRIMARY KEY (song_id)
);

CREATE INDEX idx_songs_album ON songs (album_id);

Constraints Summary

Constraint Purpose Syntax
PRIMARY KEY Unique row identifier CONSTRAINT name PRIMARY KEY (col)
FOREIGN KEY Referential integrity col type REFERENCES table (col)
CHECK Value validation CONSTRAINT name CHECK (expr)
UNIQUE No duplicates CONSTRAINT name UNIQUE (col)
NOT NULL Requires a value col type NOT NULL

What Is Next?

Now that you can build tables with proper constraints, the next step is migrating data from your staging table into the new structure.

Chapter 9 covers:

  • INSERT INTO ... SELECT for data migration
  • UPDATE for modifying existing data
  • DELETE for removing rows
  • Transaction safety with BEGIN / COMMIT / ROLLBACK

References

Sources

  1. DeBarros, A. (2022). Practical SQL: A Beginner’s Guide to Storytelling with Data (2nd ed.). No Starch Press. Chapter 7: Table Design That Works for You.

  2. PostgreSQL Documentation. “CREATE TABLE.” https://www.postgresql.org/docs/current/sql-createtable.html

  3. PostgreSQL Documentation. “Indexes.” https://www.postgresql.org/docs/current/indexes.html

  4. PostgreSQL Documentation. “Constraints.” https://www.postgresql.org/docs/current/ddl-constraints.html