
DATA 351: Data Management with SQL
January 12, 2026
Data science is fundamentally tied to the data that it analyzes.
Before any analysis can be done, we need to know:
All data storage solutions need to be:
This is the work of a data engineer.
There are many different ways information can be stored, with varying trade-offs.
Suppose you wanted to keep track of your friends’ birthdays:

| First Name | Last Name | Birthday |
|---|---|---|
| Eleven | Hopper | 4/2/2004 |
| Dustin | Henderson | 8/23/2003 |
| Max | Mayfield | 12/14/2005 |
You could store the information in a table or comma-separated values (CSV) file:
| First Name | Last Name | Birthday |
|---|---|---|
| Eleven | Hopper | 4/2/2004 |
| Dustin | Henderson | 8/23/2003 |
| Max | Mayfield | 12/14/2005 |
Alternatively, you might use some other form of common data structure like JSON:
Suppose now you would also like to keep track of what courses they are currently taking at Hawkins High, and what times those courses are held.
Suppose each friend is taking 2-3 classes, some of which overlap.
This significantly complicates both storage methods.
We cannot store tables inside of tables, so we usually need to duplicate information over multiple rows:
| First Name | Last Name | Birthday | Class | Day | Time |
|---|---|---|---|---|---|
| Eleven | Hopper | 4/2/2004 | CHEM101 | MWF | 1:00pm |
| Eleven | Hopper | 4/2/2004 | AV101 | MWF | 9:00am |
| Dustin | Henderson | 8/23/2003 | CHEM101 | MWF | 1:00pm |
| Dustin | Henderson | 8/23/2003 | PHYS201 | TTh | 1:00pm |
Duplication is generally bad!
We still have duplication issues with JSON as well:
[
{"First Name": "Eleven", "Last Name": "Hopper",
"Birthday": "4/2/2004",
"Classes": [
{"class": "CHEM101", "day": "MWF", "time":"1:00pm"},
{"class": "AV101", "day": "MWF", "time":"9:00am"}
]},
{"First Name": "Dustin", "Last Name": "Henderson",
"Birthday": "8/23/2003",
"Classes": [
{"class": "CHEM101", "day": "MWF", "time":"1:00pm"},
{"class": "PHYS201", "day": "TTh", "time":"1:00pm"}
]}
]Class information like day and time is duplicated across friends.
One solution is realizing that we are trying to actually keep track of two things: friends and classes.
So we break things up into two tables, and then create relationships between them.
This is the core of what occurs in a relational database.
Friends Table:
| First Name | Last Name | Birthday |
|---|---|---|
| Eleven | Hopper | 4/2/2004 |
| Dustin | Henderson | 8/23/2003 |
| Max | Mayfield | 12/14/2005 |
Classes Table:
| Class | Day | Time |
|---|---|---|
| CHEM101 | MWF | 1:00pm |
| AV101 | MWF | 9:00am |
| PHYS201 | TTh | 1:00pm |
| HIST150 | MWF | 12:00pm |
In general, you would use a third table to represent all the linkages.
Unique ID keys are used to connect the different tables.

In a relational database we can specify what type of information is allowed in each column:
int for integersvarchar for textdate for datestime for timesboolean for true/false valuesnumeric for decimal numbersThis keeps information consistent and predictable.
SQL is a language that allows you to define and query relational databases.
Pronunciation:
SQL is often called “Structured Query Language,” but the name is misleading:
SQL comes in several variants, though the core standards are governed by ANSI and ISO, so none stray too far from the standards.
Common SQL variants include:
Skills transfer easily between variants.
PostgreSQL operates on a server model where clients contact the server and ask it to manipulate or query a particular database.
Several ways you can interact with the server:
We will use Beekeeper Studio or pgAdmin in this course.
SQL has commands to help with administration as well as creating, manipulating, and querying tables.
New installs come with a database called postgres, but it is good practice to create a new one and leave the default untouched.
Creating tables is one of the more fundamental actions you may need to take with a database.
Need to specify:
SQL requires no special formatting for capitalization or tabbing, but conventions help readability:
Tables are initially empty. You add data by inserting new values into the columns.
New rows are concatenated to the end of the table.
Once data is in the table, you can retrieve it using SELECT:
| first_name | last_name | birthday |
|---|---|---|
| Eleven | Hopper | 2004-04-02 |
| Dustin | Henderson | 2003-08-23 |
| Max | Mayfield | 2005-12-14 |
Data Management Fundamentals:
SQL Basics:
Reading: Chapter 1 and Chapter 2 of Practical SQL
Topics: