Lecture 05: Math Operators and Functions

DATA 351: Data Management with SQL

Lucas P. Cordova, Ph.D.

Willamette University

January 26, 2026

Math Operators and Functions

SQL as a Calculator

PostgreSQL can do math! Let’s start simple:

SELECT 2 + 2;

?column?
4

No table needed. SQL evaluates the expression and returns the result.

Basic Math Operators

Operator	Operation	Example	Result
`+`	Addition	`5 + 3`	8
`-`	Subtraction	`10 - 4`	6
`*`	Multiplication	`6 * 7`	42
`/`	Division	`15 / 4`	3
`%`	Modulo (remainder)	`15 % 4`	3

Wait, what? 👀

15 / 4 = 3? d Yes! Integer division truncates. 🤯

Integer Division Trap

SELECT 15 / 4;        -- Returns 3 (integer division)
SELECT 15.0 / 4;      -- Returns 3.75 (decimal division)
SELECT 15 / 4.0;      -- Returns 3.75 (decimal division)
SELECT 15::NUMERIC / 4;  -- Returns 3.75 (cast to numeric)

Note

Rule: If both operands are integers, result is integer.

Make at least one operand a decimal to get decimal results.

Hands-On: Math with Employee Data

Calculate annual bonus (10% of salary):

SELECT 
    full_name,
    salary_usd,
    salary_usd * 0.10 AS annual_bonus
FROM employees;

full_name	salary_usd	annual_bonus
Michael Scott	75000.00	7500.0000
Dwight Schrute	62000.00	6200.0000
Pam Beesly	42000.00	4200.0000

Calculating Percent Change

The formula: ((new - old) / old) * 100

Example: If salary goes from $50,000 to $55,000:

SELECT ((55000 - 50000) / 50000.0) * 100 AS pct_increase;

pct_increase
10.0

A 10% raise. (Nice!)

Hands-On: Salary Per Year of Experience

How much does each employee earn per year of experience?

SELECT 
    full_name,
    salary_usd,
    years_experience,
    ROUND(salary_usd / years_experience, 2) AS salary_per_year
FROM employees
ORDER BY salary_per_year DESC;

full_name	salary_usd	years_experience	salary_per_year
Pam Beesly	42000.00	8	5250.00
Jim Halpert	61000.00	10	6100.00
Dwight Schrute	62000.00	12	5166.67

The ROUND() Function

ROUND(value, decimal_places) rounds to specified precision:

SELECT ROUND(3.14159, 2);   -- Returns 3.14
SELECT ROUND(3.14159, 0);   -- Returns 3
SELECT ROUND(3.5, 0);       -- Returns 4 (rounds up)
SELECT ROUND(2.5, 0);       -- Returns 3 (banker's rounding)

Note

Always ROUND money calculations for clean output!

The ABS() Function

ABS() returns the absolute value (distance from zero):

SELECT ABS(-42);    -- Returns 42
SELECT ABS(42);     -- Returns 42
SELECT ABS(0);      -- Returns 0

Using ABS()

Use case: Finding differences regardless of direction:

SELECT 
    full_name,
    salary_usd,
    ABS(salary_usd - 55000) AS distance_from_average
FROM employees;

Hands-On: Distance from Average Salary

Let’s find how far each employee’s salary is from $55,000:

SELECT 
    full_name,
    salary_usd,
    salary_usd - 55000 AS difference,
    ABS(salary_usd - 55000) AS absolute_difference
FROM employees
ORDER BY absolute_difference DESC;

full_name	salary_usd	difference	absolute_difference
Michael Scott	75000.00	20000.00	20000.00
Pam Beesly	42000.00	-13000.00	13000.00
Dwight Schrute	62000.00	7000.00	7000.00

Exponents and Roots

Function	Description	Example	Result
`^`	Exponentiation	`2 ^ 3`	8
`\\|/`	Square root	`\\|/ 16`	4
`\\|\\|/`	Cube root	`\\|\\|/ 27`	3
`SQRT()`	Square root (function)	`SQRT(16)`	4

SELECT 
    |/ 25 AS square_root,
    SQRT(25) AS also_square_root,
    2 ^ 10 AS two_to_the_tenth;

Exercise: Calculate Commission

Sales employees have a commission rate. Calculate their potential commission on a $10,000 sale:

SELECT 
    full_name,
    commission_rate,
    -- Your calculation here: commission on $10,000 sale
FROM employees
WHERE commission_rate IS NOT NULL;

Hint: Commission = sale_amount * commission_rate

Take 2 minutes.

Exercise Solution: Commission Calculation

SELECT 
    full_name,
    commission_rate,
    10000 * commission_rate AS commission_on_10k
FROM employees
WHERE commission_rate IS NOT NULL;

full_name	commission_rate	commission_on_10k
Dwight Schrute	0.08	800.00
Jim Halpert	0.07	700.00
Stanley Hudson	0.05	500.00

Dwight’s higher rate reflects his #1 salesman status. Obviously.

Aggregate Functions

What Are Aggregate Functions?

Aggregate functions compute a single result from multiple rows.

The Big Five Aggregates

Function	Description	Example
`SUM()`	Total of all values	Total payroll
`AVG()`	Average (mean)	Average salary
`COUNT()`	Number of rows	How many employees
`MIN()`	Smallest value	Lowest salary
`MAX()`	Largest value	Highest salary

`SUM()`: Total Values

What is our total payroll?

SELECT SUM(salary_usd) AS total_payroll
FROM employees;

total_payroll
452000.00

We spend $452,000 on salaries. (Michael probably thinks he deserves half.)

`AVG()`: Calculate the Mean

What is the average salary?

SELECT AVG(salary_usd) AS avg_salary
FROM employees;

avg_salary
56500.000000

That is a lot of decimal places. Let’s fix that:

SELECT ROUND(AVG(salary_usd), 2) AS avg_salary
FROM employees;

avg_salary
56500.00

COUNT(): How Many?

Three ways to count:

-- Count all rows
SELECT COUNT(*) FROM employees;

-- Count non-NULL values in a column
SELECT COUNT(commission_rate) FROM employees;

-- Count distinct values
SELECT COUNT(DISTINCT department) FROM employees;

count(*)	count(commission)	count(distinct dept)
8	3	4

Only 3 employees have commission rates!

MIN() and MAX(): The Extremes

Salary range:

SELECT 
    MIN(salary_usd) AS lowest_salary,
    MAX(salary_usd) AS highest_salary,
    MAX(salary_usd) - MIN(salary_usd) AS salary_range
FROM employees;

lowest_salary	highest_salary	salary_range
42000.00	75000.00	33000.00

Pam makes the least, Michael makes the most. Shocking.

Combining Multiple Aggregates

You can calculate several aggregates in one query:

SELECT 
    COUNT(*) AS num_employees,
    SUM(salary_usd) AS total_payroll,
    ROUND(AVG(salary_usd), 2) AS avg_salary,
    MIN(salary_usd) AS min_salary,
    MAX(salary_usd) AS max_salary
FROM employees;

num_employees	total_payroll	avg_salary	min_salary	max_salary
8	452000.00	56500.00	42000.00	75000.00

GROUP BY: Aggregates by Category

What if we want average salary BY DEPARTMENT?

SELECT 
    department,
    COUNT(*) AS num_employees,
    ROUND(AVG(salary_usd), 2) AS avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;

department	num_employees	avg_salary
Management	1	75000.00
Sales	3	60333.33
Accounting	3	51333.33
Reception	1	42000.00

How GROUP BY Works

GROUP BY splits data into buckets, then aggregates each bucket separately.

HAVING: Filter After Grouping

Show only departments with more than 1 employee:

SELECT 
    department,
    COUNT(*) AS num_employees,
    ROUND(AVG(salary_usd), 2) AS avg_salary
FROM employees
GROUP BY department
HAVING COUNT(*) > 1
ORDER BY avg_salary DESC;

department	num_employees	avg_salary
Sales	3	60333.33
Accounting	3	51333.33

WHERE filters rows BEFORE grouping. HAVING filters AFTER grouping.

Exercise: Department Analysis

Write a query that shows for each department:

Department name
Number of employees
Total salary expense
Average performance rating (rounded to 1 decimal)

Only include departments where average performance rating > 3.5

Take 4 minutes.

Exercise Solution

SELECT 
    department,
    COUNT(*) AS num_employees,
    SUM(salary_usd) AS total_salary,
    ROUND(AVG(performance_rating), 1) AS avg_rating
FROM employees
GROUP BY department
HAVING AVG(performance_rating) > 3.5
ORDER BY avg_rating DESC;

department	num_employees	total_salary	avg_rating
Accounting	3	154000.00	3.8
Sales	3	181000.00	3.6

Management and Reception did not make the cut!

10 Minute Break

When we return: Statistical functions and percentiles!

Up next: Finding the median (and why it matters more than average).

Part 3: Statistics and Percentiles

The Problem with Averages

Pop quiz: A company has 5 employees with these salaries:

$40,000, $42,000, $45,000, $48,000, $500,000

What is the average salary?

SELECT AVG(salary) FROM company;
-- Returns: $135,000

Is $135,000 a good representation of “typical” salary? No!

The CEO’s salary skews the average dramatically.

Median: The Middle Value

The median is the middle value when data is sorted.

$40,000, $42,000, $45,000, $48,000, $500,000

The median is $45,000, which better represents “typical.”

Tip

When to use which:

Mean (average): Data is normally distributed, no outliers
Median: Data is skewed or has outliers

Finding Median with percentile_cont()

PostgreSQL does not have a built-in MEDIAN() function, but we can use:

SELECT 
    percentile_cont(0.5) WITHIN GROUP (ORDER BY salary_usd) AS median_salary
FROM employees;

median_salary
55000

The 0.5 means “50th percentile” which is the median.

The WITHIN GROUP Syntax

percentile_cont(0.5) WITHIN GROUP (ORDER BY salary_usd)

Let’s break this down:

Part	Meaning
`percentile_cont(0.5)`	Find the 50th percentile (median)
`WITHIN GROUP`	Required syntax for ordered-set aggregates
`ORDER BY salary_usd`	Which column to calculate percentile on

Hands-On: Median vs Average

Compare median and average for our employee salaries:

SELECT 
    ROUND(AVG(salary_usd), 2) AS mean_salary,
    percentile_cont(0.5) WITHIN GROUP (ORDER BY salary_usd) AS median_salary
FROM employees;

mean_salary	median_salary
56500.00	55000

Pretty close! Our data does not have extreme outliers.

What is a Percentile?

A percentile tells you what percentage of values fall below a point.

25th percentile: 25% of values are below this
50th percentile: 50% are below (the median)
75th percentile: 75% are below this
90th percentile: 90% are below this

If you score in the 90th percentile on a test, you beat 90% of test-takers.

Calculating Quartiles

Quartiles divide data into four equal parts:

SELECT 
    percentile_cont(0.25) WITHIN GROUP (ORDER BY salary_usd) AS q1,
    percentile_cont(0.50) WITHIN GROUP (ORDER BY salary_usd) AS q2_median,
    percentile_cont(0.75) WITHIN GROUP (ORDER BY salary_usd) AS q3
FROM employees;

q1	q2_median	q3
49000	55000	61250

Using Arrays for Multiple Percentiles

Instead of separate function calls, use an array:

SELECT 
    percentile_cont(ARRAY[0.25, 0.5, 0.75]) 
        WITHIN GROUP (ORDER BY salary_usd) AS quartiles
FROM employees;

quartiles
{49000,55000,61250}

The result is an array. Curly braces indicate array values.

percentile_cont vs percentile_disc

Two versions exist:

Function	Behavior	Best For
`percentile_cont`	Interpolates between values	Continuous data
`percentile_disc`	Returns actual value from data	Discrete data

SELECT 
    percentile_cont(0.5) WITHIN GROUP (ORDER BY salary_usd) AS cont,
    percentile_disc(0.5) WITHIN GROUP (ORDER BY salary_usd) AS disc
FROM employees;

Tip

For salaries, percentile_cont is usually more appropriate.

Exercise: Salary Percentile Analysis

Write a query that shows:

The 10th percentile salary (low end)
The median salary
The 90th percentile salary (high end)

Use a single percentile_cont call with an array.

Take 2 minutes.

Exercise Solution

SELECT 
    percentile_cont(ARRAY[0.10, 0.50, 0.90]) 
        WITHIN GROUP (ORDER BY salary_usd) AS salary_percentiles
FROM employees;

salary_percentiles
{43400,55000,69550}

Interpretation:

10% of employees make less than $43,400
50% make less than $55,000 (median)
90% make less than $69,550

Date Arithmetic

Working with Dates in PostgreSQL

PostgreSQL makes date math intuitive:

SELECT '2026-01-15'::DATE - '2025-01-15'::DATE AS days_between;

days_between
365

Subtracting dates gives you the number of days between them.

Hands-On: Calculate Tenure

How long has each employee been with the company?

SELECT 
    full_name,
    hire_date,
    CURRENT_DATE AS today,
    CURRENT_DATE - hire_date AS days_employed
FROM employees
ORDER BY days_employed DESC;

full_name	hire_date	today	days_employed
Stanley Hudson	2004-11-20	2026-01-19	7730
Michael Scott	2005-03-24	2026-01-19	7606
Jim Halpert	2005-10-05	2026-01-19	7411

EXTRACT(): Pull Parts from Dates

EXTRACT(part FROM date) gets specific components:

SELECT 
    full_name,
    hire_date,
    EXTRACT(YEAR FROM hire_date) AS hire_year,
    EXTRACT(MONTH FROM hire_date) AS hire_month,
    EXTRACT(DOW FROM hire_date) AS day_of_week
FROM employees;

full_name	hire_date	hire_year	hire_month	day_of_week
Michael Scott	2005-03-24	2005	3	4

DOW = Day of Week (0=Sunday, 1=Monday, etc.)

DATE_PART(): Alternative Syntax

DATE_PART('part', date) does the same thing:

SELECT 
    full_name,
    DATE_PART('year', hire_date) AS hire_year,
    DATE_PART('quarter', hire_date) AS hire_quarter
FROM employees;

full_name	hire_year	hire_quarter
Michael Scott	2005	1
Angela Martin	2006	3

Use whichever syntax you prefer. They are equivalent.

Grouping by Date Parts

How many employees were hired each year?

SELECT 
    EXTRACT(YEAR FROM hire_date) AS hire_year,
    COUNT(*) AS num_hired
FROM employees
GROUP BY EXTRACT(YEAR FROM hire_date)
ORDER BY hire_year;

hire_year	num_hired
2004	1
2005	3
2006	2
2007	2

Interval Arithmetic

You can add intervals to dates:

SELECT 
    hire_date,
    hire_date + INTERVAL '1 year' AS one_year_later,
    hire_date + INTERVAL '90 days' AS ninety_days_later
FROM employees
WHERE full_name = 'Jim Halpert';

hire_date	one_year_later	ninety_days_later
2005-10-05	2006-10-05	2006-01-03

Exercise Solution

SELECT 
    full_name,
    last_login,
    CURRENT_DATE - last_login::DATE AS days_since_login
FROM employees
WHERE last_login IS NOT NULL
ORDER BY last_login DESC;

full_name	last_login	days_since_login
Dwight Schrute	2026-01-16 08:01:00	3
Jim Halpert	2026-01-16 08:03:00	3
Kevin Malone	2026-01-16 09:30:00	3

Note: I cast last_login to DATE to get whole days.

Homework 2 Preview

Assignment Overview

Homework 2 covers everything we learned today:

Q1: CREATE TABLE with correct data types
Q2: Import data with \COPY
Q3-Q5: Math operators and aggregates
Q6-Q7: Percentile functions
Q8-Q10: Date arithmetic and grouping

You will use the stackoverflow database with three tables:

currency_rates (CSV). Create table and import a CSV.
country_stats (SQL to execute)
response_timeline (SQL to execute)

Q1 and Q2 Tips: Creating and Importing

You will define a currency_rates table based on sample CSV data.

Sample data you will see:

rate_date,currency_code,currency_name,exchange_rate,is_major_currency
2025-01-01,USD,US Dollar,1.000000,true
2025-01-01,EUR,Euro,0.8523,true

Choose types carefully:

rate_date needs a DATE type
currency_code is always 3 characters
exchange_rate needs decimal precision

Q3 and Q4 Tips: Math Functions

Q3 asks for ABS():

Remember that ABS gives the distance from zero:

ABS(exchange_rate - 1.0)  -- Distance from USD rate

Q4 asks for ROUND() and percentages:

ROUND((part / whole) * 100, 2)  -- Two decimal places

Watch for integer division! Cast if needed.

Q5 Tips: GROUP BY with HAVING

You will group by currency and filter with HAVING.

Common mistake: Using WHERE instead of HAVING for aggregate conditions.

-- WRONG: WHERE cannot filter on aggregates
SELECT currency_code, COUNT(*)
FROM currency_rates
WHERE COUNT(*) = 12  -- ERROR!
GROUP BY currency_code;

-- RIGHT: Use HAVING for aggregate conditions
SELECT currency_code, COUNT(*)
FROM currency_rates
GROUP BY currency_code
HAVING COUNT(*) = 12;  -- Correct!

Q6 and Q7 Tips: Percentiles

Q6: Finding the median

percentile_cont(0.5) WITHIN GROUP (ORDER BY column_name)

Do not forget WITHIN GROUP! It is required.

Q7: Using arrays for quartiles

percentile_cont(ARRAY[0.25, 0.5, 0.75]) WITHIN GROUP (ORDER BY column_name)

The ARRAY keyword is essential.

Q8 and Q9 Tips: Date Arithmetic

Q8: Subtracting dates gives days:

response_date - survey_start_date AS days_since_start

Q9: EXTRACT for grouping by time parts:

EXTRACT(MONTH FROM rate_date) AS month_num

Remember to GROUP BY the same expression you SELECT.

Q10 Tips: Self-Join

The hardest question! You need to compare January and December rates.

Strategy: Join the table to itself:

FROM currency_rates jan
JOIN currency_rates dec 
    ON jan.currency_code = dec.currency_code
WHERE jan.rate_date = '2025-01-01'
  AND dec.rate_date = '2025-12-01'

Use table aliases (jan, dec) to distinguish the two instances.

Common Mistakes to Avoid

Forgetting to alias calculated columns

-- Bad: No alias
SELECT salary * 0.10 FROM employees;

-- Good: Clear alias
SELECT salary * 0.10 AS bonus FROM employees;

Integer division giving wrong results

-- Returns 0 (integer division)
SELECT 3 / 4;

-- Returns 0.75
SELECT 3.0 / 4;

Using WHERE instead of HAVING with aggregates

Testing Your Queries Locally

Before submitting:

Run your query in Beekeeper Studio or psql
Verify the output looks reasonable
Check that column aliases match what the question asks
Make sure ORDER BY direction is correct (ASC vs DESC)

Pro tip: Start simple, then add complexity.

Questions?

Topics we covered today:

Data types (INTEGER, NUMERIC, TEXT, DATE, BOOLEAN)
Importing data with \COPY
Math operators (+, -, *, /, %, ABS, ROUND)
Aggregates (SUM, AVG, COUNT, MIN, MAX)
Percentiles (percentile_cont with WITHIN GROUP)
Date arithmetic (subtraction, EXTRACT)

What questions do you have?

Lecture 05: Math Operators and Functions

Math Operators and Functions

SQL as a Calculator

Basic Math Operators

Integer Division Trap

Hands-On: Math with Employee Data

Calculating Percent Change

Hands-On: Salary Per Year of Experience

The ROUND() Function

The ABS() Function

Using ABS()

Hands-On: Distance from Average Salary

Exponents and Roots

Exercise: Calculate Commission

Exercise Solution: Commission Calculation

Aggregate Functions

What Are Aggregate Functions?

The Big Five Aggregates

SUM(): Total Values

AVG(): Calculate the Mean

COUNT(): How Many?

MIN() and MAX(): The Extremes

Combining Multiple Aggregates

GROUP BY: Aggregates by Category

How GROUP BY Works

HAVING: Filter After Grouping

Exercise: Department Analysis

Exercise Solution

10 Minute Break

Part 3: Statistics and Percentiles

The Problem with Averages

Median: The Middle Value

Finding Median with percentile_cont()

The WITHIN GROUP Syntax

Hands-On: Median vs Average

What is a Percentile?

Calculating Quartiles

Using Arrays for Multiple Percentiles

percentile_cont vs percentile_disc

Exercise: Salary Percentile Analysis

Exercise Solution

Date Arithmetic

Working with Dates in PostgreSQL

Hands-On: Calculate Tenure

EXTRACT(): Pull Parts from Dates

DATE_PART(): Alternative Syntax

Grouping by Date Parts

Interval Arithmetic

Exercise: Login Analysis

Exercise Solution

Homework 2 Preview

Assignment Overview

Q1 and Q2 Tips: Creating and Importing

Q3 and Q4 Tips: Math Functions

Q5 Tips: GROUP BY with HAVING

Q6 and Q7 Tips: Percentiles

Q8 and Q9 Tips: Date Arithmetic

Q10 Tips: Self-Join

Common Mistakes to Avoid

Testing Your Queries Locally

Questions?

`SUM()`: Total Values

`AVG()`: Calculate the Mean