The Problem Left Behind by Empty Values
Understanding NULL in Databases
I created a table and filled it with member data. I wanted to find people who hadn't entered a phone number, so I wrote this:
SELECT name
FROM users
WHERE phone = NULL;
The result came back empty. Yet I knew there were rows where the phone number field was blank.
No typo. The table name was right. Still, it didn't work. To understand why this query fails, I need to reconsider what NULL actually is.
NULL is not emptiness
When you first encounter NULL, it feels like a blank space. Nothing there. Something like 0 or an empty string.
But in a database, these three things are completely different.
0 is the number zero. A clearly defined value. '' is an empty string. It has no content, but it is itself a value. NULL is different. It doesn't mean the absence of a value—it represents a state that is unknown or unrecorded.

When someone leaves the phone number field blank on a registration form, some systems save that as an empty string (''), while others save it as NULL. The crucial point is that they don't mean the same thing. A value stored as NULL doesn't quite mean "there is no phone number"—it's closer to "the phone number is still unknown or hasn't been recorded."
This distinction may seem small, but it becomes a problem in far more places when writing queries than you'd expect.
Why you can't compare with an equals sign
This is why the earlier query didn't work.
WHERE phone = NULL
This condition asks, "Is phone equal to NULL?" But NULL is an unknown value. Whether an unknown value equals another unknown value is itself unknowable.
So the result of NULL = NULL is not true. It's NULL.
The database treats this comparison as "unknown." In a WHERE clause, if the result is NULL, that row doesn't appear in the results. All rows without phone numbers get filtered out.
To find NULL, you must use IS NULL instead of an equals sign.
SELECT name
FROM users
WHERE phone IS NULL;
If you want to see only rows with values, use IS NOT NULL.
SELECT name
FROM users
WHERE phone IS NOT NULL;
IS NULL is not an equals sign. It's a separate operation that asks, "Is this field in an unknown state?"
Unknowability spreads
The most important characteristic of NULL to watch for is this: when NULL appears in an operation, the result becomes NULL too.
SELECT 100 + NULL; -- NULL
SELECT NULL * 5; -- NULL
The reason is simple. The result of any calculation involving an unknown value is itself unknown.
Say you have an items table where you multiply price and quantity to find the total.
SELECT item_id,
price * quantity AS total
FROM items;
| item_id | price | quantity | total |
|---|---|---|---|
| 1 | 100 | 2 | 200 |
| 2 | NULL | 3 | NULL |
| 3 | 200 | NULL | NULL |
| 4 | NULL | NULL | NULL |

If either price or quantity is NULL, the total becomes NULL. This isn't a calculation error. It simply means that a result containing an unknown value must itself remain unknown.
When this behavior becomes problematic, you handle it with COALESCE. COALESCE(value, replacement) returns the replacement value when value is NULL.
SELECT item_id,
COALESCE(price, 0) * COALESCE(quantity, 0) AS total
FROM items;
But replacing NULL with 0 across the board isn't always correct. The moment you change "unknown" to 0, the meaning of the data shifts. What value to substitute depends on what the data actually represents.
Aggregate functions handle it differently
Unlike ordinary operations where NULL spreads, aggregate functions like SUM, AVG, MAX, and MIN typically skip NULL values and calculate based only on rows with values.
-- values in scores column: 80, NULL, 90, NULL, 70
SELECT SUM(score), -- 240 (NULL excluded)
AVG(score), -- 80 (240 ÷ 3, NULL excluded)
MAX(score), -- 90
MIN(score) -- 70
FROM scores;
SUM doesn't treat NULL as 0. It simply skips it. AVG doesn't divide by the total row count—it divides only by the count of rows with values. If there are 5 rows and 2 are NULL, it divides by 3.
Among aggregate functions, COUNT works in two different ways.
SELECT COUNT(*), -- 5 (total rows, including NULL)
COUNT(score) -- 3 (only rows with values)
FROM scores;

COUNT(*) counts the rows themselves. It includes rows even if they contain NULL. COUNT(score) counts only rows where the score column has a value. Rows with NULL are excluded.
If you miss this distinction, you easily confuse the total number of members with the number of members who have a phone number. They're different questions.
The problem that empty values leave behind
Handling NULL is not a matter of syntax before it's a matter of how to read data.
When a table contains NULL, it's a record of a moment when that data was unknown. Someone didn't enter a phone number during signup, or a delivery address hadn't been set yet, or a measurement was skipped. NULL preserves that situation exactly.
The fact that WHERE phone = NULL doesn't work, that NULL + 1 becomes NULL, that COUNT(*) and COUNT(col) give different results—all of it stems from the same reason. An unknown value remains unknown.
Whether to fill that unknowability with 0, exclude it, or leave it as is falls to the person writing the query. The database speaks honestly: it doesn't know. How you read that honesty is your responsibility.
