TL;DR: In this post we’ll introduce PostgreSQL’s architecture and walk through how SQL queries are executed under the hood. We’ll explore why some queries are slow, and use tools like EXPLAIN, EXPLAIN ANALYZE, and pg_stat_statements to diagnose performance. In the second half, we get hands-on: spinning up a Dockerized PostgreSQL, loading sample data, and optimizing some slow queries (with a bit of wit and lots of technical insight along the way).
PostgreSQL (aka “Postgres”) is a powerful open-source relational database known for its robustness and extensibility. It has a client-server architecture: your application (the client) connects to the PostgreSQL server which manages your data. Unlike some systems that use threads, Postgres uses a process-per-connection model. This means each new client connection forks a new process on the server.
When Postgres starts up, a postmaster (master process) is launched to handle initialization and then listen for connections. For every client that connects, the postmaster forks a backend process to handle that session’s queries. In addition, Postgres runs a handful of background processes to maintain the database: a write-ahead log (WAL) writer to ensure durability, a background writer and checkpointer to flush cached data to disk, an autovacuum daemon to clean up old rows, etc. All these processes communicate via shared memory for efficiency. The shared memory contains the buffer cache (shared buffers) where frequently used data pages are kept, as well as the WAL buffers for transactional logs. In short, the architecture ensures that Postgres can handle multiple clients at once, each with their own process, while sharing common data and coordinating through background workers.
Why does this matter for performance? Understanding the architecture helps explain behaviors like: if you have too many connections, the OS has to juggle many Postgres processes (which can exhaust CPU/RAM and lead to context-switching overhead). It also explains why data caching is so important — Postgres will keep frequently accessed rows in shared memory (shared buffers) to avoid slow disk reads. And because each backend is isolated in its own process, heavy queries on one connection won’t directly crash others, but they might contend for shared resources (like buffer memory or CPU).
SQL Queries Under the Hood: SELECT, WHERE, JOIN, ORDER BY
SQL Query Lifecycle: SQL → Parser → Planner → Executor → Plan Tree
Writing an SQL query in Postgres is declarative — you describe what data you want, and the database engine figures out how to get it. Under the hood, PostgreSQL parses your SQL, generates a query plan, and executes that plan step by step. To a developer, SELECT * FROM users WHERE id = 123; is one line of code; to Postgres, it’s a series of operations: scan a table (or index), filter rows, and return results. Let’s break down the basic SQL operations and what the database does under the hood:
SELECT-FROM (Table Scan): When you select data, Postgres needs to retrieve rows from the table (the FROM clause). The simplest approach is a sequential scan (“Seq Scan”) which reads all rows of the table one-by-one. This is fine for small tables or when you need most of the data, but it’s slow if you only need a few rows out of a large table. If an appropriate index exists, Postgres can do an index scan to find specific rows quickly (Supabase documentation explains these operations well). Think of a sequential scan like reading an entire phone book to find one name, versus an index scan like flipping directly to the “Smith” section. PostgreSQL supports different index types (B-tree indexes are the common default) which are optimized tree structures that allow logarithmic-time lookups (PostgreSQL documentation on indexes). An index scan uses these structures to jump directly to matching data rather than scanning everything.
Table Scan vs Index Scan: all rows read vs targeted lookup via B-tree
WHERE (Filtering): The WHERE clause is a filter on rows. If there’s an index on the filtered column(s), the planner may use an Index Scan with an index condition to retrieve only matching rows (e.g. using a B-tree index to find value = 100). Without an index, the executor will do a Seq Scan and check each row against the WHERE condition, discarding those that don’t match (this is essentially a filter operation on each tuple). Under the hood, each row is pulled from storage and the condition is evaluated; rows that fail the condition are skipped. This is why missing indexes on frequently searched columns can cause big slowdowns – the database ends up doing a lot of extra work scanning irrelevant rows.
JOIN (Combining Tables): SQL joins combine rows from multiple tables. PostgreSQL’s query planner can choose among several join algorithms: Nested Loop, Hash Join, or Merge Join. A Nested Loop Join is simple: for each row in table A, scan table B for matches (perhaps using an index on B if available). This works fine for small tables or when an index can limit the inner loop, but it can become slow if both tables are large (O(N×M) in the worst case). A Hash Join is often faster for larger data sets: Postgres will read the smaller table, build an in-memory hash table by the join key, then scan the larger table and probe the hash table for matches (PgMustard has a great explanation of join types). This way, each table is scanned just once (ideal for equality joins). A Merge Join is used when both inputs are already sorted on the join key (or can be sorted cheaply). The database can then step through two sorted lists in order, like merging two sorted files, which is very efficient. Each algorithm has its sweet spot: nested loops for small or index-driven joins, hash for large unsorted joins, and merge for pre-sorted data. The planner will estimate costs and pick what it thinks is best – and it usually does a good job, though occasionally it might guess wrong.
Join Algorithms: Nested Loop, Hash Join, Merge Join — visualized with two tables and arrows showing matching process
ORDER BY (Sorting): After getting the result rows, if an ordering is required, Postgres must ensure the output is sorted. This might involve a sort operation (quickSort or similar under the hood) on the result set. Sorting is O(N log N) so it adds latency, especially for large N. However, if there is an index on the column(s) used in ORDER BY, the planner can sometimes skip an explicit sort by scanning the index in order. For example, if you do SELECT * FROM items ORDER BY created_at, and created_at has an index, Postgres can perform an Index Scan that returns rows already sorted by created_at – essentially killing two birds with one stone (getting data and sorting it). In execution, an ORDER BY without an index will pull all results then sort them, which can be memory-intensive (and may spill to disk if the data is huge and memory is limited). This is a common culprit for slow queries if not planned carefully.
Other operations: There’s more under the hood (GROUP BY aggregations, LIMIT/OFFSET pagination, etc.), but they build on similar principles. For example, a GROUP BY may use hashing (Hash Aggregate) or sorting (Sort + Aggregate) depending on context. A LIMIT 10 can allow the planner to do less work (it might use an index to avoid scanning the whole table if only a few rows are needed). Understanding the basic scans, joins, and sorts gives you a foundation for reasoning about these as well.
In summary, a SQL query is translated into a tree of plan nodes: at the bottom are scan nodes pulling data from tables (via sequential or index scans) and above them are join nodes combining data, filter nodes applying WHERE conditions, and possibly sort nodes, etc. The executor pulls rows through this tree. Knowing this, we can already guess why some queries are slower than others: a query that does a full table scan of 10 million rows and sorts them will be orders of magnitude slower than one that uses indexes to fetch 5 rows and does no sort. But there are also more subtle factors, which we’ll cover next.
Why Queries Can Be Slow: Understanding Query Latency
Even with the same data and schema, query performance can differ dramatically based on how the query is written and how Postgres executes it. Here are the key factors that contribute to query latency:
Amount of Data Scanned: The more rows the database has to examine, the longer it takes. This is why indexing is so crucial: a query that scans 100,000 rows (because of no useful index) will usually be slower than a query that uses an index to scan 100 rows. Sequential I/O through a large table is slow (it may involve lots of disk reads unless the data is fully cached in RAM). If your query is filtering or joining but not using indexes effectively, you’ll end up reading a ton of unnecessary rows. For example, a sequential scan reads all rows from a table, so if your table is huge and you only needed a few, that’s wasted work (Citus Data has a good article on this).
Complexity of Joins and Operations: As described, join algorithms have different costs. A poorly planned join (e.g., nested loop over two large tables without indexes) can be extremely slow. Likewise, operations like sorting or aggregation can be costly if they involve many rows. A query that joins multiple tables and sorts the result will inherently do more work than a simple single-table lookup. The number of steps (plan nodes) and the nature of those steps (e.g. hashing, sorting, etc.) add to latency.
Disk I/O vs Memory: Reading from memory is orders of magnitude faster than reading from disk. If your data is already in Postgres’s shared buffers (or the OS cache), a scan can be quite fast. But if the data is not cached and the query triggers a lot of disk reads, you pay a heavy penalty for disk I/O (PgCasts has a great episode on this). This is why the first run of a query might be slower than subsequent runs (warm cache effect). Large result sets also mean more data to transfer over the network to the client, which can be a bottleneck by itself.
Query Execution Plans (Bad Estimates): Postgres uses a cost-based optimizer. It makes estimates of how many rows will result from each step. If the statistics are off (e.g., you haven’t run ANALYZE recently, or data is very skewed), the planner might pick a suboptimal plan. For instance, if the planner thinks a filter will match 100 rows but in reality it matches 100,000, it might choose a nested loop join expecting 100 iterations, but ends up doing 100k – ouch. Mismatches between estimated rows vs actual rows can lead to bad decisions and thus slow execution (AWS has a good article on statistics). Regularly analyzing tables helps the optimizer make informed choices.
Locks and Contention: If your query is waiting for another transaction’s lock (e.g., trying to read a row that’s being modified), it will be blocked until the lock is released. This appears as “latency” from the outside. High write activity or frequent locking (e.g., updates on the same rows) can therefore slow down queries due to waiting, even if the plan itself is fine (PostgreSQL Wiki on lock monitoring).
Connection Overhead: Each new client connection in Postgres is a new process, which isn’t free. If an application opens and closes connections for each query (without pooling), the connection setup/teardown can add latency. Typically, this is mitigated by using connection pools so queries reuse existing backend processes rather than constantly forking new ones. Without pooling, excessive connections or frequent connect/disconnect can bog down the server (PgBouncer documentation).
Server Resources and Configuration: Lastly, the hardware and config matter. If your server is low on RAM, it can’t cache much data. If work_mem (memory for operations like sorts or hashes) is too low, queries might spill to disk during sorting or hashing, drastically slowing those operations. If the CPU is slow or overloaded, query execution waits behind other work. If the disk is slow, I/O-bound queries suffer. And as mentioned, if you have, say, 100 active heavy queries at once (high concurrency), they compete for CPU, I/O and memory, each running slower than if they ran alone (PostgreSQL documentation on server configuration).
To sum up, query latency = time spent reading data + processing it + waiting on any resources. Big data volume, inefficient plan (lack of indexes or bad join method), and disk I/O are the big three for slow queries. In the real world, performance issues often come down to “the database is doing more work than it ideally should” – our job is to find out why and reduce the workload (or give it more resources). Thankfully, PostgreSQL provides great tools to analyze and tune query performance.
Tools for Analyzing Query Performance
Query Plan Tree Example: nodes for scan, join, filter, sort
How do we peek under the hood to see what Postgres is doing with our SQL? This is where performance analysis tools come in. The three primary tools we’ll discuss are:
EXPLAIN
EXPLAIN ANALYZE
pg_stat_statements (an extension for tracking query stats across the system)
Above: Example EXPLAIN output with annotations showing cost estimates, estimated rows, and plan node types. This helps you understand how to read the plan tree and spot key details for performance analysis.
The EXPLAIN command in PostgreSQL shows the planned execution for a given query, without actually running the query. It’s like asking the planner, “Hey, how would you execute this if I ran it?” The output of EXPLAIN is the query plan: a tree of steps with cost estimates. Each line in an EXPLAIN output is one node in the plan (indented to show tree structure). For example:
This tells us PostgreSQL plans to do a Sequential Scan on the users table, with an estimated cost between 0.00 and 19.00, expecting about 1 row to match (PostgreSQL documentation on EXPLAIN). Because perhaps user_id wasn’t indexed in this scenario, it has to scan. The cost numbers are in arbitrary units, not milliseconds, and are only useful relatively. The first number is the cost to get the first row, the second is the cost to get all rows. Here the planner thinks the first matching row will be found almost immediately (0.00 startup cost), and finishing scanning would be 19.00 cost units. The rows=1 is the estimated number of rows that will pass the filter, and width=240 is the estimated row size (in bytes). Don’t worry about width too much, it’s about memory footprint. The key info from basic EXPLAIN: what plan was chosen (Seq Scan vs Index Scan, Hash Join vs Nested Loop, etc.) and the cost/row estimates.
Reading a complex plan can be intimidating at first, but it’s logical. For instance:
Hash Join (cost=... rows=...)
-> Seq Scan on orders ...
-> Hash (cost=...)
-> Seq Scan on customers ...
This means it will scan orders, scan customers, build a hash on customers, then join to orders. If you see an Index Scan in a plan, that’s usually a sign the planner is using an index (good for selective queries). A Seq Scan on a very large table for a selective query is a red flag (maybe an index is missing). EXPLAIN by itself doesn’t run the query, so it’s quick and safe to use on even slow queries — but it shows only estimates, not actual runtime.
Tip: You can use EXPLAIN (VERBOSE, BUFFERS) and other options for more detail. However, an even more powerful variant is EXPLAIN ANALYZE.
EXPLAIN ANALYZE: Timing the Query Execution
EXPLAIN ANALYZE actually runs your query and then shows the plan with actual execution statistics. This is incredibly useful because it tells you what really happened, not just what was expected. The output includes additional columns like (actual time=… rows=… loops=…) for each plan node. For example:
Seq Scan on users (cost=0.00..19.00 rows=1 width=240)
(actual time=0.026..0.030 rows=1 loops=1)
Filter: (user_id = 1)
Rows Removed by Filter: 999
This tells us the seq scan actually took ~0.03 ms and found 1 row, scanning 1000 rows (999 were filtered out) (Depesz’s EXPLAIN ANALYZE visualizer can help interpret these). The planner expected 1 row and got 1 – good estimate. The loops=1 indicates how many times this node was executed (loops > 1 appear in nested loops or repeated subplans). The actual time is the wall-clock time spent in that operation (it shows a start..end time range per node). The difference between actual and estimated can highlight issues: e.g., if a node estimated 100 rows but actual rows=10000, that discrepancy can explain why the query might be slower than expected (planner didn’t anticipate so much work). We often look for the slowest portion of the plan – usually a node with a high actual time.
Keep in mind that EXPLAIN ANALYZEdoes execute the query fully, so don’t use it on an UPDATE/DELETE unless you’re okay with it affecting data (or wrap it in a transaction and roll back). For SELECTs, it will actually pull all the data (though not print it out), so it takes as long as the query itself takes (plus some overhead for measurement). But the insights are gold. For performance tuning, you typically run an EXPLAIN ANALYZE on a problematic query to see where the time is going. Are we doing sequential scans that could be indexed? Is a hash join consuming a lot of time or memory? Is the actual row count way off from estimate? EXPLAIN ANALYZE will answer that.
Interpreting a real example: Suppose we have a slow query joining two tables. EXPLAIN ANALYZE might show:
This indicates it scanned 500k orders, 100k customers, built a hash of customers, and joined – the hash join took about 120 ms. If we see that and think “hmm, maybe there’s an index on customer_id we could use instead”, we could try to optimize it (more on that soon). The EXPLAIN ANALYZE helps pinpoint that the bulk of time was spent scanning orders (80 ms) and then hashing+joining (another 40 ms or so). Perhaps if we had an index and the planner chose nested loops, it might scan 100k customers and do 500k index lookups – which could be better or worse. Only testing would tell.
The key point: Use EXPLAIN ANALYZE to get ground truth on query performance. It will show you exactly which step is the slowest and whether the planner’s estimates are on target or not.
pg_stat_statements: Tracking Query Performance Over Time
pg_stat_statements is a PostgreSQL extension that tracks execution statistics for all SQL statements. While EXPLAIN is great for one-off analysis, pg_stat_statements is useful to monitor and find slow queries in a running system over time. It records metrics like total time spent, number of calls, I/O counts, etc., for each query (normalized).
To use it, you need to enable it in the Postgres config (shared_preload_libraries = 'pg_stat_statements') and install the extension in your database (CREATE EXTENSION pg_stat_statements;). Once that’s done, Postgres will start tracking stats. You can then query the view pg_stat_statements. For example, to find the most time-consuming queries overall:
query: The text of the query (with literals normalized to placeholders).
calls: How many times that query was executed.
total_exec_time: Cumulative time spent executing it (in milliseconds).
mean_exec_time: Average time per execution (i.e. total_exec_time/calls).
rows: Total number of rows returned or affected across all calls.
(There are also columns for I/O like shared_blks_hit/read, which show cache hits vs reads.)
Using this, you can identify, say, “Query X has been called 1000 times and averages 200 ms – that’s a slow query we should optimize.” It’s extremely helpful in production systems to find the slow outliers or the most frequently run heavy queries.
For example, you might find an entry like:
SELECT * FROM products WHERE category_id = $1 – called 15240 times, total_exec_time 42578 ms, mean_exec_time ~2.79 ms. That might be fine (2.79 ms each). Another might be a complex join with mean_exec_time 200 ms – which is worth investigating. Without pg_stat_statements, you’d have to guess where the slow queries are. With it, you can just query the stats view to pinpoint them.
One caveat: pg_stat_statements tracks queries by their normalized text. So WHERE id = 1 and WHERE id = 2 are considered the same query. This is usually what you want. Just be mindful that if your application generates lots of dynamic SQL (with text differences), the view can have many entries. Typically, though, you look at the top times or those exceeding a threshold.
Reading pg_stat_statements: If we see a query with high mean_exec_time, we’d then take that query and run EXPLAIN/ANALYZE on it to see why. pg_stat_statements is a starting point to find what is slow. It also lets you verify improvements: after adding an index or rewriting a query, you can see if its mean_exec_time goes down on subsequent executions.
Lastly, note that pg_stat_statements accumulates stats since the server started (or since you reset the stats). You might want to SELECT pg_stat_statements_reset(); to clear stats, run a workload, then check, to focus on a specific timeframe.
Other Tools Briefly
We won’t dive deep, but be aware there are other helpful tools: - Auto-Explain (an extension to automatically log EXPLAIN ANALYZE for slow queries), - pgBadger (pgBadger website) (a log analyzer that summarizes slow queries), - third-party monitoring systems, etc. But the foundation of manual performance analysis is usually EXPLAIN/ANALYZE and pg_stat_statements.
Now that we have the theory and tools in mind, let’s put this knowledge into practice with a hands-on example.
Enough talk – time to get our hands dirty. In this section, we’ll set up a local PostgreSQL environment using Docker, load some sample data, and walk through optimizing a query. Follow along step-by-step to reproduce the tests.
1. Setting Up PostgreSQL in Docker
If you have Docker installed, spinning up a PostgreSQL instance is a one-liner. We’ll run the official Postgres image with a password for the default postgres user:
docker run --name pg-test -e POSTGRES_PASSWORD=mysecretpass -p 5432:5432 -d postgres:latest
This downloads (if not already) the latest Postgres image and runs it in a container named pg-test. It exposes Postgres on port 5432 of your machine (the default port). The database is ready within a few seconds. You can connect to it using any Postgres client. For simplicity, let’s use the psql command-line or a tool like DBeaver/TablePlus if you prefer GUIs.
Once the container is up, connect to it (the user is “postgres” and password “mysecretpass” as set above):
# Using psql (you might need to install it if not present):psql-h localhost -U postgres
It should prompt for the password and then give you a postgres=# prompt. Now we’re in!
2. Sample Dataset Preparation
We need some data to play with. You can use an existing sample database (like the PostgreSQL DVD Rental sample or Pagila). But for transparency, we’ll create our own simple schema and generate some test data using SQL. This way, we know exactly what’s in it and can easily tweak it.
Let’s create two tables to simulate a scenario of a simple store:
CREATETABLE items ( item_id SERIAL PRIMARYKEY, name TEXT, price INT,category TEXT);CREATETABLE orders ( order_id SERIAL PRIMARYKEY, item_id INTREFERENCES items(item_id), quantity INT, order_date TIMESTAMP);
items: holds products with a name, price, and category.
orders: each order references an item and has a quantity and date.
Now, we’ll insert data. We want enough rows to see performance differences (tens of thousands). We can use PostgreSQL’s generate_series function to quickly create lots of rows. For example, to insert 100,000 items and 100,000 orders:
-- Insert 100k itemsINSERTINTO items (name, price, category)SELECT'Item '|| g, (random()*1000)::int, CASEWHEN g % 4=0THEN'electronics'WHEN g % 4=1THEN'books'WHEN g % 4=2THEN'clothing'ELSE'home'ENDFROM generate_series(1, 100000) as g;-- Insert 100k orders (each for a random item)INSERTINTO orders (item_id, quantity, order_date)SELECT (random()*100000)::int+1, -- random item_id between 1 and 100000 (random()*5)::int+1, -- random quantity 1 to 6 NOW() - (random()*365||' days')::interval-- random date within past yearFROM generate_series(1, 100000);
(This might take a few seconds to run given 200k inserts, but it’s all happening inside the database, which is pretty fast in modern Postgres.)
We now have: 100k items and 100k orders, with orders distributed randomly across items.
Before proceeding, let’s ensure stats are up-to-date (especially because we generated data in a somewhat unusual way). Run:
ANALYZE;
This will update the planner statistics for our tables, so our upcoming EXPLAINs are based on fresh info.
3. Measuring Query Performance with EXPLAIN and EXPLAIN ANALYZE
Scenario: Imagine we frequently need to query for orders of a particular item by name. Perhaps our app is asking: “Find all orders for the item named ‘Item 50000’.” A straightforward SQL for that might be:
SELECT o.order_id, o.quantity, o.order_dateFROM orders oJOIN items i ON o.item_id = i.item_idWHERE i.name ='Item 50000';
Let’s analyze this query’s performance.
First, use EXPLAIN to see the planned execution:
EXPLAINSELECT o.order_id, o.quantity, o.order_dateFROM orders oJOIN items i ON o.item_id = i.item_idWHERE i.name ='Item 50000';
You might get an output (formatted for clarity) like:
Nested Loop (cost=0.00..???? rows=?)
-> Seq Scan on items i (cost=0.00..???? rows=1)
Filter: (name = 'Item 50000')
-> Index Scan on orders (cost=0.00..???? rows=?)
Index Cond: (item_id = i.item_id)
What does this mean? The planner sees a filter on i.name = 'Item 50000'. We did NOT create an index on items.name, so by default, it has to Seq Scan items to find that item. Since there’s only one matching item (if names are unique), it estimates 1 row. For each item found (just one), it then does an Index Scan on orders… but wait, do we have an index on orders.item_id? We did not explicitly create one. However, recall we made item_id a foreign key. Postgres does not automatically index foreign keys (unlike primary keys). So currently, orders.item_id has no index. That means the planner cannot actually do an Index Scan on orders by item_id. Instead, likely it will do a Seq Scan on orders and filter each by matching item_id. The plan might actually be:
Hash Join (cost=... )
-> Seq Scan on items i ... (filter name)
-> Seq Scan on orders o ... (filtered by hash join on item_id)
Because without an index on orders.item_id, a Nested Loop using index scan isn’t possible. The planner probably chooses a Hash Join: it will read the one matching item, hash its item_id, then seq scan orders and for each order check if its item_id matches the hash (i.e., equals 50000). Since orders is 100k, that’s a 100k scan – not too bad, but not great if this query is frequent.
Let’s confirm by running EXPLAIN ANALYZE to see actual behavior and timing:
EXPLAINANALYZESELECT o.order_id, o.quantity, o.order_dateFROM orders oJOIN items i ON o.item_id = i.item_idWHERE i.name ='Item 50000';
You might see output indicating a hash join and the time, for example (pseudo-output):
Hash Join (actual time=0.2..15.0 rows=Random loops=1)
Hash Cond: (o.item_id = i.item_id)
-> Seq Scan on orders o (actual time=0.02..10.00 rows=100000 loops=1)
Filter: (/* after hash cond */)
-> Hash (actual time=0.1..0.1 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on items i (actual time=0.01..0.01 rows=1 loops=1)
Filter: (name = 'Item 50000')
Rows Removed by Filter: 99999
Interpreting: it scanned 100k items to find the one with that name (took ~0.01 ms per the above – actually likely a bit more, but anyway). It found 1 row, built a tiny hash. Then it seq scanned 100k orders, and for each, applied the hash condition (which essentially checks item_id = 50000). It found, say, a handful of orders that matched (depends how random our data was; could be around 1 order on average since random distribution). Total time ~15 ms (most of which is scanning orders).
Now, 15 ms isn’t terrible, but remember this is a contrived smallish dataset on a local machine. Scale it up or have many concurrent queries and it adds up. Also, scanning 100k rows to get maybe 1 result is wasteful. We can do better by adding an index.
4. Optimizing the Query: Adding an Index
Aspect
Before Index
After Index (on orders.item_id)
Plan Type
Hash Join
Nested Loop Join
Plan Details
Seq Scan on items + Seq Scan on orders
Seq Scan on items + Index Scan on orders
Rows Scanned (orders)
100,000 (all rows)
~1-3 (only matching rows)
Execution Time
~15 ms
~0.5 ms
Performance Gain
-
30x faster
Scalability
Poor (O(N) with table size)
Good (logarithmic with table size)
The obvious optimization is to add an index on the items.name or on orders.item_id (or both). Which one helps more?
Index on items.name would make finding “Item 50000” very fast (index lookup instead of seq scan on items). But items is only 100k rows and fairly static, and scanning 100k in memory is like ~0.01 seconds, not huge. The bigger cost was scanning orders.
Index on orders.item_id would allow Postgres to directly find all orders for item_id 50000 instead of scanning all orders. That seems more beneficial in this case.
Let’s create an index on orders.item_id:
CREATEINDEX idx_orders_item ON orders(item_id);
After creating the index, run ANALYZE orders; (to update stats for the planner to consider the new index). Now, re-run our query with EXPLAIN:
EXPLAINSELECT... (same query) ...;
Expected new plan:
Nested Loop (cost=... rows=...)
-> Seq Scan on items i (cost=... rows=1)
Filter: (name = 'Item 50000')
-> Index Scan using idx_orders_item on orders o (cost=... rows=??)
Index Cond: (item_id = i.item_id)
Because now the planner knows it can find orders by item_id quickly using the index. The Seq Scan on items remains (we didn’t index name yet), but that’s minor. The big change is an Index Scan on orders. With EXPLAIN ANALYZE, we’d likely see:
Nested Loop (actual time=0.1..0.5 rows=X loops=1)
-> Seq Scan on items i (actual time=0.01..0.02 rows=1 loops=1)
Filter: (name = 'Item 50000')
-> Index Scan using idx_orders_item on orders o (actual time=0.05..0.40 rows=X loops=1)
Index Cond: (item_id = i.item_id)
The index scan will only fetch orders where item_id matches our item. If, say, there were 3 orders of Item 50000, then rows=3 for that index scan. The actual time would be very small (the index lookup is very fast, plus fetching 3 table rows). The total time might drop to ~0.5 ms or so (plus the overhead of client display). A dramatic improvement from 15 ms to ~0.5 ms, which is 30x faster in this toy example. More importantly, the workload scales: even if orders table had 100 million rows, an index scan to find a few orders would remain fast (logarithmic time to traverse the index, then retrieving a few rows). The seq scan approach would balloon to scanning all 100 million rows, which is obviously much slower.
This demonstrates the classic index vs no-index difference. We addressed the performance by adding a missing index on the join/filter key. In real scenarios, you’d apply the same logic: find what the bottleneck is (e.g., large seq scan or sort) and see if an index or query rewrite can eliminate or reduce that heavy operation.
What about indexing items.name? If we also did CREATE INDEX idx_items_name ON items(name);, then the planner would likely do an Index Scan on items to find “Item 50000” instead of a seq scan. That would shave that last 0.01 ms to maybe 0.005 ms – negligible for our scale. But if items were millions of rows and you frequently search by name, it’s worth indexing. The general rule: index columns that are frequently filtered or joined on. Here, orders.item_id was crucial to index for the join, and possibly items.name is worth indexing for direct lookups by name.
Another potential improvement, if we really expect this specific query a lot, might be to denormalize (store item name in orders). But that has its own trade-offs and usually not necessary with proper indexes.
5. Using pg_stat_statements to Verify Improvement (Optional)
In our simple test, we can just measure timing directly. But imagine this was a production system. Before adding the index, pg_stat_statements would have shown the query with a higher mean time. After adding the index, that mean time should drop.
If you had pg_stat_statements enabled, you could do something like:
SELECT mean_exec_time, rowsFROM pg_stat_statementsWHEREquery ILIKE '%JOIN items i ON o.item_id = i.item_id%Item 50000%';
(We use ILIKE with a snippet of the query to find it; in real use, you’d probably identify by an id or just look at the top slow queries.) You would see the mean_exec_time before vs after. Or simply reset stats, run the query 100 times before the index and 100 times after, and then compare the averages in pg_stat_statements. The difference should be evident (we saw ~15 ms vs ~0.5 ms per execution).
6. Other Optimization Techniques
In our hands-on scenario, adding an index was the obvious fix. In other cases, you might consider:
Query rewriting: e.g., if a query is doing something inefficient like SELECT in a loop or unnecessary subqueries, you can rewrite it in a set-based way or use CTEs/materialized views, etc. That’s outside our current example, but an important tool.
Adjusting Postgres settings: For instance, enabling parallel query for large scans, or increasing work_mem for big sorts can help certain queries. Those are more advanced tuning steps once you identify the nature of the slow operation.
Caching results: Sometimes if an application repeatedly runs an expensive query with the same parameters, caching the result in the application or using a materialized view in the database can bypass repeated work. Again, situational, but worth noting.
Adding appropriate** indexes:** As we did, but with care – too many indexes slow down writes. Focus on the ones that matter for read performance hotspots. Also consider index types: e.g. use a GIN index for text search, etc., if that’s the bottleneck.
Vacuum and bloat: Ensure tables aren’t bloated (which can slow scans) – autovacuum usually handles this, but if you have a lot of churn, you may need to tune it.
Connection pooling: As mentioned, if connection overhead is an issue, using a pooler like PgBouncer can drastically cut latency for lots of short queries by reusing backends.
Our example was straightforward and common: speeding up a selective query by indexing. In practice, many performance problems have similar straightforward solutions once identified, while some require more thought and testing.
Conclusion
Performance Factor
Impact
Tools for Analysis
Optimization Techniques
Query Planning
High
EXPLAIN, EXPLAIN ANALYZE
Query rewriting, schema optimization
Indexing
Very High
EXPLAIN (look for Seq Scans)
Create appropriate indexes on filter/join columns
I/O Operations
High
EXPLAIN ANALYZE (actual rows/time)
Optimize buffer cache, reduce disk reads
Table Statistics
Medium
ANALYZE, pg_stat* views
Regular ANALYZE, monitor data changes
Join Methods
High
EXPLAIN (Nested Loop vs Hash Join)
Choose right join type, index join columns
Memory Usage
Medium
EXPLAIN BUFFERS
Tune work_mem, maintenance_work_mem
Connection Overhead
Low-Medium
pg_stat_activity
Connection pooling (PgBouncer)
Query Complexity
High
Query parsing
Simplify complex queries, use CTEs
Concurrency
Medium
pg_stat_activity
Manage connection limits, query timeouts
Data Volume
Very High
Table size statistics
Partitioning, archiving old data
PostgreSQL is a brilliant piece of software that usually does the right thing automatically – but as engineers, understanding why it does what it does allows us to write better queries and design better database schemas. We saw that under the hood, Postgres executes queries through plan nodes (scans, joins, sorts) and each has implications for performance. We used EXPLAIN and EXPLAIN ANALYZE to reveal those inner workings, and pg_stat_statements to find slow queries systematically. Our hands-on example showed how a seemingly innocent query can be slow without proper indexing, and how a simple index can improve it by an order of magnitude.
As you apply this knowledge to your own applications, keep a few things in mind (yes, a TL;DR of the TL;DR):
Always profile your slow queries – use EXPLAIN ANALYZE to see what’s really happening. It’s the first step in any SQL performance investigation.
Ensure you have the right indexes for your access patterns. It’s one of the most impactful tweaks for query speed (just don’t go index-crazy without reason).
Watch out for large sequential scans, hash joins, or sorts in plans on big tables – they often indicate areas to optimize (either via indexes or query changes).
Use pg_stat_statements or logging to catch slow queries in production. You can’t optimize what you don’t know about.
Think holistically: sometimes the fix is in the application (e.g., querying less often or in bulk), sometimes in the DB (indexes, config), sometimes in schema design (normalization vs denormalization).
With a bit of practice, reading EXPLAIN output and tweaking queries becomes an immensely satisfying debugging activity – it’s like being a detective diving into the database’s brain. So the next time your API call or report is sluggish due to a database query, you know the tools and concepts to crack the case. Happy querying, and may your Postgres queries be ever speedy!