How to solve any SQL problem
Progressing from syntax to problem-solving
I’ve been pondering the most common problem I notice with my coaching clients, when it comes to SQL. They know the syntax of SQL, they know the difference between a LEFT JOIN and INNER JOIN, they know each component of a window function.
But they struggle with the “how”. How to apply the theoretical SQL knowledge to actually solving SQL problems. How do we go from a question and a dataset to an answer.
In this article, I will share my framework that I use to solve any SQL problem. It doesn’t matter if it’s a straightfoward query, or if it requires multiple CTEs. This framework works because teaches you how to problem-solve. This would be helpful if you’re applying to any Data jobs, or if you’re doing SQL on the job.
🎄 BTW, I'm hosting a free SQL challenge this December -- the SQL Advent Calendar 2025! It is 24 days of daily SQL challenges, but everything is Christmas-themed. Sign up here: www.interviewmaster.ai/advent
For today’s walk-through, we’re going to use a question from my Christmas-themed SQL challenge.
Question: The Grinch has brainstormed a ton of pranks for Whoville, but he only wants to keep the top prank per target, with the highest evilness score. Return the most evil prank for each target.
Table name and columns: grinch_prank_ideas (prank_id, target_name, prank_description, evilness_score, created_at)
Step 1: Understand the question
This is the most important step of solving any SQL problem. We need to understand what we’re working with and what we’re trying to achieve, before we can create a solution.
Let’s start by pulling out the keywords from the question (highlighted in bold).
The Grinch has brainstormed a ton of pranks for Whoville, but he only wants to keep the top prank per target, with the highest evilness score. Return the most evil prank for each target.
Everything else in the question is a distraction, we can ignore that :)
Step 2: Find the relevant data
Next up, let’s find the relevant columns in the table(s) that we are given.
Table name and columns: grinch_prank_ideas (prank_id, target_name, prank_description, evilness_score, created_at)
So mapping our keywords to the relevant columns:
target is in the target_name column
evilness score is in the evilness_score column
In the real world, the data might not be so readily available or so intuitive. In which case, I would read the data documentation or talk to the database owners for more information.
Step 3: Break the question into a series of steps
Now that we’ve highlighted the keywords, we know what to focus our attention on. Our next step would be the reframe the question into a series of steps.
For this question, these are the series of steps that I would take (+ a quick note on the SQL syntax that we would use).
We want to rank the pranks by evilness score in descending order using
RANK() & ORDER BY evilness_score DESCBut we want to restart the ranking based on the target, so we’ll need a window function with a
PARTITION BY target_nameThen we want to filter down to only where RANK = 1 using a
WHERE rank = 1clause
Step 4: Visualize the intermediate tables (This step is a game-changer for my clients!)
Quite often, the data that we need isn’t directly available in the tables that we have. We need to clean, join and aggregate.
In this example, we know that we want to get a rank of the pranks for each target, but data is not already available in the given tables. So this is definitely an intermediate table that we’ll need.
Let’s visualize this intermediate table that we can wrap into a subquery or a CTE. You can see below that we added a prank_rank column to our intermediate table.
Original: grinch_prank_ideas (prank_id, target_name, prank_description, evilness_score, created_at)
Intermediate table: grinch_prank_ideas_ranked (prank_id, target_name, prank_description, evilness_score, created_at, prank_rank)
Then start to write the query blocks for the intermediate tables. This should be a little easier since in Step 3, we already broke down the steps that we need, and jotted down the relevant syntax.
SELECT
*,
RANK() OVER (
PARTITION BY target_name
ORDER BY evilness_score DESC
) AS prank_rank
FROM grinch_prank_ideasStep 5: Query the intermediate tables to get the final query
This is our last step! Now that we have created one or more intermediate tables, we can put together our final SELECT & final query.
At this point, we should have intermediate CTEs that we can query with simple SELECT, WHERE, ORDER BY, LIMIT etc. If you find that you are still not able to finish off your query, then go back to Steps 3 & 4 — break the steps down further and fill in any required intermediate queries.
Note: In the real-world, you often have to create multiple, interdependent CTEs. That’s totally normal! Keep everything organized by using descriptive names and avoid correlated subqueries. Sometimes, when I find it hard to keep track of my many CTEs, I’ll draw out a flow chart so that I can trace the flow & transformation of the data.
Here’s the final query for this example!
WITH grinch_prank_ideas_ranked AS (
SELECT
*,
RANK() OVER (
PARTITION BY target_name
ORDER BY evilness_score DESC
) AS prank_rank
FROM grinch_prank_ideas
)
SELECT *
FROM grinch_prank_ideas_ranked
WHERE prank_rank = 1Hope this is helpful! Would love to hear any questions or feedback that you have :)
ICYMI (in case you missed it!)
As I mentioned before, I have a free SQL challenge running in December 2025. It’s super fun, and 1,500 people have already joined! Join the challenge & level up your SQL skills here.
I am making daily videos breaking down the challenge question (in painful detail) on my Instagram. Check out the first video of my series here.
A case study on how AI is used in sports, and leading to better performance!
Data cleaning in Python cheatsheet
Tableau learning roadmap cheatsheet
Quick video walk-through of a Data Analysis in Python, focusing on F1 data



This is such a clear breakdown of SQL problem-solving. The step about visualizing intermediate tables before writing code is genuinely clever, especially when you're dealing with complex queries that need multiple CTEs. I've noticed a lot of people (including me at times) just start writing SELECT statements without mapping out what transformations they actually need first. The way you named that intermediate table as grinch_prank_ideas_ranked makes it so much easier to reason about what's happening in the query flow.