SQL Skills for Product Managers
The power of data
It’s painful, it’s brutal and it’s a little sadistic. But it’s wonderful.
Data to product managers is like ammunition to an assassin; nothing gives us more pleasure than gently annihilating a loud, forceful stakeholder with an incisive bullet of data which obliterates their stance in a few seconds.
And then the silence.
Those short, awkward seconds of silence after a killer piece of data is presented in a meeting is almost as satisfying as the utterly dismayed looks on people’s faces which typically follows.
The people in the meeting room shift around in their seats, twiddle their pens and look at their own notes; grasping at any last straws that might salvage their arguments.
Without the data, the meeting would inevitably have been a long and arduous slog of opinion-based drivel, ending with the either the loudest person or the highest paid person (HIPPO)’s winning the day.
Am I saying opinions don’t count? No. Of course they do. But ultimately, a potent blend of opinion, persuasion and data will ensure you win the day.
Typically, as product managers we’re comfortable using 3rd party tools such as Google Analytics, MixPanel or KISS metrics to get our own data, but when it comes to interacting with databases directly, we often rely on other people to get the data we need before meeting with stakeholders or putting together decks for a herd of HIPPOS.
In larger corporates, there are entire teams set up specifically to deal with and manage your data. If you want a bit of data, you need to make a formal request and you might get it a few days later. In smaller companies, it’s usually the engineering team’s responsibility to retrieve the data – or the business analysts if you have them.
But what if you didn’t need to rely on other people? It’s an extremely useful skill to be able to interact directly with a database yourself and retrieve the data you need, whenever you want to.
One of the most popular languages used for creating, retrieving, updating and deleting data from a database is SQL.
With even a little bit of SQL knowledge you’ll be able to write queries, generate reports and completely bypass your data teams in some instances.
So you want to know more about SQL? OK, great.
What is SQL?
SQL stands for structured query language. People also refer to it as ‘sequel’ – just to confuse things a bit. You may have heard your engineering team refer to ‘sequel’ – and this is what they’re referring to.
SQL is a database specific programming language – meaning it can only be used with databases. SQL works with MySQL databases.
What’s a MySQL database? Well, it’s a popular type of database known as a relational database. What’s a relational database? A relational database is a set of multiple data sets organised by tables, records and columns. Each table is known as a relation. Here’s a diagram which explains the concept:
Data is stored in tables. Just like data in your excel sheets is stored in tables.
In a relational database these tables can communicate with each other based on specific fields in the database which are known as keys. But let’s leave that for another day and not get bogged down in that. Instead, let’s focus on the language for querying MySQL databases.
SQL is the language used to create, retrieve, update and delete data from a database. The acronym CRUD is used for these actions. As product managers, I’d argue that we are primarily concerned with the ability to read information from a database; getting the data we need for our reports and leaving the creating, updating and deleting to the engineers or database managers who are responsible for outlining the schema and structure of databases.
Why is SQL useful for product managers?
As a skillset, there are some who think that SQL is an essential skill for product managers.
I remember being sat in a meeting with a few of my product colleagues in a previous role a few years ago and the Head of Product at the time balked at the idea that any product manager might not have SQL skills. My SQL skills were a little bit ropey at the time, so I smiled and sheepishly nodded along in painful agreement, knowing that I’d need to take a crash course in SQL immediately.
But as with most things product related, it’s not clear where the boundaries lie between essential skills and nice to have skills, since we’re a blend of all sorts of disciplines. The mutants of the tech world.
I don’t think SQL skills are completely essential; your focus as a product manager is and always should be on delivering value for your customers – and your business. And when it comes to metrics in particular, your focus as a product manager should be on the product discoveries that metrics provide you with and not on the mechanism of getting the metrics themselves. So SQL is just a nice to have. But nice to haves are nice to have.
The benefits are clear:
- Don’t rely on engineers or data teams to building reports pull the data you need for yourself.
- Set your metrics, measure your results without involving your engineering team too much
- Particularly helpful to know if you’re a startup or small business since you’re less likely to have a dedicated data / reporting team.
- Get the data you need – some queries might be complex but even the most basic SQL queries will allow you to get whatever the data you need.
- Understand where the data lives – if you understand where the data lives, you have a deeper understanding of your product – and you can ensure that in your next planning session that a relevant field is added to the relevant table so that you can generate reports to measure the performance of your feature.
How does it work?
As with most things, sometimes concepts are better explained using words and pictures than text. Here’s a video we’ve put together which explains what SQL is and how it works:
SQL retrieves information from a database. This process is known as ‘querying’ a database.
Querying says ‘hey database, show me all the customers on this package’ and the database returns a result, known as a ‘result set’. The result is the answer to your query.
How querying a database works
Connecting to a database
In order to query a database you need to connect to it. To do this you need to establish a connection. What do we mean by a ‘connection’?
Your databases contain highly sensitive information, which means you don’t want anyone to be able to rummage around and grab whatever they want.
If you want to access a database, you’ll need a username and password. You’ll also have administrative rights determined by your administrator e.g. read-only, or read-only access to tables that you think you might need.
There are various ways to connect to a database. Your organisation may have it’s own tools. The easiest way to connect to a database is to use a ‘GUI’ tool such as MySQL Workbench or Sequel Pro (Mac users).
Once you’re connected, you can start running queries against your database to retrieve the information you want and run the reports you want. Whoop.
Basic SQL syntax explained
How do you select what you want from the database? You use the word
SELECT. Fairly self explanatory, really.
SELECT tells the database that you’re performing a query and you expect a set of data to be returned as a result. Your database could contain hundreds of different tables, each with different pieces of information. You need to tell the database what you want and where in the database you want to get it from. You’ll need to ensure you know which table contains the data you need.
SQL SELECT statement
|SQL syntax||What it does||Example|
||Selects everything; the * sign means all in SQL||
||Selects everything from a specific table||
||Selects a specific column from a table||
||Selects 2 columns (column1 and column2) from the table||
The most useful SQL queries to know as a product manager
Now that you’ve got a basic understanding of how SQL works, let’s take a look at some of the most useful SQL queries to know as a product manager. This is by no means an exhaustive list of all the relevant queries you should know since every product is different but if you know how to construct these types of SQL queries you’ll have a solid foundation to build upon.
As product managers, it’s highly unlikely that we’ll ever need to create databases in our day to day roles (although there are always exceptions of course!); instead, we’ll typically be using SQL to create reports and retrieve information from a database so that we can put together presentations or keep stakeholders and our teams up to date with the latest product metrics and KPIs so we’re focusing on queries which allow us to read data from a database – not create, update or delete.
1. Select specific columns from a table
SELECT keyword is used to tell the database which columns you want to select from the database.
“I want to select all transactions from the payments table”
SELECT transactions FROM payments;
“I want to select the users and the transactions from the payments table”
SELECT users, transactions FROM payments;
Commas are used to separate the 2 different columns you want to select from the table.
2. Apply conditions to filter your results
You won’t always want to return every single record from a specific column. Sometimes you’ll want to limit it to records which meet a specific criteria. Let’s look at some examples to help explain what I mean.
“I want to select all customers who live in London”
In this example, we want to limit our results to only customers who live in London. To do this, we use the WHERE clause. You say to the database, OK, show me all the customers, but only WHERE the customers live in a specific city. Like this:
SELECT * FROM customers WHERE city = ‘London’;
Notice that following the WHERE clause we specify the column followed by the condition using the ‘=’ symbol – known as the equality operator; just a pompous way of saying ‘equals’. You can use any logical operator here.
You might know about operators; if not, here’s an overview of the operators you can use with SQL.
|!=||Not equal. Note: In some versions of SQL this operator may be written as <>|
|>=||Greater than or equal
|<=||Less than or equal
|BETWEEN||Between an inclusive range
|LIKE||Search for a pattern
WHERE column_name LIKE pattern;
|IN||To specify multiple possible values for a column
Notice also that the city – ‘London’ – is put into inverted commas since this is a string. In SQL, if the condition you’re filtering against is a word or a string of letters then this needs to be put into inverted commas. If it’s a number and not a string of letters, no inverted commas are required.
Some other examples of using the WHERE clause to apply conditions include:
“Show me all customers on the premium package”
SELECT * FROM customers WHERE package = ’Premium’
“Show me all customers who are paying more than $200 a month”
SELECT * FROM customers WHERE monthly_price > 200;
Notice that we’re used the greater than symbol here >. You could change this to less than (<), less than and equal to (<=) or any other logical operator. That’s the neat thing about SQL; once you know the foundations you can then mix together different operators and clauses to get the data set you need.
It is, dare I say it, almost fun.
3. Retrieve data in a specific order
Just like you do in your Excel spreadsheets, you can use SQL to filter your results.
You might want to get a list of all your customers in ascending order of how much they pay per month or a list of your most popular packages, ordered by country alphabetically.
How do you do it? Let’s look at some examples.
Example – filter by first name
“I want a list of all my customers ordered by their first name alphabetically”
SELECT * FROM customers ORDER BY FirstName ASC;
There’s a few new things here. Let’s look at each one individually:
ORDER BY– This, as you will no doubt not be surprised to find out, allows you to order you results by a specific column. In this example we’re saying we want to order our results by the FirstName column.
FirstName– The column you want to order by is what comes immediately after the ORDER BY keywords. In this example, we’re ordering by the FirstName column.
ASC / DESC– these allow you to specify whether you want the results to be in ascending or descending order. Where the column includes text (such as this example with FirstName), SQL will order the results alphabetically. If the column was numbers, it would be in ascending numerical order.
Make sense? Here’s a visual example:
Example 2 – order by payment amount
If, instead of ordering by first name, you wanted to retrieve a list of your customers, ordered by customers who are paying the most per month, in descending order you’d do the following:
SELECT * FROM customers ORDER BY monthly_payment DESC;
This tells the database, ‘show me everything from the customers table and ORDER it BY the monthly_payment column in DESCending order.’
4. Rename results using aliasing
Often when you’re working with sets of data, particularly in larger corporates, the column names in the tables that you’re working with in the database make absolutely no sense.
You might need be looking for users on a specific subscription package but have no idea which column in the table refers to monthly subscription. In these scenarios, you’d need to do 2 things:
- Speak to your data person to find out which column matches the column you’re trying to find in the database
- Use aliasing to rename your results when performing queries
Here’s a real life example of a table I once worked with at as a product manager:
Notice how the columns don’t make a great deal of sense if you don’t know what they relate to.
Let’s assume you’ve had a chat with your database person and you now know which columns in the database matches the columns you need for your report. You then need to rename the columns in your results to ensure everyone knows what it means.
Not making sense? Let’s look at an example.
“I want to create a report showing customers by package”
SELECT customers, pkge_legacy_3 FROM customers;
Let’s say that in this example, you’ve looked at the tables in the database and you can’t find a field which matches ‘packages’. You have, however, spotted a column called ‘pkge_legacy_3’ but you’re not 100% sure if this is the correct column.
You speak to your database person and s/he confirms that this is indeed the column you need for your report.
Great. Thanks for the help.
But for your own sanity, you’d like to rename this column so that in your weekly report you don’t have to remind yourself that ‘pkge_legacy_3’ means package. You’re going to be sharing this with stakeholders and you want to avoid the inevitable avalanche of questions:
- ‘What’s this ‘‘pkge_legacy_3’ column?’
- ‘Why is there a column called ‘‘pkge_legacy_3’
- ‘Are you SURE this is the right data source? I can’t remember there being a package called legacy3 – pls confirm’
So what do you do?
SELECT customers, pkge_legacy_3 AS ‘Package’ FROM customers;
Using the ‘AS’ keyword, you rename the column in your result set. In this example, you’re renaming the hideously named pkge_legacy_3 as ‘Package’ by using the AS keyword followed by what you’d like to rename it as.
You can rename it in any way that makes sense for you and your report. For example, if I wanted to rename the package column as ‘Shirley’ I would do the following:
Example 2 – renaming multiple columns
You can rename multiple columns in 1 single query if it makes sense for your report.
Let’s say not only is there a hideously named ‘pkge_legacy_3’ but there’s also the monthly payment amount which is currently named ‘mnth_pkge_pxmnt’.
Your stakeholders will have a mild meltdown if you send this around. It’s going to need to change. How do you do it?
pkge_legacy_3 AS ‘Package’,
mnth_pkge_pxmnt AS ‘Monthly payment amount’
To do this, we’ve added a comma after ‘Package’ to indicate that in addition to package column we want to select another column from the database. In this case, we’re adding the mnth_pkge_pxmnt column and using the AS keyword to rename it to ‘Monthly payment amount’.
5. Joining data together
We’ve covered quite a few pieces of useful SQL and this is all wonderful, but we’re making 1 rather bold assumption throughout this little journey so far and that assumption is this:
The truth? It’s not.
So what does that mean?
Well, that means that in order to retrieve the data you need, you’ll often be joining data together from multiple different tables. It sounds rather complicated, and it is at first.
Your brain really won’t want to want you to understand joins. It will tell you that they are complicated and that you’ll never understand them (at least that’s what mine did).
But here’s a video which will hopefully explain how joins work:
To summarise, the process of joining data together from separate tables is intuitively called a ‘Join’.
Remember we mentioned briefly that MySQL is an example of a ‘relational database’? Well, it turns out that one of the main benefits of ‘relational databases’ is that since there is a relationship between the tables you can join them together using ‘joins’.
There are different types of joins but let’s focus on the most frequently used ones for now:
- Inner Join
Inner joins can be used with the words INNER JOIN or simply JOIN. Inner joins return only the bits in the middle that match the field you’re joining the tables on. A venn diagram with 2 intersecting circles is used to demonstrate this. We’re only interested in the bits in the middle – the bits that match the criteria upon which you’re joining the 2 tables together. Inner joins exclude everything else from the results.
Let’s suppose you have 2 tables:
- A users table – which is a list of users
- A hobbies table – which is a list of hobbies. Hobbies include fishing and walking.
Here’s a video which explains how you could join the 2 tables together:
In order to join the 2 tables together you need to:
- Tell the database what you want to select (using the SELECT statement)
- Tell the database which columns you want to select
- Tell the database FROM which tables you want to select the data
- Join the 2 tables using the JOIN keyword
- Tell the database ON which columns you want to join the 2 tables together
- Tell the database which columns from both tables you should be matched, using the ‘=’ equality criteria
Here’s the syntax for doing that:
SELECT users.name, hobbies.hobby FROM users
JOIN hobbies ON users.id = hobbies.UserID;
A left joins return everything in the first table, table1, along with the matching rows in the second table, table2.
Here’s a diagram which demonstrates how left joins work, using the circles. You can see that with left joins, not only do you get back the bit in the middle (the matching intersection), but you also get back everything in the ‘left’ table – even if there is no match between the tables.
If there is no match found for a particular column the result is displayed as NULL (meaning nothing, nil, nada).
OK, check out this visual example, which shows you in real time how different joins can impact the output of a particular SQL query. It’s put together by a developer and it’s super useful to be able to visualise it all in one place.
A real world example
This is all wonderful in theory, but how would joins work in the real world of product management?
Let’s look at a real life example.
“Build a report which shows monthly package prices and descriptions”
This sounds like a fairly straight forward report. However, the data you need for your report is kept in 2 separate tables:
- The customers table
- The package table
To generate your report you need to join together the 2 tables together on a specific column to ensure you get the correct data.
Here’s a video which explains how we’d tackle this problem:
The final step – persuading your stakeholders that your data point is relevant
Just knowing or sharing the the data with others is likely not enough to get the other side to listen to why your data point is important. Humans are irrational primates and facts, figures and data alone are not good enough.
To truly harness the power of data, you need to be able to persuade your stakeholders of the relevance of your data in ways which resonate with them.
One technique which is particularly powerful is what we call the Paper Bullet technique. It sounds deadly – and it is. Well, it’s not, but it is useful.
Here’s how it works.
You call a meeting with the relevant people to discuss the feature, backlog or problem that you’re trying to solve, and throughout the meeting you pick your moment to explain why the data is telling a different story.
But, instead of just telling people about the data point, you prepare a printed handout which contains the numbers and the graphs you’re referring to. When it comes to your time to pounce, you hand out a few pre-printed copies and confidently explain why the the data is telling a different story.
This technique is powerful for 2 reasons:
- It disarms – using a printed handout gives you the upper hand, disarms your audience and demonstrates that you’ve done your homework. The other participants will think to themselves ‘WTF, this guy has prepared printed documents?! I look like an idiot’. With your audience disarmed, they’ll feel less confident about challenging your point.
- It reinforces – The piece of paper itself acts as a tangible reinforcer of the point you’re making. Touching, feeling, interacting with the paper will make the data you’re presenting seem more authentic; it adds credibility and makes it feel more real. Looking at a graph on a screen is effective sometimes, but picking it up and holding it in your hands makes it 10X more effective. It’s almost as though the data is alive (it’s not).
Bringing it all together
So that’s a short introduction into the world of structured query language, or SQL.
If you’re not a product manager from a computer science background at all, using SQL can be daunting at first. But, as we discussed earlier, SQL is a nice to have – so if you’d rather not have it, there’s no mandatory requirement for you to do so as a product person.
That being said, with even the most basic foundation of SQL skills, you can create reports which combine tables from multiple databases – and cut out the annoyance of having to ask someone else to get the data for you – so that you can get what you need – and get on with your day.