Overview: SQL Joins

Today we’ll briefly discuss SQL joins—the method used to join two tables in a database together based on a given column (key) that they have in common. By the end of this article you should be able to explain the concept of a join as it applies to SQL and querying a relational database, as well as identify the different types of joins available and their particular purposes.

If you want to learn more about relational databases and SQL click here.

If you’re nervous about how to present yourself when faced with questions about SQL and databases in an interview, click here.

 

1. What is a join?

The join is an important tool for instances which require that you put two or more tables together when retrieving a dataset from your database. Tables in a relational database are usually designed with columns that serve as keys—the keys in question are usually some form of unique identifier, i.e. a client name or ID number. Tables that have the same keys can be matched up based on those values and the data related to those values can then be linked up as well:

RDSQL_img_10

2. What are the different types of joins, and why do we have different types of joins?

There are several different types of joins—we’ll be focusing on the big 4 today: the inner join, left join, right join, and full outer join. Data incongruence can be an issue when joining tables and the different joins were built to address this issue.

The Full Outer Join

Let’s start with an example: What if two of the tables (table_1 and table_2) I need to join are different sizes? That means there’s a different number of clients in table_1 than in table_2, which also means that if I just join both tables in their entirety, I might end up with some blank cells. This is called a “full outer join”:

SQLjoin_img_1

The full outer join basically smashes two tables together regardless of the content of their key columns. As long as you have two CLIENT columns, you can join the tables, regardless of whether the names in those columns actually match up. This can be useful in some cases, but in most cases it’s a hassle because you usually end up with records (rows) in your dataset that you don’t want or need.

SQLjoin_img_full_outer_diag
A full outer join combines all records in both tables, as long as there’s a key column.

The Inner Join

The inner join was specifically designed to address the problem above—it throws out records where there’s no matching key:

SQLjoin_img_inner

Take a moment to notice how the new table above is different from the one before it.  The key we joined on was “CLIENT”, and since Will Pastrami’s name didn’t show up in both “CLIENT” columns, it was removed from the result set. The traditional way of explaining the inner join is with the diagram below, which makes way more sense when combined with the diagram above than it does on its own:

SQLjoin_img_inner_diag
An inner join throws out all the records where the key column values don’t match.

Now, what if we wanted to make sure we included all of the records in one table, while throwing out unrelated records from the other?

The Left and Right Joins

These are generally the trickiest joins to wrap your head around. We grouped left and right joins together because they are essentially mirror images of each other. Left and right joins, as the name implies, keep all the records in either the left, or right table and only the records that match from the other one:

SQLjoin_img_left_join
A left join keeps all the content from Table 1, but throws out content from table 2 where the records don’t match.
SQLjoin_img_right_join
A right join keeps all the content from Table 2, but throws out content from Table 1 where the records don’t match.

One more quick note about left and right joins: you may have asked yourself the question—“in a database there’s no ‘left’ or ‘right’, so how do I know which join to use where?

SQLjoin_img_left_diag
Left Join
SQLjoin_img_right_join_diag
Right Join

 

 

 

                  

 

The simple way to remember this is that the left join always keeps the whole first table you list and only part of the second.

 

Left join= whole 1st table, right join = whole 2nd table.

 

If you make sure to always list your tables like this:

 

1st Table = whole table I need, 2nd table = partial table I need.

 

Then you’ll only ever have to use left joins and you can ignore right ones!

 

3. How do you actually write a join?

The syntax in a join is actually pretty simple. If you remember, in our previous article we talked about the from  statement.

You will remember that there are many different types of query languages. Generally speaking, SQL joins are always written as part of the from statement, since the from statement reflects the base tables you’re using to create your dataset.

There are three steps to writing a complete join statement. The first is to specify the type of join you want to use (inner/left/right/outer), then identify the table you want to join, and select the column that you want to join on. Let’s use our example from our last article again here, but with a join added in:

SQLjoin_img_syntax

 

That’s about it on the basics of SQL joins. If you want to learn more about relational databases and SQL click here. If you’re nervous about how to present yourself when faced with questions about SQL and databases in an interview, click here.

We hope you found this segment helpful and useful—if you liked it, please feel free to leave a comment below!

 

One thought on “Overview: SQL Joins

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s