It is entirely possible to try to JOIN two tables on almost any field, as long as the two data types can, in some way, be made to reconcile to each other, you can try to join the tables. But, most database designs assume a much more directly relationship and provide a column or columns in one table that match the identifying column or columns in the other table.
INNER JOIN
The INNER JOIN will return the parts of both data sets that match. Frequently, what you’ll see when joining two tables is the same column name in each table. With that, you have to be sure to identify the owner of each column. I’ve introduced what is called an alias to make it so I don’t have to type as much. This query will return all data from both tables where there is a match between the appropriate columns.
SELECT * FROM Personnel.Person AS p JOIN Finance.FinancialTransaction AS ft ON p.PersonID = ft.PersonID;
The INNER JOIN operation doesn’t require the use of the word INNER. I find it more readable to just leave it off. You can see that you list the first table in the FROM clause and then you add the JOIN operator and list the second table. Then you have to supply the criteria through the ON clause. Note that both tables have aliases and the query uses the appropriate alias in the ON clause to identify the matching column on both tables.
Returning all the columns from both tables just doesn’t make sense, even if you really want all the data from both tables. Just as an example, do you really need two copies of the PersonID column? It’s always best to limit the SELECT list, but I wanted to show you what happens when you use the star.
To add another table, you just add it to the FROM clause like you did the first one. This query includes the Company table and limits the select list as appropriate.
SELECT c.CompanyName, p.LastName + ', ' + p.FirstName AS 'PersonName', ft.TransactionAmount, ft.TransactionDate FROM Personnel.Person AS p JOIN Finance.FinancialTransaction AS ft ON p.PersonID = ft.PersonID JOIN Management.Company AS c ON ft.CompanyID = c.CompanyID;
The SELECT statement operates in the same way as you’ve used it previously. The only difference is that in order to designate where each column came from you have to include the table alias. The additional table added as a relationship just required another JOIN operator and another ON clause to define the unique relationship. It doesn’t matter the order of the JOIN operations or which column is listed first in the ON clause. SQL Server understands the JOIN syntax and will figure out how best to put the tables together based on the information you provided in the TSQL statement.
You can just keep going from there. You can continue adding tables as long as there are relationships to support them. You can also use derived tables in the same way. It all comes down to being able to establish an appropriate relationship between the sets of data, or tables, and then using the syntax to define that relationship.
Conclusion
This is just the beginning of joining tables together. You may have tables that have more than one column that define the JOIN. There you simply add an AND to the ON clause and define that additional relationship. There’s much more to this than immediately meets the eye. I’ll address a lot more in the next Fundamentals post which will be on the OUTER JOIN.
Brilliant Grant. Love the simplicity of the article, it’s straight to the point, it’s in plain English and easy to absorb. Hopefully next articles will follow the same pattern! Thanks.
Thanks. I’ll sure try.
I usually leave off the “AS” as well when using an alias for a table name.
While I know it’s just syntactic sugar (I’ll never be able to thank Phil Factor enough for that phrase), I like using the AS. I feel it clarifies the intent of the code.
Grant, about the AND. If I was to say SELECT (whatever) FROM TableA INNER JOIN TableB ON TableA.Id = TableB.Id
AND TableA.MyVal = 0
What goes on there exactly? Does TableB basically join onto a TableA where MyVal is only 0? How does that work?
Yeah. That’s what happens. Effectively, the ON clause is just another way to write a WHERE clause (I haven’t posted that one yet). So, if you restricted it through that AND clause, it would be where MyVal = 0 AND the Id values matched.
I lost you on “So, if you restricted it through that AND clause, it wouldn’t be where MyVal = 0 AND the Id values matched”.
In the code I posted, would MyVal (column) in TableA come back with just 0 or not? Got a bit confused…
Sorry, typo or auto-correct. It WOULD be… I’ll edit it to make it more clear and accurate.
Crystal clear. Thanks Grant.