Similarly, the columns that came from the right source table contain all the rows that were in the right source table. For those rows, the columns that came from the left source table all have the null value. Thus, the table resulting from a union join contains all the columns of both source tables — and the number of rows it contains is the sum of the number of rows in the two source tables.
The result of a SQL union join by itself is not immediately useful in most cases; it produces a result table with many nulls in it. But you can get useful information from a union join when you use it in conjunction with the COALESCE
expression. Look at an example.
Suppose that you work for a company that designs and builds experimental rockets. You have several projects in the works. You also have several design engineers who have skills in multiple areas. As a manager, you want to know which employees, having which skills, have worked on which projects. Currently, this data is scattered among the EMPLOYEE table, the PROJECTS table, and the SKILLS table.
The EMPLOYEE table carries data about employees, and EMPLOYEE.EmpID
is its primary key. The PROJECTS table has a row for each project that an employee has worked on. PROJECTS.EmpID
is a foreign key that references the EMPLOYEE table. The SKILLS table shows the expertise of each employee. SKILLS.EmpID
is a foreign key that references the EMPLOYEE table.
The EMPLOYEE table has one row for each employee; the PROJECTS table and the SKILLS table have zero or more rows.
The following tables show example data.
EmpID | Name |
1 | Ferguson |
2 | Frost |
3 | Toyon |
ProjectName | EmpID |
X-63 Structure | 1 |
X-64 Structure | 1 |
X-63 Guidance | 2 |
X-64 Guidance | 2 |
X-63 Telemetry | 3 |
X-64 Telemetry | 3 |
Skill | EmpID |
Mechanical Design | 1 |
Aerodynamic Loading | 1 |
Analog Design | 2 |
Gyroscope Design | 2 |
Digital Design | 3 |
R/F Design | 3 |
Now suppose that, as a manager, you want to see all the information about all the employees. You decide to apply an equi-join to the EMPLOYEE, PROJECTS, and SKILLS tables:
SELECT * FROM EMPLOYEE E, PROJECTS P, SKILLS S WHERE E.EmpID = P.EmpID AND E.EmpID = S.EmpID ;You can express this same operation as an inner join by using the following syntax:
Both formulations give the same result.
E.EmpID | Name | P.EmpID | ProjectName | S.EmpID | Skill |
1 | Ferguson | 1 | X-63 Structure | 1 | Mechanical Design |
1 | Ferguson | 1 | X-63 Structure | 1 | Aerodynamic Loading |
1 | Ferguson | 1 | X-64 Structure | 1 | Mechanical Design |
1 | Ferguson | 1 | X-64 Structure | 1 | Aerodynamic Loading |
2 | Frost | 2 | X-63 Guidance | 2 | Analog Design |
2 | Frost | 2 | X-63 Guidance | 2 | Gyroscope Design |
2 | Frost | 2 | X-64 Guidance | 2 | Analog Design |
2 | Frost | 2 | X-64 Guidance | 2 | Gyroscope Design |
3 | Toyon | 3 | X-63 Telemetry | 3 | Digital Design |
3 | Toyon | 3 | X-63 Telemetry | 3 | R/F Design |
3 | Toyon | 3 | X-64 Telemetry | 3 | Digital Design |
3 | Toyon | 3 | X-64 Telemetry | 3 | R/F Design |
SELECT
statements, to produce a more suitable result. You begin with the basic SQL union join:Notice that the union join has no ON
clause. It doesn't filter the data, so an ON
clause isn’t needed. This statement produces the result shown in the following table.
E.EmpID | Name | P.EmpID | ProjectName | S.EmpID | Skill |
1 | Ferguson | NULL | NULL | NULL | NULL |
NULL | NULL | 1 | X-63 Structure | NULL | NULL |
NULL | NULL | 1 | X-64 Structure | NULL | NULL |
NULL | NULL | NULL | NULL | 1 | Mechanical Design |
NULL | NULL | NULL | NULL | 1 | Aerodynamic Loading |
2 | Frost | NULL | NULL | NULL | NULL |
NULL | NULL | 2 | X-63 Guidance | NULL | NULL |
NULL | NULL | 2 | X-64 Guidance | NULL | NULL |
NULL | NULL | NULL | NULL | 2 | Analog Design |
NULL | NULL | NULL | NULL | 2 | Gyroscope Design |
3 | Toyon | NULL | NULL | NULL | NULL |
NULL | NULL | 3 | X-63 Telemetry | NULL | NULL |
NULL | NULL | 3 | X-64 Telemetry | NULL | NULL |
NULL | NULL | NULL | NULL | 3 | Digital Design |
NULL | NULL | NULL | NULL | 3 | R/F Design |
Notice that the table has three ID columns, two of which are null in any row. You can improve the display by coalescing the ID columns. The COALESCE
expression takes on the value of the first non-null value in a list of values. In the present case, it takes on the value of the only non-null value in a column list:
SELECT COALESCE (E.EmpID, P.EmpID, S.EmpID) AS ID, E.Name, P.ProjectName, S.Skill FROM EMPLOYEE E UNION JOIN PROJECTS P UNION JOIN SKILLS S ORDER BY ID ;The
FROM
clause is the same as in the previous example, but now the three EMP_ID
columns are coalesced into a single column named ID
. You're also ordering the result by ID
. The following table shows the result.
ID | Name | ProjectName | Skill |
1 | Ferguson | X-63 Structure | NULL |
1 | Ferguson | X-64 Structure | NULL |
1 | Ferguson | NULL | Mechanical Design |
1 | Ferguson | NULL | Aerodynamic Loading |
2 | Frost | X-63 Guidance | NULL |
2 | Frost | X-64 Guidance | NULL |
2 | Frost | NULL | Analog Design |
2 | Frost | NULL | Gyroscope Design |
3 | Toyon | X-63 Telemetry | NULL |
3 | Toyon | X-64 Telemetry | NULL |
3 | Toyon | NULL | Digital Design |
3 | Toyon | NULL | R/F Design |
ProjectName
column has a non-null value, the row names a project on which the employee has worked. If the Skill
column is not null, the row names one of the employee's skills.You can make the result a little clearer by adding another COALESCE
to the SELECT
statement, as follows:
SELECT COALESCE (E.EmpID, P.EmpID, S.EmpID) AS ID, E.Name, COALESCE (P.Type, S.Type) AS Type, P.ProjectName, S.Skill FROM EMPLOYEE E UNION JOIN (SELECT "Project" AS Type, P.* FROM PROJECTS) P UNION JOIN (SELECT "Skill" AS Type, S.* FROM SKILLS) S ORDER BY ID, Type ;In this union join, the PROJECTS table in the previous example is replaced with a nested
SELECT
that appends a column named P.Type
with a constant value "Project"
to the columns coming from the PROJECTS table. Similarly, the SKILLS table is replaced with a nested SELECT
that appends a column named S.Type
with a constant value "Skill"
to the columns coming from the SKILLS table. In each row, P.Type
is either null or "Project"
, and S.Type
is either null or "Skill"
.The outer SELECT
list specifies a COALESCE
of those two Type
columns into a single column named Type
. You then specify Type
in the ORDER BY clause
, which sorts the rows that all have the same ID in an order that puts all projects first, followed by all skills. The result is shown in the following table.
ID | Name | Type | ProjectName | Skill |
1 | Ferguson | Project | X-63 Structure | NULL |
1 | Ferguson | Project | X-64 Structure | NULL |
1 | Ferguson | Skill | NULL | Mechanical Design |
1 | Ferguson | Skill | NULL | Aerodynamic Loading |
2 | Frost | Project | X-63 Guidance | NULL |
2 | Frost | Project | X-64 Guidance | NULL |
2 | Frost | Skill | NULL | Analog Design |
2 | Frost | Skill | NULL | Gyroscope Design |
3 | Toyon | Project | X-63 Telemetry | NULL |
3 | Toyon | Project | X-64 Telemetry | NULL |
3 | Toyon | Skill | NULL | Digital Design |
3 | Toyon | Skill | NULL | R/F Design |
Considering the number of SQL JOIN
operations available, relating data from different tables shouldn't be a problem, regardless of the tables’ structure. You can trust that if the raw data exists in your database, SQL has the means to get it out and display it in a meaningful form.