Understanding Row Relationships in Joins: Mastering Outer Joins for Relational Databases

Understanding Row Relationships in Joins

When working with databases, particularly relational databases like MySQL or PostgreSQL, joining tables is a common operation. However, understanding how to join rows from different tables can be challenging. In this article, we’ll explore the basics of joins and how to use them effectively.

Table Schema and Data

To better understand the problem, let’s examine the table schema and data provided in the question:

-- Create tables
drop table person;
drop table interest;
drop table relation;

create table person (
  pid int primary key,
  fname varchar2(20),
  age int,
  interest int references interest(intID),
  relation int references relation(relID)
);

create table interest (
  intID int primary key,
  intName VARCHAR2(20)
);

create table relation (
  relID int primary key,
  relName varchar2(20)
);
-- Insert data
insert into person values(1, 'Rahul', 18, null, 1);
insert into person values(2, 'Sanjay', 19, 2, null);
insert into person values(3, 'Ramesh', 20, 4, 5);
insert into person values(4, 'Ajay', 17, 3, 4);
insert into person values(5, 'Edward', 18, 1, 2);

insert into interest values(1, 'Cricket');
insert into interest values(2, 'Football');
insert into interest values(3, 'Food');
insert into interest values(4, 'Books');
insert into interest values(5, 'PCGames');

insert into relation values(1, 'Friend');
insert into relation values(2, 'Friend');
insert into relation values(3, 'Sister');
insert into relation values(4, 'Mom');
insert into relation values(5, 'Dad');

The Original Query

The query provided in the question is:

select person.fname, interest.intName, relation.relName
from person, interest, relation
where person.interest = interest.intID and relation.relID = person.relation;

This query attempts to retrieve the first name of individuals from the person table who have a matching intID in the interest table and a corresponding relID in the relation table.

Expected Output

The expected output is:

FNAME                INTNAME              RELNAME
-------------------- -------------------- --------------------
Rahul                                     Friend
Edward               Cricket              Friend
Ajay                 Food                 Mom
Ramesh               Books                Dad
Sanjay               Football

However, the actual output from the original query is different.

The Issue: Null Values

The issue lies in how SQL handles NULL values when performing joins. In a join operation, if there are no matching rows between two tables, then all columns with NULL values will be marked as NULL.

Let’s examine what happens in each table:

  • Person Table: Ramesh, Edward and Sanjay have corresponding rows in the interest table. However, Ajay has a NULL value for relation since there is no matching row in the person table.
  • Interest Table: The tables are populated with data.

The Solution: Using Outer Joins

To resolve this issue and retrieve all rows from both tables (even if they don’t match), we can use an outer join. Specifically, using a LEFT JOIN will allow us to include all records from the left table (person) even if there is no matching record in the right table (interest or relation). A RIGHT JOIN would be equivalent but may not provide the same results due to the default behavior of SQL.

Here’s how we can modify our query to use a LEFT JOIN:

SELECT p.fname, i.intName, r.relName
FROM person p
LEFT JOIN interest i ON p.interest = i.intID
LEFT JOIN relation r ON r.relid = p.relation;

By using LEFT JOIN, we ensure that all rows from the person table are included in our results, along with their corresponding values from the interest and relation tables.

Using Different Types of Outer Joins

We can also use RIGHT JOIN or FULL OUTER JOIN to achieve a similar result. However, it’s worth noting that:

  • RIGHT JOIN is not supported by all databases (for example, PostgreSQL does not support RIGHT JOIN, but you can achieve the same result using FULL OUTER JOIN).
  • FULL OUTER JOIN returns all records when there is no match between two tables. This might be useful in certain situations.

Conclusion

The main takeaway here is that SQL joins require careful consideration of null values and table relationships. Using outer joins like LEFT JOIN, RIGHT JOIN, or FULL OUTER JOIN can help you retrieve more complete results, including rows with NULL values. Remember to consider the different database options and limitations when working with joins.

Best Practices

  • Always verify your data by checking the database schema, data types, and nulls.
  • When using joins, start with an inner join and then add outer joins as needed.
  • Don’t assume that a left or right join will always work because both tables have matching columns.

Last modified on 2023-08-11