New Sberry

Sberry came out with a sensible understanding of inner join this week, let’s go and learn from it


Joins ( especially Inner join as in my case) when not performed scrupulously can raise suspicion about the data and cause a whole lot of mess. We may end up trying to figure out what the heck is. I encountered this scenario and was startled by the result set produced. To demonstrate the point let’s consider the following tables with their associations.

If OBJECT_ID(N'Customers', N'U') is NOT NULL DROP table dbo.Customers;
CREATE TABLE [dbo].[Customers] (
[Id] int NOT NULL Primary key,
[Name] nvarchar(50) NULL,
[ReferredBy] int NULL);

If OBJECT_ID(N'Invoices', N'U') is NOT NULL DROP table dbo.Invoices;
CREATE TABLE [dbo].[invoices] (
[Id] int NOT NULL primary key,
[BillingDate] datetime NULL,
[CustomerId] int NULL);

If OBJECT_ID(N'Payments',N'U') is NOT NULL DROP table dbo.Payments;
CREATE TABLE [dbo].[Payments](
[id] int primary key,
[money] int,
[CustomerId] int references Customers(Id),
[InvoiceId] int references Invoices(Id)
);

Let’s fill some data to play within the tables
insert into [dbo].[Customers]([Id],[Name],[ReferredBy])
values (20,'topal',null)
insert into [dbo].[Customers]([Id],[Name],[ReferredBy])
values(20,'topal',null)
insert into [dbo].[Customers]([Id],[Name],[ReferredBy]) values(21,'Audacioius',20)
insert into [dbo].[Customers]([Id],[Name],[ReferredBy])
values (22,'riveting',20)
insert into [dbo].[Customers]([Id],[Name],[ReferredBy])values (23,'Smack',20)
insert into [dbo].[Customers]([Id],[Name],[ReferredBy])
values (24,'sack',21)

insert into [dbo].[invoices]([Id],[BillingDate],[CustomerId])
values (3,'2017-07-30 17:32:29',20),(4,'2017-07-30 17:32:29',21)
(5,'2017-07-30 17:32:29',21),(6,'2017-07-30 17:32:29',21)
(7,'2017-07-30 17:32:29',22)

insert into [dbo].[Payments]([id],[money],[CustomerId],[InvoiceId]) values (100,50,20,3),
(101,55,21,4)

Let’s examine the simple queries step and step and analyze what is the cause of the mess. Suppose we want to find the details of the customers along with their invoice details whose payments have been completed.
Here goes Query 1.

SELECT c.Id AS CustomerId, c.[Name] AS Customer_Name,i.BillingDate,i.Id
FROM Customers c
INNER JOIN invoices i
ON i.CustomerId = c.Id

We are all familiar with this form of simple inner join. This produces the result as expected and is provided below. This result set indicates the customers whose invoices are generated.

Query1Now comes Query2

SELECT c.Id AS CustomerId, c.[Name] AS Customer_Name, p.Money AS Money_Paid,i.Id,i.BillingDate
FROM Customers c
INNER JOIN invoices i
ON i.CustomerId = c.Id
INNER JOIN Payments p
ON p.CustomerId = c.Id

This is nothing hard fast query, instead, it’s a simple query as Query1 with an additional join on Payments table.

Query2

Notice this result set, we got four rows. This is so because the condition for the Join with Payments i.e p.CustomerId=c.Id matches for every row in the result set 1(Inner join of Customers and Invoices) with every row in payments and produces the above result set. I expected the result set would show Money_Paid for invoice 5 and 6 to be zero or null but which is not the case.
This indicates that the Customer with code 21 and Invoice Id 5 and 6 have paid money 55. (Good Luck bestowed 21 ;))
This, in fact, is the fallacious result set for our requirement. This happened since we didn’t specify the other foreign key relationship of  Payments with invoices when joining it.
Query 3 outputs the desired reliable result.

SELECT c.Id AS CustomerId, c.[Name] AS Customer_Name, p.Money AS Money_Paid,i.Id,i.BillingDate
FROM Customers c
INNER JOIN invoices i
ON i.CustomerId = c.Id
INNER JOIN Payments p
ON p.CustomerId = c.Id
and p.InvoiceId = i.Id

Query3

The above result set is flawless with correct data. Hence we need to be meticulous when dealing with joins on tables and provide the exact conditions on what we are intended for achieving.

Happy Learning 🙂

Advertisements