SQL Exists Syntax

About this post: This post is designed to help you understand how to use the EXISTS keyword in your SQL query environment. All of the examples will be based on a fictional three table database for posting messages to an Internet message board. The examples will be generic so that they can apply to any database (Oracle, SQL Server, etc) that supports the EXISTS syntax.

As part of my job, I regularly deal with SQL coding. My current focus is on Oracle databases, but I occasionally get to play in DB2 and other platforms. I recently encountered some SQL code that I was not familiar with. In the WHERE clause of the code was an EXISTS statement with a sub query listed inside. It looked something like the following

SELECT 	*
FROM	Users U
WHERE 	ID EXISTS (SELECT NULL
		                  FROM PostRating PR
		                  WHERE PR.UserID = U.ID
		                  AND MarkedAsSpam >= 5)

Hard as I tried, I could not make sense of the query listed above. Why would you select NULL from the the table and what does it mean. In order to explore this further, we need to have a simple problem to solve.

First, there will be two database tables for this example. I’m going to write generic syntax to get the point across. You’ll have to modify this to your platform. The tables only have the columns that are relevant to the example. Were this a production table, they would have many more columns.

 

CREATE TABLE Users AS
ID          Number
Name     String

INSERT INTO Users (ID, Name) VALUES (1, 'Nice Guy')
INSERT INTO Users (ID, Name) VALUES (2, 'Bad Guy')

CREATE TABLE PostRating AS
UserID           Number
PostID           Number
SpamVotes    Number

INSERT INTO PostRating (UserID, PostID, SpamVotes) VALUES (1, 1, 1)
INSERT INTO PostRating (UserID, PostID, SpamVotes) VALUES (2, 2, 20)

 

For the tables above, the Users table lists the valid users in the system. PostRatings table is used for people on the message board to mark a message as SPAM. Using the community to mark messages as spam can help clean up an open message board from unwanted content.

Before digging into the original query above, lets explore with some basic testing how EXISTS works.

 

--Returns 2 rows.
SELECT *
FROM Users

--Still Returns 2 rows
SELECT *
FROM   Users
WHERE  EXISTS(SELECT null
                          FROM PostRating)
             
--Returns 0 rows. 
--Note that the sub query returns zero rows.      
SELECT *
FROM   Users
WHERE  EXISTS(SELECT null
                          FROM   PostRating
	                  UserID = 999)  

--Returns 2 rows. 
--Note that the sub query returns 1 rows.      
SELECT *
FROM   Users
WHERE  EXISTS(SELECT null
                          FROM   PostRating
	                  WHERE  UserID = 1)


--Still return  2 rows  
--The sub query if for an Oracle database, but it works
--the same for any table that has rows in it.             
SELECT *
FROM   Users
WHERE  EXISTS (SELECT null
                           FROM dual)
                    
--Blows up, must be a query inside of the parentheses.                    
SELECT *
FROM   Users
WHERE  EXISTS (NULL)

 

Hopefully at this point, you have come to the same conclusion that I did after all the testing. If the query listed inside the EXISTS returns anything [including NULL] then the statement is true. If no records come back, then the record is false and don’t return anything in the whole query.

Now you might be asking why anybody would use this syntax. The power of this statement comes when you join the sub query of the EXISTS to the main query. You can quickly exclude records that don’t meet the criteria. In the original example above, I wanted to list all users that have had one or more posts listed as spam 5 or more times.

SELECT 	*
FROM	Users U
WHERE 	ID EXISTS (SELECT NULL
		                  FROM PostRating PR
		                  WHERE PR.UserID = U.ID
		                  AND MarkedAsSpam >= 5)

By joining the Users Table to the PostRating table in the WHERE statement of the sub query inside the Exists, I’m essentially asking, “Does this user have their MarkedAsSpam for any posted listed 5 or more times?” The result is that only 1 record will return for the ‘Bad Guy’ user, but it will ignore ‘Good Guy’ as their count is only at 1.

This would be great for an admin report to help know which users might be removed from the system for abuse, and it protects the good users are not abusing the system.

One final thought. Upon my initial research for EXISTS I could only thing that there are other ways to write the same query and get the result you’re looking for. However, I inherited a large code base where the developers preferred the EXISTS syntax. I don’t have the time to re-write and test working code, so I have learned to work with it. Now that I understand it, I have added it to my skills and even find it quite useful!

Let me know if you have questions or comments.

Hogan Haake