Recently, I faced a very simple task which in the end does not seem so obvious to me. I was told to rewrite an existing report, in which SQL query amongst others, was filtering by values of a column which had to be in a given set of values. And I needed to add another column for the same check.
Ok, let's go step by step to make it easier. Imagine a situation when you need to retrieve all rows from
table1 where values of
column1 are in a given set of values. I bet you have already written a query in your head and it looks something like this:
select * from table1 where column1 in (1, 2, 3);
But what if your task was a bit harder - and you need to fetch all rows where the value of any
column2 is in the given set.
The obvious answer is to use
select * from table1 where column1 in (1, 2, 3) or column2 in (1, 2, 3);
Looks ugly, doesn't it? And is this query efficient? Especially, when this action is only one of many you need to do with your data (so your query is a bit more complex)? I doubt it.
If you think this example is something unnatural and simulated, let's consider this - imagine, you have a trading platform and for each order you track identifiers of a buyer and a seller. But these two are of the same type. I mean, both the buyer and the seller are organizations, for example. And you need to show your users all their orders - the orders where the buyer OR the seller is their organization.
So, you have the table
order with the columns
seller_id and the query to get a user's orders would be:
select * from order where buyer_id in (select organization_id from employee where employee_id = :employee_id) or seller_id in (select organization_id from employee where employee_id = :employee_id);
And it looks even worse. If you think the same way, let's sort out how to improve it together. To do so, we need to remember about SET operators available in Oracle. Indeed, they are a great feature, but a bit underused to my mind. To rewrite our query, I will be using the
intersect operator and the query transforms to:
select * from order o where exists ( ( select o.buyer_id from dual union all select o.seller_id from dual ) intersect ( select organization_id from employee where employee_id = :employee_id ) );
Better? For sure! Why? Look:
- This query is more efficient in terms of performance (using
ORin a complex query in an anti-pattern since it's hard for the optimizer to cope with such a query).
- I do not repeat the filtering subquery anymore.
- Need another column to be checked (i.e.
reseller_id)? Not a problem - just add it into the first checking subquery.
- It is much easier to integrate such a subquery into a complex query (imagine you have many filters, joins and so on).
- This approach is suitable for a dynamically constructed report (using dynamic SQL) - this was my case, actually.
Note: Mind nullable columns - you may need to add
and <column_name> is not nullin your
whereclauses in the subqueries if you don't want to intersect on
I also tried to suggest the same in a question on StackOverflow.
Hope it'll help somebody else when they face the same task.