MySQL

‘EXISTS’ is better than ‘IN’

While working on my last project, I was facing a query which was taking 30 second to execute. The number of tables involved were 10 and total records were around 20k.  It was a search functionality for a web-app project.  There were lot of sub-queries involved in the first draft, all of them using ‘IN’. After doing some research ( within MySQL Community ) , I just replaced all ‘IN’  by ‘EXISTS’ and voila !  The query took 11 seconds to execute.

For example, you could change this query  :

SELECT *

FROM invitees i

WHERE i.event_id IN ( SELECT e.id

FROM event e

WHERE e.city LIKE = ‘ABC’ )

TO

SELECT *

FROM invitees i

WHERE EXISTS ( SELECT 1

FROM event e

WHERE e.city LIKE = ‘ABC’

AND e.id = i.event_id ).

Try it and let me know if you find the same difference in performance as I have. There were many more modifications  to the original query to bring it further down to 1 sec. But I will talk about that later.  For more information on  EXISTS and IN Sub Queries see Reference below.

References :

http://dev.mysql.com/doc/refman/5.0/en/optimizing-subqueries.html

http://dev.mysql.com/doc/refman/5.0/en/in-subquery-optimization.html

9 Comments

  • Jay Pipes

    1) You are using LIKE with no wildcard, and so it really should be = ‘ABC’

    2) You would get sub-second performance with a standard join:

    SELECT i.*
    FROM invitees i
    INNER JOIN event e
    ON e.id = i.event_id
    AND e.city = ‘ABC’;

  • Roland Bouman

    Hi there,

    in my opinion, these

    SELECT *
    FROM invitees i
    WHERE i.event_id IN (
    SELECT id
    FROM event e
    WHERE city LIKE ‘ABC’
    )

    Should all be rewritten to joins

    SELECT i.*
    FROM invitees i
    INNER JOIN event e
    ON i.event_id = e.id
    WHERE e.city LIKE ‘ABC’

    Also, the LIKE does not contain a wildcard, we can just as well write:

    WHERE e.city = ‘ABC’

    instead of

    WHERE e.city LIKE ‘ABC’

    Roland Bouman
    http://rpbouman.blogspot.com/

  • Rishi Agarwal

    Oh , I am sorry. Should have put LIKE ‘%ABC%’ or = ‘ABC’ for the example.

    Second, I agree with your example of using a standard join, it will be better. The 2 examples given were to show the difference between IN and EXISTS , so didn’t focus on JOIN. But yeah, there are ways to replace ‘IN’ Sub Queries with JOIN for better performance.

  • Andrew


    SELECT *

    FROM invitees i

    WHERE i.event_id IN ( SELECT e.id

    FROM event e

    WHERE e.city LIKE = ‘ABC’ )

    I believe this is slow because the inner query is evaluated, and then the outer query is evaluated – you do not reference the inner and outer queries directly.

    If you rewrote the query to be:

    SELECT *

    FROM invitees i

    WHERE i.event_id IN ( SELECT e.id

    FROM event e

    WHERE e.city LIKE = ‘ABC’ AND i.event_id = e.id )

    You would find the query to be much faster.

  • Ahmed

    Hi,

    I have this SQL query but instead of using IN i like to use the EXISTS operator, now how can I do that? My brain is hanging on this and need help. Thanks

    SELECT C.CUSTOMER_NUM, C.CUSTOMER_NAME
    FROM CUSTOMER C, ORDERS O
    WHERE
    C.CUSTOMER_NUM = O.CUSTOMER_NUM
    AND
    O.ORDER_DATE IN (‘2007-10-21’);

    • Rishi Agarwal

      Hi Ahmed,

      You can try this query :

      SELECT C.CUSTOMER_NUM, C.CUSTOMER_NAME
      FROM CUSTOMER C
      WHERE EXISTS (
      SELECT 1
      FROM ORDERS O
      WHERE O.ORDER_DATE = ‘2007-10-21’
      AND C.CUSTOMER_NUM = O.CUSTOMER_NUM
      )

  • Ajit

    Hi….
    can some buddy help me…to use EXISTS …how i can use EXISTS in following query insted of IN

    ……………………….
    …………………….
    ………..WHERE application_opus.company_id = ’50’
    AND application_opus.application_id
    IN (
    SELECT event_log_link_opus.event_log_link_reference_id
    FROM event_log_link_opus
    LEFT JOIN event_log_link_type_opus ON event_log_link_type_opus.event_log_link_type_id = event_log_link_opus.event_log_link_type_id
    LEFT JOIN event_log_opus ON event_log_opus.event_log_id = event_log_link_opus.event_log_id
    LEFT JOIN event_log_type_opus ON event_log_type_opus.event_log_type_id = event_log_opus.event_log_type_id
    WHERE event_log_link_type_opus.name_short = ‘application_id’
    AND event_log_type_opus.event_type IN (‘dteen_fraud_rule_130’)
    )

  • SacTiw

    After looking at what Rishi has to offer, you this might work:

    WHERE application_opus.company_id = ’50′
    AND EXISTS (
    SELECT 1
    FROM event_log_link_opus
    LEFT JOIN event_log_link_type_opus ON event_log_link_type_opus.event_log_link_type_id = event_log_link_opus.event_log_link_type_id
    LEFT JOIN event_log_opus ON event_log_opus.event_log_id = event_log_link_opus.event_log_id
    LEFT JOIN event_log_type_opus ON event_log_type_opus.event_log_type_id = event_log_opus.event_log_type_id
    WHERE event_log_link_type_opus.name_short = ‘application_id’
    AND event_log_type_opus.event_type = ‘dteen_fraud_rule_130′
    AND application_opus.application_id = event_log_link_opus.event_log_link_reference_id
    );

    I, btw, know it’s too late to answer this but I saw it today and thought of giving it a try!!! 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *