Mysql – Large table random select with some peculiarity


I have a table with 800k rows and I need to select 20 random rows that not are in other 2 tables (tickets already purchased).

I already tried a SQL with NOT IN syntax, but it's still (sometimes 1s in development machine).

Actual SQL:

SELECT id, description 
FROM ticket
                  FROM cart_item 
                  WHERE DATE(added) = CURDATE() ) 
  AND id NOT IN ( SELECT id 
                  FROM valid_ticket )
  AND id BETWEEN 1 AND 200000

I need some faster. Because grabbing all table and select 20 rand rows is very slow. That function is executed is all pages of a website and I'm thinking how to optimize and cache that query.

Thanks all for support

Best Answer

First, ORDER BY RAND() will collect all the rows, tack on RAND(), sort, and then start work.

Actually, it is not that bad, since you have filtering. But it may not help a lot.

AND id BETWEEN 1 AND 200000

limits the work to 200K instead of 800K.

The other filters involve what might be a terribly inefficient construct: NOT IN ( SELECT ... ). It may be turned into a different construct. Please provide EXPLAIN SELECT ...

DATE(added) = CURDATE()

Assuming added is DATETIME or TIMESTAMP, and there are not values in the future, then change to

added >= CURDATE()

and have INDEX(added, id).

After I see the EXPLAIN, I may suggest using EXISTS or LEFT JOIN.