Postgresql – Left join without duplicate rows


I have two tables called record and record_history. For each record, there could be more than one history. They can be joined by id and record_id. I want to get all the record's entries with recent record_history data. I have created the query like,

SELECT,, AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON ( = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON ( = rech2.record_id AND rech2.ts > rech1.ts)

Here, I am getting the latest one by ts. This works as long as there are no duplicate ts entries. If the recent timestamp is repeated in record_history, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?

Best Answer

Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL join.

The duplicate results can be avoided in your method by adding a second condition besides the = rech2.record_id. With the LATERAL join method, the use of LIMIT is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):

SELECT,, AS last_history_data
FROM record AS rec
     ( SELECT
       FROM record_history AS rech
       WHERE = rech.record_id
       ORDER BY rech.ts DESC
                -- , DESC               -- optional
       LIMIT 1 
     ) AS rech
     ON TRUE

Regarding how to do this with the original method (2 joins and IS NULL check), you could change the ON condition - assuming there is an id column in history table so that (id) or at least (ts, id) is unique:

LEFT OUTER JOIN record_history rech2 
ON = rech2.record_id 
   AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND >

By the way, you could replace that second LEFT join and IS NULL check with a NOT EXISTS subquery with same results and possibly similar efficiency (or even with a NOT IN subquery although that needs extra care for nullable columns, not recommended).