Postgresql – Very slow postgresql aggregation


EDIT: docker image for testing with the same data (obfuscated)

$ docker run --rm --name pg -d homer5439/pgagg
$ docker exec -ti pg bash
# createdb -U postgres test; zcat /tmp/corr.sql.gz | psql -U postgres test
# psql -U postgres test

and you can run the examples below.

I was wondering why the following query takes more than 30 seconds on postgresql 12.2:

    array_agg(corr) AS corr
  GROUP BY contract_id;

More info: the corr table has about 150000 records. Each record has the following structure:

gse=# \d corr ;
                                         Table "public.corr"
    Column   |  Type   | Collation | Nullable |                             Default                             
 corr_id     | integer |           | not null | nextval('corr_corr_id_seq'::regclass)
 contract_id | integer |           |          | 
 start_date  | date    |           |          | 
 corr_type   | text    |           |          | 
 descr       | text    |           |          | 
 credit      | numeric |           |          | 
 detail      | text    |           |          | 
    "corr_pkey" PRIMARY KEY, btree (corr_id)
    "corr_contract_id_idx" btree (contract_id)

The detail field contains text up to 2/3 MB in length (about 10% of the records have detail that long; the others have some (~10-20) kilobytes).
The number of distinct contract_id values is currently 2317.

Following other suggestions found on this site, I've tried changing the value of work_mem up to 10GB without noticeable changes.

Here's the query plan:

                                                                 QUERY PLAN                                                   
HashAggregate  (cost=9883.29..9911.57 rows=2263 width=36) (actual time=1184.971..1357.309 rows=2317 loops=1)
   Output: contract_id, array_agg(corr.*)
   Group Key: corr.contract_id
   Buffers: shared hit=78012 read=49899
   ->  Seq Scan on public.corr (cost=0.00..9320.19 rows=112619 width=571 (actual time=0.057..959.359 rows=112619 loops=1)
         Output: contract_id, corr.*
         Buffers: shared hit=78012 read=49899
Planning Time: 0.131 ms
Execution Time: 1357.747 ms

Best Answer

When you running "alone", if you mean running it in "psql" but just without the EXPLAIN ANALYZE, then "psql" is reading the entire result set into memory, and going over that set deciding how long the longest entry is for each column so that it can format it to that length. That can be slow, though I would not expect 30 seconds for 2317 lines!

Except I see you don't have a "corr" column in your "corr" table, so what you are aggregating is the entire row records, which you already said has a column that can be very wide. So you are slinging a huge amount of data at psql, and it takes a long time to process it, plus maybe doing some swap.