Postgresql – Filter and append data to row

aggregatejsonpostgresqlpostgresql-9.3

My query and my data set (copied from sqlfiddle):

CREATE TABLE one (
    id int4 primary key, 
    p_id int4, 
    k_id int4
);
CREATE TABLE two(
   id int4 primary key, 
   p_id int4, 
   k_id int4,
   t_id int4,
   pos int4
);

INSERT INTO one(id, p_id, k_id) VALUES 
    (1, 1, 1), 
    (2, 1, 2), 
    (3, 1, 3), 
    (4, 1, 4);
INSERT INTO two(id, p_id, k_id, t_id, pos) VALUES 
    (1, 1, 1, 1, 1), -- t_id = 1 and pos = 1
    (2, 1, 2, 1, 2),
    (3, 1, 3, 1, 1), -- t_id = 1 and pos = 1
    (4, 1, 4, 1, 3),
    (5, 1, 1, 2, 3), -- shares p_id and k_id with row 1
    (6, 1, 2, 2, 1),
    (7, 1, 3, 2, 5), -- shares p_id and k_id with row 3
    (8, 1, 4, 2, 6);

Table one is a join table, I've added it because in the future I might add more columns into it, and then it will be easier to modify this report.

What I want is to be able to filter my dataset like so:

  1. Get all the rows where pos = 1 for t_id = 1

Expected output:

p_id, k_id, stats
1,     1,    [{p_id: 1, k_id: 1, t_id: 1, pos: 1}, {p_id: 1, k_id: 1, t_id: 2, pos: 3}
1,     3,    [{p_id: 1, k_id: 3, t_id: 1, pos: 1}, {p_id: 1, k_id: 3, t_id: 2, pos: 5}

As you can see I want to append to my results the data for others t_ids where p_id and k_id are the with the found results.

Update:

What will be the best way of removing duplicates?

Adding (9, 1, 1, 1, 20) to my dataset will result in having the same type twice in a row;

One solution to get only uniq types and the smallest pos values is:

WITH uniqt AS (
  SELECT   p_id, k_id, t_id, min(pos) as pos
  FROM     two
  GROUP BY 1, 2, 3

)

Because I am selecting from table one 50 rows (limit 50) and then join a table with SELECT like in @Erwin Brandstetter example, having this CTE will slow down my query?

Best Answer

You could use an inner join to filter for rows that share a (p_id, k_id) value with rows that have a (t_id, pos) of (1,1) (sqlfiddle):

select  base.p_id
,       base.k_id
,       json_agg(base order by base.p_id, base.k_id, base.t_id, base.pos) as stats
from    (
        select  p_id
        ,       k_id
        ,       t_id
        ,       pos
        from    two
        ) base
join    two as filter
on      filter.t_id = 1
        and filter.pos = 1
        and filter.p_id = base.p_id
        and filter.k_id = base.k_id
group by
        base.p_id
,       base.k_id;

Or an equivalent way using an exists subquery (sqlfiddle):

select  base.p_id
,       base.k_id
,       json_agg(base order by base.p_id, base.k_id, base.t_id, base.pos) as stats
from    (
        select  p_id
        ,       k_id
        ,       t_id
        ,       pos
        from    two
        ) base
where   exists
        (
        select  *
        from    two as filter
        where   filter.t_id = 1
                and filter.pos = 1
                and filter.p_id = base.p_id
                and filter.k_id = base.k_id
        )
group by
        base.p_id
,       base.k_id;

Updated to use json_agg from @ErwinBrandstetter's answer.