Postgresql – add the missing rows to each group of year/date


i have this table –>

and there is another table –>

what i need is to show ALL categories regardless of the quantity. if there have been no sales in a certain category, I also want to show a record where the quantity is zero

The result I want to get is like the following –>

I have a table with many records… the sales_per_month table is the result of a query to get the subtotals.
the categories table has an ID column, I just haven't put it here for simplicity's sake

with this query

select b.year, b.month, a.category, coalesce(c.quantity, 0) as quantity
from categories a
cross join sales_per_month b
left join sales_per_month c on c.year = b.year and c.month = b.month and c.category = a.category
group by b.year, b.month, a.category
order by b.year desc, b.month desc, a.category

I get all the records from the categories table, but only once, but what I need is all categories to be repeated for each year/month group.

i need all categories to always appear in each group, as my ultimate goal is to add six more columns to this table from other tables that are exactly the same but with different totals. in short, i want to normalize the number of rows. i want the seven tables to have exactly the same number of rows (the same number of categories) so that i can add the other columns of totals.

expected result –>

Best Answer

The first step is to get the data into a usable format - you should be using fiddles rather than the image format in your various pastes - for this reason. The fiddle for this answer is to be found here.

Create the category and sale tables (recommendation, only use singular names for tables - they are (or should be) sets).

  cat_id   SERIAL,
  cat_name TEXT,
  CONSTRAINT category_pk PRIMARY KEY (cat_id)


CREATE TABLE sale -- Note the added CONSTRAINTs!
  sale_year  INTEGER NOT NULL,
  sale_month INTEGER NOT NULL,
  sale_cat   INTEGER NOT NULL,
  quantity   INTEGER NOT NULL CONSTRAINT sale_qty_gt_0 CHECK (quantity > 0),
  CONSTRAINT sale_category_fk FOREIGN KEY (sale_cat) REFERENCES category (cat_id),
  CONSTRAINT natural_key_uq UNIQUE (sale_year, sale_month, sale_cat)

and the data:

INSERT INTO category (cat_name) VALUES ('hat'), ('shoe'), ('umbrella'), ('shirt');  

INSERT INTO sale VALUES (2018, 12, 1, 100);
INSERT INTO sale VALUES (2018, 12, 2, 200);
INSERT INTO sale VALUES (2018, 12, 3, 300);
INSERT INTO sale VALUES (2018, 11, 2, 200);
INSERT INTO sale VALUES (2018, 11, 3, 300);
INSERT INTO sale VALUES (2018, 11, 4, 400);
INSERT INTO sale VALUES (2018, 10, 1, 100);
INSERT INTO sale VALUES (2018, 10, 3, 100);
INSERT INTO sale VALUES (2018, 10, 4, 400);

Then, we have to have a CROSS JOIN getting all the years, months and category ids. The problem with a triple CROSS JOIN is that there'll be many duplicates - hence you use the DISTINCT clause in the first table:

SELECT DISTINCT s.sale_year, s.sale_month, c.cat_id
FROM sale s, category c;

which gives results like this:

sale_year   sale_month  cat_id
2018    10  1
2018    10  2
2018    10  3
... results snipped for brevity - see fiddle

which becomes the resultset (x in this example) for a further query as follows:

SELECT x.sale_year, x.sale_month, COALESCE(s1.quantity, 0) AS "Qty",  x.cat_id
FROM sale s1
  SELECT DISTINCT s.sale_year, s.sale_month, c.cat_id
  FROM sale s, category c
  ORDER BY s.sale_year, s.sale_month, c.cat_id
) x
ON  s1.sale_year = x.sale_year
AND s1.sale_month =  x.sale_month
AND s1.sale_cat = x.cat_id
ORDER BY sale_year, sale_month, cat_id;


sale_year   sale_month  Qty     cat_id
2018                10  100          1
2018                10    0          2  <<--- notice 0 instead of NULL/blank
2018                10  100          3
... again, snipped for brevity - again see fiddle

Note the use of the RIGHT JOIN in order to get every year, month and category and the use of the COALESCE function to get 0 for no sales instead of NULLs

There are a few articles on my profile about how to ask questions here - you might want to add a fiddle to any future questions you may have. However, I've given you an upvote for a decent first question. p.s. welcome to the forum! :-)

p.p.s. there's another way of doing this - by making use of a CTE (COMMON TABLE EXPRESSION (also known as the WITH clause)) - I'm not sure of the performance implications of both solutions - if I get a chance, I'll have a look at this too - but here is the solution (with fiddle to boot):

WITH cte1 AS
  SELECT DISTINCT s.sale_year, s.sale_month, c.cat_id
  FROM sale s, category c
SELECT c1.sale_year, c1.sale_month, c1.cat_id, COALESCE(s1.quantity, 0) AS "Qty"
FROM cte1 c1
LEFT JOIN sale s1 
ON  c1.sale_year  = s1.sale_year
AND c1.sale_month = s1.sale_month
AND c1.cat_id     = s1.sale_cat
ORDER BY sale_year, sale_month, cat_id;

Same result!