I have a Table which tracks User activity (i-e user started a session at what time ?). This table contains the data from Dec 2018 Till now. I need to calculate monthly retention (not based on signup date) based on user Activity( i-e In December 2018, 500 users were active. Then, how many of them were active in Jan,Feb,Mar….Till Now?Same activity should be performed for Jan 2019,Feb 2019… till now users).
I have tried the hard coded way which is to get the users of Dec 2018 in on Table and then Get the Jan 2019 users in other Table and join both tables based on the user_ids, but for that purpose i have to write a lot of joins. Need a dynamic way to check month over month user retention and for all months after Dec 2018(because the data is available from this month onward).
Postrgesql Code #
select A.year_month_id,count(distinct A.user_id) as November_Users,count(distinct B.user_id) as December_Retained_Users FROM ( select date_trunc('month', ua.created_at) as monthly, ua.user AS user_id FROM user_activity ua WHERE ua.event_type='StartSession' and cast(ua.created_at as date) between cast('20181201' as date) and cast('20181231' as date) GROUP BY 1,2 ) AS A left Join ( select date_trunc('month', ua.created_at) as monthly, ua.user AS user_id FROM user_activity ua WHERE ua.event_type='StartSession' and cast(ua.created_at as date) between cast('20190101' as date) and cast('20190131' as date) GROUP BY 1,2 ) AS B on A.user_id=B.user_id group by 1
user_activity Table #
id | user | event_type | created_at 1 | A1 | StartSession | April 29, 2019, 3:59 AM 2 | A2 | StartSession | December 29, 2018, 1:07 AM 3 | A3 | StartSession | December 9, 2018, 4:59 PM 49 | A31 | StartSession | May 25, 2019, 11:59 AM 100| A46 | StartSession | April 29, 2019, 3:56 AM
Expected Output #
Month |Monthly_Active_Users| Jan_Retained|Feb_Retained|Mar_Retained|....... Dec | 500 | 300 | 200 | 330 Jan | 700 | N/A | 450 | 410 Feb | 1000 | N/A | N/A | 820 Mar | 920 | N/A | N/A | N/A . . . . Aug | 100 | N/A | N/A | N/A