I'm pretty new to MySQL (used to programming in R/Stata), and getting stuck on this – I'm probably going about it the wrong way. Any pointers would be appreciated!
I have a large database (a couple million rows) and want to select records where at least two fields from a selection of variables are non-missing (not NULL). I can find lots on counting missing values by column, but not by row.
As an example, I would like to write a select statement for the table below, based only on var1-var5, that would select ID 1 and 3 as they have at least 2 non-NULL values.
ID | var1 | var2 | var3 | var4 | var5 | var6 | var7 1 | NULL | 4 | X | NULL | NULL | NULL | R69 2 | NULL | NULL | NULL | NULL | 1 | X | J10 3 | 4 | 5 | 23 | 1jan03| 2 | 22 | Z38
I was thinking creating a user-defined variable to count the number of non-missing values and delete rows where this is <2, but it doesn't seem to work.
SET @include = 0; SELECT var1, var2, var3, var4, var5 FROM db1; @include := @include + 1 if var1 is not null; DELETE FROM db1 WHERE @include<2;
I'm guessing this is something I can do with a WHERE statement in the SELECT line, but I can only think of writing it so it only selects rows where everything is non-missing, not at least 2 values.