Primary key in resulting relation from union operation in relational algebra


I have a question regarding the resulting table/relation from a union operation in relational algebra. Does the resulting relation not have a primary key? Or does it return a table with a hidden primary key?

Example: Suppose we take the union of the following two tables:

id | name
1    a
2    b
3    c

id | name
1    b
2    c
3    d

Where the attribute "id" is primary key in both relation R1 and R2, now the union of those two will be:

id | name
1     a
1     b
2     b
2     c
3     c
3     d

Now none of the attributes in the resulting relation can be primary key since both id and name contain duplicates. How is this resolved by the DBMS?

Best Answer

By definition the union of two relation contains no duplicates. So it contains at least one candidate key, the composition of all attributes. In your example (id, name) is a candidate key.

DBMS do not "resolve" the issue by assigning a primary key because in almost all DBMS (at least those that use SQL), the union of two tables is a derived table (or a view or a CTE but not a base table) and primary keys(1) can be defined only for base tables (not for views, not for derived tables and not for common table expressions).

Still, the DBMS optimizer is often aware that the derived table that results from a union has no duplicates and that the composition of all attributes uniquely defines a row and may use this fact in subsequent operations on the table (e.g. in a more complex query where the union is joined to or unioned with other tables).

(1) : and constraints in general. There is an generalized SQL constraint construction (CREATE ASSERTION) that allows constraints across several tables that could be used for creating a unique constraint on a union but assertions have not been implemented by any major DBMS - probably because it's a very complicated issue. Firebird claims to have them but not sure how well or what the restrictions are.