Why repetition in join queries

relational-theory

Very dumb question, but bear with me, I am learning and am interested in the principles behind.

So: I have a users table and a pages table. In this case, 1 user and 4 pages. If I join the two tables, I get four times the user data, along with the pages data. The result is little legible, when I see a result stream like this I think wtf?

I know 'databases are just made this way'. I'd like to learn some context and meaning.

Best Answer

I think that you are confusing the presentation layer with the database layer.

In the presentation layer, repetitive information would be removed or reduced, perhaps a group header indicating the user and then a details section indicating the pages associated with that user. This is the way that the information would best be presented to a person, in any case.

However, the dataset that underlies that presentation would look as you describe with the user information repeated for each page. You get the same repetition of data if you reverse the joins.

SELECT P.page, U.user FROM Pages P INNER JOIN Users U ON U.user_id = P.user_id.

is equivalent to

SELECT P.page, U.user FROM Users U INNER JOIN Pages P ON P.user_id = U.user_id

So yes, it is a little illegible but it's meant to be fast and accurate. Formatting the data relies on a deeper understanding of what the data actually means and desired presentation.