Does anybody have any conjectures as to why this quirk is so common? For an example, see this TED talk by Kevin Slavin.

The standard grammatical explanation of this is that it’s a variant of constructions like:

What the reason is is that she’d just returned from Guatemala.

These are quite standardly grammatical, analogous to e.g.

What I know is that capuchins are a kind of monkey.

This construction acts in some respects as a fixed idiom, with slightly different connotations from plain old “The reason is that she’d just returned…”, and as such, it’s started to evolve independently. In particular, it’s developed the variant which omits the what, which occurs frequently enough that descriptive linguists happily accept it as grammatical, though slightly nonstandard.

Those different connotations are subtle; the following is my subjective impression, but if someone can find a proper corpus-based analysis of them, that would be better.

The form “The problem is that I don’t know why he’s angry.” can be the first mention of the fact that there’s a problem; it puts focus on this assertion. Contrastingly, “[What] the problem is, is that I don’t know why he’s angry.” is typically used when the listener/reader is already aware that there’s a problem; it emphasises the delineation of precisely what the problem is, possibly in contrast to other things it could be:

The problem isn’t that they’re stupid. What the problem is, is that they’re overspecialised.

(Mark Liberman also discusses the “The X is, is” construction on Language Log, and partially disagrees with this standard analysis, linking to an alternative proposed explanation. I’ve not read the linked paper, I’m afraid, and the standard analysis makes sense to me, so I’m leaving it at this for now.)