Chrome – does Chrome sometimes download a PDF instead of opening it?

google-chromepdf

When I go to certain addresses of PDF files, Chrome downloads the PDF instead of opening it using its built-in PDF viewer. The page is then blank white.

There is no problem with my Chrome settings: I try addresses of other PDF files, and Chrome behaves as expected (I have it set to use Chrome's built-in PDF viewer). But every time I try the same problematic address, Chrome downloads the PDF and then displays a blank page.

I am using Windows 10 and Chrome
Version 63.0.3239.84 (Official Build) (64-bit).

My specific problematic URL this time is here (a Google search result).

Best Answer

Basically, this happens because the website tells the browser to do it. Occasionally, it's because the website developer decides they want this behaviour, e.g. common on file sharing sites. Other times, it's because it's a default option for whatever software they're using (e.g. forum or blogging software). Sometimes it's because the site dev has no idea what they're doing.


Content-Disposition

That's usually because the site sends a Content-Disposition header in the response. Specifically, it can send either inline or attachment.

inline is the default if not otherwise specified, and means the browser will open the file within the browser window if it is able to.

attachment means to always download the file, never attempt to open it inside the browser.


If you open your browser's developer tools, you'll see that particular link sends the following response headers:

Content-Disposition: attachment; filename="Schubert-Sonata-21-B-flat.pdf"
Content-Type: application/pdf

This tells the browser to always download (attachment) the file, and to give it the default filename of Schubert-Sonata-21-B-flat.pdf rather than inferring it from the URL. Additionally, it does tell the browser (correctly) that it's an application/pdf file - but since it's an attachment the browser will still default to downloading.


Inline handling details

When a Content-Disposition is inline (or unspecified), the browser will try to open the file in the default embedded viewer. This only works when the browser knows what file type it is, and the browser knows how to open that type.

Type detection

The file type can be specified by the server with a Content-Type header. For example, the most common inline types are text/html, application/javascript and text/css, making up the three major parts of a modern website. You can also have more esoteric types like application/pdf.

Another possibility is the server has specified a Content-Type of application/octet-stream. This is the most generic type, and it tells the browser that the file is just arbitrary data - at which point the only thing the browser can do is download it (in theory - we'll get to that).

When a Content-Type is not specified by the server (and sometimes even when it is), the browser can perform what is known as sniffing to try to guess the type by reading the file and looking for patterns.

Type handling

Upon receiving a file with an inline or unspecified disposition, the browser needs to try to open it within the browser if possible. To do this, it looks at the file type, and if it recognises the type it will try to open it. Most browsers will open any text/ type in a simple text viewer, will try to render text/html as a webpage, might open application/json in a special syntax-highlighted viewer, etc..

The type application/octet-stream was handled specially. Since it's supposed to be the most generic type, denoting an arbitrary stream of bytes, there isn't supposed to be any handler that can apply to all files of this "type". For example, in Firefox, this manifests as an inability to set the default handler for application/octet-stream.

Some websites have also used non-standard types. I've seen application/force-download used - which ends up as a download because the browser does not recognise or know what else to do with the type, but does not enjoy the special handling that application/octet-stream does.


A bit of a history lesson

To see how PDFs are handled, we can delve a bit into web history. See, in the past, browsers had no idea what a PDF is. So they could not open it. But we've seen PDFs being opened in browsers long before built-in PDF viewers were a thing, so how did that work?

It used to be possible to extend browser functionality with far more control than what you can do with limited extensions/addons these days. Those were most generically known as plugins. In Internet Explorer, they were ActiveX controls; in Mozilla Firefox and later Google Chrome they were NPAPI plugins. These plugins were capable of doing everything any other program could, and could additionally register themselves as a handler for a specific file type that might be otherwise unrecognised by the browser. (Incidentally, this was later found to be a huge security risk and support for these powerful plugins was gradually dropped...)

In the days of plugins, you would go and install Adobe Acrobat Reader, which would then install an ActiveX or NPAPI plugin that would register the application/pdf MIME type and tell the browser to open those types inline using the plugin.

Of course, after a number of security and performance issues caused by these plugins, the major browser vendors decided to incorporate their own PDF viewers while phasing out support for most plugins. The only one we still see is Adobe Shockwave Flash, which handles application/x-shockwave-flash.

There's actually still some leftover controls for this, e.g. in Firefox the Preview in Firefox option still exists:

Screenshot of option

In the past, this would have allowed the choice between multiple plugins that registered that type. For example, the list of registered types for Flash:

Screenshot of registered types

Those days were also before a lot of the media support that came with HTML5. It wasn't just PDFs - your browser would have no idea how to handle a MP4 container or H.264 video, no idea how to play a MP3 file, etc., etc.. You would see plugins provided by media players like VLC or even Windows Media Player, or websites would embed a media player built in Flash.