Ubuntu – How to download link with unicode using wget


Sometimes a link has unicode characters in it, such as http://www.example.com/файл.zip

If you point your browser to it, it will properly prompt you to download the file as файл.zip. But if you try to do it with wget, the file comes with a mix of ?, percent encoding (like %D0%BB) and the (invalid encoding) string after the filename.

What parameters can I add to wget, or any other command line tricks, so that it behaves as Chrome and Firefox and saves the file exactly as specified in the rendered link – in this case, as файл.zip?

The solution should work without having to explicitly write it in the command, so an explicit wget -O файл.zip http://www.example.com/файл.zip is not a good solution.

I realize that as soon as you run wget http://www.example.com/файл.zip it tries to retrieve http://www.example.com/%D1%84%D0%B0%D0%B9%D0%BB.zip, that is, it converts the link to percent encoding, which may be the reason why it saves it doesn't render the filename "properly".

I posted a somewhat related question here, whose answer may or may not be of help to this one.

Best Answer

You can use curl instead, as follow

curl -O http://www.example.com/файл.zip

It will save it to файл.zip.

Related Question