# Linux – Proper encoding for file names in zip archives created in Windows and unpacked in linux

character encodinglinuxspecial characterswindowszip

I have problems with different charsets in Windows and Linux (Centos).

I have files with special characters in their filenames from many different languages. The zip archive is generated under Win7 and uploaded on a Linux server. Under Windows all characters were displayed normal, as expected. But after uploading and extracting with, either phps' ZipArchive() or Linux unzip, some special characters were displayed with strange wrong characters.

I know that this is a known problem in the interplay between Windows and Linux, but I'm not able to solve my problem. I've tried to unzip my zip file with different charsets, but nothing worked for me. In Portuguese the charater õ makes a lot of problems, but ç is okay.

aplicações.txt is after unzipping aplicaçΣes.txt

As far as I understood it right, windows uses the ASCII code charset IBM860, but sometimes I read windows-1257 and I do not know which charset is used, when the zip archive is made with WinRar under Win7. Is there a way to check this, or tell WinRar to use UTF-8?

When the zip archive is uploaded to a linux os and unzipped by ZipArchive() (php) or on the Linux bash with unzip, the filenames are wrong. Think it is because linux used UTF-8.

Under linux command I tried:

unzip -O windows-1257 uploaded.zip -d zipout/


Under linux command I tried:

unzip -O IBM860 uploaded.zip -d zipout/


Under linux command I tried:

unzip -O IBM437 uploaded.zip -d zipout/


Under linux command I tried:

unzip -O UTF-8 uploaded.zip -d zipout/


Under linux command I tried:

unzip -O UTF-16 uploaded.zip -d zipout/