# Macos – Fix corrupted Unicode file names in a zip archive

macosunicodezip

A colleague gave me a zip archive of some data I need to analyse, but unfortunately the filenames have been corrupted somewhere along the way, either when creating the archive or when extracting them on my machine.

The archive contains 3000 files whose filenames contain Japanese characters. He zipped it on a Windows machine, and I'm using a Mac. If I double-click on the archive then I get file names that look like this:

0001_rt_ñºéå-ïÅí ñºéå-àÍî _ÉAÅ[ÉãÉeÉBÅ[.dat


On the other hand, if I use 'unzip' at the command line the same file comes out as

0001_rt_%FB+%C4%EE-%F2%FC%C6-%FB+%C4%EE-%EA%DB%F6-_%E2A%FC[%E2%EF%E2e%E2B%FC[.dat


The content of the files is fine (they don't contain any Japanese characters, only numbers), but I need to get at the original file names.

Is there some way I can restore the correct file names without having access to the original files, which are on another computer in another city? I'm up for writing a quick Python script if that's a possibility, but I don't know much about character encodings, so I'm not sure how to go about it.