# Ubuntu – Chinese encoding in names of compressed files in zip

encoding

Sorry for asking a question similar to my previous one. The difference from the last question is that now it is in a zip archive where Chinese encoding in names of compressed files are not recognized, both after extraction and after listing the content of the zip archive:

$unzip -l "严蔚敏数据结构(c语言版)教材及答案.zip" Archive: 严蔚敏数据结构(c语言版)教材及答案.zip Length Date Time Name --------- ---------- ----- ---- 25600 2000-01-04 23:27 ?+?+i- ??-?.doc 80896 2000-01-04 23:27 ?+??i- -+.doc 41984 2000-01-04 23:27 ?++?i- i+????-?.doc 52224 2000-01-04 23:27 ?+?+i- ??i?.doc 50688 2000-01-04 23:27 ?+??i- ??????.doc 54272 2000-01-04 23:27 ?++?i- -????-??????.doc 26112 2000-01-04 23:27 ?+?-i- ?????????_+?.doc 76288 2000-01-04 23:27 ?+-?i- -??-????-?.doc 53760 2000-01-04 23:27 ?+-?i- -+?+++?=.doc 53760 2000-01-04 23:27 ?+--i- ??.doc 7929077 2009-02-26 22:49 -???????+C????+??+?+?+pdf.pdf --------- ------- 8444661 11 files  I was wondering how to deal with this problem? Thanks and regards! update: I have uploaded this zip archive to and it can be downloaded from http://www.mediafire.com/?dw87ee72m56evy9 I tried to use chardet to determine the encoding of the names of the compressed files by: $ unzip -l "严蔚敏数据结构(c语言版)教材及答案.zip" | chardet
<stdin>: utf-8 (confidence: 0.99)


But are the file names indeed encoded in utf-8? Aren't they supposed to be in a foreign encoding? I guess the output by unzip -l are too much, and how shall I only single out the filenames in its output as input to chardet?

unzip -O cp936 "严蔚敏数据结构(c语言版)教材及答案.zip"