rejetto forum

Кирилица в названии файлов и папок при скачивании .tar

marribi · 8 · 12723

0 Members and 1 Guest are viewing this topic.

Offline marribi

  • Occasional poster
  • *
    • Posts: 9
    • View Profile
Добрый день всем !
Знаю что на форуме уже есть подобный вопрос, но ответы не совсем совпадают с тем что хотелось бы реализовать.

Собственно ситуация:
Если скачивать папку с файлами в названии которых есть кириллица, с установленной галочкой "OEM file names for tar archives".
То при открытии с помощью WinRaR кодировка имен нарушена (рис.1), но при открытии с помощью 7-zip 16.01 все нормально(рис.2).

Если скачивать папку с файлами в названии которых есть кириллица, сняв галочку "OEM file names for tar archives".
То при открытии с помощью WinRaR все нормально (рис.3), но при открытии с помощью 7-zip 16.01 кодировка нарушена (рис.4).

Собственно можно ли как то сделать что бы при открытии любым из этих архиваторов кириллица отображалась нормально ?


Offline bmartino1

  • Tireless poster
  • ****
    • Posts: 911
  • I'm only trying to help i mean no offense.
    • View Profile
    • My HFS Google Drive Shared Link
i'm slightly confused, are these files that were downloaded form HFS?, as what you described is an issues with the zip archive program not reading the file correctly, and i would recommend a md5 check of the file.
Files I have snagged and share can be found on my google drive:

https://drive.google.com/drive/folders/1qb4INX2pzsjmMT06YEIQk9Nv5jMu33tC?usp=sharing


Offline marribi

  • Occasional poster
  • *
    • Posts: 9
    • View Profile
i'm slightly confused, are these files that were downloaded form HFS?, as what you described is an issues with the zip archive program not reading the file correctly, and i would recommend a md5 check of the file.
Yes this files were downloaded from same HFS server, md5 is ok.

Only difference between this 2 files that they were downloaded with and without option "OEM file names for tar archives".


Offline marribi

  • Occasional poster
  • *
    • Posts: 9
    • View Profile

Offline LeoNeeson

  • Tireless poster
  • ****
    • Posts: 857
  • Status: On hiatus       (sporadically here)
    • View Profile
    • twitter.com/LeoNeeson
I still don't understand your question. If you use "OEM file names for tar archives" does it work? If you don't use that option, does it work? Which option works best?...

Я все еще не понимаю Ваш вопрос. Если вы используете "OEM file names for tar archives", она работает? Если Вы не используете эту опцию, она работает? Какой вариант лучше работает?...
HFS in Spanish (HFS en Español) / How to compile HFS (Tutorial)
» Currently taking a break, until HFS v2.4 get his stable version.


Offline marribi

  • Occasional poster
  • *
    • Posts: 9
    • View Profile
I still don't understand your question. If you use "OEM file names for tar archives" does it work? If you don't use that option, does it work? Which option works best?...

Я все еще не понимаю Ваш вопрос. Если вы используете "OEM file names for tar archives", она работает? Если Вы не используете эту опцию, она работает? Какой вариант лучше работает?...

Hello LeoNeeson !
My question is about that i cant get normally working encoding for Cyrillic letters for both winrar and 7-zip.
If i download files from HFS using download all files from this folder, HFS makes tar archive.

And if i check option "OEM file names for tar archives"
7-zip 16.01 unzip contents normally, but WinRaR broke file names encoding.

If i uncheck option "OEM file names for tar archives"
WinRaR unzip contents normally, but 7-zip 16.01  broke file names encoding.

Like i show on screenshots.

I just got answer from WinRaR tech support.

Quote
Last years tar archives tend to use UTF-8 encoding for file names.

WinRAR, beginning from version 5.40, attempts to use UTF-8 for tar first. If conversion from UTF-8 fails, WinRAR switches to ANSI encoding, which is CP1251 for Russian.

As far as I know, 7-Zip also attempts to use UTF-8 for tar first.
If conversion from UTF-8 fails, 7-Zip switches to OEM encoding, which is CP866 for Russian.

There is no reliable way to detect the file name encoding in TAR, so both WinRAR and 7-Zip just take a wild guess. This time the archive was in CP866 encoding, so 7-Zip guess was correct. In case of ANSI encoding WinRAR guess would be correct. Both would be correct for UTF-8 encoding.

Beginning from WinRAR 5.40 it is possible to override the code page selected by WinRAR and specify another code page. Open this archive in WinRAR first, then open "Options/Name encoding" WinRAR menu and select "866 (OEM - Russian)". Then browse or unpack the archive.
This name encoding menu is not available in WinRAR 5.20.

If they are right its better to code names in HFS tar archives in UTF-8, and problem will be solved !


Offline bmartino1

  • Tireless poster
  • ****
    • Posts: 911
  • I'm only trying to help i mean no offense.
    • View Profile
    • My HFS Google Drive Shared Link
I just got answer from WinRaR tech support.

Last years tar archives tend to use UTF-8 encoding for file names.

WinRAR, beginning from version 5.40, attempts to use UTF-8 for tar first. If conversion from UTF-8 fails, WinRAR switches to ANSI encoding, which is CP1251 for Russian.

As far as I know, 7-Zip also attempts to use UTF-8 for tar first.
If conversion from UTF-8 fails, 7-Zip switches to OEM encoding, which is CP866 for Russian.

There is no reliable way to detect the file name encoding in TAR, so both WinRAR and 7-Zip just take a wild guess. This time the archive was in CP866 encoding, so 7-Zip guess was correct. In case of ANSI encoding WinRAR guess would be correct. Both would be correct for UTF-8 encoding.

Beginning from WinRAR 5.40 it is possible to override the code page selected by WinRAR and specify another code page. Open this archive in WinRAR first, then open "Options/Name encoding" WinRAR menu and select "866 (OEM - Russian)". Then browse or unpack the archive.
This name encoding menu is not available in WinRAR 5.20.


that is correct, by default when a tar is made it attempts to use the system language and turns files into a zip archive thus the encoding is utf-8

quote: "Actually tar doesn't encode/decode filenames at all, It simply copies them out of the filesystem as-is. If your locale is UTF-8-based"
http://superuser.com/questions/60379/how-can-i-create-a-zip-tgz-in-linux-such-that-windows-has-proper-filenames


i'm not sure how hfs sends it commands to tar, but in linux, it is possible to tar and change the encoding
http://www.howtogeek.com/248780/how-to-compress-and-extract-files-using-the-tar-command-on-linux/

in you case you want a different encoding:
http://community.sharpdevelop.net/forums/p/10702/29494.aspx

with 7zip (possible winrar) you would have to open the comand line (cmd/terminal) and run it with certain comand to extract the files you want...
https://sevenzip.osdn.jp/chm/cmdline/syntax.htm
http://community.sharpdevelop.net/forums/p/10702/29494.aspx

?Maybe it can be scripted?...
Files I have snagged and share can be found on my google drive:

https://drive.google.com/drive/folders/1qb4INX2pzsjmMT06YEIQk9Nv5jMu33tC?usp=sharing


Offline Rapid

  • Occasional poster
  • *
    • Posts: 49
    • View Profile
    • R&Q Portal
...
Собственно можно ли как то сделать что бы при открытии любым из этих архиваторов кириллица отображалась нормально ?
Попробуйте мою версию:
https://github.com/drapid/HFS/releases/tag/2.3m
Вроде должна лучше работать с русскими файлами