New content plugin cputil (chars from different codepage)
Moderators: Hacker, petermad, Stefan2, white
-
- Power Member
- Posts: 556
- Joined: 2006-04-01, 00:11 UTC
ghisler(Author) wrote:You can either post your additions/corrections here in the forum (which supports Unicode), or send them to me by e-mail in zipped form. Thanks!
Code: Select all
//Romanian (Unicode 3.0, 1999)
î=i
Î=I
â=a
Â=A
ă=a
Ă=A
ș=s
ț=t
Ș=S
Ț=T
//Romanian Legacy (Unicode 1.1, 1993)
ş=s
ţ=t
Ş=S
Ţ=T
aNDreas Bolotă
The truth always carries the ambiguity of the words used to express it. (Frank Herbert, God Emperor of Dune)
The truth always carries the ambiguity of the words used to express it. (Frank Herbert, God Emperor of Dune)
- ghisler(Author)
- Site Admin
- Posts: 50383
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Thanks for your list of characters. I just checked it - the following are already handled by Windows itself:
î Î â Â ă Ă
ş ţ Ş Ţ
The following are converted to '_', so I will add them:
ș ț Ș Ț
î Î â Â ă Ă
ş ţ Ş Ţ
The following are converted to '_', so I will add them:
ș ț Ș Ț
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
I was using [face=courier][=cputil.NameReplaceUserUnicode][/face] and [face=courier][=cputil.NameReplaceUserAll][/face] which don't make those replacements, but now I switched to [face=courier][=cputil.NameReplaceNoEnglish][/face] which I think is the one you mean ("handled by Windows itself").ghisler(Author) wrote:Thanks for your list of characters. I just checked it - the following are already handled by Windows itself:
î Î â Â ă Ă
ş ţ Ş Ţ
The following are converted to '_', so I will add them:
ș ț Ș Ț
aNDreas Bolotă
The truth always carries the ambiguity of the words used to express it. (Frank Herbert, God Emperor of Dune)
The truth always carries the ambiguity of the words used to express it. (Frank Herbert, God Emperor of Dune)
- ghisler(Author)
- Site Admin
- Posts: 50383
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Yes, that's what I meant! The first two do the following:
NameReplaceUserUnicode: Replaces unicode characters from a DIFFERENT codepage only, so that ANSI-only programs like Irfanview can access the files.
NameReplaceUserUnicode: Replaces all characters from other code pages using the external tables only. For example, on a Russian system, only non-cyrillic text is converted to Latin/English.
NameReplaceUserAll: Replaces all characters, also from the same code page, using external tables only.
NameReplaceUserUnicode: Replaces unicode characters from a DIFFERENT codepage only, so that ANSI-only programs like Irfanview can access the files.
NameReplaceUserUnicode: Replaces all characters from other code pages using the external tables only. For example, on a Russian system, only non-cyrillic text is converted to Latin/English.
NameReplaceUserAll: Replaces all characters, also from the same code page, using external tables only.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
-
- Junior Member
- Posts: 6
- Joined: 2009-07-13, 06:04 UTC
- ghisler(Author)
- Site Admin
- Posts: 50383
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
It depends on which of the options you use, and which language settings. If you choose to replace only Unicode characters, and are using German or French settings where the 'ä' is part of the language, then the 'ä' will not be changed to 'a'. If you use e.g. Russian locale, then 'ä' isn't part of the current codepage and will be changed.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
-
- Junior Member
- Posts: 6
- Joined: 2009-07-13, 06:04 UTC
Hi,
This is a small contribution to complete "cputil1.tbl" conversion table for Serbian Cyrillic.
As I can see, the basis for conversion table for Cyrillic alphabet is taken from Russian language.
In Serbian, compared with Russian, there are many shared Cyrillic letters, and even few more. (Some letters exists only in Russian, but that's not relevant for now).
The existing conversion table for Cyrillic alphabet is following the 'phonetic' logic for conversion.
In fact, this is not the ideal solution for Serbian because of readability after conversion.
That's because in Serbia both Latin and Cyrillic alphabets are used concurrently, and there is internal conversion table, and key part of it is presented below.
In such scenario, the best readability would be achieved with cropped Latin replacement (eliminated diacritical marks from Latin).
I'm aware that two different semantics would be overkill, so I just present the table of missing letters to complete the Serbian Latin alphabet.
Anyway, this is great plug-in (because of problems with packers and burners), and whatever Christian Ghisler decides to include in final plug-in - it would be OK.
Uppercase/Lowercase conversion table for Serbian Cyrillic:
Already included in "cputil1.tbl":
(cyrillic) Ж == (latin) Ž == (phonetic) ZH == (cropped latin) Z
(cyrillic) ж == (latin) ž == (phonetic) zh == (cropped latin) z
(cyrillic) Ш == (latin) Š == (phonetic) SH == (cropped latin) S
(cyrillic) ш == (latin) š == (phonetic) sh == (cropped latin) s
(cyrillic) Ч == (latin) Č == (phonetic) CH == (cropped latin) C
(cyrillic) ч == (latin) č == (phonetic) ch == (cropped latin) c
Does not exist in "cputil1.tbl":
(cyrillic) Ђ == (latin) Đ == (phonetic) DJ == (cropped latin) DJ
(cyrillic) ђ == (latin) đ == (phonetic) dj == (cropped latin) dj
(cyrillic) Ћ == (latin) Ć == (phonetic) TJ == (cropped latin) C
(cyrillic) ћ == (latin) ć == (phonetic) tj == (cropped latin) c
(cyrillic) Џ == (latin) DŽ == (phonetic) DZH == (cropped latin) DZ
(cyrillic) џ == (latin) dž == (phonetic) dzh == (cropped latin) dz
(cyrillic) Љ == (latin) LJ == (phonetic) LJ == (cropped latin) LJ
(cyrillic) љ == (latin) lj == (phonetic) lj == (cropped latin) lj
(cyrillic) Њ == (latin) NJ == (phonetic) NJ == (cropped latin) NJ
(cyrillic) њ == (latin) nj == (phonetic) nj == (cropped latin) nj
(cyrillic) Ј == (latin) J == (phonetic) J == (cropped latin) J
(cyrillic) ј == (latin) j == (phonetic) j == (cropped latin) j
(I'm not sure why letter "J/j" is omitted from previous conversion table, but it's needed to avoid underscores after conversion)
Best regards,
Vladimir Stefanovic
This is a small contribution to complete "cputil1.tbl" conversion table for Serbian Cyrillic.
As I can see, the basis for conversion table for Cyrillic alphabet is taken from Russian language.
In Serbian, compared with Russian, there are many shared Cyrillic letters, and even few more. (Some letters exists only in Russian, but that's not relevant for now).
The existing conversion table for Cyrillic alphabet is following the 'phonetic' logic for conversion.
In fact, this is not the ideal solution for Serbian because of readability after conversion.
That's because in Serbia both Latin and Cyrillic alphabets are used concurrently, and there is internal conversion table, and key part of it is presented below.
In such scenario, the best readability would be achieved with cropped Latin replacement (eliminated diacritical marks from Latin).
I'm aware that two different semantics would be overkill, so I just present the table of missing letters to complete the Serbian Latin alphabet.
Anyway, this is great plug-in (because of problems with packers and burners), and whatever Christian Ghisler decides to include in final plug-in - it would be OK.
Uppercase/Lowercase conversion table for Serbian Cyrillic:
Already included in "cputil1.tbl":
(cyrillic) Ж == (latin) Ž == (phonetic) ZH == (cropped latin) Z
(cyrillic) ж == (latin) ž == (phonetic) zh == (cropped latin) z
(cyrillic) Ш == (latin) Š == (phonetic) SH == (cropped latin) S
(cyrillic) ш == (latin) š == (phonetic) sh == (cropped latin) s
(cyrillic) Ч == (latin) Č == (phonetic) CH == (cropped latin) C
(cyrillic) ч == (latin) č == (phonetic) ch == (cropped latin) c
Does not exist in "cputil1.tbl":
(cyrillic) Ђ == (latin) Đ == (phonetic) DJ == (cropped latin) DJ
(cyrillic) ђ == (latin) đ == (phonetic) dj == (cropped latin) dj
(cyrillic) Ћ == (latin) Ć == (phonetic) TJ == (cropped latin) C
(cyrillic) ћ == (latin) ć == (phonetic) tj == (cropped latin) c
(cyrillic) Џ == (latin) DŽ == (phonetic) DZH == (cropped latin) DZ
(cyrillic) џ == (latin) dž == (phonetic) dzh == (cropped latin) dz
(cyrillic) Љ == (latin) LJ == (phonetic) LJ == (cropped latin) LJ
(cyrillic) љ == (latin) lj == (phonetic) lj == (cropped latin) lj
(cyrillic) Њ == (latin) NJ == (phonetic) NJ == (cropped latin) NJ
(cyrillic) њ == (latin) nj == (phonetic) nj == (cropped latin) nj
(cyrillic) Ј == (latin) J == (phonetic) J == (cropped latin) J
(cyrillic) ј == (latin) j == (phonetic) j == (cropped latin) j
(I'm not sure why letter "J/j" is omitted from previous conversion table, but it's needed to avoid underscores after conversion)
Best regards,
Vladimir Stefanovic
- ghisler(Author)
- Site Admin
- Posts: 50383
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Thanks very much!
Maybe you could send me two separate cputil.tbl by e-mail?
1. One which can be used as a replacement for the current included file, which keeps the Russian characters and adds Serbian
2. One which I could offer as a separate download which would be optimized for Serbian?
Please send them to beta at ghisler dot com. Thanks!
Maybe you could send me two separate cputil.tbl by e-mail?
1. One which can be used as a replacement for the current included file, which keeps the Russian characters and adds Serbian
2. One which I could offer as a separate download which would be optimized for Serbian?
Please send them to beta at ghisler dot com. Thanks!
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
- ghisler(Author)
- Site Admin
- Posts: 50383
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Thanks a lot for this plugin which seems to be exactly what I am looking for. (I need to automatically "translate" 50000+ filenames in a branched folder-structure of several hundreths of folders, as one of my programs can not read those files.)
Now, as I am completely new to TotalCommander, can anybody please give a short "for-dummies" description on how I would install/use this plugin to rename all files in a folderstructure? (replacing "unreadable" codepage letters by (any, in my case) readable latin ones?)
Now, as I am completely new to TotalCommander, can anybody please give a short "for-dummies" description on how I would install/use this plugin to rename all files in a folderstructure? (replacing "unreadable" codepage letters by (any, in my case) readable latin ones?)
2bejoscha
1. Download and install the plugin.
2. In the left panel go to the root folder of your files.
3. Press <Ctrl+B> for branch view.
4. Press <Ctrl+A> to select all files.
5. Press <Ctrl+M> to start multi rename tool.
6. In the "Rename mask:" field type [=cputil.NameReplaceUserAll]
7. Press start button.
1. Download and install the plugin.
2. In the left panel go to the root folder of your files.
3. Press <Ctrl+B> for branch view.
4. Press <Ctrl+A> to select all files.
5. Press <Ctrl+M> to start multi rename tool.
6. In the "Rename mask:" field type [=cputil.NameReplaceUserAll]
7. Press start button.