LineBreakInfo - Content plugin for information about line break type, BOM type, number of CR/LF/CRLF occurrences

AntonyD · Post by *AntonyD » 2023-12-11, 12:20 UTC

I didn't quite understand the depth of the problem. These are the officially existing designations. (\f) and (\v)
Also used in regular expressions. Like \r and \n.
They are not used anywhere in your plugin in any special way - but for translation purposes they are sufficient and must be specified

Dalai · Post by *Dalai » 2023-12-11, 13:41 UTC

AntonyD wrote: 2023-12-11, 12:20 UTCThese are the officially existing designations. (\f) and (\v)

That's what I want to make sure because I haven't seen them so far. Is there any source where I can read more about them?

In the Notepad++ documentation I can only find \f and \x0B (which is VT) but \v means vertical space which is explained like this:

https://npp-user-manual.org/docs/searching/#character-escape-sequences wrote:Vertical space: This encompasses all the [[:space:]] characters that aren't [[:blank:]] characters: The LF, VT, FF, CR , NEL control characters and the LS and PS format characters

This could be specific to Notepad++ but it's also possible that \v always means vertical space and not just vertical tab.

Regards
Dalai

AntonyD · Post by *AntonyD » 2023-12-12, 08:28 UTC

To be honest, I still don’t quite understand what level of documentation you want to specify for this.
And why? The same phrases are used in translation. And for the translation, let’s say, the translator is responsible.
And for our population, the specified escape sequences of control characters have been like this for a very long time.
https://www.gnu.org/software/emacs/manual/html_node/elisp/Basic-Char-Syntax.html = for example.
This is a version of the “original” relationship between escape “symbols” and their meanings.
https://en.cppreference.com/w/cpp/language/escape = also states itself as a mature site with info...
https://google.github.io/styleguide/tsguide.html#special-escape-sequences - here they are not listing the table of
relationship between “symbol” and meaning. But still, they specifically point out the fact that they use this widely
in their coding practice.
https://en.wikipedia.org/wiki/Control_character
https://developer.mozilla.org/en/docs/Web/JavaScript/Guide/Grammar_and_types
http://es5.github.io/x7.html#x7.8.4 (see the 'table 4')

As to Notepad++ - it's internally based on a SciTE engine - so you have to find doc for SciTE

))
https://www.scintilla.org/SciTERegEx.html
find for text line: "\a, \b, \f, \n, \r, \t, \v match the corresponding C escape char, respectively BEL, BS, FF, LF, CR, TAB and VT;"
Of course we used to use RU version: https://scite-ru.bitbucket.io/pack/doc/SciTERegEx_rus.html which has in ()'s a liiiitle
bit more clarifications for the same line of text.

Dalai · Post by *Dalai » 2023-12-12, 11:49 UTC

AntonyD wrote: 2023-12-12, 08:28 UTCTo be honest, I still don’t quite understand what level of documentation you want to specify for this.
And why?

I don't want to specify anything. I want to make sure adding these escape sequences (\f and \v) to the language file doesn't end up misleading in case they're not commonly used for these line break types (or even mean something else). Simple as that.

Now that I've seen multiple sources saying the same thing I can add them to the translation. But I'm going to do that only for languages that already have them for LF, CR and CRLF.

Regards
Dalai

AntonyD · Post by *AntonyD » 2023-12-12, 12:48 UTC

I'll try to clarify my previous misunderstanding.
Historically, these correspondences have deep roots in the history of the ASCII standard of 1963.
And nothing has changed since those times for any particular programming language!
And it’s even more strange to see doubts when using these symbols in 'only' the translation!!!!
I repeat - the translator himself is responsible for its meaning and logic of application))). Simple as that.

Here is a quote from the original table of this standard on the website of one of the very respected universities in the world:
https://redirect.cs.umbc.edu/courses/undergraduate/313/fall07/burt/CMSC313_lectures/Introduction/ASCII.html
And here a person makes the same table in a form that is more convenient for visualization, modern:
https://github.com/ardnew/chars#ascii-atbl-5934

P.S.
By the way, if we are so deeply immersed in the analysis of these symbols, IMHO this combination is logical to indicate for any translation by default!
And not only in my translation. Because these symbols are useful to remember and know about them - for everyone who will widely use this plugin;)

Dalai · Post by *Dalai » 2023-12-12, 13:37 UTC

AntonyD wrote: 2023-12-12, 12:48 UTCAnd it’s even more strange to see doubts when using these symbols in 'only' the translation!!!!

Well, I wasn't aware of these escape sequences before you brought them up. It might have to do with the fact that FF and VT are used so rarely.

I repeat - the translator himself is responsible for its meaning and logic of application))). Simple as that.

Well, it's probably me who will be asked about these escape sequences by users who don't know anything about translation or language files. So I'm trying to avoid misunderstandings that might come up, as much as I can anyway.

By the way, if we are so deeply immersed in the analysis of these symbols, IMHO this combination is logical to indicate for any translation by default!

I'm not going to do that. Why? I considered adding these escape sequences a while ago after seeing them in a translation (probably your Russian one) and thought to myself "Oh, that's a good idea". But after testing I found these strings to be too long and not particularly clearly represented with the parentheses and the backslash and all that - too many characters in too little space. In TC's search they are of help though.

Instead I've included them in a commented line of the English and German sections. That way translators and users alike can use them if they want to by just moving the semi-colon character to the other line. I could do that for all other translations. OTOH I don't want to mess with the translations too much because I don't know these languages, so I kind of want to leave it to the translators what they think is better.

Regards
Dalai

Dalai · Post by *Dalai » 2023-12-16, 17:21 UTC

Just FYI, I've updated the plugin archive a couple of days ago:

Code: Select all

Version 0.3.0  [2023-12-12]
[+]  Added Ukranian translation, thanks to beb!
[!]  Plugin package is unchanged except for the .lng file and history.txt,
     hence the version is the same

The language file also adds \f and \v escape sequences to the languages that already had \r, \n and \r\n. As I said above, I've added them also to the English template in a commented line, but I left the remaining language sections untouched.

Regards
Dalai

Total Commander

LineBreakInfo - Content plugin for information about line break type, BOM type, number of CR/LF/CRLF occurrences

Re: LineBreakInfo - Content plugin for information about line break type, BOM type, number of CR/LF/CRLF occurrences

Re: LineBreakInfo - Content plugin for information about line break type, BOM type, number of CR/LF/CRLF occurrences

Re: LineBreakInfo - Content plugin for information about line break type, BOM type, number of CR/LF/CRLF occurrences

Re: LineBreakInfo - Content plugin for information about line break type, BOM type, number of CR/LF/CRLF occurrences

Re: LineBreakInfo - Content plugin for information about line break type, BOM type, number of CR/LF/CRLF occurrences

Re: LineBreakInfo - Content plugin for information about line break type, BOM type, number of CR/LF/CRLF occurrences

Re: LineBreakInfo - Content plugin for information about line break type, BOM type, number of CR/LF/CRLF occurrences