Yes, but an educated guess because the unicode range U+1F600 to U+1F64F primarily contains emoticons.makinero wrote:Emoticons:
regex:[\x{1F600}-\x{1F64F}]
A good starting point for newbies is:Usher wrote: 2019-01-03, 04:06 UTC Your information also seems to be incomplete or misleading, for newbies at least.
https://unicode.org/faq/utf_bom.html#General
better than anything you or i may put as info into this forum.
Yes but the one encoding where every single Unicode code point is mapped 1:1 to its 32bit valueUsher wrote: 2019-01-03, 04:06 UTC UTF-32 is NOT a synonym for Unicode, it's only one of Unicode encodings
https://www.unicode.org/versions/Unicode11.0.0/ch03.pdf#G7404 wrote:D76 Unicode scalar value:
Any Unicode code point except high-surrogate and low-surrogate code points.
• As a result of this definition, the set of Unicode scalar values consists of the
ranges 0 to D7FF and E000 to 10FFFF, inclusive.
Don't you think that you leave any newbie far behind you with this point?Usher wrote: 2019-01-03, 04:06 UTC There is lack of info about endianness (byte order) in the table you have quoted and you also ignore the byte order problem. Some people may not understand why they can't see the number from the table when using hex view in Lister.
Is this important if you enter unicode symbols or search via regex for unicode symbols?
BTW: i don't use Unicode emoticons except the set of smileys which can be encoded with ASCII characters
Code: Select all
8-) :-( ;-) :-) ... :-|
Unfortunately the forum software automatically translates these character sequences into single Unicode characters:






Holger