Topic: Tag Alias: pi -> 𝜋

Posted under Tag Alias and Implication Suggestions

memeboy said:
why is pk being aliased to a square?

Granberia said:
It's wrong! We should alias it to 3.2 instead.

Seriously though I'm for the alias to 휋 or π (TBH they look almost the same on my browser).

Actually, using 𝜋 might not be a good idea. The font choice seems to screw with the character and it looks like Windows XP or similar doesn't have support for it and it shows up as a square (what memeboy said).

For compatibility, I think using the regular greek pi character (π) might be a better choice. Can you see this character memeboy? (this is what it's supposed to look like)

Updated by anonymous

I'm not really sure about using symbols as tags. I think sticking to text based tags (like pi) wherever possible is a better option in the long run. Not only are there compatibility issues I'm sure there are a lot of people who don't know how to use the alt<numbers> function (Windows, not sure about Mac) to get the desired symbols.

I think the simplistic approach would be more useful to the bulk of users.

Updated by anonymous

parasprite said:
Actually, using 휋 might not be a good idea. The font choice seems to screw with the character and it looks like Windows XP or similar doesn't have support for it and it shows up as a square (what memeboy said).

For compatibility, I think using the regular greek pi character (π) might be a better choice. Can you see this character memeboy? (this is what it's supposed to look like]
Okay, I guess you're right. My browser also displays it differently.

Updated by anonymous

Genjar

Former Staff

Yeah, I don't see that symbol either.

And I've never understood why we use symbol tags anyway. It's always a hassle to add (or search for) those: either you have to check the alias (if there even is one), or copypaste the symbol from somewhere.

Updated by anonymous

I think that simple symbol tags are ok , like <3 =I =3 , faces and such but i wouldn't even know where to begin to make a Pi (𝜋) character unless i copy paste from somewhere else.

Updated by anonymous

Just_Another_Dragon said:
I think that simple symbol tags are ok , like <3 =I =3 , faces and such but i wouldn't even know where to begin to make a Pi (휋) character unless i copy paste from somewhere else.

I don't personally find this to always be a better solution than using something more friendly, however it's pretty easy to set up a tag like to have something more friendly aliased to it like female_symbol (you can do this already). However, this only works for adding it as a tag obviously.

A few more on symbol:

However, most of these have been aliased the other way round over the years, so I'm beginning to agree with using the more friendly versions.

Updated by anonymous

The standard is to avoid non-unicode characters.
So, nope.

Updated by anonymous

Halite said:
The standard is to avoid non-unicode characters.
So, nope.

It's actually standard since unicode 6. But it is definitely hard to type on a keyboard.

Just curious (since I know you've been here for a while), do you remember any sort of reasoning on why female_symbol/male_symbol wasn't chosen over their fancy symbol variants?

Updated by anonymous

Halite said:
The standard is to avoid non-unicode characters.

What exactly is non-unicode character?
This page contains 𝜋, so is it unicode character or not?
Also if 𝜋 is not allowed then is é (pokémon) allowed?

Also should we rename artist names?
For example bell☆ -> bell_don't_use_strange_symbols_you_asshole

Updated by anonymous

Perhaps it's best if the tags were things you could type on most keyboards - standard alphanumeric and punctuation characters, as well characters that can be accessed with ctrl-alt such as é. Then they'd be much easier to type.

Updated by anonymous

Granberia said:
What exactly is non-unicode character?
This page contains 휋, so is it unicode character or not?
Also if 휋 is not allowed then is é (pokémon) allowed?

Also should we rename artist names?
For example bell☆ -> bell_don't_use_strange_symbols_you_asshole

Avoid isn't the same thing as "not allowed".

Updated by anonymous

Granberia said:
Also should we rename artist names?
For example bell☆ -> bell_don't_use_strange_symbols_you_asshole

I missed this. Perhaps artist's names could use non-standard characters if they're integral to the name.

Updated by anonymous

Kaeetayel said:
Perhaps it's best if the tags were things you could type on most keyboards - standard alphanumeric and punctuation characters, as well characters that can be accessed with ctrl-alt such as é. Then they'd be much easier to type.

That's pretty much what people mean when they say "non-unicode characters" or (ironically) "unicode characters". Truth is everything from „̵̱̱̋̊̊̊ to emoji are all part of unicode standards. The ☆ Granberia mentioned is almost certainly part of the U2600 block, which has been part of unicode since 1991.

Although just because it's part of unicode doesn't mean the font supports it (which is really the biggest issue).

Updated by anonymous

Halite said:
Avoid isn't the same thing as "not allowed".

The point here AFAICS is that your concept of 'non-unicode character' is just incorrect -- all of the characters thus far shown are part of UTF-8, which is an encoding that.."encodes each of the 1,112,064 valid code points in the Unicode code space". The HTML rendered by E621 is encoded in UTF8 -- any character that E621 can send to your browser as part of a valid document is by definition a Unicode character.

Trying to assume you weren't intentionally being obtuse here, I'd find it far more likely that you were trying to refer to a specific plane or block of Unicode, probably block 0000-0FFF , which contains all of ASCII plus characters for most moderately-common languages (Greek, Russian, Arabian, French, German..).
Or maybe just ASCII itself.


Updated by anonymous

I'm just wondering what kind of dark magic you people are using to type these things

Updated by anonymous

Durandal said:
I'm just wondering what kind of dark magic you people are using to type these things

Character Map

Updated by anonymous

Durandal said:
I'm just wondering what kind of dark magic you people are using to type these things

Magic of copy-paste. Especially when you're using laptop and even getting your numpad working is annoying.

Updated by anonymous

Durandal said:
I'm just wondering what kind of dark magic you people are using to type these things

Character map program is the easiest way if you're just messing around.


Also, in any ₲TK+ based program I can type ₶trl+Ⓢℎift+U followed by a heᳲ character ⋐ode to enter that characteℛ.
In Windows I understand it is done with Alt+keypad numbers (which, I guess, means the character code is in decima∠ rather than hex)

(this collection of silliness was done with the gucharmap program, FWIW)

Updated by anonymous

savageorange said:
The point here AFAICS is that your concept of 'non-unicode character' is just incorrect -- all of the characters thus far shown are part of UTF-8, which is an encoding that.."encodes each of the 1,112,064 valid code points in the Unicode code space". The HTML rendered by E621 is encoded in UTF8 -- any character that E621 can send to your browser as part of a valid document is by definition a Unicode character.

Trying to assume you weren't intentionally being obtuse here, I'd find it far more likely that you were trying to refer to a specific plane or block of Unicode, probably block 0000-0FFF , which contains all of ASCII plus characters for most moderately-common languages (Greek, Russian, Arabian, French, German..).
Or maybe just ASCII itself.


Except the character used in the OP of the thread isn't unicode, as is evidenced by the fact that it's japanese Kanji on my screen.
π is the unicode pi not 휋

Updated by anonymous

Halite said:
Except the character used in the OP of the thread isn't unicode, as is evidenced by the fact that it's japanese Kanji on my screen.

So the definition of unicode is "character that displays correctly on Halite's screen"?
Someone should make corrections in its wikipedia page.

Updated by anonymous

Yeah, I don't see the symbol either. You'd thing Chrome could handle it... or whatever it is that handles fonts on webpages... I'm not good at under the hood computer stuff, I leave all that kinda stuff to Xch3l.

I'm +1 for keeping it as pi, as I think π is too easy to mistake for n or something.

Updated by anonymous

Halite said:
Except the character used in the OP of the thread isn't unicode, as is evidenced by the fact that it's japanese Kanji on my screen.
π is the unicode pi not 휋

There are multiple Pi's in unicode : [ℼℿ𝚷𝛑𝛡𝜫𝝅𝝥𝝿𝞏𝞟𝞹𝟉Ⲡⲡπ𝛱] (and probably others -- that was just a quick check). The character used in OP is U+1D70B MATHEMATICAL ITALIAN SMALL PI. (this stuff is easy to check with a character map program)
Whatever software is doing your font rendering is probably incorrect -- characters your font doesn't support should normally be rendered as a box, possibly with the character code shown inside. Or a diamond with a question mark. This link goes into it a little.

Wikibooks provides a reference for the block of Unicode that contains OP's chosen character : I guess most or all of those characters will render wrongly for you.

In general, lower numbered unicode characters tend to be better supported throughout a variety of fonts, so I'd personally support a policy of 'No characters that aren't in the 0000-ffff plane'.

@ Tokaido
Your browser actually has surprisingly little to do with it; It's basically your OS (or to be precise, your font rendering libraries and the fonts they have available to them). Most modern font rendering systems will use a 'multiple-fallbacks' system, where they look through multiple fonts to see if any of them define that character; If no character is found, you get a box (or something else that is supposed to represent 'undefined character', like a black diamond containing a question mark).
So mainly, it depends on how well the fonts you have installed cover the defined list of Unicode characters. If you haven't specifically tried to make your Unicode font support good, it's probably fairly marginal.

(The main involvement your browser has, is in working out what encoding the text coming to you is using. See eg. View->Character Encoding menu in Firefox. Most often on English sites this will be Unicode (UTF-8), though)

Updated by anonymous

  ▲
▲ ▲

Newfags can't even triforce.

In my opinion, using symbols that can only be typed using either copy-paste or alt-xxx, etc, is not a good idea. Most people aren't computer literate enough to know what unicode is, much less how to type a given symbol.

Although for Pokemon, that is a copyright name, so I'm not really sure how that stands.

Updated by anonymous

Moon_Moon said:
  ▲
▲ ▲

Newfags can't even triforce.

In my opinion, using symbols that can only be typed using either copy-paste or alt-xxx, etc, is not a good idea. Most people aren't computer literate enough to know what unicode is, much less how to type a given symbol.

Although for Pokemon, that is a copyright name, so I'm not really sure how that stands.

Pokemon is not the only case where it may be unavoidable. We also use unicode for artists who provide no ASCII name (mainly, asian artists).
I don't have any problems in principle with things like pi -> 𝜋 , used as a sort of 'ad-hoc "input method"', but I agree it's problematic in the sense that if you see 𝜋, you won't necessarily think to type 'pi' to tag other images with that same picture element. Not really a problem with pi, since it's so uncommon. Would be more of a serious problem with common tags, like if we chose to alias <3 to ❤.

Updated by anonymous

well pokemon doesn't cause any problems because you can just tag or search pokemon and because alias it fixes it for you.

Updated by anonymous

Moon_Moon said:
  ▲
▲ ▲

Newfags can't even triforce.

In my opinion, using symbols that can only be typed using either copy-paste or alt-xxx, etc, is not a good idea. Most people aren't computer literate enough to know what unicode is, much less how to type a given symbol.

Although for Pokemon, that is a copyright name, so I'm not really sure how that stands.

I hold the letter e then select the accent mark. I'm amazed that Windows hasn't come up with something more useful than the character map or alt codes for typing these things.

Halite said:
well pokemon doesn't cause any problems because you can just tag or search pokemon and because alias it fixes it for you.

Yeah, it's really only an issue when someone links it on the wiki.

savageorange said:
Pokemon is not the only case where it may be unavoidable. We also use unicode for artists who provide no ASCII name (mainly, asian artists).
I don't have any problems in principle with things like pi -> 휋 , used as a sort of 'ad-hoc "input method"', but I agree it's problematic in the sense that if you see 휋, you won't necessarily think to type 'pi' to tag other images with that same picture element. Not really a problem with pi, since it's so uncommon. Would be more of a serious problem with common tags, like if we chose to alias <3 to ❤.

That's actually why I chose a more stylized one. I didn't realize that I was pulling it from the mathematical set which is one of the most inconsistent sets between fonts (display-wise). I'm not entirely sure why.

Updated by anonymous

Generally speaking, we avoid using most non-standard symbols because of 1, compatibility/display concerns (not everyone's computer can or will display it the same [if at all] for too many symbols), 2, difficulties typing it for searches and tagging.

But we do have a few exceptions where the symbol is common enough that most computers can and will display it correctly, as well as the usefulness or clearness of having the symbol or accent mark was seen to be important. In those cases, some sort of text-based version of the tag has been aliased to it in order to make it easily typeable since the alias will handle it. Which then handles most of the concerns. Like pokemon to --> pokémon. And peace_symbol to --> since otherwise it and "peace_sign"/v_sign tended to get mixed up even more than they do now. So we've been known to use it sparingly and selectively. But for the most part, we try to avoid using non-standard symbols.

In this case, based on how the screencaps of what these symbols should look like vs how it looks in my browser (and quite a few other people's, judging by the comments), I don't think the compatibility is high enough to keep the special symbols for pi. Pi as a word is clear enough on it's own, and I think we'd be better off aliasing the symbols to the word in this instance.

Updated by anonymous

parasprite said:
That's actually why I chose a more stylized one. I didn't realize that I was pulling it from the mathematical set which is one of the most inconsistent sets between fonts (display-wise). I'm not entirely sure why.

Did you notice how the pi character 𝜋 changed to 휋 (I guess this is the character Halite saw) when you quoted me? This looks like a bug either in e621's character encoding handling or both our browsers (I'm using Firefox 34.0.5, you?)

Updated by anonymous

Genjar

Former Staff

savageorange said:
There are multiple Pi's in unicode : [ℼℿ횷훑훡휫흅흥흿힏힟ힹ퟉Ⲡⲡπ훱] (and probably others -- that was just a quick check).

Except for the first two and second to last, those don't look like pi to me...

Updated by anonymous

savageorange said:
Did you notice how the pi character 휋 changed to 휋 (I guess this is the character Halite saw) when you quoted me? This looks like a bug either in e621's character encoding handling or both our browsers (I'm using Firefox 34.0.5, you?)

Genjar said:
Except for the first two and second to last, those don't look like pi to me...

For some reason the encoding changes when you reply to it regardless of browser (Firefox, Safari, iOS Safari, and Chrome here). I don't know enough about the backend to know why this is.

I've definitely come to the conclusion that of all the characters I could have picked, this was one of the worse choices. o.O

Updated by anonymous

Genjar

Former Staff

parasprite said:
For some reason the encoding changes when you reply to it regardless of browser (Firefox, Safari, iOS Safari, and Chrome here). I don't know enough about the backend to know why this is.

Even without replying, those don't look like pi. All I see is a row of boxes.

Updated by anonymous

Looks like someone other than me has japanese fonts installed :3

Updated by anonymous

Genjar said:
Even without replying, those don't look like pi. All I see is a row of boxes.

Just curious, what OS are you using? (I think you were the one that used Linux...?)

Updated by anonymous

Genjar

Former Staff

parasprite said:
Just curious, what OS are you using? (I think you were the one that used Linux...?)

Mostly Windows 7 64-bit, browsing with Chrome or Nightly. Haven't used Linux for a while, been too busy to get my second rig working again. And yep, I'm using the Japanese fonts for compatibility reasons.

And on my tablet (tried both Chrome and Dolphin), most of those symbols show up as blank. Only the second to last pi works (π).

Updated by anonymous

Genjar said:
Mostly Windows 7 64-bit, browsing with Chrome or Nightly. Haven't used Linux for a while, been too busy to get my second rig working again.

And on my tablet (tried both Chrome and Dolphin), most of those symbols show up as blank. Only the second to last pi works (π).

Halite said:
I'm on Windows 8 with Chrome.

I get mixed results on my iPad, but I get 3/4 Japanese with the replied version. On the original line of them I see pretty much what Genjar sees.

Edit: Except for the title bar which seems to be falling into Last Resort lol

Updated by anonymous

Ok, here's what I'm going to do. Since both 𝜋 and Π appear to sometimes get displayed as other characters (including boxes and kanji and an alien face), to keep it from confusing anything I manually put a note in their wiki to refer to pi instead using an alias. This is to prevent an alias displaying as something unrelated like an alien face or kanji to some people and confusing things. And since both of them are empty at the moment, I doubt many people use that form of it on here anyways.

And I aliased π to --> pi since that form of it was being used as a tag. And from the comments here it was also one of the few symbols for pi that showed any consistency in being recognizable on various computers as the same type of thing. So hopefully this solution will work.

Updated by anonymous

furrypickle said:
Ok, here's what I'm going to do. Since both and Π appear to sometimes get displayed as other characters (including boxes and kanji), to keep it from confusing anything I manually put a note in their wiki to refer to pi instead using an alias. This is to prevent an alias displaying as something unrelated like an alien face or kanji to some people and confusing things. And since both of them are empty at the moment, I doubt many people use that form of it on here anyways.

And I aliased π to --> pi since that form of it was being used as a tag. And from the comments here it was also one of the few symbols for pi that showed any consistency in being recognizable on various computers as the same type of thing. So hopefully this solution will work.

👍

That's a unicode thumbs up, but I'm betting it's just a box for most, if not all of you.

Updated by anonymous

Halite said:
👍

That's a unicode thumbs up, but I'm betting it's just a box for most, if not all of you.

Non-Google Chrome master race reporting in. 👍

Updated by anonymous

parasprite said:
Just curious, what OS are you using? (I think you were the one that used Linux...?)

Yeah, Arch Linux, 64bit. That -could- potentially change things (fontconfig vs whatever Windows uses to resolve glyphs), but it seems so far that there isn't any notable difference.

I have, however, got a set of fonts installed that have quite good coverage of the Unicode character set, including fonts designed for mathematical notation.

It's funny that you have an alien face as your 'unknown character' glyph. Kind of understandable though.

I get mixed results on my iPad, but I get 3/4 Japanese with the replied version. On the original line of them I see pretty much what Genjar sees.

I didn't realize Unicode support was still this broken. Yikes. (although not working on IPad is more to be expected)
And I'm still not sure where the problem is. Do we need to specifically tell browsers that the encoding of the message input box is UTF-8, somehow?

(I'm not really seeing 'lumps of japanese' as a valid fallback character rendering strategy, here ;)

Updated by anonymous

savageorange said:
Yeah, Arch Linux, 64bit. That -could- potentially change things (fontconfig vs whatever Windows uses to resolve glyphs), but it seems so far that there isn't any notable difference.

I have, however, got a set of fonts installed that have quite good coverage of the Unicode character set, including fonts designed for mathematical notation.

It's funny that you have an alien face as your 'unknown character' glyph. Kind of understandable though.

I didn't realize Unicode support was still this broken. Yikes. (although not working on IPad is more to be expected)
And I'm still not sure where the problem is. Do we need to specifically tell browsers that the encoding of the message input box is UTF-8, somehow?

(I'm not really seeing 'lumps of japanese' as a valid fallback character rendering strategy, here ;)

I'm not sure. I know that it changes to a monospace font, but if you copy and paste the character into the box it displays fine. I think it gets converted somewhere in the middle. It does actually call for UTF-8 specifically in the code.

However, using http://www.babelstone.co.uk/Unicode/whatisit.html I think I may have figured it out:

U+1D70B : MATHEMATICAL ITALIC SMALL PI
U+D70B : HANGUL SYLLABLE HWELH

It looks like it is dropping the 1 at the beginning somehow.

This is definitely a bug (albeit a pretty minor one).

Updated by anonymous

parasprite said:
I'm not sure. I know that it changes to a monospace font, but if you copy and paste the character into the box it displays fine. I think it gets converted somewhere in the middle. It does actually call for UTF-8 specifically in the code.

However, using http://www.babelstone.co.uk/Unicode/whatisit.html I think I may have figured it out:

U+1D70B : MATHEMATICAL ITALIC SMALL PI
U+D70B : HANGUL SYLLABLE HWELH

It looks like it is dropping the 1 at the beginning somehow.

This is definitely a bug (albeit a pretty minor one).

It probably only effects characters with a value above 0xffff, then (if some foolish bit of code is using an uint16 to store the character code instead of a uint32, that could be the cause.).
Testing:

These characters should be fine when quoting: £� (EBE4, EE15, F087, F80C, FFE1, FFFD) -- it is correct for the final character to show as a 'replacement character' : in my font it is a ? in a black diamond. Some of these characters may not show as anything but a box for the average viewer

These characters should break when quoting: 𐄗𝄊𝛀🌷💛󰀀󲁡􏿿 (10117, 1D10a, 1D6C0, 1F337, 1F49B, F0000, F2061, 10FFFF) . Most or all of these characters may not show as anything but a box for the average viewer.

EDIT: Confirmed. All of the above results were as predicted -- anything beyond the bottom 16 bits was dropped.

Updated by anonymous

savageorange said:
It probably only effects characters with a value above 0xffff, then (if some foolish bit of code is using an uint16 to store the character code instead of a uint32, that could be the cause.).
Testing:

These characters should be fine when quoting: £� (EBE4, EE15, F087, F80C, FFE1, FFFD) -- it is correct for the final character to show as a 'replacement character' : in my font it is a ? in a black diamond. Some of these characters may not show as anything but a box for the average viewer

These characters should break when quoting: ė턊훀⁡￿ (10117, 1D10a, 1D6C0, 1F337, 1F49B, F0000, F2061, 10FFFF) . Most or all of these characters may not show as anything but a box for the average viewer.

EDIT: Confirmed. All of the above results were as predicted -- anything beyond the bottom 16 bits was dropped.

Yeah, that's essentially what I got

Nice job. Any chance you could send a message to Toni about this since you actually know what you're talking about? Else I can just ask them to look at the thread.

Updated by anonymous

It's like a myriad of special effects in here, every post has a handful of symbols I've never seen before.

Updated by anonymous

parasprite said:
Yeah, that's essentially what I got

Nice job. Any chance you could send a message to Toni about this since you actually know what you're talking about? Else I can just ask them to look at the thread.

Not sure what is to be gained by specifically PMing him, but I've added a bug report to the bug report thread. Thanks for confirming my results.

Updated by anonymous

  • 1