Can't post with non-BMP characters such as emoji

Can't post with non-BMP characters such as emoji
by on (#161973)
In this post, I was trying to make a distinction between SNROM with battery and SNROM without battery by including the Unicode character BATTERY (U+1F50B). I can preview the post just fine:
File comment: Preview works fine.
sgemoji.png [ 756 Bytes | Viewed 8881 times ]

But when I try to post I get this error:
General Error
SQL ERROR [ mysqli ]

Incorrect string value: '\xF0\x9F\x94\x8B\x0A\x0A...' for column 'post_text' at row 1 [1366]

it appears MySQL by default does not support UTF-8 code unit sequences that correspond to Unicode code points outside the Basic Multilingual Plane (U+0000 through U+FFFD). In GTK+ applications, this character can be typed with Ctrl-Shift-U 1f50b Space. A related question on Stack Overflow is “Incorrect string value” when trying to insert UTF-8 into MySQL via JDBC? which implies that certain settings will need to be flipped from utf8 (BMP-only UTF-8) to utf8mb4 (UTF-8 including extra planes, the NES 2.0 of Unicode), which was introduced in MySQL 5.5.
Re: Can't post with non-BMP characters such as emoji
by on (#162037)
I would have mistaken the battery graphic for a logic gate or something.
Re: Can't post with non-BMP characters such as emoji
by on (#166365)
It's not just the battery. Other characters, such as the emoji that inspired several emblems in the NES homebrew game Concentration Room, can't be posted because the column for the text of a post is set to CHARACTER SET utf8 instead of CHARACTER SET utf8mb4. Part of the reason for this is that runs MySQL 5.1.62, not 5.5.3 or later.

Step 2: Upgrade the MySQL server

Upgrade the MySQL server to v5.5.3+, or ask your server administrator to do it for you.

I hereby request that the server administrator do it for me.
Re: Can't post with non-BMP characters such as emoji
by on (#213992)
Today I discovered that MySQL had been upgraded to 5.5.53, but the tables still had not been upgraded from utf8 to utf8mb4.
Re: Can't post with non-BMP characters such as emoji
by on (#214033)
tepples, please remember: just because a Unicode glyph exists doesn't mean everyone can view it. Not everyone's devices have every version of Unicode on them. For example, my mobile phone is from 2013 and there are many present-day glyphs that show up as [X]. Battery symbol is Unicode 6.0, so my phone has it, but just because *my* device has it doesn't mean it's wise to use. It also can cause problems when such content is copy-pasted into non-Unicode mediums.

In other words: use of said glyph (vs. an actual word) brings nothing to the table (pun not intended) content-wise. You're just being pedantic / obsessed.

That said: upgrading of the MySQL tables from utf8 to utf8mb4 should probably happen anyways, but you should do some digging to see if there are negative ramifications of that. Examples include: bugs in MySQL client, bugs in MySQL server, performance implications in MySQL server, disk space concerns, incompatibility/breakage with future phpBB upgrades (these upgrades often do ALTER TABLE or other table mangling), and so on. It's always best to follow what the software (phpBB) recommends, even if that means giving up something you personally want. One must be practical.
Re: Can't post with non-BMP characters such as emoji
by on (#221206)
Hopefully some of the performance problems have been worked out in the eight years since utf8mb4 was introduced. When is a full backup scheduled in preparation for an upgrade to MySQL 5.5.3 or later so that the tables can be converted to utf8mb4? Or is phpBB generally structured such that the only way to convert it from utf8 to utf8mb4 is by wiping all posts and starting over?