So I've just committed a major revision to my bot to enable the ability to send encrypted unicode characters. I've introduced a new ByteArray class, which implicitly does the String->UTF-8 conversion and vice-versa. This allows my encryption modules to work on the UTF-8 encoded byte arrays directly, thus circumventing the issue of worring about unicode characters.
I've got a chat splitter in my bot; the idea is that if you type a really long line of text in to the chat box and hit enter, or if a command has a long response, the core will automatically split it up in to multiple SID_CHATCOMMAND messages which are properly formatted.
Here's the catch: unicode characters manifest in UTF-8 byte arrays as byte pairs, or sometimes trios, but in order to calculate how much data to pull out of the buffer, I have to have already converted the unicode characters to a UTF-8 byte array, because bnet limits you on the size of the byte array, not the unicode string. This means that the chat splitter can potentially break up a unicode character in to two lines of text.
Any ideas on how to work around this situation?