Clan x86

Technical (Development, Security, etc.) => General Programming => Topic started by: Camel on June 26, 2008, 03:42:08 PM

Title: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: Camel on June 26, 2008, 03:42:08 PM
So I've just committed a major revision to my bot to enable the ability to send encrypted unicode characters. I've introduced a new ByteArray class, which implicitly does the String->UTF-8 conversion and vice-versa. This allows my encryption modules to work on the UTF-8 encoded byte arrays directly, thus circumventing the issue of worring about unicode characters.

I've got a chat splitter in my bot; the idea is that if you type a really long line of text in to the chat box and hit enter, or if a command has a long response, the core will automatically split it up in to multiple SID_CHATCOMMAND messages which are properly formatted.

Here's the catch: unicode characters manifest in UTF-8 byte arrays as byte pairs, or sometimes trios, but in order to calculate how much data to pull out of the buffer, I have to have already converted the unicode characters to a UTF-8 byte array, because bnet limits you on the size of the byte array, not the unicode string. This means that the chat splitter can potentially break up a unicode character in to two lines of text.

Any ideas on how to work around this situation?
Title: Re: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: Chavo on June 26, 2008, 05:03:20 PM
split before you convert the string (even if it means doing so outside your normal message splitter)
Title: Re: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: Camel on June 27, 2008, 03:04:46 AM
I can't do that, because I don't know how long the unicode string will be once utf8 encoded. In the worst case, it will triple in size, but it would be foolish to limit the user to 66 characters.

This is roughly what I have, in very very simplified form

sendChat(prefix="/w bnu-camel ", text="some really long unicode string", crypto=DM_ENCRYPTION) {
  int MAX_CHAT_LENGTH = 200; // just for shits
  int length_to_pull_from_buffer = MAX_CHAT_LENGTH - prefix.toUtf8.length;
  if(crypto == DM_ENCRYPTION)
    length_to_pull_from_buffer = (length_to_pull_from_buffer - 1) / 2; // DM has a prefix and doubles length

  for(int i = 0; i < text.length; i += length_to_pull_from_buffer) {
    part = prefix + encrypt(text.substr(i, length_to_pull));
  }
}

Title: Re: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: Camel on June 27, 2008, 03:10:33 AM
Expanded:
private void enqueueChat(ByteArray prefix, ByteArray text, int priority) {
//Split up the text in to appropriate sized pieces
int pieceSize = MAX_CHAT_LENGTH;
if(prefix != null)
pieceSize -= prefix.length();
if(enabledCryptos != 0) {
if((enabledCryptos & GenericCrypto.CRYPTO_REVERSE) != 0)
pieceSize--; // Reverse has a prefix
if((enabledCryptos & GenericCrypto.CRYPTO_MC) != 0)
pieceSize--; // MC has a prefix
if((enabledCryptos & GenericCrypto.CRYPTO_DM) != 0)
pieceSize = (pieceSize - 1) / 2; // DM doubles in size and has a prefix
if((enabledCryptos & GenericCrypto.CRYPTO_HEX) != 0)
pieceSize = (pieceSize - 1) / 2; // Hex doubles in size and has a prefix
if((enabledCryptos & GenericCrypto.CRYPTO_BASE64) != 0)
pieceSize = (pieceSize - 1) * 3 / 4; // B64 increases 33% and has a prefix
}

ChatQueue cq = profile.getChatQueue();
for(int i = 0; i < text.length(); i += pieceSize) {
ByteArray piece = text.substring(i);
if(i > 0) {
// This is not the first piece; prepend ellipsis
piece = new ByteArray("...").concat(piece);
i -= 3;
}
if(piece.length() > pieceSize) {
// This is not the last piece; append ellipsis
piece = piece.substring(0, pieceSize - 3).concat("...".getBytes());
i -= 3;
}

// Cryptos
if(enabledCryptos != 0)
piece = GenericCrypto.encode(piece, enabledCryptos);

// Prepend the prefix
if(prefix != null)
piece = prefix.concat(piece);

cq.enqueue(this, piece, priority);
}
}
Title: Re: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: Camel on June 27, 2008, 03:19:07 AM
I had an idea; ByteArrayEx extends ByteArray, and keeps track of the width of each unicode char, thus allowing the splitter to obtain a hint about where it's safe to split.
Title: Re: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: Camel on June 30, 2008, 09:16:35 AM
Quote from: Camel on June 27, 2008, 03:19:07 AM
I had an idea; ByteArrayEx extends ByteArray, and keeps track of the width of each unicode char, thus allowing the splitter to obtain a hint about where it's safe to split.

I got about half way in to considering how ugly this solution would be before I marked ByteArray as a final class.

Any other ideas? :)
Title: Re: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: Joe on June 30, 2008, 12:26:59 PM
I swear this wasn't supposed to look like VB. But, what's wrong with the idea of doing it like this?

Const MaxLength = 200 //i did it for the lulz?

Proc SendText(Utf8Text)
    If Length(Utf8Text) < 200 Then Enqueue(ToUnicode(Utf8Text)); Break

    For I = 0 to Utf8Text;  I+=200
        Enqueue Substring(ToUnicode(Utf8Text), I, 200)
    Next I
End Proc
Title: Re: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: Camel on June 30, 2008, 01:22:56 PM
That doesn't prevent a utf8 encoded unicode character from beign split in half

[edit] and, additionally, is exactly what i already have :)
Title: Re: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: MyndFyre on June 30, 2008, 05:09:36 PM
What if you prepend the byte length of the message?
Title: Re: [Java] What happens when you combine support for Unicode, DM encryption, and...
Post by: Camel on June 30, 2008, 11:22:40 PM
Quote from: MyndFyre on June 30, 2008, 05:09:36 PM
What if you prepend the byte length of the message?

That doesn't address the issue either.