Author Topic: Re-writing JBLS  (Read 17533 times)

0 Members and 4 Guests are viewing this topic.

Offline warz

  • Hero Member
  • *****
  • Posts: 1134
    • View Profile
    • chyea.org
Re: Re-writing JBLS
« Reply #15 on: June 12, 2008, 02:00:51 am »
I don't think the algorithms need worker threads, the algorithm stuff can be done as part of the connection's thread. People shouldn't be requesting a bunch of things anyways.

exactly. like i said, no matter what you do, it's going to be a "queue style process" - one thing will always be before the next step. make your algorithms into some class, and instanciate an object that handles those algorithms in each connection's thread. in the grand scheme of things, this isn't a terribly difficult design concept. it's your typical server style app.
http://www.chyea.org/ - web based markup debugger

Offline MyndFyre

  • Boticulator Extraordinaire
  • x86
  • Hero Member
  • *****
  • Posts: 4540
  • The wait is over.
    • View Profile
    • JinxBot :: the evolution in boticulation
Re: Re-writing JBLS
« Reply #16 on: June 12, 2008, 05:54:07 am »
What could be done in parallel for a given Crev anyway?  Each result is dependent on the last.
Whoever said they were dependent on eachother? Only thing crevs use from the last ones are there results in the cache [if its in the cache it doesn't need to be computed]

It's been a while since I've worked on crev, but I'm pretty sure that, as a given crev is calculated, the values are order-dependent.  Meaning, you can't hash starcraft.exe, storm.dll, and battle.snp and expect it to be the same as hashing storm.dll, battle.snp, then starcraft.exe.  So, you can't make a given crev calculation paralellized. 

You can, of course, parallelize multiple requests, and you could even add into a queue that you're currently calculating a given formula/product pair and, rather than running multiple requests for the same, wait until an existing request is completed.
I have a programming folder, and I have nothing of value there

Running with Code has a new home!

Our species really annoys me.

Offline iago

  • Leader
  • Administrator
  • Hero Member
  • *****
  • Posts: 17914
  • Fnord.
    • View Profile
    • SkullSecurity
Re: Re-writing JBLS
« Reply #17 on: June 12, 2008, 08:22:13 am »
Incidentally, parallelizing within an algorithm will only speed things up if multiple CPUs are being used. If that isn't the case, then there's little point, other than to handle other requests while a slow one is going.

Offline Joe

  • B&
  • Moderator
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: Re-writing JBLS
« Reply #18 on: June 12, 2008, 10:43:51 am »
Incidentally, parallelizing within an algorithm will only speed things up if multiple CPUs are being used. If that isn't the case, then there's little point, other than to handle other requests while a slow one is going.

I feel it's still good coding practice, especially with dual-cores being the norm now and quad-cores emerging on the consumer market.
I'd personally do as Joe suggests

You might be right about that, Joe.


Offline iago

  • Leader
  • Administrator
  • Hero Member
  • *****
  • Posts: 17914
  • Fnord.
    • View Profile
    • SkullSecurity
Re: Re-writing JBLS
« Reply #19 on: June 12, 2008, 10:54:47 am »
Incidentally, parallelizing within an algorithm will only speed things up if multiple CPUs are being used. If that isn't the case, then there's little point, other than to handle other requests while a slow one is going.

I feel it's still good coding practice, especially with dual-cores being the norm now and quad-cores emerging on the consumer market.
I disagree, this is the type of thing that falls under premature optimization. I think it'd be more useful to have it working cleanly in one thread/connection, and let it process 2 or 4 connections simultaneously.

Offline Hdx

  • The Hdx!
  • Full Member
  • ***
  • Posts: 311
  • <3 Java/Cpp/VB/QB
    • View Profile
Re: Re-writing JBLS
« Reply #20 on: June 12, 2008, 10:58:18 am »
It's been a while since I've worked on crev, but I'm pretty sure that, as a given crev is calculated, the values are order-dependent.  Meaning, you can't hash starcraft.exe, storm.dll, and battle.snp and expect it to be the same as hashing storm.dll, battle.snp, then starcraft.exe.  So, you can't make a given crev calculation paralellized. 

You can, of course, parallelize multiple requests, and you could even add into a queue that you're currently calculating a given formula/product pair and, rather than running multiple requests for the same, wait until an existing request is completed.
Ah thats what you ment, this is correct. I never planned on doing at that low of a level.
http://img140.exs.cx/img140/6720/hdxnew6lb.gif
09/08/05 - Clan SBs @ USEast
 [19:59:04.000] <DeadHelp> We don't like customers.
 [19:59:05.922] <DeadHelp> They're assholes
 [19:59:08.094] <DeadHelp> And they're never right.

Offline Camel

  • Hero Member
  • *****
  • Posts: 1703
    • View Profile
    • BNU Bot
Re: Re-writing JBLS
« Reply #21 on: June 12, 2008, 05:57:05 pm »
I don't think the algorithms need worker threads, the algorithm stuff can be done as part of the connection's thread. People shouldn't be requesting a bunch of things anyways.
People do request a ton of crap. Its jsut them being stupid.
Right now all the 'one run' algorithms are in static form. Meaning One(1) instance of the function is in memory for all threads to use. Unless i'm being retarded and this isnt how it works.
Now the simple thing to do is to make everything non-static, and create/destroy them as needed.

It sounds like you might be misunderstanding something here. Instantiating a type doesn't create copies of the methods in memory, it only creates a type that is the size of the non-static fields in that class. All methods are of a static nature in memory, but methods that are not marked static can reference "this," since they are always called with respect to that object.

I'm not sure exactly what the java bytecode looks like, but in C, the traditionally generated asm for an object-function looks like the first (last?) parameter to the function is a pointer to the object decorated over what the function's definition looks like in code. The theory is exactly the same with Java.


[edit] That said, it still seems appropriate for the CheckRevision classes to have static methods, since the lifecycle of an object would be one method call, and because the object would be stateless.
« Last Edit: June 12, 2008, 06:00:42 pm by Camel »

<Camel> i said what what
<Blaze> in the butt
<Camel> you want to do it in my butt?
<Blaze> in my butt
<Camel> let's do it in the butt
<Blaze> Okay!

Offline Camel

  • Hero Member
  • *****
  • Posts: 1703
    • View Profile
    • BNU Bot
Re: Re-writing JBLS
« Reply #22 on: June 12, 2008, 06:25:56 pm »
The more I think about it, the more I think that you may not want to use asynchronous sockets. The lifecycle of a BNLS connection is pretty short - it lasts, in the worst case, the full duration of one login cycle, plus some idling if the bot doesn't close the socket. Asynchronous sockets might be valueless, since the probably unrealistic upper limit of 100 simultaneous connections wouldn't really bog down the system by a matter of design flaw.

You absolutely should limit the maximum number of connections to a fairly small number, and perhaps consider booting people who are in an idle state if you wish to accept a new connection while at capacity. This is just good practice.

So, with that said and done, here's part one of my proposed design: in your main thread, read the config file in (consider using my property file reader, and store all the settings in static fields of a settings class), and then bring up the listener socket. Every time you accept(), create a new instance of a class extending Thread and start() it, handing off the socket to the thread.

In the constructor of your connection thread, call setDeamon(true), which you must do **before** you start() it. Deamon threads are not taken in to consideration when the JVM's is trying to decide whether the program is done and should be killed, and will consequentially give you a theoretical performance improvement in the scenario where you have lots of threads (which you shouldn't allow to happen anyways).

Use this model for your connection thread:
Code: [Select]
@Override
void run() {
while(!idleForTooLong && connectionOpen) {
try {
if(sck.dataLength >= PACKET_HEADER_SIZE) {
parse();
} else {
sleep(x);
yield();
}
} catch(Throwable t) {
// report the exception
}
}
}

Why to I wrap everything? Because uncaught exceptions in worker threads will make your server crash, which JBLS currently does about 3 times a week on my server. This is not to say it's acceptable to mask exceptions and allow the fallback to handle everything; always handle IO exceptions in an appropriate place; this will just ensure that, if you missed one, you can still recover a stack trace and fix the bug - and also, it will prevent the server from being able to crash, which is a good thing.

Oh, also, idleForTooLong and connectionOpen should be primitive booleans, so that they can't be null and cause a NullPointerException, or be methods that interact with sockets which could cause an IOException.

<Camel> i said what what
<Blaze> in the butt
<Camel> you want to do it in my butt?
<Blaze> in my butt
<Camel> let's do it in the butt
<Blaze> Okay!

Offline Hdx

  • The Hdx!
  • Full Member
  • ***
  • Posts: 311
  • <3 Java/Cpp/VB/QB
    • View Profile
Re: Re-writing JBLS
« Reply #23 on: June 12, 2008, 06:31:19 pm »
It sounds like you might be misunderstanding something here. Instantiating a type doesn't create copies of the methods in memory, it only creates a type that is the size of the non-static fields in that class. All methods are of a static nature in memory, but methods that are not marked static can reference "this," since they are always called with respect to that object.

I'm not sure exactly what the java bytecode looks like, but in C, the traditionally generated asm for an object-function looks like the first (last?) parameter to the function is a pointer to the object decorated over what the function's definition looks like in code. The theory is exactly the same with Java.


[edit] That said, it still seems appropriate for the CheckRevision classes to have static methods, since the lifecycle of an object would be one method call, and because the object would be stateless.
I must be misunderstanding something. If this is wrong I don't know where it got stuck in my head/who said it. But, the way I thought java's objects worked was that on init all file were loaded and all 'static' methods, vars, were maped out into memory. And then when an object was created it loaded/mapped new instances of the non-static parts into memory, and updated it to point the calls/referances to static functions to wherever they were loaded.

Ugh, Well today is my 4th day. Which means I have the next 4 days off. So after some much needed sleep [meaning 14~ hours] i'll make a benchmark app real quick. Make it run 200 static crevs, then 200 with worker threads.

Did I ever mention how bord I am?

[needs to read your 2nd post]

-------------------------------------------------
JBLS shouldnt crash 3 times a week, If I was planning on modifying the current version, i'd have you do some error reporting. But alas i'm not so it will die soon.

One of the things I really do not want to deal with right now is configuration. It's not really a priority.
The ONLY thing I want the server to do is work, and be efficient. I would let others make plugins that delt with configuration/extra features.
As for the basic outline you gave, that's pretty much how JBLS works now. I guess it works fine. I don't really care about async socket handling [that part is negligible, they are delayed the 1/2 ms it takes to create a thread...] i'm more thinking async PACKET handling.
As for the exception catching. I'd definitely make sure I was careful with that.

The Only file IO I want to have JBLS do is hash files. And *possibly* a Verbyte file.
« Last Edit: June 12, 2008, 06:52:57 pm by HdxBmx27 »
http://img140.exs.cx/img140/6720/hdxnew6lb.gif
09/08/05 - Clan SBs @ USEast
 [19:59:04.000] <DeadHelp> We don't like customers.
 [19:59:05.922] <DeadHelp> They're assholes
 [19:59:08.094] <DeadHelp> And they're never right.

Offline Camel

  • Hero Member
  • *****
  • Posts: 1703
    • View Profile
    • BNU Bot
Re: Re-writing JBLS
« Reply #24 on: June 12, 2008, 07:11:04 pm »
Part 2: synchronization.

Consider a case where you have a huge number of simultaneous requests for CheckRevision(). If you try to process all of them at the same time, you're most likely going to cause every one of the people who requested CheckRevision() to timeout on battle.net. So, the idea of a queue comes in to play.

Queues are great if you're at an amusement park. Queues are not great when you have a maximum time to live (TTL) --- for this scenario, I'm giong to say that is 30s. You do not want to do CheckRevision()s all at once because everyone will die, and you do not want to do CheckRevision()s one at a time, because you'll finish the request for fewer than the maximum possible number of people in the window.

The bottom line is that you will never be able to eliminate the case where someone can timeout. For this reason, you should time how long someone is waiting for CheckRevision() to start. If they are waiting 20s for it to start with a 30s TTL, just close their socket - it's a lost cause. By not attempting to do a CheckRevision() for the person who can't use it, you're making it more likely that the next person in line will have a chance to succeed. The bot might even detect the socket is closed in time to abort their bnet connection, thus avoiding the ridiculous ip ban that bnet gives for taking too long to complete CheckRevision().

Now, on to how to design a queue. I won't bore you with how to choose or construct a queue class, but I will bore you with several strategies for using them.

The first, and simplest:
Code: [Select]
static Queue<ConnectionThread> queue;
void waitInLineForCheckRevision() {
    queue.add(this);
    while(queue.contains(this))
        sleep();

    // we're done
}

Seems nice, and allows you to have a few worker threads, but as we've already concluded, that's not the goal, and what you'll find if you try to actually implement this method is that you're going to hit synchronization problems that a synchronized() block won't solve. A more complex design pattern including semaphore-y constructs would solve the synchronization problem, but this model still is pretty bad. It can't accommodate very well the case where you want to give up on waiting for a CheckRevision() result and close the socket, because the worker thread can't remove the item from the queue until it's finished. This argument pretty much applies to any use of worker threads, though you could hack your way around it.

[ IF YOU KNOW WHAT SEMAPHORES ARE START SKIPPING NOW ]

I mentioned semaphores in the previous paragraph, but if you don't know what they are, here's a brief summary:
Code: [Select]
private static int LIST_SEMAPHORE = 0;
private static List<Object> myList = new List<Object>();
void synchronizedIteration() {
    while(LIST_SEMAPHORE > 0)
        sleep();
    LIST_SEMAPHORE++;
    for(Object o : myList)
        someOperation();
    LIST_SEMAPHORE--;
}
void synchronizedAdd(Object o) {
    while(LIST_SEMAPHORE > 0)
        sleep();
    LIST_SEMAPHORE++;
    myList.add(o);
    LIST_SEMAPHORE--;
}

This code tries to prevent multiple threads from accessing the list at the same time using a semaphore. You might see that there is a flaw in this code, which is that the thread could yield between the end of the while loop and the increment operation, causing the undesired behavior to occur anyways. That's an excellent argument, but the reality is that threads don't yield very often when they don't explicitly ask to yield, and the fact that the threads will be sleep()ing if the list is under heavy pressure will pretty much guarantee that that the thread won't yield before it gets a chance to increment the semaphore. There is always that corner case, though.

[ STOP SKIPPING NOW ]

Synchronization is a really nasty thing to have to worry about. It's always better to avoid having to deal with it at all than to deal with it correctly; it's rare that you can plug every hole in a sinking ship. There is something to be admired about semaphores, though: it allows you to limit [to a number greater than one] the number of threads that can do a certain operation at one time without creating extra unnecessary worker threads. This is PERFECT for what you are doing. You want to limit the number of simultanious CheckRevision()s to a pretty low number, but that number isn't really all that well defined, so it's not a big deal if an extra one falls through.

Consider:
Code: [Select]
private static int CHECKREVISION_SEMAPHORE = 0;
public static ? CheckRevision() {
    while(CHECKREVISION_SEMAPHORE > 2) // I have a dual-core machine, so I want 3 threads to do CheckRevision() concurrently
        sleep();
    CHECKREVISION_SEMAPHORE++;
    ? result = null;
    try {
        result = doCheckRevision();
    } catch(Throwable t) {
        // handle the exception
    }
    CHECKREVISION_SEMAPHORE--;
    return result;
}

Remember, it's absolutely imperative that you catch everything before decrementing your semaphore, or it will just keep going up until no threads are ever allowed to doCheckRevision()


Well, I think I've written enough for one day. I'm going to take a nap.
« Last Edit: June 12, 2008, 07:12:56 pm by Camel »

<Camel> i said what what
<Blaze> in the butt
<Camel> you want to do it in my butt?
<Blaze> in my butt
<Camel> let's do it in the butt
<Blaze> Okay!

Offline Camel

  • Hero Member
  • *****
  • Posts: 1703
    • View Profile
    • BNU Bot
Re: Re-writing JBLS
« Reply #25 on: June 12, 2008, 07:25:33 pm »
But, the way I thought java's objects worked was that on init all file were loaded and all 'static' methods, vars, were maped out into memory. And then when an object was created it loaded/mapped new instances of the non-static parts into memory, and updated it to point the calls/referances to static functions to wherever they were loaded.

Every part of that is incorrect. Classes are loaded lazily (on first use), and in the following order:
* The first time you touch them, the JVM reads the bytecode and maps all methods (static and non-static), static fields, and final primitives (which compile to statics) in to memory.
* Static fields and static { ... } blocks are initialized/executed in the order in which they appear.
* When you construct an object, the JVM does a malloc() for the size of ONLY the non-static fields, and uses that for the reference to the class.
* The non-static fields that are not set by constructors are initialized in order of appearance.
* The constructor finishes initialization.

Mapping methods multiple times would be a waste of resources, and have no advantage whatsoever.

Quote
JBLS shouldnt crash 3 times a week
Well, it doesn't really crash; it still accepts new connections. It just refuses to complete CheckRevision()s, which is worse than a crash because I can't write a script to automate detecting/restarting it.

Quote
One of the things I really do not want to deal with right now is configuration. It's not really a priority.
The ONLY thing I want the server to do is work, and be efficient. I would let others make plugins that delt with configuration/extra features.
You don't want users to be able to pick what port to use? That's rather selfish of you. The classes are already there, and free to use.

[edit] I noticed that there's a small discrepancy between what I said and how the JVM acts; really large classes get split in to multiple .class files, which the JVM will load lazily as if they are separate classes. Other than that, it's the same.
« Last Edit: June 12, 2008, 07:27:16 pm by Camel »

<Camel> i said what what
<Blaze> in the butt
<Camel> you want to do it in my butt?
<Blaze> in my butt
<Camel> let's do it in the butt
<Blaze> Okay!

Offline Hdx

  • The Hdx!
  • Full Member
  • ***
  • Posts: 311
  • <3 Java/Cpp/VB/QB
    • View Profile
Re: Re-writing JBLS
« Reply #26 on: June 12, 2008, 07:44:54 pm »
Every part of that is incorrect. Classes are loaded lazily (on first use), and in the following order:
....
Meh Like I said I don't know where it got planed into my rain that was how it worked. I never did bother to research it.

Quote
You don't want users to be able to pick what port to use? That's rather selfish of you. The classes are already there, and free to use.
I never said I didn't want them to be able to pick the port. What I meant was I didn't want the CORE of JBLS to have to worry about that crap.
I'm looking to make something lean, clean, and efficient. No bells/whistles, All that should be able to be added by non-core projects.


But, you are giving me great ideas to mull over. I'm off work in 3.2 hours and i'll see what I can do after I setup my dev tools at home
« Last Edit: June 12, 2008, 07:49:49 pm by HdxBmx27 »
http://img140.exs.cx/img140/6720/hdxnew6lb.gif
09/08/05 - Clan SBs @ USEast
 [19:59:04.000] <DeadHelp> We don't like customers.
 [19:59:05.922] <DeadHelp> They're assholes
 [19:59:08.094] <DeadHelp> And they're never right.

Offline Warrior

  • supreme mac daddy of trolls
  • Hero Member
  • *****
  • Posts: 7503
  • One for a Dime two for a Quarter!
    • View Profile
Re: Re-writing JBLS
« Reply #27 on: June 12, 2008, 09:13:07 pm »
Quote
You don't want users to be able to pick what port to use? That's rather selfish of you. The classes are already there, and free to use.
I never said I didn't want them to be able to pick the port. What I meant was I didn't want the CORE of JBLS to have to worry about that crap.
I'm looking to make something lean, clean, and efficient. No bells/whistles, All that should be able to be added by non-core projects.


But, you are giving me great ideas to mull over. I'm off work in 3.2 hours and i'll see what I can do after I setup my dev tools at home

I really don't think a plugin system would lend itself well to this type of project. You want it to be as lean as possible, and through some good design you can take a lot of the pain of interfacing with configuration files.

I know .NET has XML serialization stuff that makes things like configurations a cinch, maybe Java has this as well?
One must ask oneself: "do I will trolling to become a universal law?" And then when one realizes "yes, I do will it to be such," one feels completely justified.
-- from Groundwork for the Metaphysics of Trolling

Offline Sidoh

  • x86
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: Re-writing JBLS
« Reply #28 on: June 12, 2008, 09:29:17 pm »
I'd check out xmlbeans.  There's a bit of overhead involved with authoring a schema, but it's really straightforward.  It accomplishes much of what serialization does, but I wouldn't be surprised to learn there's a more "direct" way of doing this.  I know Java "natively" supports serialization stuff, but I don't know about serialization to XML.

Offline Camel

  • Hero Member
  • *****
  • Posts: 1703
    • View Profile
    • BNU Bot
Re: Re-writing JBLS
« Reply #29 on: June 12, 2008, 11:36:14 pm »
I'd check out xmlbeans.  There's a bit of overhead involved with authoring a schema, but it's really straightforward.  It accomplishes much of what serialization does, but I wouldn't be surprised to learn there's a more "direct" way of doing this.  I know Java "natively" supports serialization stuff, but I don't know about serialization to XML.

I don't think that xmlbeans would be an appropriate thing to put in a server which could potentially be quite bogged down to begin with. It's no secret that xmlbeans is very very slow; I tried to use it for a client-side xml deserializer, and it was more of a bottleneck than hibernate.

What purpose would it serve, anyways? The protocol is already set in stone, and it's not XML.

[edit] Ok, I just re-read the relevant posts, and I retract my question in favor of a new one: Why the f*ck would you use XML for a list of properties? Use a properties file!
« Last Edit: June 12, 2008, 11:38:17 pm by Camel »

<Camel> i said what what
<Blaze> in the butt
<Camel> you want to do it in my butt?
<Blaze> in my butt
<Camel> let's do it in the butt
<Blaze> Okay!