!=============================================================!
!=  	     Client Emulation and Data Parsing               =!
!=                                                           =!
!= Article written by: Feanor (FeanorxL@gmail.com) 7/10/2005 =!
!=                                                           =!
!=============================================================!
[Contents]
1. Introduction
2. Clint/Server Basics
3. Packet logging and Data Analysis
4. Implementation: Sending and Receiving Packets
5. Client Emulation
5. Finishing up
6. References
	
-- 1. Introduction
Since its birth in the 1960s, the internet has grown to the forefront of 
our society. Gigabytes of data can travel around the world in only 
fractions of a second. As time goes on, the internet will continue to 
grow more influential and prominent in our society. This is why 
client and server internet applications are so important. 
This tutorial is meant to introduce people (with some background in 
programming) to writing code designed for client/software based 
software. Some of the material covered, however is very basic 
(especially the Client/Server Basics section). Please skip sections if 
you need to. All examples of written code in this tutorial are in java, 
but the concepts demonstrated are very general and can be easily 
converted to any other language. I hope that by reading this tutorial, 
you can gain some knowledge and useful programming skills. 
Enjoy.
-- 2. Client/Server Basics
	
Most operating systems use what is known as the TCP/IP 
(Transmission Control Protocol/Internet Protocol) to transmit data 
over the internet. In TCP/IP each computer receives a 32 bit numeric 
identifying address (IP address), telling other computers where to 
send data. IP addresses are written as four separate numbers, each 
separated by a period, for example 127.0.0.1. You can find out your 
own IP address by visiting www.whatismyip.com. [1] 
Along with IP addresses, internet applications also direct data by 
using a specific ports number. Ports ensure that data sent to a certain 
IP address will be handled and analyzed by the correct program. 
Here is a list of some standard servers/services and the ports that 
they generally run on [2]:
20 FTP data (File Transfer Protocol)
21 FTP (File Transfer Protocol)
22 SSH (Secure Shell)
23 Telnet
25 SMTP (Send Mail Transfer Protocol)
43 whois
53 DNS (Domain Name Service)
68 DHCP (Dynamic Host Control Protocol)
79 Finger
80 HTTP (HyperText Transfer Protocol)
110 POP3 (Post Office Protocol, version 3)
115 SFTP (Secure File Transfer Protocol)
119 NNTP (Network New Transfer Protocol)
123 NTP (Network Time Protocol)
137 NetBIOS-ns
138 NetBIOS-dgm
139 NetBIOS
143 IMAP (Internet Message Access Protocol)
161 SNMP (Simple Network Management Protocol)
194 IRC (Internet Relay Chat)
220 IMAP3 (Internet Message Access Protocol 3)
389 LDAP (Lightweight Directory Access Protocol)
443 SSL (Secure Socket Layer)
445 SMB (NetBIOS over TCP)
666 Doom
993 SIMAP (Secure Internet Message Access Protocol)
995 SPOP (Secure Post Office Protocol)
1243 SubSeven (Trojan - security risk!)
1352 Lotus Notes
1433 Microsoft SQL Server
1494 Citrix ICA Protocol
1521 Oracle SQL
1604 Citrix ICA / Microsoft Terminal Server
2049 NFS (Network File System)
3306 mySQL
4000 ICQ
5010 Yahoo! Messenger
5190 AOL Instant Messenger
5632 PCAnywhere
5800 VNC
5900 VNC
6000 X Windowing System
6699 Napster
6776 SubSeven (Trojan - security risk!)
7070 RealServer / QuickTime
7778 Unreal
8080 HTTP
26000 Quake
27010 Half-Life
27960 Quake III
31337 BackOrifice (Trojan - security risk!)
When opening a connection to a server, the IP address and port 
number must be specified. In most programming languages, you 
open a connection to another computer with a Socket, which is 
defined as an "an endpoint for communication between two 
machines." In java, you would do this using the Socket class 
(java.net.Socket), and you would specify the IP address and Port 
number in the constructor [3], for example:
Socket testconnection = new Socket("google.com", 80);
Now you have an open connection and are ready to send and receive 
data.
	
--3. Packet logging and Data Analysis
	
A packet is a sequence of binary digits, used to send data over the 
internet. Client and server applications communicate through 
sending and receiving packets. It is very easy to analyze the 
information being sent to/from your computer through the use of a 
packet logger. I recommend Ethereal: (http://www.ethereal.com/). 
Here is a sample TCP packet (used in the AIM OSCAR protocol) 
that I received on my computer:
0000   00 0d 61 14 b9 d0 00 05 00 e4 ae a1 08 00 45 00  ..a...........E.
0010   00 32 70 b8 40 00 6c 06 a4 ab cd bc 99 79 18 bd  .2p.@.l......y..
0020   79 6f 14 46 13 20 54 05 49 16 79 d4 87 4f 50 18  yo.F. T.I.y..OP.
0030   40 00 ad 5f 00 00 2a 01 d9 54 00 04 00 00 00 01  @.._..*..T......
Left: Line numbers
Middle: Raw packet data
Right: Packet data displayed as a string
First off this data is not displayed in binary. Instead, most packet 
loggers display data in hexadecimal (base 16), since one two digit 
hex number easily represents an entire byte of data (a number 
between 0 and 255). Since hexadecimal requires 16 digits, and our 
numbering system only offers ten (0123456789), we use A, B, C, D, 
E, and F to express 10 - 15 in one single digit. Here are the first 16 
digits in base two, ten and sixteen:
(base 2):  0, 1, 10, 11, 100, 101, 110, 111, 1000, 1001, 1010, 1011, 
1100, 1101, 1110, 1111
(base 10): 0, 1,  2,  3,   4,   5,   6,   7,    8,    9,   10,   11, 
12,   13,   14,   15
(base 16): 0, 1,  2,  3,   4,   5,   6,   7,    8,    9,    A,    B,
C,    D,    E,    F, ect.
Now we can analyze the data in the packet. First, ever packet has a 
header. The header contains important information such as the 
sender's IP, the recipients IP, protocol information and the port 
number that the packet is meant to be received on. Some packet 
loggers don't display the header information, because it is not really 
necessary for us to analyze. It is only there for computers to easily 
direct packets and make sure that they arrive at the correct 
destination. The header on my packet is:
0000   00 0d 61 14 b9 d0 00 05 00 e4 ae a1 08 00 45 00  ..a...........E.
0010   00 32 70 b8 40 00 6c 06 a4 ab cd bc 99 79 18 bd  .2p.@.l......y..
0020   79 6f 14 46 13 20 54 05 49 16 79 d4 87 4f 50 18  yo.F. T.I.y..OP.
0030   40 00 ad 5f 00 00 
Some important data in here is: 
cd bc 99 79 - Sender IP (205.188.153.121)
18 bd 79 6f - Recipient IP (24.189.121.111)
14 46 - Sender's Port number (5190)
13 20 - Recipient's port number (4896)
The header information, however, is not something that you really 
have to worry about. Moving on, data over the internet is sent in 
much the same manner that it is stored on the computer. There are a 
few basic data types:
byte - a hexadecimal two digit number or an eight digit binary 
number
word - equivalent of a "short," two bytes of data
dword - "double word" equivalent to an "int," four bytes of data
These three data types are generally used to display numbers. 
Although these data types are standard, protocols use them 
differently. For example, when a protocol demands that you send a 
dword, but the data that you have to send does not take up four 
bytes, you have some options. You could either begin or end the 
dword with zeros. Say we want to send (dword) 0x01. In some 
protocols you would send it as (00 00 00 01). This type of protocol is 
called big-endian. In a little-endian protocol you would send the 
same 0x01 dword as (01 00 00 00). The AIM OSCAR protocol, 
which we are using as an example, is big-endian.
Besides numbers, packet data can also represent a string. For an 
example of a packet with some string in it, here is data I receive 
when signing on to my MSN Instant Messenger account 
(FeanorxL@hotmail.com):
0000   00 0d 61 14 b9 d0 00 50 f2 c8 8e a2 08 00 45 00  ..a....P......E.
0010   00 52 02 2d 00 00 72 06 ae 6c cf 2e 06 24 c0 a8  .R.-..r..l...$..
0020   02 12 07 47 0b 71 9d ba 01 55 2d f2 11 4e 50 18  ...G.q...U-..NP.
0030   fe 25 db 8a 00 00 55 53 52 20 31 36 20 4f 4b 20  .%....USR 16 OK 
0040   66 65 61 6e 6f 72 78 6c 40 68 6f 74 6d 61 69 6c  feanorxl@hotmail
0050   2e 63 6f 6d 20 4a 61 6d 65 73 20 31 20 30 0d 0a  .com James 1 0..
Removing the packet header, we are left with:
 			 55 53 52 20 31 36 20 4f 4b 20  .%....USR 16 OK 
0040   66 65 61 6e 6f 72 78 6c 40 68 6f 74 6d 61 69 6c  feanorxl@hotmail
0050   2e 63 6f 6d 20 4a 61 6d 65 73 20 31 20 30 0d 0a  .com James 1 0..
Most computers use ASCII, which stands for American Standard 
Code for Information Interchange. In ASCII, every character can be 
represented numerically, for example the letter "A" is 65 (0x41). [4] 
A good ASCII table with decimal and hex conversion can be 
accessed at http://www.lookuptables.com/. The hex packet data 
converted to a string using ACII is displayed on the right. It is easy 
to see that in this case, a lot of the data in the packet was meant to be 
parsed as a string and not numerically. This packet pretty much 
contains one long string, separating words with spaces. Some other 
protocols treat strings differently, separating certain strings with null 
characters. In essence, a protocol either uses null terminated strings 
or non null terminated strings. Some protocols even use both! 
Let's look back at the first packet we were looking at, the AIM 
Oscar one. The important back data was: 
2a 01 d9 54 00 04 00 00 00 01
Now with a little knowledge of the protocol [5], we can break it 
down:
0x2A           – (byte) OSCAR header
0x01           – (byte) channel number
0xD954         – (word) sequence number
0x0004         – (word) length of the rest of the packet
0x00000001     – (dword) version
Along with packet logging a program, it will always help to look up 
documentation to see if anyone could help tell you exactly what the 
data is supposed to be. Information on the OSCAR AIM protocol 
can be found at http://joust.kano.net/wiki/oscar/moin.cgi/FrontPage
--4. Implementation: Sending and Receiving Packets
Now we can talk about writing some code. I have written some 
classes in java to make this easy for myself. First off, I have a simple 
"Packet" class. Once I receive data, I put it in a packet, where it can 
easily be parsed. Here is the class:
import  java.util.Vector; 
public class Packet {
	private Vector buffer = new Vector(); 
	public Packet(){}
	public Packet(byte[] data){
		for(int pos = 0; pos < data.length; pos++){
			addByte(data[pos]);
		}
	}
	
	public void addByte(int pos, byte element){
		buffer.add(pos, new Byte(element));
	}
	
	public void addByte(byte element){
		buffer.add(new Byte(element));
	}
	public byte getByte(int pos){
		Byte element = (Byte)buffer.get(pos);
		return element.byteValue();
	}
	public byte[] getBuffer(){
		byte returnbytes[] = new byte[buffer.size()];
		for(int pos = 0; pos < buffer.size(); pos++){
			returnbytes[pos] = getByte(pos);
		}
		return returnbytes;
	}	
	public void removeByte(int pos){
		buffer.remove(pos);
	}		
}
Next, I have a simple class that I use to create packets called 
"PacketBuilder." The class below is written for a big-endian 
protocol. This class will probably need to be tweaked a little, 
depending on which protocol you are using. Here it is:
public class Packetbuilder{
	private Packet buffer = new Packet();
	public Packetbuilder(){}
	
	public void insertDword(int value){
   		buffer.addByte((byte) ((value >> 24) & 0xFF));
		buffer.addByte((byte) ((value >> 16) & 0xFF));
		buffer.addByte((byte) ((value >> 8) & 0xFF));
		buffer.addByte((byte) (value & 0xFF));
   	}
	public void insertWord(int value){
		buffer .addByte((byte) ((value >> 8) & 0xFF));
		buffer .addByte((byte) (value & 0xFF));
   	}
	public void insertByte(byte value){
		buffer.addByte(value);
	}
	public void insertBytes(byte[] info){
		for(int pos = 0; pos < 	info.length; pos++){
			buffer .addByte(info[pos]);
		}
	}
	//non null terminated string
	public void insertNonNTString(String instr){
		for(int pos = 0; pos < instr.length(); pos++){
			buffer .addByte((byte)instr.charAt(pos));
		}
	}
	//null terminated string
	public void insertNTString(String instr){
		for(int pos = 0; pos < instr.length(); pos++){
			buffer .addByte((byte)instr.charAt(pos));
		}
		buffer .addByte((byte)0x00);
	}
	public byte[] getPacketBuffer(){
		byte tosend[] = buffer.getBuffer();
		return tosend;
	}
}
I recommend that you guys try to write your own Packet and 
Packetbuilder sort of classes, so that you can make sure you have a 
good understanding of the material. The best way to learn is through 
practice and experience. Start out with a basic protocol; maybe try to 
make a POP3 or FTP client. You can download some source code to 
help you out along the way and see how other people have done it. If 
you are programming in java, you should take a good look at the 
online documentation, paying good attention to the Socket, 
OutputStream, and InputStream classes. You can find them at:
http://java.sun.com/j2se/1.4.2/docs/api/
--4. Client Emulation
The main reason why people packet log applications and internet 
connections is to emulate a client logging on the server. The two 
examples that we have used thus far, packets from the MSN and 
AIM protocols, are two clients that are often emulated. Many people 
packet log AIM and MSN in order to make their own clients, or to 
take advantage of exploits within the protocol. 
With the knowledge you now have, emulating some clients can be 
very easy, but there are a few things to remember. First off, many 
servers have security built in to them. Constantly connecting or 
sending packets out of order can result in having your IP banned 
from the server. Some companies might even complain to your ISP 
(internet service provider) if the problem continues. 
Of course, there are some people or companies that created protocols 
for public use, and encourage programmers to write clients for their 
servers. There are some companies, however, that do everything in 
their power to prevent people from emulating their clients. They try 
to preserve the uniqueness of their software through endeavors such 
as creating special hashing algorithms to encrypt their packets, or 
pressing charges against the makers of third party software clients 
for copyright infringement.
Basically, be careful and know exactly whose software you are 
emulating.  
--5. Finishing Up
I sincerely hope that this tutorial has helped you learn more about 
client/server communication. If you are dedicated enough to have 
read this, I am sure that you put this knowledge to good use. Don't 
get frustrated, just keep on working. Write some great servers or 
clients, and give back to the internet community. 
Good Luck!
--6. References
 
[1] Find out your IP - www.whatismyip.com
[2] Commonly used port list -  
http://www.governmentsecurity.org/articles/CommonPorts.php
[3] Java documentation - http://java.sun.com/j2se/1.4.2/docs/api/
[4] Hex conversion table - http://www.lookuptables.com/
[5] OSCAR documentation -
http://joust.kano.net/wiki/oscar/moin.cgi/FrontPage