Author Topic: P2P VLAN (Read 13027 times)

nslay · « **on:** August 11, 2006, 07:56:43 am »

Peer-to-peer (P2P) is a buzz word these days. It is an umbrella term for any sort of communication that is decentralized from small scale direct-connect on AIM to large scale file sharing. A Virtual LAN (VLAN) is a logical network that exists ontop of an existing network. It is similar to a VPN but is independent of the real network. It's usually used to network host and multiple virtual machines together, an invaluable test tool.
Combining the two ideas could yield an invaluable networking tool that transcends network topologies such as NATs and firewalls and yet is independent of dedicated servers. It would allow friends and family to network their machines near and far. A similar tool exists called Hamachi, however it isn't truly P2P. It still relies on the vendor's server to coordinate networks. However, a P2P VLAN has more powerful implications besides transcending network topologies. It can have other features such as true anonymity, use multiple peers as one logical tunnel, and encrypt underlying communication between two nodes.
The concept of a P2P VLAN is based on a very simple and straightforward analogy. Suppose there is a room full of people and suppose each person has friends. Now suppose that each person has a concealed card from a standard 52 card deck. Only the owner of the card knows his/her card. Now, if one were to pass a message to the owner of the Ace of Diamonds (AD), one would need to know where to send data. To solve this problem, one asks each of one's friends who has the AD and those friends repeat the process. The owner of the AD can respond that he/she knows who has the AD without revealing that he/she has the AD. Eventually that response reaches the original querier and most likely through multiple friends. To note, one cannot conclude who has the AD since the response could have been forwarded. Now the original queier can pass a message by having friends relay the message. Remember, he/she may have many different routes and can randomize which friend he/she relays a message through. The effect? Nobody knows who has the AD yet communication can take place. In this analogy, the friends are directly connected peers and the cards represent IP addresses.
To test this idea, I wrote a demonstration that uses the tun(4) psuedo network interface. It intializes a tun interface and then attempts to connect to each peer, up to M peers in a list file. The demonstration code does not choose an IP, one must manually configure the tun interface. Since our primary concern is routing, I conducted a test with three machines and forced a specific network configuration.
Our three machines:
LIGHTBULB - 172.23.0.1
BLENDER - 172.23.0.2
BOTTLE - 172.23.0.3

I forced a configuration of:
BOTTLE <-> LIGHTBULB <-> BLENDER

I did this by stripping the peer list file from BOTTLE and BLENDER. I chose this network configuration so BLENDER and BOTTLE are forced to route through LIGHTBULB when communicating with each other. This would demonstrate the routing technique described in our analogy above.
In order to coordinate the network, I designed a simple protocol based on battle.net's message format.

Code: [Select]

/* This header defines various events. 
 * Protocol format:
 * <uint8_t event><uint16_t size><void>
 * void is event specific ... all comments below refer to void format. 
 * The protocol is still in development.
 */

#ifndef PROTO_H
#define PROTO_H

/* Authorization technique.
 * peer1 connects to peer2
 * peer1 -> uint8_t clientid
 * peer2 -> uint8_t clientid
 * peer2 -> P2PI_CLIENTINFOS 
 * peer1 -> P2PI_CLIENTINFOR
 */

/* Other semantics not yet determined */

/* Non-existent Client ID */
#define P2PI_NULLCLIENT	0x00

/* These refer to the event byte */
#define P2PI_CLIENTINFOS	0x01 /* Client information exchange <struct client_info> */
#define P2PI_CLIENTINFOR	0x02 /* Response event <struct client_info> */
#define P2PI_WHOHASS	0x03 /* Route request <in_addr><void> ... void data is for anything arbitrary (e.g. public key) */
#define P2PI_WHOHASR	0x04 /* Route request <in_addr><void> ... void data is for anything arbitrary (e.g. public key) */
#define P2PI_PACKET	0x05 /* Send packet <in_addr><packet> */

#endif

The comments should be self explanatory.

The tun interface on FreeBSD has a mode (TUNSLMODE) that has if_tun prepends sockaddr to each packet. I make use of this so that I can easily extract in_addr without examining the packet in anyway. The struct in_addr is used for routing and route request instead of sockaddr_in as to not reveal the destination port (althought the demonstation does not encrypt the packets for simplicity).

To handle routing and route requests, the demonstration uses a hash table and struct route_info:

Code: [Select]

struct route_info {
	int route;	/* Can route */
	int routereq;	/* Route request in progress */
};

The hash table relates in_addr -> route_info.
Furthermore, each connection is associated with a struct con_info which holds the file descriptor, hash table and other miscellaneous necessary information.

Code: [Select]

struct con_info {
	int fd, mode;
	struct sockaddr_in sin;
	struct client_info cname;
	uint8_t cid, buff[P2PI_BUFF];
	size_t buffsz;
	struct hash_table route;
};

Upon recieving a route request, the connection requesting an address is hashed as requesting a route while the demonstration forwards the route request to peers. Upon a response, the demonstration marks the address as routable on each connection that responded, forwards the response to all connections waiting for a route, and then unmarks the route request. While this is happening, for sanity reasons, packets read from the tun interface destined for an unroutable address are simply discarded. Queueing the packets and then sending them on response could have serious consequences since TCP might behave to the delays while a route is in pursuit and queue more packets that might confuse the destination when they are finally sent.
Once an address is marked routable, packets read from tun are dispatched through connections able to route to the destined address. Peers who recieve the packets should also behave accordingly.

One might ask what would happen if a peer in between a route died. Since each connection is associated with a hash_table that holds route_info, the hash table for that connection would have been cleared and the the packet would be discarded since it is no longer routable. The peer will do the above method to try to establish a new route.

In the experiment, I did a simple a test by pinging BOTTLE from BLENDER. The route was established through LIGHTBULB almost instantaneously and got ping times as low as 50ms. I was even able to ssh to BOTTLE from BLENDER over this P2P VLAN, all the while LIGHTBULB is routing under the covers. To test the above scenario with a dead route, I restarted BOTTLE's demonstration which killed the route to BOTTLE. Soon after, ping packets routed to LIGHTBULB were discarded until LIGHTBULB re-established connection with BOTTLE and the route re-established between LIGHTBULB and BOTTLE.

Here is a picture of the transactions:

That experiment concludes the feasability of a P2P VLAN and demonstrates how one might structure the software to handle the routing.

The real software will probably use UDP since it is probably bad to encapsulate a reliable protocol TCP/IP into TCP/IP packets. But the bigger issue is how to deal with liars. A liar is a peer that behaves badly and wrongfully responds to requests and perhaps uses other malicious tatics to damage the network.

One of the major security tatics, aside of encryption, is to use multiple and random routes to prevent any one peer from accumulating all packets. That said, we treat multiple peers as one logical tunnel. Although, this serves multiple purposes aside of security, it allows for more robust communication, as well as distributes bandwidth usage among peers. Most peers will not want to dedicate a large portion of their bandwidth to routing.

The first and foremost problem to be dealt with is automatic determination of a node's IP. A node could ask its peers if an IP is in use, but a peer could lie. To overcome this problem, a node could generate a public key, a timely process, and then take a hash of the key. If we were to use the 10.x.y.z IP block, we would want a 24 bit hash, perhaps I will write a CRC24 method. Although, 256 bit keys would have high collision rates on a 24 bit hash, the collision rate of IP addresses would be near one in sixteen million. This method ensures a) Most likely a unique IP is chosen b) Easy validation of liar route responses (the public key is appended to responses) c) Difficulty to target any one specific IP address. CRC24 might not be a good enough hash, a real cryptographic hash might have to be created to do the hashing since CRC can supposedly be reversed to a degree. Even though collision rates between 256 bit keys and CRC hashes would be high, one could somehow possibly reverse the process and pick a 256 bit number from one of the possibilities. But that only fixes one of many problems.

Other measures taken could be heuristics (using known information) and tolerance. Tolerance assumes that a majority of peers are not liars. It assumes that the majority of matching responses are the correct response, and if there is no majority response, the responses are discarded. To note, this would require each peer to have a minimum of three direct peers. However for small networks, this could be easily overcome. As the network grows larger of fully functional peers, it can tolerate more liars.

I will compile a list of experiments to test each security measure or a combination of the security measures. When I get more time, I will code the actual project with a friend. It's lots of fun

A more interesting aspect of a P2P VLAN, is that it could be used not only for small things but for wide scale. Perhaps it could be a full fledge virtual Internet ... all that would need be done is a p2p-driven DNS system. Again, such a P2P VLAN ensures anonymity, security and privacy.

Please comment ... especially on how to deal with lying peers.
Until next time

See also
http://freenet.sourceforge.net/

Newby · « **Reply #1 on:** August 12, 2006, 09:59:45 pm »

Quote from: nslay on August 11, 2006, 07:56:43 am

The real software will probably use UDP since it is probably bad to encapsulate a reliable protocol TCP/IP into TCP/IP packets. But the bigger issue is how to deal with liars. A liar is a peer that behaves badly and wrongfully responds to requests and perhaps uses other malicious tatics to damage the network.

There's no real way to deal with a liar. You could send a bogus packet through, (in your example, let's say you want to communicate with someone who has the Joker card, but it was removed) and if someone responds with "yes i know who has the joker card" you could simply drop him as a friend and route nothing through him, since he's not trustworthy. And if everyone does this on the net eventually he'll basically be cut off from the network and no one can talk to him. At that point, you can adjust the route table accordingly. Sadly one could write a decent malicious program that could recognize bogus packets. Another method would be to replace the bogus card with your own card, and see if your friends know who has it. Unfortunately this could fail because your "friends" should know you have it, and if a malicious person is your friend he could easily act properly.

Those are just two suggestions.

Nice write-up, by the way.

nslay · « **Reply #2 on:** August 13, 2006, 06:06:24 am »

Quote from: Newby on August 12, 2006, 09:59:45 pm

Quote from: nslay on August 11, 2006, 07:56:43 am
The real software will probably use UDP since it is probably bad to encapsulate a reliable protocol TCP/IP into TCP/IP packets. But the bigger issue is how to deal with liars. A liar is a peer that behaves badly and wrongfully responds to requests and perhaps uses other malicious tatics to damage the network.

There's no real way to deal with a liar. You could send a bogus packet through, (in your example, let's say you want to communicate with someone who has the Joker card, but it was removed) and if someone responds with "yes i know who has the joker card" you could simply drop him as a friend and route nothing through him, since he's not trustworthy. And if everyone does this on the net eventually he'll basically be cut off from the network and no one can talk to him. At that point, you can adjust the route table accordingly. Sadly one could write a decent malicious program that could recognize bogus packets. Another method would be to replace the bogus card with your own card, and see if your friends know who has it. Unfortunately this could fail because your "friends" should know you have it, and if a malicious person is your friend he could easily act properly.

Those are just two suggestions.

Nice write-up, by the way.

Yeah. It's a difficult problem, especially since even the mallicious users are perfectly anonymous. It's really a pitty the creators of TCP/IP didn't use a public key address system. This would solve nearly every problem.
However, as mentioned above a simple solution would be to:
1) Generate a public/private key pair
2) Take a cryptographic hash of the public key
3) mod the hash by ~netmask

The result is the remaining bits of the IP address and the public key is used for negotiating encryption between two nodes. On a route response, a node must append its public key so that the querier can encrypt an encryption key to use for communication (like 3DES, AES, Blowfish etc...). To note, it is now very easy to verify a genuine response. A peer forwarding a response just need perform steps 2 and 3 to verify the response.

You might wonder why a cryptographic hash is desirable. Consider a simple mathematical function as f(x)=x^4 (x to the 4th power).
If we hash the number 2 we get f(2)=16.
Now note that this hash can be reversed to a degree.
f(x)=16 when x=+/-2, +/-2i

I believe ssh uses 2048 bit keys, so suppose your domain is [0,2^2048) (2^2048 is 617 digits in base 10)
Try that with md5 or SHA-1

One could use rainbow tables to search for possible input, but generating the rainbow table would be time consuming and certainly one would need a large pocket to do it. Secondly, a 2048bit public key is a product of two very large prime numbers, cerca 2^1024. What are the odds that any old number you pick has that property?

So, its extremely difficult to spoof route responses with this idea.

Joe · « **Reply #3 on:** August 13, 2006, 10:32:24 pm »

Couldn't you implement AES above TCP and have the same effect as under it? I suppose you still have to pass the key, though..

nslay · « **Reply #4 on:** August 15, 2006, 01:22:38 am »

Quote from: Joe[x86] on August 13, 2006, 10:32:24 pm

Couldn't you implement AES above TCP and have the same effect as under it? I suppose you still have to pass the key, though..

No, because each peer is responsible for routing packets.
In the experiment above:
BOTTLE <-> LIGHTBULB <-> BLENDER

If we assume there is encryption between BOTTLE and LIGHTBULB and LIGHTBULB and BLENDER, what happens if BOTTLE wants to communicate with BLENDER? BOTTLE sends encrypted packets to LIGHTBULB, LIGHTBULB decrypts the packets and then encrypts them to send to BLENDER. Now, note that LIGHTBULB is still able to read the packet addressed to BLENDER. What needs to be done is encrypt packets with a key only BLENDER and BOTTLE know so that LIGHTBULB can't read routed packets. The higher level encryption isn't enough, although it will be used along with the above method.
Again, and I stress, we do not connect directly to another peer for communication to preserve anonymity and transcend network topologies, we rely on peers to do the routing for us.

nslay · « **Reply #5 on:** August 21, 2006, 12:02:50 am »

Today, my friend found 3 old machines laying around outside.
One AMD k6, and two Pentium II machines. This is going to give me a total of 6 test machines!

Newby · « **Reply #6 on:** August 21, 2006, 12:20:53 am »

Fun!

Now you can test out the app over a larger network, versus between two.

Sidoh · « **Reply #7 on:** August 21, 2006, 12:28:47 am »

Quote from: Newby on August 21, 2006, 12:20:53 am

Fun!

Now you can test out the app over a larger network, versus between two.

Ooooh, now I get it. Thanks Newby.

rabbit · « **Reply #8 on:** August 21, 2006, 09:43:30 am »

Quote from: nslay on August 21, 2006, 12:02:50 am

Today, my friend found 3 old machines laying around outside.
One AMD k6, and two Pentium II machines. This is going to give me a total of 6 test machines!

I have a Compaq Proliant 1500, and a bunch of other non-server boxes. If you pay S&H, they are yours.

nslay · « **Reply #9 on:** November 01, 2006, 12:49:43 pm »

I've started working on a prototype of the tunnel project. Because of a combination of my inexperience with large projects and the oddity of this software, it is hard for me to create a concrete design. It's not clear what should be made separate components, and what should be internal to the software. The prototype should give me some ideas on what the real design should look like, as well as give me a tool to do some analysis. Aside of that, some math has to be derived to describe the characteristics and phenomena of the network. So, I hope to have a full working prototype by December that offers said features above and some mathematics that describe the network's workings.
By Spring of next year, I should hopefully be working on the real project. I invite anybody who has experience in any of the following:

Network programming
Linux tun
Windows VPN
OpenSSL and Cryptography
Graph theory and Statistics

Because of the poor documentation in Linux and the LDP, tun doesn't appear to be documented in Linux. I have no idea where to look for the Windows VPN hooks. I have self taught myself OpenSSL and have the math background to understand cryptography, but because of the poor documentation of OpenSSL, I could potentially misuse it. I have no knowledge of Graph Theory ... but would be nice if I didn't have to teach myself the subject. When I note statistics, I mean someone who has taken the college equivalent of Mathematical Statistics (Calculus based statistics).

MyndFyre · « **Reply #10 on:** November 01, 2006, 01:28:12 pm »

I'm solid in stats, and I've virtually-privately networked my home PC with others. Does that qualify me?

nslay · « **Reply #11 on:** November 01, 2006, 03:09:22 pm »

Quote from: MyndFyre[x86] on November 01, 2006, 01:28:12 pm

I'm solid in stats, and I've virtually-privately networked my home PC with others. Does that qualify me?

Sure does...although, when I note "Windows VPN", I mean the VPN interface in Windows that lets software create a fake network interface. I'm talking on a programming level.
Take for example the FreeBSD tun device:
http://www.freebsd.org/cgi/man.cgi?query=tun

tun is a psuedo network interface. It acts like a network interface except everything is handled with software. It lets a process open a tunN interface for reading/writing. Any packets the OS sends through the tunN interface is intercepted by the process, any packets the process writes to the tunN interface, the OS recieves. That's the basic functionality of tun, it lets you do a whole lot more. For example, the process can set the tunN interface to be a Token Ring interface that supports Broadcast, Multicast or Point-to-point mode of operation. Of course, I setup tun to be an IFT_TUNNEL with an MTU of TUNMTU and it supports IFF_BROADCAST. For convenience, I enable TUNSLMODE so that each packet recieved from tun has a sockaddr prepended to the packet.
The reason I am so interested in using tun is that it can be configured as a normal ethernet interface and allows existing TCP/IP applications to indirectly use it. The daemon that has the tun device open intercepts packets and routes them accordingly over UDP.

For more information on network interfaces in FreeBSD, you can see:
http://www.freebsd.org/cgi/man.cgi?query=netintro

d&q · « **Reply #12 on:** November 02, 2006, 09:13:21 pm »

I took AP Statistics and got an A...does that count?

nslay · « **Reply #13 on:** November 02, 2006, 09:23:35 pm »

Quote from: Deuce on November 02, 2006, 09:13:21 pm

I took AP Statistics and got an A...does that count?

Depends on what topics you covered. To be honest, I'm not quite sure what kind of stats will be done just yet.

nslay · « **Reply #14 on:** December 29, 2006, 05:26:01 am »

Its been nearly 5 months since embarking on this project and I'm pleased to announce that I'm on the verge of a prototype. As it stands, I've written over 2000 lines of code (22 headers and C files combined) and it will continue to climb. I've written everything from the configurations parser, to the secure UDP library, all the way to the tunnel manager, now all that remains is threading all these components together.

Some of the design goals of the prototype include:
- Low level encryption (over UDP)
- High level encryption (over peer-to-peer route)
- medium independence (so this can extend to say, ad-hoc Wi-Fi and other mediums besides UDP)
- An API to allow applications to communicate with the daemon
- An API to allow applications to manipulate the underlying protocol (For example, for a larger scale p2p tunnel network, you might want to create a list daemon that keeps tabs on new peers and dead peers ... to do this you'd need to be able to jack into the low level protocol)

So far I've only accomplished one of these goals. I'm not sure I can accomplish the medium-independence goal because its very hard to design an API that is general enough to work with all sorts of quirks in mediums.

EDIT: Note that when I say "medium independence," it means that the core code will use a generalized medium API to invoke communications, but the actual support for a medium still has to be written.

Overall goals:
I'd like to design, as mentioned above, a list daemon, and a p2p-DNS daemon. These will of course use the above mentioned API.
The list daemon's functions are fairly simple, it collects peers addresses and synchronizes with other peers running the list daemon. This is useful if you are on a fairly large and dynamic tunnel network (where lots of people come and go). The p2p-DNS will listen for DNS queries on localhost on the frontend and on the backend coordinate domain name ownership and handle domain name queries over the P2P protocol. Since its possible that people will want to be on multiple tunnel networks (e.g. Much like how everyone wants to be on all their friend's networks on social/cell phone/etc networks), tunnel networks will need to have identifiers like names. The DNS server will handle the ambiguity by accepting an optional networkname|server.com format. This way, if you want to browse x86labs.org on Myndfyre's network specifically, you would type Myndfyre|www.x86labs.org. Thats the idea anyways. How to handle colliding network blocks is still a mystery. How does the OS behave if you have more than one tunnel that use the same network block? In Unix its possible for applications to bind to a specific interface, but its not a very portable solution and would require applications to write such support.

Clan x86

News:

Author Topic: P2P VLAN (Read 13027 times)

nslay

P2P VLAN

Newby

Re: P2P VLAN

nslay

Re: P2P VLAN

Joe

Re: P2P VLAN

nslay

Re: P2P VLAN

nslay

3 "new" machines

Newby

Re: P2P VLAN

Sidoh

Re: P2P VLAN

rabbit

Re: 3 "new" machines

nslay

Prototype and beyond

MyndFyre

Re: P2P VLAN

nslay

Re: P2P VLAN

d&q

Re: P2P VLAN

nslay

Re: P2P VLAN

nslay

Re: P2P VLAN