I just sort of skimmed the article, so sorry if I'm repeating what's already been said, but here's my $0.02:
It depends on whether you want accuracy or going through walls more; if you want to go through walls, you're locked to RF (and relatively high power - so you'd have to use the 2GHz band, which would probably demolish any 802.11 networks in the area), but your accuracy will severely suffer. I also sincerely doubt that you have sufficient background to attempt such a thing yet; I've been in the wireless industry for two years, and I still only understand the basic principles involved.
I don't think radio is plausible for your situation; but one exception would be to re-use an existing TPS solution. Beware though, they (cdmaOne, 802.11n, etc) are all inaccurate, since they are based on a system where light moves really fast, and reference points are really close. The accuracy of GPS is critically dependent on the fact that GPS satellites are far away.
What it really comes down to is how important accuracy is to you. If you just want to tell what room someone's in, you can get away with an 802.11n system -- and you don't even need sensors on peoples bodies; you can "see" a human in the 2GHz band; there are some intrusion detection systems in development that can run on any WRT-based 802.11n router.
If, however, you need to know accurately where in the room a person is, I think you should just use sound; it's probably your only hope if accuracy is desirable. As iago hinted earlier, there's six orders of magnitude difference between the speed of light and sound, so your prorogation delay will also be six orders of magnitude larger than that of an RF-based system, but it also means that your reference points can be six orders of magnitude closer for the same accuracy.
Have you determined how to locate your reference points? If you build a sensor in to each of the transmitters, you can build an ad-hock relative positioning system, where one sensor is simply defined as the origin, and another as the master reference, and all the other points are given with position relative to the positions defined by the first two.
I'm pretty sure you only need 3, even in 3 dimensions, but i'm bad at visualizing 3d so you might be right.
You need not visualize 3d, just understand that each reference point knocks out one dimension (though the reference frame is not fixed). When you get down to 1 dimension, you still need two reference points - and that should be easy to visualize.
If you have an equal number of reference points as dimensions, there can be two points at which the object could be. In the case of 3d, think of a plane defined by the 3 reference points, and mirror the point across that plane.
RF would be the preferred technology for this application (low power, easy to implement, transmitters/receivers are cheap, low interference issues).
Interference in 2GHz is not low; the thing about WiFi is that the AP doesn't really have to know where you are accurately, just where it looks like you are from its perspective.
Even with 4 reference points, you're going to have dead zones, distortion, and interference. However, with enough redundancy, you should be able to almost disregard those problems. You may have to fill every channel in every room.