Well, two ways that come to mind immediately (keeping mind that I didn't really read your ideas, so this will probably overlap)
1) Echolocation, or something similar -- the object/person is equipped with a directional beacon. But that won't get you a x, y, z very easily, especially if they turn, it's more useful for unknown navigation (that is, not running into a wall)
2) Triangulation -- have three separate units that can tell the distance to the object/person (either by radio, sound, throwing ping pong balls, or whatever). The farther apart the better. Based on the ping time from each base to the object, you can get the distance. Conceptually draw three spheres, and where all three come closest to meeting is where the object is. You'll need to know the x/y/z of each of the sensors, but not of the object itself. You also don't have to know what the shape of the room is, although this will break if the person goes behind a wall, or if the sensors bounce off walls
Have you considered GPS? It'll give you the x/y of an object within a meter (through triangulation, as it turns out), but you won't get a great z I don't think.