RPE = "render and processing engine". That is all we know for sure...
However, I doubt it is specifically to do with tessellation, as tessellation is a geometry-based operation (i.e. the sub-division of polygons), and has little to do directly with rendering. The above suggests a broader application. My
guess is that it is similar to what Baboonanza suggests; an indivisible bank of shaders. Increased modularity is a broad requirement of maintaining / improving scalability to larger numbers of SPs and greater data throughput (something nvidia moved towards with Fermi). Whether a tessellation unit is associated with each RPE, or whether a global tessellation unit exists, will depend on the nature of the design (I suspect that a tessellation unit for each RPE would be logical, given the modular design paradigm, but we will have to wait and see).
Also, if you take a look at this picture (posted in another thread):
It SEEMS, from the shape of the blurs, that the Cayman has 3 RPEs, and 480(x4) stream processors. It's hard to be sure, but the number of texture units is definitely a two-figure number (i.e. not 128), so this adds some weight to the assumption. Also, the number of RPEs looks to be the wrong shape for a "4".
It will be interesting to see how independent these RPEs are, how many transistors (and so much die area) is given over to improved command / control logic, and how they interface with the memory. IF they interface with the memory independently and in parallel, we could be looking at a 384bit bus on Cayman (though this is not something we've heard much about so perhaps not).