Sorry that was a bit of a stream of consciousness. After explaining the problem to the Super Shadowy Sponsor guy I decided that even still the notion of introducers serving up new forward paths is still better overall.
The reason being that at any one time, all of the rendezvous points chosen by the hidden service could be very busy, not only with the hidden service traffic but also the general network relaying traffic.
And the sum bandwidth capacity of the chosen rendezvous points may not be nearly as high as the exit node can handle. Relays can advertise claimed capacities to the network, but this can be lies, whereas averaging it out over the whole network puts the inbound capacity at the average of the network as a whole, which is likely higher than a worst case rendezvous point relays sum capacity at a given time, by definition. Latency is a major enemy in my design thinking. I want to create a protocol that can become as ubiquitous as TCP/IP itself, which is something of an unfair competition, but it helps me evaluate design options under one razor that can rule out bad options quickly due to their latency cost.
The network would not be able to quickly adapt to this, as introducer points can only have their updates at the speed of the p2p DHT network's propagation rate. So until the hidden service changes its rendezvous points, traffic to it could be limited by an unfortunate choice of rendezvous points.
In contrast, the 8 hops, 3 forward provided by the hidden service, two first hops back by the hidden service, and 3 more including the last layer for returning to the client, can be chosen by both sides out of less busy relays based on both the p2p DHT network propagation of congestion information, and the direct data gathered via onion circuit paths about the level of utilisation of each of the relays. In fact, this information would be a logical component of the chatter between relays, reporting to each other their utilisation level.
The gathering of such data on an anonymity network is full of gotchas related to leaking contemporary data about the exits, especially, that a client is connecting to. As such, only exit relays used by a client report their current utilisation value, but because the client is constantly chosing new exit points for services that have many relays providing, this traffic can update the information of the client about the recent congestion state of the relays and feed into their probabalistic selection of nodes for any given message.
So, overall, I'm leaning very heavily towards the 2 hop first layer to be provided by the hidden service, and the 3 hops provided in the outbound message for delivering the return, because it fits in better with the congestion mitigation strategies that provide a stronger guarantee of low latency for users.
I originally wasn't going to focus much on this facility in the protocol, but when the "oh but they could just deliver short time valid outbound paths" thought occurred, I got all excited because it seemed at first like a better option than static and slowly changing rendezvous points forwarding traffic between the endpoints. But I think that in terms of reliability and latency minimisation, the 8 hop idea works out better.