Transistors have everything to do with it, the less you have the smaller the die. If so many were not dedicated to RTX and DLSS the dies would be smaller and cheaper.
As to efficiency I am talking raw performance as measured in game or synthetic benches. For this there is very little to separate Pascal Turing or Volta.
NVidia have screwed up badly with Turing offering features that are extremely poor value for money.
The transistors have no cost, only die area. And as I said, DLSS and RTX have added 8% to the die area, so the die itself which is about 35-40% of the GPU manufacturing cost (which itslef is only about 1/3rd of the selling price at the most). So adding RTX and DLSS has increased production costs by well under 5% in terms of die size. The R&D costs of RTX is another matter and a different discussion.
those extra transistors are not only used for RTX and DLSS, they are used for a lot of things as I mentioned. The extra cahce sizes alone probably account for an extra billion transistors. Turing is a very different architecture to Pascal, even if the raw performance numbers don't always bear that out. That is because there are always other bottlenecks. Under certain scenarios Turing shows impressive IPC gains over Pascal What we fail to see in Turing is the additional 50% gains afforded by simply moving to a new process, instead we get the 30% gains from a new architecture in isolation.
https://www.techpowerup.com/reviews/Zotac/GeForce_RTX_2080_AMP_Extreme/31.html
2080 is 30% faster than 1080 at 4K (66/94%).
That 30% comes from architectural changes and CUDa core count increases. That requires a significant chunk of those new transistors and increased die area.
Not least, the 2080 at 15% more CUDA cores than than the 1080 and the CUDA cores take up the most transistor budget. At the same number of CUDA cores, the Turing cores have higher complexity and use more transistors. the 2080 is faster than the 1080TI because of the core improves and architecture changes.
L1 cache capacity increased from 24KB to 96KB. L2 cache increased form 3MB to 6MB. There are new features like concurrent floating point and integer math. All of this stuff takes a lot of additional transistors.
You then have design decision choices as well. pascal didn't offer double rate FP16 like Vega does. This is useful in the future. Wit the RTX Turing cards, nvidia uses the Tensor cores to do fast FP16 at rates much higher than double rate FP16, so the existence of Tensor cores reduces transistor budget used in the CUDA cores for FP16 support. Interestingly, With the GTX Turing without tensor cores, nvidia had to add specific FP16 support. I don;t have time but would be interesting to see the 1660Ti transistor and architecture changes for comparison.
Bottom line is , DLSS and RTX adding way under 5% of the production costs to Turing. Turing is expensive partly because production costs are just increasing anyway, e.g. Vram is more expensive, much more is simply due to exponentially increasing R&D costs (a 7nm GPU will cost 3X 16nm GPU to design and bring to production), and simply Nvidia increasing Profit margins.
RTX undoubtedly has made Turing more expensive, but undoubtedly that is largely due to marketing, not production.