As far as privacy concerns, I'm not sure if you can yield a good image from computationally removing the light from the pixels that are directly over the camera (after all those are known by the system). I would assume that it's unlikely with the screen near or at maximum brightness since the wells in each pixel have quite low capacity due to the tiny size of these pixels compared to on larger format sensors which can handle far more light prior to clipping. You would need some extremely high readout speeds (which right now gives pretty high noise) and some computational imagery to get a usable photo when the screen is on. As of now there's a good probability of the screen above the pixels needing to be off in order to get a good photo. In that sense, this is actually good on a privacy front because it would be quite obvious when something is using your camera just like pop-up cameras. I don't see pop-up cameras getting anywhere due to the amount of moving parts it needs as well as the chunk of space those moving parts all need.
This wouldn't matter when the screen is off because then it would be just like any normal camera sitting in a bezel unobstructed by anything.
It would have to take another decade or two before we get high sensitivity high speed readout image sensors (Quanta Image Sensors and jots are already being developed by Eric Fossum, the inventor of the modern day CMOS image sensor) in which this would become a larger concern.
On the technology itself, I'm looking forward to it. I'd like to be able to reach the Svper Phone concept
that helped push forward this "bezel-less" trend, though there should still be some bezel like on the iPhone X/XS to help mitigate accidentally presses.
As for how soon we'll see it, since it's already in prototype stage, probably in the next 5 years? Probably in the iPhone 12 or 13 since those are the major redesigns and the S ones are just optimizations. I assume they've solved the problem of distortion from light passing through the subpixel array (in principle this would work just like software lens distortion corrections) and have made the display glass "fast" enough that there isn't a much of a problem with placing a sensor underneath.
The under-screen speaker technology is interesting. I'd venture it's quite close to what Sony is doing with their flagship Bravia televisions (Acoustic Surface). Worried about how the sound might distort though when users press on the screen. If it does distort on contact then it would be quite jarring if you're ever listening to music on the speakers while using another app, or in the case of video, using picture-in-picture or split screen.