To better train and evaluate 3D reconstruction methods (NeRF, Gaussian Splatting) or 3D generative models, both for static (3D) and dynamic (4D) scenes, we will develop a new full-reference quality metric and no-reference loss function. Those will be trained and validated on a new 4D quality dataset, with the subjective quality measured in stereoscopic presentation (e.g., on a VR headset). The developed techniques will improve 3D and temporal consistency of the rendered views, resulting in fewer temporal artefacts. They will also allow automatic hyper-parameter tuning and more reliable evaluation and comparison of 3D rendering techniques.