Before this change, a skeleton that was not updated every frame would
result in a difference of 2+ between last_change and frame index every
frame, which would disable the buffer rotation and set motion vectors to
zero. This results in significant visual artifacts for FSR2 that are
especially prominent on the characters that move together with the view
such as the main character in third person mode.
This is a significant problem for high refresh rate displays: at 120 Hz,
we are effectively guaranteed to skip skeleton updates every other frame
with skeleton update happening during physics processing, and the lack
of physics interpolation for skeletons. This happens by default in TPS
demo when FSR2 is enabled.
In other places where motion vectors are disabled, such as multi-mesh
and mesh rendering (where previous transform is updated), the logic
effectively allows for a single-frame gap in updates, because it
compares the frame where the update happened (which is the current frame
if updates are consistent) with the current frame, so the latency of 0
means "update just happened", but both multi-mesh and mesh transform
updates permit a latency of 1 as well.
Here, however, last_change is updated *after* the frame processing has
concluded, so a zero-latency update has a distance of 1. Allowing a
distance of 2 (latency 1) reduces the severity of the problem and aligns
the logic with transform updates.
Note that the problem will still happen when refresh rate is noticeably
higher than physics rate times 2. For example, it still happens at 240
Hz. However, a longer latency allowance is inconsistent with other
transforms and could lead to issues, so ideally long term physical
interpolation of skeleton transforms would completely solve this.
* Servers now use WorkerThreadPool for background computation.
* This helps keep the number of threads used fixed at all times.
* It also ensures everything works on HTML5 with threads.
* And makes it easier to support disabling threads for also HTML5.
CommandQueueMT now syncs with the servers via the WorkerThreadPool
yielding mechanism, which makes its classic main sync semaphore
superfluous.
Also, some warnings about calls that kill performance when using
threaded rendering are removed because there's a mechanism that
warns about that in a more general fashion.
Co-authored-by: Pedro J. Estébanez <pedrojrulez@gmail.com>
Selecting "Generate AABB" on a 3D particle node in the editor would not work
and printed an error about incorrect buffer size if the particle shader used
one or more of the USERDATA build-ins.
- Supporting custom AABB on the MultiMesh resource itself allows us to prevent costly runtime AABB recalculations.
- Should also help improve CPU Particle performance.