Applying overlay materials into multi-surface meshes currently
requires adding a next pass material to all the surfaces, which
might be cumbersome when the material is to be applied to a range
of different geometries. This also makes it not trivial to use
AnimationPlayer to control the material in case of visual effects.
The material_override property is not an option as it works
replacing the active material for the surfaces, not adding a new pass.
This commit adds the material_overlay property to GeometryInstance
(and therefore MeshInstance), having the same reach as
material_override (that is, all surfaces) but adding a new material
pass on top of the active materials, instead of replacing them.
Implemented in rasterizer of both GLES2 and GLES3.
Async. compilation via ubershader is currently available in the scene and particles shaders only.
Bonus:
- Use `#if defined()` syntax for not true conditionals, so they don't unnecessarily take a bit in the version flagset.
- Remove unused `ENABLE_CLIP_ALPHA` from scene shader.
- Remove unused `PARTICLES_COPY` from the particles shader.
- Remove unused uniform related code.
- Shader language/compiler: use ordered hash maps for deterministic code generation (needed for caching).
For octahedral compressed normals/tangents, we use vec4 in the shader
regardless of whether a normal/tangent does/doesn't exist
For the case where we only have a normal vector, we need to specify that
there are only two components being used when calling glVertexAttrib
Before we would always specify that there were 4 components, and used
offsets to determine where in the vertex buffer to read data from but
this doesn't work on all platforms
Sets `AlignOperands` to `DontAlign`.
`clang-format` developers seem to mostly care about space-based indentation and
every other version of clang-format breaks the bad mismatch of tabs and spaces
that it seems to use for operand alignment. So it's better without, so that it
respects our two-tabs `ContinuationIndentWidth`.
Immediate meshes do not have geometry of type Surface so we check
to see the mesh isn't immediate before trying to cast to surface
to check for octahedral compression
This provides more realistic lighting with a very small performance cost.
The option is available in both GLES3 and GLES2, and can be enabled in
the Project Settings. This goes well with the ACES Fitted tonemapping mode
that was recently added.
When enabled, this also makes upgrading Godot 3.x projects to Godot 4.0 easier,
since lighting in 3.x will better match how it'll look in Godot 4.0.
Change the existing DEV_ASSERT function to be switched on and off by the DEV_ENABLED define. DEV_ASSERT breaks into the debugger as soon as hit.
Add error macros DEV_CHECK and DEV_CHECK_ONCE to add an alternative check that ERR_PRINT when a condition fails, again only enabled in DEV_ENABLED builds.
Update mesh_surface_get_format_stride and
mesh_surface_make_offsets_from_format to return an array of offsets and
an array of strides in order to support vertex stream splitting
Update _get_array_from_surface to also support vertex stream splitting
Add a condition on split stream usage to ensure it does not get used on
dynamic meshes
Handle case when Tangent is compressed but Normal is not compressed
Make stream splitting option require a restart in the settings
Update SoftBody and Sprite3D to support and use strides and offsets
returned by updated visual_server functions
Update Sprite3D to use the dynamic mesh flag
This is only available on the GLES3 backend.
This can be useful for advanced shaders, but it should generally
not be enabled otherwise as full precision has a performance cost.
For general-purpose rendering, the built-in debanding filter should
be used to reduce banding instead.
This backports the high quality glow mode from the `master` branch.
Previously, during downsample, every second row was ignored.
Now, when high-quality is used, we sample two rows at once to ensure
that no pixel is missed. It is slower, but looks much better and has
a much high stability while moving.
High quality also takes an additional horizontal sample the width of the
horizontal blur matches the height of the vertical blur.
With the octahedral compression, we had attributes of a size of 2 bytes
which potentially caused performance regressions on iOS/Mac
Now add padding to the normal/tangent buffer
For octahedral, normal will always be oct32 encoded
UNLESS tangent exists and is also compressed
then both will be oct16 encoded and packed into a vec4<GL_BYTE>
attribute
Initial octahedral compression incorrectly wrote tangent to the buffer
using an offset of 3 rather than 4, losing the sign of the tangent
vector needed for things like tangent space for texturing mapping
GLES3 renderer used remove_custom_define rather than set_conditional to
change back to the default conditional state the scene shader should be
in
Implement Octahedral Compression for normal/tangent vectors
*Oct32 for uncompressed vectors
*Oct16 for compressed vectors
Reduces vertex size for each attribute by
*Uncompressed: 12 bytes, vec4<float32> -> vec2<unorm16>
*Compressed: 2 bytes, vec4<unorm8> -> vec2<unorm8>
Binormal sign is encoded in the y coordinate of the encoded tangent
Added conversion functions to go from octahedral mapping to cartesian
for normal and tangent vectors
sprite_3d and soft_body meshes write to their vertex buffer memory
directly and need to convert their normals and tangents to the new oct
format before writing
Created a new mesh flag to specify whether a mesh is using octahedral
compression or not
Updated documentation to discuss new flag/defaults
Created shader flags to specify whether octahedral or cartesian vectors
are being used
Updated importers to use octahedral representation as the default format
for importing meshes
Updated ShaderGLES2 to support 64 bit version codes as we hit the limit
of the 32-bit integer that was previously used as a bitset to store
enabled/disabled flags
Implemented splitting of vertex positions and attributes in the vertex
buffer
Positions are sequential at the start of the buffer, followed by the
additional attributes which are interleaved
Made a project setting which enables/disabled the buffer formatting
throughout the project
Implemented in both GLES2 and GLES3
This improves performance particularly on tile-based GPUs as well as
cache performance for something like shadow mapping which only needs
position data
Updated Docs and Project Setting
This is an older, easier to implement variant of CAS as a pure
fragment shader. It doesn't support upscaling, but we won't make
use of it (at least for now).
The sharpening intensity can be adjusted on a per-Viewport basis.
For the root viewport, it can be adjusted in the Project Settings.
Since `textureLodOffset()` isn't available in GLES2, there is no
way to support contrast-adaptive sharpening in GLES2.
All my earlier test cases for software skinning had the polys parent transform to be identity. This works fine until you had cases where the user had moved the transform of the parent nodes of skinned polys.
This PR fixes this situation by taking into account the final (concatenated) transform of the polys RELATIVE to the skeleton base transform. It does this by applying the inverse skeleton base transform to the poly final transform.
We've been using standard C library functions `memcpy`/`memset` for these since
2016 with 67f65f6639.
There was still the possibility for third-party platform ports to override the
definitions with a custom header, but this doesn't seem useful anymore.
Backport of #48239.
The final_modulate was incorrectly being set in the uniform on light passes in GLES3 in situations where color was baked in the vertices. This was already correct in GLES2. This PR makes prevents setting final_modulate in this situation.
When users create an invalid shader, the shader->valid flag is set to false. Batching previously assumes that shaders are valid, and this can result in primitives with invalid shader being joined, causing visual errors.
This PR prevents joining items that have invalid shaders.
Allows users to override default API usage, in order to get best performance on different platforms.
Also changes the default legacy flags to use STREAM rather than DYNAMIC.
In rare cases default batches could occur which were containing commands that were not owned by the first item referenced by the joined item. This had assumed to be the case, and would read the wrong command, or crash.
Instead for safety in this PR we now store a pointer to the parent item in default batches, and use this to determine the correct command list instead of assuming.
Trying to use the old `hardware_transform` flag to combine the new large_fvf has lead to several bugs. So here the logic is broken out into 2 separate components, single item and large_fvf.
The old `hardware_transform` name also no longer makes sense, as there are now 3 transform paths:
Software (CPU)
Hardware (uniform)
Hardware (attribute)
- Fix objects with no material being considered as fully transparent by the lightmapper.
- Added "environment_min_light" property: gives artistic control over the shadow color.
- Fixed "Custom Color" environment mode, it was ignored before.
- Added "interior" property to BakedLightmapData: controls whether dynamic capture objects receive environment light or not.
- Automatically update dynamic capture objects when the capture data changes (also works for "energy" which used to require object movement to trigger the update).
- Added "use_in_baked_light" property to GridMap: controls whether the GridMap will be included in BakedLightmap bakes.
- Set "flush zero" and "denormal zero" mode for SSE2 instructions in the Embree raycaster. According to Embree docs it should give a performance improvement.
As part of the improvements to batch more cases, batching can store final_modulate as an attribute in the vertex format rather than sending as a uniform. This allows draw calls with different final_modulate to be batched together.
However custom shader code was reading from only the final_modulate uniform, and not the attribute when it was in use. This was leading to visual errors.
This is tricky to solve, because we cannot use the same name for the attribute in the vertex and fragment shaders, because one is an attribute and one a varying, whereas a uniform is accessible anywhere. To get around this, a macro is used which can translate to the most appropriate variable depending on whether uniform or attribute or varying is required.
Although batching supported both ninepatch modes (fixed and scaling) when using ninepatch stretch mode, the ninepatch tiling modes (in GLES3) could only run through the shader.
The shader only supported one of the ninepatch modes. This PR uses the hack method of #if defined in the shader to prevent the use of a conditional. The define is set at startup according to the project setting.
GLES3 changes:
This commit makes it possible to disable 3D directional lights by using
the light's cull mask. It also automatically disables directionals when
the object has baked lighting and the light is set to "bake all".
GLES2 changes:
Added a check for the light cull mask, since it was previously ignored.
The rendering/quality/2d section of project settings is becoming considerably expanded in 3.2.4, and arguably was not the correct place for settings that were not really to do with quality.
3.2.4 is the last sensible opportunity we will have to move these settings, as the only existing one likely to break compatibility in a small way is `pixel_snap`, and given that the whole snapping area is being overhauled we can draw attention to the fact it has changed in the release notes.
Class reference is also updated and slightly improved.
`pixel_snap` is renamed to `gpu_pixel_snap` in the project settings and code to help differentiate from CPU side transform snapping.
Move definition of rendering/quality/filters/anisotropic_filter_level to
servers/visual_server.cpp, since both GLES2 and GLES3 now use it
rasterizer_storage_gles3.cpp: Remove a spurious variable write (the
value gets overwritten soon after)
- Fix Embree runtime when using MinGW (patch by @RandomShaper).
- Fix baking of lightmaps on GridMaps.
- Fix some GLSL errors.
- Fix overflow in the number of shader variants (GLES2).
Completely re-write the lightmap generation code:
- Follow the general lightmapper code structure from 4.0.
- Use proper path tracing to compute the global illumination.
- Use atlassing to merge all lightmaps into a single texture (done by @RandomShaper)
- Use OpenImageDenoiser to improve the generated lightmaps.
- Take into account alpha transparency in material textures.
- Allow baking environment lighting.
- Add bicubic lightmap filtering.
There is some minor compatibility breakage in some properties and methods
in BakedLightmap, but lightmaps generated in previous engine versions
should work fine out of the box.
The scene importer has been changed to generate `.unwrap_cache` files
next to the imported scene files. These files *SHOULD* be added to any
version control system as they guarantee there won't be differences when
re-importing the scene from other OSes or engine versions.
This work started as a Google Summer of Code project; Was later funded by IMVU for a good amount of progress;
Was then finished and polished by me on my free time.
Co-authored-by: Pedro J. Estébanez <pedrojrulez@gmail.com>
Happy new year to the wonderful Godot community!
2020 has been a tough year for most of us personally, but a good year for
Godot development nonetheless with a huge amount of work done towards Godot
4.0 and great improvements backported to the long-lived 3.2 branch.
We've had close to 400 contributors to engine code this year, authoring near
7,000 commit! (And that's only for the `master` branch and for the engine code,
there's a lot more when counting docs, demos and other first-party repos.)
Here's to a great year 2021 for all Godot users 🎆
(cherry picked from commit b5334d14f7)
These were only put in for the betas, in order to test hypotheses for stalling on Macs. It seems that most of the problems in the Mac editor have been solved by fixing the excessive redraw_requests.
As a result no one has reported any results from these options, but in future we will be able to refer users to try the beta versions, so there is no need to include them in the stable release. Indeed they are only likely to cause confusion.
For fixing a previous issue state.canvas_texscreen_used was reset to false at the start of each render_joined_item. This was causing a later shader that used SCREEN_TEXTURE to force recapturing the back buffer immediately prior to use, which we don't want.
This PR preserves the state across joined items, and also prevents joining of items that copy the back buffer as this may be problematic.
It turns out that the original issue that needed the line is now fixed, and the later issue is also fixed by removing it.
While adding more debug checks to legacy renderer, I closed 2 types of vulnerabilities:
* TYPE_PRIMITIVE would previously read from uninitialized data if only specifying a single color
* Other legacy draw operations would fail in debug AFTER accessing out of bounds memory rather than before
Many calls to glBufferSubData are wrapped in a safe version which checks for out of bounds and exits the draw function if this is detected.
Large FVF allows batching of many custom shaders, but should not join items which have shaders that utilize BUILTINs which would change for each item, because these will not be sent individually, and all joined items would wrongly use the values from the first joined item.
This adds support for custom shaders for polys, and properly handles modulate in the case of large FVF and modulate FVF.
It also fixes poly vertex colors not being sent to OpenGL.
As a result of the GLES specifications being vague about best practice for how buffers should be used dynamically, different GPUs / platforms appear to have different preferences.
Mac in particular seems to have a number of problems in this area, and none of the rendering team uses Macs. So far we have relied on guesswork to choose the best usage, but in an attempt to pin this down, this PR begins to introduce manual selection of options for users to test their configurations.
In small batches using hardware transform, vertices would be drawn in incorrect positions due to the item transform being applied twice - once in the transform uniform, and once from the transform passed as a vertex attribute.
This PR alters the shader to ignore uniform transforms when using large FVF.
Due to my less than eagle-like view over these functions I had assumed they were passing in a single buffer input for the changes to make buffer uploading more efficient. They aren't, which is less than ideal.
So these particular changes should be reverted. When I have some more time I'll see whether the API for these calls can be changed, because as is the multiple glSubBufferData calls could be causing stalls on some hardware.
It can be enabled in the Project Settings
(`rendering/quality/filters/use_debanding`). It's disabled
by default as it has a small performance impact and can make
PNG screenshots much larger (due to how dithering works).
As a result, it should be enabled only when banding is noticeable enough.
Since debanding requires a HDR viewport to work, it's only supported
in the GLES3 backend.