Adds a FramebufferCache singletion that operates the same way as UniformSetCache.
Allows creating framebuffers on the fly (and keep them cached if re-requested) such as:
```C++
RID fb = FramebufferCache::get_singleton()->get_cache(texture1,texture2);
```
This is not enabled by default in the core version for performance reasons,
as Vector/CowData are used in critical code paths where not zero'ing memory
which is going to be set later on can be important.
But for bindings / the scripting API, we make zero the new items by default
(which already happened for built types like Vector3, etc., but not for
trivial types like int, float).
Fixes#43033.
Co-authored-by: David Hoppenbrouwers <david@salt-inc.org>
Implement built-in classes Vector4, Vector4i and Projection.
* Two versions of Vector4 (float and integer).
* A Projection class, which is a 4x4 matrix specialized in projection types.
These types have been requested for a long time, but given they were very corner case they were not added before.
Because in Godot 4, reimplementing parts of the rendering engine is now possible, access to these types (heavily used by the rendering code) becomes a necessity.
**Q**: Why Projection and not Matrix4?
**A**: Godot does not use Matrix2, Matrix3, Matrix4x3, etc. naming convention because, within the engine, these types always have a *purpose*. As such, Godot names them: Transform2D, Transform3D or Basis. In this case, this 4x4 matrix is _always_ used as a _Projection_, hence the naming.
This PR implements a worked thread pool. It uses a fixed amount of threads in a pool and allows scheduling tasks
that can be run on threads (and then waited for). It satisfies the following use cases:
* HTML5 thread count is fixed (and similar restrictions are known in consoles) so we need to reuse threads.
* Thread spawning is slow in general, so reusing threads is faster anyway.
* This implementation supports recursive waiting for tasks, making it less prone to deadlocks if threads from the pool also run tasks.
After this is approved and merged, subsequent PRs will be needed to replace the ThreadWorkPool usage by this class.
Clean up and do fixes to hash functions and newly introduced murmur3 hashes in #61934
* Clean up usage of murmur3
* Fixed usages of binary murmur3 on floats (this is invalid)
* Changed DJB2 to use xor (which seems to be better)
* Map is unnecessary and inefficient in almost every case.
* Replaced by the new HashMap.
* Renamed Map to RBMap and Set to RBSet for cases that still make sense
(order matters) but use is discouraged.
There were very few cases where replacing by HashMap was undesired because
keeping the key order was intended.
I tried to keep those (as RBMap) as much as possible, but might have missed
some. Review appreciated!
Adds a new, cleaned up, HashMap implementation.
* Uses Robin Hood Hashing (https://en.wikipedia.org/wiki/Hash_table#Robin_Hood_hashing).
* Keeps elements in a double linked list for simpler, ordered, iteration.
* Allows keeping iterators for later use in removal (Unlike Map<>, it does not do much
for performance vs keeping the key, but helps replace old code).
* Uses a more modern C++ iterator API, deprecates the old one.
* Supports custom allocator (in case there is a wish to use a paged one).
This class aims to unify all the associative template usage and replace it by this one:
* Map<> (whereas key order does not matter, which is 99% of cases)
* HashMap<>
* OrderedHashMap<>
* OAHashMap<>
* Changed syntax usage for RD::Uniform to create faster with a single RID
* Converted render pass setup to use this in clustered renderer to test.
This is the first step into creating a proper uniform set cache system to simplify large parts of the codebase.
The same is done for `Vector` (and thus `Packed*Array`).
`begin` and `end` can now take any value and will be clamped to
`[-size(), size()]`. Negative values are a shorthand for indexing the array
from the last element upward.
`end` is given a default `INT_MAX` value (which will be clamped to `size()`)
so that the `end` parameter can be omitted to go from `begin` to the max size
of the array.
This makes `slice` works similarly to numpy's and JavaScript's.
Each file in Godot has had multiple contributors who co-authored it over the
years, and the information of who was the original person to create that file
is not very relevant, especially when used so inconsistently.
`git blame` is a much better way to know who initially authored or later
modified a given chunk of code, and most IDEs now have good integration to
show this information.
We prefer to prevent using chained assignment (`T a = b = c = T();`) as this
can lead to confusing code and subtle bugs.
According to https://en.wikipedia.org/wiki/Assignment_operator_(C%2B%2B), C++
allows any arbitrary return type, so this is standard compliant.
This could be re-assessed if/when we have an actual need for a behavior more
akin to that of the C++ STL, for now this PR simply changes a handful of
cases which were inconsistent with the rest of the codebase (`void` return
type was already the most common case prior to this commit).
Some platforms (*cough* web *cough*) have hard limits on the number of
threads that can be spawned.
Currently, ThreadPoolWork (mostly used in rendering/physics servers)
will spawn as many threads as CPUs available causing exception on
machines with high CPU count.
This commit adds a new overridable method to OS that returns the default
thread pool size (still the CPU count by default), and overrides it for
the JavaScript platform so it always allocate only one thread.
We can likely improve the whole ThreadPoolWork in the future to always
allocate X amount of threads, and assign jobs to them on the fly, but
that will require some more architectural changes.
`#pragma once` was used in a few files, yet we settled on using
traditional include guards instead.
The PooledList template comment was also moved to allow editors
such as Visual Studio Code to display the comment when hovering
PooledList.
`app.h` was renamed to `app_uwp.h` to be less generic for the
include guard.
On latest (11.1 as of this commit) GCC, the following warning is
continuously issued during build:
warning: placement new constructing an object of type
'SafeNumeric<unsigned int>' and size '4' in a region of type
'uint32_t*' {aka 'unsigned int*'} and size '0' [-Wplacement-new=]
This happens because on 98ceb60eb4 the new operator override used
was dropped and replaced with standard placement new. GCC sees the
subtraction from the pointer and complains as it thinks that the
SafeNumeric is placed outside an allocation, not knowing that the
address requested is already inside one.
After suggestions, the false positive is silenced, with no other
changes.
The constructor was expecting a mutable pointer, while the passed pointer was immutable ( returns a const pointer), so the compilation was failing when iterating a constant .
It uses the (`const T &&... p_args`) forward reference, to avoid copying the
memory in case it's an rvalue, or pass a reference in case it's an lvalue.
This is an example:
```c++
PagedAllocator<btShapeBox> box_allocator;
btShapeBox* box = box_allocator.alloc( btVector3(1.0, 1.0, 1.0) );
```
With this commit the macro `memnew_placement` uses the standard memory
placement syntax: `new (mem) TheClass()`, and removes the outdated and
not used syntax:
```
_ALWAYS_INLINE_ void *operator new(size_t p_size, void *p_pointer, size_t check, const char *p_description) {
```
Thanks to this change, the function `memnew_placement` call is compatible with
any class, and can also initialize classes with non-empty constructor:
```
// This is valid, like before.
memnew_placement(mem, Variant);
// This works too:
memnew_placement(mem, Variant(123));
```
Found via `codespell -q 3 -S ./thirdparty,*.po,./DONORS.md -L ackward,ang,ans,ba,beng,cas,childs,childrens,dof,doubleclick,fave,findn,hist,inout,leapyear,lod,nd,numer,ois,ony,paket,seeked,sinc,switchs,te,uint`
The `Math_INF` and `Math_NAN` defines were just aliases for those
constants, so we might as well use them directly.
Some portions of the code were already using `INFINITY` directly.
This PR implements range iterators in the base containers (Vector, Map, List, Pair Set).
Given several of these data structures will be replaced by more efficient versions, having a common iterator API will make this simpler.
Iterating can be done as follows (examples):
```C++
//Vector<String>
for(const String& I: vector) {
}
//List<String>
for(const String& I: list) {
}
//Map<String,int>
for(const KeyValue<String,int>&I : map) {
print_line("key: "+I.key+" value: "+itos(I.value));
}
//if intending to write the elements, reference can be used
//Map<String,int>
for(KeyValue<String,int>& I: map) {
I.value = 25;
//this will fail because key is always const
//I.key = "hello"
}
```
The containers are (for now) not STL compatible, since this would mean changing how they work internally (STL uses a special head/tail allocation for end(), while Godot Map/Set/List do not).
The idea is to change the Godot versions to be more compatible with STL, but this will happen after conversion to new iterators have taken place.
Vector handles this silently by returning -1, and we should do the same here.
Otherwise we get errors when calling `find()` on e.g. a LocalVector of size 0,
while `find()` is expected to always work (if the parameters are invalid then
it doesn't find anything, so -1).
Fixup to #49925.
* Ability to allocate empty objects in RID_Owner, so RID_PtrOwner is not needed in most cases.
* Improves cache usage, as objects are now allocated together
* Should improve performance in 2D rendering
This commit adds the following properties to GeometryInstance3D: `visibility_range_begin`,
`visibility_range_begin_margin`, `visibility_range_end`, `visibility_range_end_margin`.
Together they define a range in which the GeometryInstance3D will be visible from the camera,
taking hysteresis into account for state changes. A begin or end value of 0 will be ignored,
so the visibility range can be open-ended in both directions.
This commit also adds the `visibility_parent` property to 'Node3D'.
Which defines the visibility parents of the node and its subtree (until
another parent is defined).
Visual instances with a visibility parent will only be visible when the parent, and all of its
ancestors recursively, are hidden because they are closer to the camera than their respective
`visibility_range_begin` thresholds.
Combining visibility ranges and visibility parents users can set-up a quick HLOD system
that shows high detail meshes when close (i.e buildings, trees) and merged low detail meshes
for far away groups (i.e. cities, woods).
* RingBuffer had no reason to be in this context
* A single buffer is used that can grow as much as the game needs.
This should make thread loading entirely reliable.
The code is based on the current version of thirdparty/vhacd and modified to use Godot's types and code style.
Additional changes:
- extended PagedAllocator to allow leaked objects
- applied patch from https://github.com/bulletphysics/bullet3/pull/3037
Port lawnjelly's dynamic BVH implementation from 3.x to be used in
both 2D and 3D broadphases.
Removed alternative broadphase implementations which are not meant to be
used anymore since they are much slower.
Includes changes in Rect2, Vector2, Vector3 that help with the template
implementation of the dynamic BVH by uniformizing the interface between
2D and 3D math.
Co-authored-by: lawnjelly <lawnjelly@gmail.com>
We've been using standard C library functions `memcpy`/`memset` for these since
2016 with 67f65f6639.
There was still the possibility for third-party platform ports to override the
definitions with a custom header, but this doesn't seem useful anymore.
- For now everything imports multithreaded by default (should work I guess, let's test).
- Controllable per importer
Early test benchmark. 64 large textures (importing as lossless, _not_ as vram) on a mobile i7, 12 threads:
Importing goes down from 46 to 7 seconds.
For VRAM I will change the logic to use a compressing thread in a subsequent PR, as well as implementing Betsy.