Adds support for LTO on macOS and Android. We don't have much experience
with LTO on these platforms so for now we keep it disabled by default
even when `production=yes` is set.
Similarly for iOS where we ship object files for the user to link in
Xcode so LTO makes builds extremely slow to link.
`production=yes` defaults to full LTO.
ThinLTO is much faster for LLVM-based compilers but seems to produce
bigger binaries (at least for the Web platform).
This will allow adding developer checks which will be fully compiled out in
user builds, unlike `DEBUG_ENABLED` which is included in debug tempates and
the editor builds.
This define is not used yet, but we'll soon add code that uses it, and change
some existing `DEBUG_ENABLED` checks to be performed only in dev builds.
Related to godotengine/godot-proposals#3371.
All Android devices that support Vulkan support 64-bit ARM.
This also removes NEON opt-out code for ARMv7 as pretty much all
ARMv7 devices also support NEON.
We found that this flag causes this error on PR #48812 which does not add any
fancy inline assembly:
```
/tmp/tile_set-ce236a.s: Assembler messages:
/tmp/tile_set-ce236a.s:34676: Error: selected processor does not support `bfc x0,#32,#32'
clang++: error: assembler command failed with exit code 1 (use -v to see invocation)
```
That flag is mentioned in various errors related to assembler failures on
arm64v8 with Clang from the Android NDK.
It was added in Godot in #6958 when migrating from GCC to Clang, and is indeed
referenced in the NDK's Clang migration guide:
https://android.googlesource.com/platform/ndk/+/master/docs/ClangMigration.md
> Especially for ARM and ARM64, Clang is much stricter about assembler rules
> than GCC/GAS. Use `-fno-integrated-as` if Clang reports errors in inline
> assembly or assembly files that you don't wish to modernize.
We don't get those errors nowadays so it seems the flag is no longer needed.
This changes the types of a big number of variables.
General rules:
- Using `uint64_t` in general. We also considered `int64_t` but eventually
settled on keeping it unsigned, which is also closer to what one would expect
with `size_t`/`off_t`.
- We only keep `int64_t` for `seek_end` (takes a negative offset from the end)
and for the `Variant` bindings, since `Variant::INT` is `int64_t`. This means
we only need to guard against passing negative values in `core_bind.cpp`.
- Using `uint32_t` integers for concepts not needing such a huge range, like
pages, blocks, etc.
In addition:
- Improve usage of integer types in some related places; namely, `DirAccess`,
core binds.
Note:
- On Windows, `_ftelli64` reports invalid values when using 32-bit MinGW with
version < 8.0. This was an upstream bug fixed in 8.0. It breaks support for
big files on 32-bit Windows builds made with that toolchain. We might add a
workaround.
Fixes#44363.
Fixesgodotengine/godot-proposals#400.
Co-authored-by: Rémi Verschelde <rverschelde@gmail.com>
This is what GitHub Actions now provide and they removed the previous 21.3.6528147.
A bit annoying to have our hand forced this way but it's still 21.x so should be good
to upgrade.
Until https://github.com/psf/black/pull/1328 makes it in a stable release,
we have to use the latest from Git.
Apply new style fixes done by latest black.
Its last use was removed in Godot 3.0, so it no longer makes sense to define.
Also removed `D3D_DEBUG_INFO` for Windows as it's likely a left over from a
long time ago pre-opensourcing when Godot had some form of Direct3D 9 support?
Configured for a max line length of 120 characters.
psf/black is very opinionated and purposely doesn't leave much room for
configuration. The output is mostly OK so that should be fine for us,
but some things worth noting:
- Manually wrapped strings will be reflowed, so by using a line length
of 120 for the sake of preserving readability for our long command
calls, it also means that some manually wrapped strings are back on
the same line and should be manually merged again.
- Code generators using string concatenation extensively look awful,
since black puts each operand on a single line. We need to refactor
these generators to use more pythonic string formatting, for which
many options are available (`%`, `format` or f-strings).
- CI checks and a pre-commit hook will be added to ensure that future
buildsystem changes are well-formatted.
As of 3.1 and later, we have too many thirdparty C++ dependencies
and some internal uses of `new` and `delete` too for it to make
sense to build without the STL on Android.
The option has been broken since 3.0, and the "System STL" that we
relied on for basic support of `new` and `delete` is likely to be
dropped from the NDK:
https://android.googlesource.com/platform/ndk/+/ndk-release-r20/docs/BuildSystemMaintainers.md#System-STL
It's the recommended way to set those, and is more portable
(automatically prepends -D for GCC/Clang and /D for MSVC).
We still use CPPFLAGS for some pre-processor flags which are not
defines.
Those were disable to keep size small, and on Android avoid the dependency on the STL,
but for tools build (editor) this is not really a concern.
Note: as of today it's not possible to build tools=yes for those platforms, but this
change is one of the necessary steps to enable it.
Fixes#25262.
Include paths are processed from left to right, so we use Prepend to
ensure that paths to bundled thirdparty files will have precedence over
system paths (e.g. `/usr/include` should have lowest priority).
Many contributors (me included) did not fully understand what CCFLAGS,
CXXFLAGS and CPPFLAGS refer to exactly, and were thus not using them
in the way they are intended to be.
As per the SCons manual: https://www.scons.org/doc/HTML/scons-user/apa.html
- CCFLAGS: General options that are passed to the C and C++ compilers.
- CFLAGS: General options that are passed to the C compiler (C only;
not C++).
- CXXFLAGS: General options that are passed to the C++ compiler. By
default, this includes the value of $CCFLAGS, so that setting
$CCFLAGS affects both C and C++ compilation.
- CPPFLAGS: User-specified C preprocessor options. These will be
included in any command that uses the C preprocessor, including not
just compilation of C and C++ source files [...], but also [...]
Fortran [...] and [...] assembly language source file[s].
TL;DR: Compiler options go to CCFLAGS, unless they must be restricted
to either C (CFLAGS) or C++ (CXXFLAGS). Preprocessor defines go to
CPPFLAGS.
Godot supports many different compilers and for production releases we
have to support 3 currently: GCC8, Clang6, and MSVC2017. These compilers
all do slightly different things with -ffast-math and it is causing
issues now. See #24841, #24540, #10758, #10070. And probably other
complaints about physics differences between release and release_debug
builds.
I've done some performance comparisons on Linux x86_64. All tests are
ran 20 times.
Bunnymark: (higher is better)
(bunnies) min max stdev average
fast-math 7332 7597 71 7432
this pr 7379 7779 108 7621 (102%)
FPBench (gdscript port http://fpbench.org/) (lower is better)
(ms)
fast-math 15441 16127 192 15764
this pr 15671 16855 326 16001 (99%)
Float_add (adding floats in a tight loop) (lower is better)
(sec)
fast-math 5.49 5.78 0.07 5.65
this pr 5.65 5.90 0.06 5.76 (98%)
Float_div (dividing floats in a tight loop) (lower is better)
(sec)
fast-math 11.70 12.36 0.18 11.99
this pr 11.92 12.32 0.12 12.12 (99%)
Float_mul (multiplying floats in a tight loop) (lower is better)
(sec)
fast-math 11.72 12.17 0.12 11.93
this pr 12.01 12.62 0.17 12.26 (97%)
I have also looked at FPS numbers for tps-demo, 3d platformer, 2d
platformer, and sponza and could not find any measurable difference.
I believe that given the issues and oft-reported (physics) glitches on
release builds I believe that the couple of percent of tight-loop
floating point performance regression is well worth it.
This fixes#24540 and fixes#24841