summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Bump version to 14.0.3llvmorg-14.0.3Tom Stellard2022-04-288-9/+9
|
* workflows: Add a test to ensure that the LLVM version is correctTom Stellard2022-04-282-0/+58
| | | | | | | | This should prevent mistakes like Issue#55137. Reviewed By: thieta Differential Revision: https://reviews.llvm.org/D124539
* [RISCV] Fix crash for section alignment with .option norvcllvmorg-14.0.2Luís Marques2022-04-253-3/+21
| | | | | | | | | | | The existing code wasn't getting the subtarget info from the fragment, so the current status of RVC would be ignored. This would cause a crash for the new test case when the target then reported it couldn't write the requested number of code alignment bytes. Differential Revision: https://reviews.llvm.org/D122236 (cherry picked from commit d09d297c5d28bd0af4dd8bf3ca66d8dbbd196d9d)
* [asan] Always skip first object from dl_iterate_phdrMichael Forney2022-04-251-18/+12
| | | | | | | | | | | | | | | | | | | | | | | | | All platforms return the main executable as the first dl_phdr_info. FreeBSD, NetBSD, Solaris, and Linux-musl place the executable name in the dlpi_name field of this entry. It appears that only Linux-glibc uses the empty string. To make this work generically on all platforms, unconditionally skip the first object (like is currently done for FreeBSD and NetBSD). This fixes first DSO detection on Linux-musl. It also would likely fix detection on Solaris/Illumos if it were to gain PIE support (since dlpi_addr would not be NULL). Additionally, only skip the Linux VDSO on linux. Finally, use the empty string as the "seen first dl_phdr_info" marker rather than (char *)-1. If there was no other object, we would try to dereference it for a string comparison. Reviewed By: MaskRay, vitalybuka Differential Revision: https://reviews.llvm.org/D119515 (cherry picked from commit 795b07f5498c7e5783237418f34d7ea69e801f87)
* [RISCV] Don't emit fractional VIDs with negative stepsFraser Cormack2022-04-252-8/+9
| | | | | | | | | | | | | | | | We can't shift-right negative numbers to divide them, so avoid emitting such sequences. Use negative numerators as a proxy for this situation, since the indices are always non-negative. An alternative strategy could be to add a compiler flag to emit division instructions, which would at least allow us to test the VID sequence matching itself. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123796 (cherry picked from commit 3e678cb77264907fbc2899c291ce23af308073ff)
* [RISCV] Add another test showing incorrect BUILD_VECTOR loweringFraser Cormack2022-04-251-0/+15
| | | | | | | | | | | | | | This test shows a (contrived) BUILD_VECTOR which is correctly identified as a sequence of ((vid * -3) / 8) + 5. However, the issue is that using shift-right for the divide is invalid as the step values are negative. This patch just adds the test: the fix is added in D123796. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123989 (cherry picked from commit 627e21048a2c040d3e353cc4f0eb8f207b6ea61c)
* [RISCV] Fix lowering of BUILD_VECTORs as VID sequencesFraser Cormack2022-04-252-35/+44
| | | | | | | | | | | | | | | | | | | | | This patch fixes a bug when lowering BUILD_VECTOR via VID sequences. After adding support for fractional steps in D106533, elements with zero steps may be skipped if no step has yet been computed. This allowed certain sequences to slip through the cracks, being identified as VID sequences when in fact they are not. The fix for this is to perform a second loop over the BUILD_VECTOR to validate the entire sequence once the step has been computed. This isn't the most efficient, but on balance the code is more readable and maintainable than doing back-validation during the first loop. Fixes the tests introduced in D123785. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123786 (cherry picked from commit c5cac48549ed254cd5ad5eef770ebfb22ccd9f64)
* [RISCV] Add tests showing incorrect BUILD_VECTOR loweringFraser Cormack2022-04-251-0/+21
| | | | | | | | | | | | | | | | These tests both use vector constants misidentified as VID sequences. Because the initial run of elements has a zero step, the elements are skipped until such a step can be identified. The bug is that the skipped elements are never validated, even though the computed step is incompatible across the entire sequence. A fix will follow in a subseqeuent patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123785 (cherry picked from commit 00537946aa29928894ba140687de1b6f9494e44d)
* [llvm-mt] Add support /notify_updateAlex Brachet2022-04-253-2/+41
| | | | | | | | | | | | | | `/notify_update` is an undocumented feature used by CMake. From their usage, it looks like this feature just changes `mt`'s exit code if the output file was changed. See https://gitlab.kitware.com/cmake/cmake/-/blob/master/Source/cmcmd.cxx#L2300 this is also consistent with some testing I have done of the mt.exeshipped with Visual Studio. See also the comment at https://gitlab.kitware.com/cmake/cmake/-/blob/master/Source/cmcmd.cxx#L2440. There might be a more performant way to implement this by first checking calling `llvm::sys::fs::file_size()` and if it is the same as the new output's size use `llvm::WritableMemoryBuffer` and fallback to `llvm::FileOutputBuffer` otherwise, but these don't inherit from a common ancestor so any implementation doing this would be really ugly. Fixes https://github.com/llvm/llvm-project/issues/54329 Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D121438 (cherry picked from commit e970d2823cf2a666cb597bf06ff8e0d0b880d361)
* [RISCV] Only try LUI+SH*ADD+ADDI for int materialization if LUI+ADDI+SH*ADD ↵Craig Topper2022-04-252-26/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | failed. There's an assert in LUI+SH*ADD+ADDI materialization that makes sure the lower 12 bits aren't zero since that case should have been handled as LUI+ADDI+SH*ADD. But nothing prevented the LUI+SH*ADD+ADDI checks from running after the earlier code handled it. The sequence would be the same length or longer so it wouldn't replace the earlier sequence, but the assert happened before that was checked. The vector holding the sequence also wasn't reset before the second check so that guaranteed the sequence would never be found to be shorter. This patch fixes this by only trying the second expansion when the earlier fails. Fixes PR54812. Reviewed By: benshi001 Differential Revision: https://reviews.llvm.org/D123406 (cherry picked from commit 70046438d02ba1ec6bc2e2fc496b610cc1068b0f)
* [ELF] --emit-relocs: fix missing STT_SECTION when the first input section is ↵Fangrui Song2022-04-252-17/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | synthetic addSectionSymbols suppresses the STT_SECTION symbol if the first input section is non-SHF_MERGE synthetic. This is incorrect when the first input section is synthetic while a non-synthetic input section exists: * `.bss : { *(COMMON) *(.bss) }` (abc388ed3cf0ef7e617ebe243d3b0b32d29e69a5 regressed the case because COMMON symbols precede .bss in the absence of a linker script) * Place a synthetic section in another section: `.data : { *(.got) *(.data) }` For `%t/a1` in the new test emit-relocs-synthetic.s, ld.lld produces incorrect relocations with symbol index 0. ``` 0000000000000000 <_start>: 0: 8b 05 33 00 00 00 movl 51(%rip), %eax # 0x39 <bss> 0000000000000002: R_X86_64_PC32 *ABS*+0xd 6: 8b 05 1c 00 00 00 movl 28(%rip), %eax # 0x28 <common> 0000000000000008: R_X86_64_PC32 common-0x4 c: 8b 05 06 00 00 00 movl 6(%rip), %eax # 0x18 000000000000000e: R_X86_64_GOTPCRELX *ABS*+0x4 ``` Fix the issue by checking every input section. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D122463 (cherry picked from commit 7370a489b1005e424b23bd0009af2365aef4db53)
* [libcxx] Add some missing xlocale wrapper functions for OpenBSDBrad Smith2022-04-251-0/+20
| | | | | | | | Reviewed By: Mordante Differential Revision: https://reviews.llvm.org/D122861 (cherry picked from commit a0d40a579a6f27a1b1cdb7d68b2145e332c02c4e)
* [LV] Remove stray debug dump added in 0d2efbb8b82c.Florian Hahn2022-04-251-1/+0
|
* [LV] Always use add to add scalar iv and (startidx + step) for ints.Florian Hahn2022-04-252-10/+10
| | | | | | | | | | In the integer case, step will be negative and InductionOpCode will be Sub for inductions counting down. By using the InductionOpCode for integers, we would incorrectly subtract a negative value, when it should be added instead. This fixes #54427 on the 14.x branch.
* [LV] Add test case for PR54427.Florian Hahn2022-04-251-0/+46
| | | | Reduced test for #54427.
* [InstCombine] canonicalize select with signbit testSanjay Patel2022-04-214-44/+61
| | | | | | | | This is part of solving issue #54750 - in that example we have both forms of the compare and do not recognize the equivalence. (cherry picked from commit 2c2568f39ec641aa8f1dcc011f2ce642c2d3423f)
* [x86] Fix infinite loop inside DAG combiner with lzcnt feature.Pierre Gousseau2022-04-212-13/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | The issue affects targets supporting fast-lzcnt such as btver2. This removes extraneous zext/trunc node insertions to fix the infinite loop. This fixes Issue https://github.com/llvm/llvm-project/issues/54694 Differential Revision: https://reviews.llvm.org/D122900 Reviewed By: RKSimon, spatel, lebedev.ri (cherry picked from commit a3d5f1cf5d88dfbbed931951e07f328d5ceba510) Signed-off-by: Warren Ristow <warren.ristow@sony.com> In https://reviews.llvm.org/D122900 a new function (to exercise the infinite-loop bug) was added to llvm/test/CodeGen/X86/lzcnt-zext-cmp.ll. In applying the fix in the main branch, two previously existing functions in that test also changed behavior slightly, and in the review it was noted: The instructions generated end up being reordered in some cases but I think it is equivalent. That reordering did not happen in those pre-existing functions when applying the fix to the slightly older code-base of the llvm14 branch, and so they are suppressed here. So the updated version of the test in this commit has the additional function added to it, but it is otherwise identical to the previous llvm14 version of the test.
* [Clang][Fortify] drop inline decls when redeclaredserge-sans-paille2022-04-182-2/+38
| | | | | | | | | | | | When an inline builtin declaration is shadowed by an actual declaration, we must reference the actual declaration, even if it's not the last, following GCC behavior. This fixes #54715 Differential Revision: https://reviews.llvm.org/D123308 (cherry picked from commit 301e0d91354b853addb63a35e72e552e8059413e)
* Reland "[llvm][AArch64] Insert "bti j" after call to setjmp"David Spickett2022-04-1815-6/+284
| | | | | | | | | | | | | | | | Cherry-picked from c3b98194df5572bc9b33024b48457538a7213b4c which was originally reviewed as https://reviews.llvm.org/D121707. This reverts commit edb7ba714acba1d18a20d9f4986d2e38aee1d109. This changes BLR_BTI to take variable_ops meaning that we can accept a register or a label. The pattern still expects one argument so we'll never get more than one. Then later we can check the type of the operand to choose BL or BLR to emit. (this is what BLR_RVMARKER does but I missed this detail of it first time around) Also require NoSLSBLRMitigation which I missed in the first version.
* [DebugInfo][InstrRef] Avoid a crash from mixed variable location modesJeremy Morse2022-04-1811-11/+158
| | | | | | | | | | | | | | | | | Variable locations now come in two modes, instruction referencing and DBG_VALUE. At -O0 we pick DBG_VALUE to allow fast construction of variable information. Unfortunately, SelectionDAG edits the optimisation level in the presence of opt-bisect-limit, meaning different passes have different views of what variable location mode we should use. That causes assertions when they're mixed. This patch plumbs through a boolean in SelectionDAG from start to instruction emission, so that we don't rely on the current optimisation level for correctness. Differential Revision: https://reviews.llvm.org/D123033 (cherry picked from commit fb6596f1ecab652b5b90cf2e395d64112504c1f8)
* Force GHashCell to be 8-byte-aligned.Eli Friedman2022-04-181-1/+5
| | | | | | | | | | | | | | | Otherwise, with recent versions of libstdc++, clang can't tell that the atomic operations are properly aligned, and generates calls to libatomic. (Actually, because of the use of reinterpret_cast, it wasn't guaranteed to be aligned, but I think it ended up being aligned in practice.) Fixes https://github.com/llvm/llvm-project/issues/54790 , the part where LLVM failed to build. Differential Revision: https://reviews.llvm.org/D123872 (cherry picked from commit 13fc1781735a327699d9522e8e44899acf92a61a)
* [compiler-rt] Implement __clear_cache on FreeBSD/powerpcCarlo Marcelo Arenas Belón2022-04-181-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | dd9173420f06 (Add clear_cache implementation for ppc64. Fix buffer to meet ppc64 alignment., 2017-07-28), adds an implementation for __builtin___clear_cache on powerpc64, which was promptly ammended to also be used with big endian mode in f67036b62c0c (This ppc64 implementation of clear_cache works for both big and little endian., 2017-08-02) clang will use this implementation for it's builtin on FreeBSD and result in an abort() in the cases where 32-bit generation was requested (ex in macppc or when the big endian powerpc64 build was done with "-m32") and as reported[1] recently with pcre2, but there is no reason why the same code couldn't be used in those cases, so use instead the more generic identifier for the PowerPC architecture. While at it, update the comment to reflect that POWER8/9 have a 128 byte wide cache line and so the code could instead use 64 byte windows instead but that possible optimization has been punted for now. [1] https://github.com/PhilipHazel/pcre2/issues/92 Reviewed By: jhibbits, #powerpc, MaskRay Differential Revision: https://reviews.llvm.org/D122640 (cherry picked from commit 81f5c6270cdfcdf80e6296df216b696a7a37c8b5)
* [PowerPC] Allow absolute expressions in relocationsNemanja Ivanovic2022-04-188-45/+69
| | | | | | | | | | | | The Linux kernel build uses absolute expressions suffixed with @lo/@ha relocations. This currently doesn't work for DS/DQ form instructions and there is no reason for it not to. It also works with GAS. This patch allows this as long as the value is a multiple of 4/16 for DS/DQ form. Differential revision: https://reviews.llvm.org/D115419 (cherry picked from commit 2aaba44b5c2265f90ac9f0ae188417ef79201c82)
* [CMake] Update cache file for Win to ARM Linux cross toolchain builders. NFC.Vladimir Vereschaka2022-04-141-2/+3
| | | | | | | * fixed remote test script arguments for libc++/compiler-rt libraries. * disabled shared libc++abi libraries (to let remote tests get passed). (cherry picked from commit 41f74bc7ae33d9cd9a1eaacfc29ba53a933c042f)
* [CMake] Replace `TARGET_TRIPLE` with `TOOLCHAIN_TARGET_TRIPLE` for ↵Vladimir Vereschaka2022-04-141-65/+65
| | | | | | | | | | | | | Win-to-Arm cross toolchain cache file. NFC. Avoid using TARGET_TRIPLE argument for the cross toolchain cmake cache file and replace it with TOOLCHAIN_TARGET_TRIPLE. Reference: https://reviews.llvm.org/D119918 Differential Revision: https://reviews.llvm.org/D121029 (cherry picked from commit d860ac5da6d71dd759d347a3c7d5e63705443694)
* [CMake] Update cache file for Win to ARM cross tooolchain. NFC.Vladimir Vereschaka2022-04-141-5/+0
| | | | | | Removed passing CMAKE_AR from the library configurations. (cherry picked from commit 19c1b084a7da9087fcdc16071b461ea33b6a68b4)
* [CMake] Use CMAKE_SYSROOT to build libs for Win to ARM cross tooolchain. NFC.Vladimir Vereschaka2022-04-141-67/+84
| | | | | | | | | | | | | | | | Provide CMAKE_SYSROOT for the libc++/libc++abi/libunwind libraries instead of specific <foo>_SYSROOT for each of them. Fixed passing some CMake arguments for the runtimes. Referenced Differentials: * https://reviews.llvm.org/D119836 * https://reviews.llvm.org/D112155 * https://reviews.llvm.org/D111672 Differential Revision: https://reviews.llvm.org/D120383 (cherry picked from commit 18fa0b15ccf610f34af1231440f89d20cb99e7a0)
* [LLD][COFF] Fix TypeServerSource matcher with more than one collisionTobias Hieta2022-04-143-13/+15
| | | | | | | | | | | | | | | Follow-up from 98bc304e9faded44f1d8988ffa4c5d8b50c759ec - while that commit fixed when you had two PDBs colliding on the same Guid it didn't fix the case where you had more than two PDBs using the same Guid. This commit fixes that and also tests much more carefully that all the types are correct no matter the order. Reviewed By: aganea, saudi Differential Revision: https://reviews.llvm.org/D123185 (cherry picked from commit 0dfa8a019d9a64d7706eb82bdb083fd9b815e088)
* [lld][COFF] Fix TypeServerSource lookup on GUID collisionsTobias Hieta2022-04-147-2/+2426
| | | | | | | | | | | | | | | | | | | | | | | | | | | Microsoft shipped a bunch of PDB files with broken/invalid GUIDs which lead lld to use 0xFF as the key for these files in an internal cache. When multiple files have this key it will lead to collisions and confused symbol lookup. Several approaches to fix this was considered. Including making the key the path to the PDB file, but this requires some filesystem operations in order to normalize the file path. Since this only happens with malformatted PDB files and we haven't seen this before they malformatted files where shipped with visual studio we probably shouldn't optimize for this use-case. Instead we now just don't insert files with Guid == 0xFF into the cache map and warn if we get collisions so similar problems can be found in the future instead of being silent. Discussion about the root issue and the approach to this fix can be found on Github: https://github.com/llvm/llvm-project/issues/54487 Reviewed By: aganea Differential Revision: https://reviews.llvm.org/D122372 (cherry picked from commit 98bc304e9faded44f1d8988ffa4c5d8b50c759ec)
* [AArch64][LOH] Don't ignore regmasks in bundles by iterating over instrs.Ahmed Bougacha2022-04-142-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | The LOH pass iterates over instructions to build its custom register state machine, but it uses the top-level bundle iterator. This should be okay, because when the wrapper BUNDLE MI is built, it aggregates the register defs/uses in its instructions into MOs. However, that doesn't apply to regmasks, and accumulating regmasks across multiple instructions would be messy business. There are a couple AnalyzePhysRegInBundle (/Virt) helpers that do look at regmasks, but those don't fit in very well here. AArch64 has started to use a few bundle instructions, specifically as glorified pseudos for variant call instructions, which have regmasks. So the LOH pass ends up ignoring regmasks. Concretely, this has been wrong for a while, but, on aarch64, the most common bundle (rv_marker call) was always followed by the attached call instruction, a plain BL with a regmask. Which was properly detected by the pass. However, we recently started keeping the attached call in the bundle, so the regmask is now ignored. And the pass happily combines ADRPs, of say, x8, across the bundle, resulting in corrupt pointers later. (cherry picked from commit cfa4fe7c51870fe6b480d541938f556cf0736fa2)
* [InstCombine] try to fold low-mask of ashr to lshrSanjay Patel2022-04-142-4/+16
| | | | | | | | With one-use, we handle this via demanded-bits. But We need to handle extra uses to improve issue #54750. https://alive2.llvm.org/ce/z/aDYkPv (cherry picked from commit 7783db55afefd3b0d83f4d1b727b6aaa2c2286d6)
* [InstCombine] add tests for low-mask of ashr; NFCSanjay Patel2022-04-141-8/+72
| | | | (cherry picked from commit 141892d481fcf446c9bf1ae5fb600d797295debc)
* [LV] Handle zero cost loops in selectInterleaveCount.Florian Hahn2022-04-142-10/+62
| | | | | | | | | | | | | | | | | | In some case, like in the added test case, we can reach selectInterleaveCount with loops that actually have a cost of 0. Unfortunately a loop cost of 0 is also used to communicate that the cost has not been computed yet. To resolve the crash, bail out if the cost remains zero after computing it. This seems like the best option, as there are multiple code paths that return a cost of 0 to force a computation in selectInterleaveCount. Computing the cost at multiple places up front there would unnecessarily complicate the logic. Fixes #54413. (cherry picked from commit ecb4171dcbf1b433c9963fd605a74898303e850d)
* [RISCV][NFC] Add missing lit.local.cfg in test/CodeGen/MIR/RISCV/Kito Cheng2022-04-141-0/+2
|
* [RISCV] Fixing stack offset for RVV object with vararg in stack.Kito Cheng2022-04-142-6/+21
| | | | | | | | | | | | | | | We found LLVM generate wrong stack offset for RVV object when stack having variable argument, that cause by we didn't count vaarg part during calculate RVV stack objects. Also update the stack layout diagram for including vaarg in the diagram. Stack layout ref: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv.cc#L3941 Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123180
* [RISCV] Pre-commit for fixing stack offset for RVV objectKito Cheng2022-04-141-0/+220
| | | | | | Reviewed By: rogfer01, frasercrmck Differential Revision: https://reviews.llvm.org/D123179
* [RISCV] Store/restore RISCVMachineFunctionInfo into MIR YAML fileKito Cheng2022-04-146-0/+225
| | | | | | | | | | | | | RISCVMachineFunctionInfo has some fields like VarArgsFrameIndex and VarArgsSaveSize are calculated at ISel lowering stage, those info are not contained in MIR files, that cause test cases rely on those field can't not reproduce correctly by MIR dump files. This patch adding the MIR read/write for those fields. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D123178
* [libc++] Make __dir_stream visibility declaration consistentDimitry Andric2022-04-141-1/+1
| | | | | | | | | | | | | | | | | | | | The class `__dir_stream` is currently declared in two places: as a top-level forward declaration in `directory_iterator.h`, and as a friend declaration in class `directory_entry`, in `directory_entry.h`. The former has a `_LIBCPP_HIDDEN` attribute, but the latter does not, causing the Firefox build to complain about the visibility not matching the previous declaration. This is because Firefox plays games with pushing and popping visibility. Work around this by making both `__dir_stream` declarations consistently use `_LIBCPP_HIDDEN`. Reviewed By: ldionne, philnik, #libc Differential Revision: https://reviews.llvm.org/D121639 (cherry picked from commit 7ab1ab0db40158e6f0794637054c98376e236a6d)
* [AArch64] Fix the upper limit for folded address offsets for COFFllvmorg-14.0.1Martin Storsjö2022-04-113-14/+16
| | | | | | | | | | | | | In COFF, the immediates in IMAGE_REL_ARM64_PAGEBASE_REL21 relocations are limited to 21 bit signed, i.e. the offset has to be less than (1 << 20). The previous limit did intend to cover for this case, but had missed that the 21 bit field was signed. This fixes issue https://github.com/llvm/llvm-project/issues/54753. Differential Revision: https://reviews.llvm.org/D123160 (cherry picked from commit 8d7a17b7c8b7151b8453903db96fc7f45d9b1bae)
* [compiler-rt] [scudo] Use -mcrc32 on x86 when availableMichał Górny2022-04-119-18/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the hardware CRC32 logic in scudo to support using `-mcrc32` instead of `-msse4.2`. The CRC32 intrinsics use the former flag in the newer compiler versions, e.g. in clang since 12fa608af44a. With these versions of clang, passing `-msse4.2` is insufficient to enable the instructions and causes build failures when `-march` does not enable CRC32 implicitly: /var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/scudo_crc32.cpp:20:10: error: always_inline function '_mm_crc32_u32' requires target feature 'crc32', but would be inlined into function 'computeHardwareCRC32' that is compiled without support for 'crc32' return CRC32_INTRINSIC(Crc, Data); ^ /var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/scudo_crc32.h:27:27: note: expanded from macro 'CRC32_INTRINSIC' # define CRC32_INTRINSIC FIRST_32_SECOND_64(_mm_crc32_u32, _mm_crc32_u64) ^ /var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/../sanitizer_common/sanitizer_platform.h:132:36: note: expanded from macro 'FIRST_32_SECOND_64' # define FIRST_32_SECOND_64(a, b) (a) ^ 1 error generated. For backwards compatibility, use `-mcrc32` when available and fall back to `-msse4.2`. The `<smmintrin.h>` header remains in use as it still works and is compatible with GCC, while clang's `<crc32intrin.h>` is not. Use __builtin_ia32*() rather than _mm_crc32*() when using `-mcrc32` to preserve compatibility with GCC. _mm_crc32*() are aliases to __builtin_ia32*() in both compilers but GCC requires `-msse4.2` for the former, while both use `-mcrc32` for the latter. Originally reported in https://bugs.gentoo.org/835870. Differential Revision: https://reviews.llvm.org/D122789 (cherry picked from commit fd1da784ac644492f8ca40064baf3ef360352f55)
* [AARCH64] ssbs should be enabled by default for cortex-x1, cortex-x1c, ↵Ties Stuij2022-04-114-4/+11
| | | | | | | | cortex-a77 Reviewed By: amilendra Differential Revision: https://reviews.llvm.org/D121206
* [libc++] Define `namespace views` in its own detail header.Arthur O'Dwyer2022-04-115-10/+53
| | | | | | | | | | | Discovered in the comments on D118748: we would like this namespace to exist anytime Ranges exists, regardless of whether concepts syntax is supported. Also, we'd like to fully granularize the <ranges> header, which means not putting any loose declarations at the top level. Differential Revision: https://reviews.llvm.org/D118809 (cherry picked from commit 44cdca37c01a58da94087be8ebd0ee2bd2ba724e)
* [X86] lowerV8I16Shuffle - use explicit SmallVector<SDValue, 4> width to ↵Simon Pilgrim2022-04-061-1/+2
| | | | | | | | avoid MSVC AVX alignment bug As discussed on Issue #54645 - building llc with /AVX can result in incorrectly aligned structs (cherry picked from commit cb5c4a5917889bd12c5662c8b550cde11924d570)
* [clang-repl] Add an accessor to our underlying execution engineVassil Vassilev2022-04-063-0/+9
| | | | | | | | This patch will allow better incremental adoption of these changes in downstream cling and other users which want to experiment by customizing the execution engine. (cherry picked from commit 788e0f7f3e96a9d61c2412e452c4589e2693b79a)
* [AArch64] Use correct calling convention for each varargPhilippe Valembois2022-04-063-76/+113
| | | | | | | | | | | | | | While checking is tail call optimization is possible, the calling convention applied to fixed arguments is not the correct one. This implies for DarwinPCS that all arguments of a vararg function will go to the stack although fixed ones can go in registers. This prevents non-virtual thunks to be tail optimized although they are marked as musttail. Differential Revision: https://reviews.llvm.org/D120622 (cherry picked from commit 26cd258420c774254cc48330b1f4d23d353baf05)
* [SelectionDAG] Don't create illegally-typed nodes while constant foldingFraser Cormack2022-04-062-1/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a (seemingly very rare) crash during vector constant folding introduced in D113300. Normally, during legalization, if we create an illegally-typed node during a failed attempt at constant folding it's cleaned up before being visited, due to it having no uses. If, however, an illegally-typed node is created during one round of legalization and isn't cleaned up, it's possible for a second round of legalization to create new illegally-typed nodes which add extra uses to the old illegal nodes. This means that we can end up visiting the old nodes before they're known to be dead, at which point we crash. I'm not happy about this fix. Creating illegal types at all seems like a bad idea, but we all-too-often rely on illegal constants being successfully folded and being fixed up afterwards. However, we can't rely on constant folding actually happening, and we don't have a foolproof way of peering into the future. Perhaps the correct fix is to revisit the node-iteration order during legalization, ensuring we visit all uses of nodes before the nodes themselves. Or alternatively we could try and clean up dead nodes immediately after failing constant folding. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122382 (cherry picked from commit 43a91a8474f55241404199f6b8798ac6467c2687)
* [AArch64] Allow .variant_pcs before the symbol is registeredFangrui Song2022-04-064-23/+39
| | | | | | | | | | | | | | | | | | | | | | glibc sysdeps/aarch64/tst-vpcs-mod.S has something like: ``` .variant_pcs vpcs_call .global vpcs_call ``` This is supported by GNU as but leads to an error in MC. Use getOrCreateSymbol to support a not-yet-registered symbol: call `registerSymbol` to ensure the symbol exists even if there is no binding directive/label, to match GNU as. While here, improve tests to check (1) a local symbol can get STO_AARCH64_VARIANT_PCS (2) undefined .variant_pcs (3) an alias does not inherit STO_AARCH64_VARIANT_PCS. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122507 (cherry picked from commit cfbd5c8e4aa1ba3fc11fb408eeedbb05bd235956)
* [VectorCombine] Insert addrspacecast when crossing address space boundariesFraser Cormack2022-04-064-12/+21
| | | | | | | | | | | | | | | | We can not bitcast pointers across different address spaces. This was previously fixed in D89577 but then in D93229 an enhancement was added which peeks further through the ponter operand, opening up the possibility that address-space violations could be introduced. Instead of bailing as the previous fix did, simply insert an addrspacecast cast instruction. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D121787 (cherry picked from commit 2e44b7872bc638ed884ae4aa86e38b3b47e0b65a)
* [ELF] Fix llvm_unreachable failure when COMMON is placed in SHT_PROGBITS ↵Fangrui Song2022-04-052-42/+37
| | | | | | | | | output section Fix a regression in aa27bab5a1a17e9c4168a741a6298ecaa92c1ecb: COMMON in an SHT_PROGBITS output section caused llvm_unreachable failure. (cherry picked from commit 1db59dc8e28819b1960dae8e7fe6d79ad4b03340)
* [Object][test] Fix invalid.testFangrui Song2022-04-051-2/+3
| | | | (cherry picked from commit f7086401b7c03179b755768845956bc8e84ab266)