On 05/10/2019 22:49, speck for Andy Lutomirski wrote: > On 10/4/19 11:26 PM, speck for Pawan Gupta wrote: >> Transactional Synchronization Extensions (TSX) may be used on certain >> processors as part of a speculative side channel attack. A microcode >> update for existing processors that are vulnerable to this attack will >> add a new MSR, IA32_TSX_CTRL to allow the system administrator the >> option to disable TSX as one of the possible mitigations. [Note that >> future processors that are not vulnerable will also support the >> IA32_TSX_CTRL MSR]. Add defines for the new IA32_TSX_CTRL MSR and its >> bits. >> >> TSX has two sub-features: >> >> 1. Restricted Transactional Memory (RTM) is an explicitly-used feature >> where new instructions begin and end TSX transactions. >> 2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of >> "old" style locks are used by software. >> >> Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the >> IA32_TSX_CTRL MSR. >> >> There are two control bits in IA32_TSX_CTRL MSR: >> >> Bit 0: When set it disables the Restricted Transactional Memory (RTM) >> sub-feature of TSX (will force all transactions to abort on the >> XBEGIN instruction). >> >> Bit 1: When set it disables the enumeration of the RTM feature (i.e. >> it will make CPUID(EAX=7).EBX{bit11} read as 0). >> >> The other TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally >> disabled but still enumerated as present by CPUID(EAX=7).EBX{bit4}. After experimenting with the production CLX ucode: (XEN) microcode: CPU0 updated from revision 0x5000024 to 0x500002b, date = 2019-08-12 ... (XEN) CPUID.7[0].ebx before 0xd39ffffb, after 0xd39ff7eb, xor 0x00000810 Bit 1 flips both the RTM and HLE bits, which is also my understand from the discussion on the calls. > Does the kernel need to intercept CPUID to clear the HLE bit to avoid > potential performance loss? I can imagine some programs using alternate > code paths that are better in the non-HLE case. HLE is a hint-only feature saying "things may get faster if you're on TSX-capable hardware".  That said, it is not possible to implement fair spinlocks with, (due to the requirement of the lock state on exit being identical to entry), and appears to have no production users. Furthermore, it is already disabled across the board (with the feature bit still present, for migration compatibility) for previous TSX errata.  Any perf hit is already being taken. ~Andrew