On Mon, 8 Apr 2024 at 11:32, Linus Torvalds wrote: > > It's been reported long ago, it seems to be hard to fix. > > I suspect the issue is that the inline asm format is fairly closely > related to the gcc machine descriptions (look at the machine > descriptor files in gcc, and if you can ignore the horrid LISP-style > syntax you see how close they are). Actually, one of the github issues pages has more of an explanation (and yes, it's tied to impedance issues between the inline asm syntax and how clang works): https://github.com/llvm/llvm-project/issues/20571#issuecomment-980933442 so I wrote more of a commit log and did that "ASM_SOURCE_G" thing (except I decided to call it "input" instead of "source", since that's the standard inline asm language). This version also has that output size fixed, and the commit message talks about it. This does *not* fix other inline asms to use "ASM_INPUT_G/RM". I think it's mainly some of the bitop code that people have noticed before - fls and variable_ffs() and friends. I suspect clang is more common in the arm64 world than it is for x86-64 kernel developers, and arm64 inline asm basically never uses "rm" or "g" since arm64 doesn't have instructions that take either a register or a memory operand. Anyway, with gcc this generates cmp (%rdx),%ebx; sbb %rax,%rax # _7->max_fds, fd, __mask IOW, it uses the memory location for "max_fds". It couldn't do that before, because it used to think that it always had to do the compare in 64 bits, and the memory location is only 32-bit. With clang, this generates movl (%rcx), %eax cmpl %eax, %edi sbbq %rdi, %rdi which has that extra register use, but is at least much better than what it used to generate with crazy "load into register, spill to stack, then compare against stack contents". Linus