Re: [PATCH 3/5] KVM: x86: Use __try_cmpxchg_user() to update guest PTE A/D bits

From: Nick Desaulniers <ndesaulniers@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kernel test robot <lkp@intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	 Nathan Chancellor <nathan@kernel.org>,
	llvm@lists.linux.dev, kbuild-all@lists.01.org,
	 Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	 Jim Mattson <jmattson@google.com>,
	Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org
Subject: Re: [PATCH 3/5] KVM: x86: Use __try_cmpxchg_user() to update guest PTE A/D bits
Date: Tue, 1 Feb 2022 11:53:28 -0800	[thread overview]
Message-ID: <CAKwvOdkieUPcXgu4_GNmD_wpH5z2mzuDL+FTxRAQ4EqeF-WbvA@mail.gmail.com> (raw)
In-Reply-To: <YfmNr8OjOWvsQBKx@google.com>

On Tue, Feb 1, 2022 at 11:44 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Feb 01, 2022, kernel test robot wrote:
> >         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> >
> > All errors (new ones prefixed by >>):
> >
> >    In file included from arch/x86/kvm/mmu/mmu.c:4246:
> > >> arch/x86/kvm/mmu/paging_tmpl.h:244:9: error: invalid output size for constraint '+a'
> >                    ret = __try_cmpxchg_user(ptep_user, &orig_pte, pte, fault);
> >                          ^
> >    arch/x86/include/asm/uaccess.h:629:11: note: expanded from macro '__try_cmpxchg_user'
> >            __ret = !unsafe_try_cmpxchg_user(_ptr, _oldp, _nval, _label);   \
> >                     ^
> >    arch/x86/include/asm/uaccess.h:606:18: note: expanded from macro 'unsafe_try_cmpxchg_user'
> >            case 1: __ret = __try_cmpxchg_user_asm("b", "q",                \
> >                            ^
> >    arch/x86/include/asm/uaccess.h:467:22: note: expanded from macro '__try_cmpxchg_user_asm'
> >                           [old] "+a" (__old)                               \
>
> #$*&(#$ clang.
>
> clang isn't smart enough to avoid compiling the impossible conditions it will
> throw away in the end, i.e. it compiles all cases given:
>
>   switch (8) {
>   case 1:
>   case 2:
>   case 4:
>   case 8:
>   }

This is because Clang generally handles diagnostics, while LLVM
handles optimizations, in a pipeline, in that order. In order to know
not to emit a diagnostic, optimizations need to be run.  Do you prefer
that diagnostics only get emitted for code not optimized out? In this
case, yes, but in others folks could reasonably desire the others.
Sorry, but this is a BIG architectural difference between compilers.

(FWIW; I've personally implemented warnings in clang that don't work
that way.  The tradeoff is excessive memory usage and work during
compile time to basically pessimistically emit the equivalent of debug
info, and do all the work that entails keeping it up to date as you
transform the code, even if it may never trigger a warning.  Changing
all diagnostics to work that way would be tantamount to rewriting most
of the frontend and would be slower and use more memory at runtime).

>
> I can fudge around that by casting the pointer, which I don't think can go sideways
> if the pointer value is a signed type?
>
> @@ -604,15 +602,15 @@ extern void __try_cmpxchg_user_wrong_size(void);
>         bool __ret;                                                     \
>         switch (sizeof(*(_ptr))) {                                      \
>         case 1: __ret = __try_cmpxchg_user_asm("b", "q",                \
> -                                              (_ptr), (_oldp),         \
> +                                              (u8 *)(_ptr), (_oldp),   \
>                                                (_nval), _label);        \
>                 break;                                                  \
>         case 2: __ret = __try_cmpxchg_user_asm("w", "r",                \
> -                                              (_ptr), (_oldp),         \
> +                                              (u16 *)(_ptr), (_oldp),  \
>                                                (_nval), _label);        \
>                 break;                                                  \
>         case 4: __ret = __try_cmpxchg_user_asm("l", "r",                \
> -                                              (_ptr), (_oldp),         \
> +                                              (u32 *)(_ptr), (_oldp),  \
>                                                (_nval), _label);        \
>                 break;                                                  \
>         case 8: __ret = __try_cmpxchg64_user_asm((_ptr), (_oldp),       \
>
>
> clang also lacks the intelligence to realize that it can/should use a single
> register for encoding the memory operand and consumes both ESI and EDI, leaving
> no register for the __err "+r" param in __try_cmpxchg64_user_asm().  That can be
> avoided by open coding CC_SET and using a single output register for both the
> result and the -EFAULT error path.

-- 
Thanks,
~Nick Desaulniers