All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Wheeler <kvm@lists.ewheeler.net>
To: Sean Christopherson <seanjc@google.com>
Cc: Amaan Cheval <amaan.cheval@gmail.com>,
	brak@gameservers.com, kvm@vger.kernel.org
Subject: Re: Deadlock due to EPT_VIOLATION
Date: Mon, 21 Aug 2023 13:27:25 -0700 (PDT)	[thread overview]
Message-ID: <5fc6cea-9f51-582c-8bb3-21e0b4bf397@ewheeler.net> (raw)
In-Reply-To: <418345e5-a3e5-6e8d-395a-f5551ea13e2@ewheeler.net>

[-- Attachment #1: Type: text/plain, Size: 13628 bytes --]

On Fri, 18 Aug 2023, Eric Wheeler wrote:
> On Fri, 18 Aug 2023, Sean Christopherson wrote:
> > On Thu, Aug 17, 2023, Eric Wheeler wrote:
> > > On Thu, 17 Aug 2023, Sean Christopherson wrote:
> > > > > > kprobe:handle_ept_violation
> > > > > > {
> > > > > > 	printf("vcpu = %lx pid = %u MMU seq = %lx, in-prog = %lx, start = %lx, end = %lx\n",
> > > > > > 	       arg0, ((struct kvm_vcpu *)arg0)->pid->numbers[0].nr,
> > > > > > 	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_seq,
> > > > > > 	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_in_progress,
> > > > > > 	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_range_start,
> > > > > > 	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_range_end);
> > > > > > }
> > > > > > 
> > > > > > If you don't have BTF info, we can still use a bpf program, but to get at the
> > > > > > fields of interested, I think we'd have to resort to pointer arithmetic with struct
> > > > > > offsets grab from your build.
> > > > > 
> > > > > We have BTF, so hurray for not needing struct offsets!

We found a new sample in 6.1.38, right after a lockup, where _all_ log 
entries show inprog=1, in case that is interesting. Here is a sample, 
there are 500,000+ entries so let me know if you want the whole log.

To me, these are opaque numbers.  What do they represent?  What are you looking for in them?

      1 ept[0] vcpu=ffff9964cdc48000 seq=80854227 inprog=1 start=7fa3183a3000 end=7fa3183a4000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854228 inprog=1 start=7fa3183a3000 end=7fa3183a4000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854229 inprog=1 start=7fa3183a4000 end=7fa3183a5000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085422a inprog=1 start=7fa3183a4000 end=7fa3183a5000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085422b inprog=1 start=7fa3183a8000 end=7fa3183a9000
      2 ept[0] vcpu=ffff9964cdc48000 seq=8085422d inprog=1 start=7fa3183a9000 end=7fa3183aa000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085422e inprog=1 start=7fa3183a9000 end=7fa3183aa000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854232 inprog=1 start=7fa3183ac000 end=7fa3183ad000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854233 inprog=1 start=7fa3183ad000 end=7fa3183ae000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854235 inprog=1 start=7fa3183ae000 end=7fa3183af000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854236 inprog=1 start=7fa3183ae000 end=7fa3183af000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854237 inprog=1 start=7fa3183b1000 end=7fa3183b2000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854239 inprog=1 start=7fa3183b3000 end=7fa3183b4000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085423a inprog=1 start=7fa3183b3000 end=7fa3183b4000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085423d inprog=1 start=7fa3183b7000 end=7fa3183b8000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085423e inprog=1 start=7fa3183b7000 end=7fa3183b8000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085423f inprog=1 start=7fa3183b8000 end=7fa3183b9000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854240 inprog=1 start=7fa3183b8000 end=7fa3183b9000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854241 inprog=1 start=7fa3183b9000 end=7fa3183ba000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854242 inprog=1 start=7fa3183b9000 end=7fa3183ba000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854243 inprog=1 start=7fa3183ba000 end=7fa3183bb000
      2 ept[0] vcpu=ffff9964cdc48000 seq=80854244 inprog=1 start=7fa3183ba000 end=7fa3183bb000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854246 inprog=1 start=7fa3183bb000 end=7fa3183bc000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854247 inprog=1 start=7fa3183bc000 end=7fa3183bd000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854248 inprog=1 start=7fa3183bc000 end=7fa3183bd000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854249 inprog=1 start=7fa3183bd000 end=7fa3183be000
      2 ept[0] vcpu=ffff9964cdc48000 seq=8085424b inprog=1 start=7fa3183be000 end=7fa3183bf000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085424c inprog=1 start=7fa3183be000 end=7fa3183bf000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085424e inprog=1 start=7fa3183bf000 end=7fa3183c0000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854250 inprog=1 start=7fa3183c0000 end=7fa3183c1000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854251 inprog=1 start=7fa3183c1000 end=7fa3183c2000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854255 inprog=1 start=7fa3183c5000 end=7fa3183c6000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854256 inprog=1 start=7fa3183c5000 end=7fa3183c6000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854257 inprog=1 start=7fa3183c6000 end=7fa3183c7000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854259 inprog=1 start=7fa3183c7000 end=7fa3183c8000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085425a inprog=1 start=7fa3183c7000 end=7fa3183c8000
      2 ept[0] vcpu=ffff9964cdc48000 seq=8085425b inprog=1 start=7fa3183c8000 end=7fa3183c9000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085425c inprog=1 start=7fa3183c8000 end=7fa3183c9000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085425e inprog=1 start=7fa3183c9000 end=7fa3183ca000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854260 inprog=1 start=7fa3183ca000 end=7fa3183cb000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854261 inprog=1 start=7fa3183cb000 end=7fa3183cc000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854262 inprog=1 start=7fa3183cb000 end=7fa3183cc000
      2 ept[0] vcpu=ffff9964cdc48000 seq=80854263 inprog=1 start=7fa3183ce000 end=7fa3183cf000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854264 inprog=1 start=7fa3183ce000 end=7fa3183cf000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854265 inprog=1 start=7fa3183cf000 end=7fa3183d0000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854266 inprog=1 start=7fa3183cf000 end=7fa3183d0000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854268 inprog=1 start=7fa3183d4000 end=7fa3183d5000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085426b inprog=1 start=7fa3183d6000 end=7fa3183d7000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085426c inprog=1 start=7fa3183d6000 end=7fa3183d7000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085426d inprog=1 start=7fa3183d7000 end=7fa3183d8000

Thanks,

-Eric


> 
>       1 ept[0] vcpu=ffff9c436d26c680 seq=32524620 inprog=1 start=7f32477d7000 end=7f32477d8000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=32524aee inprog=1 start=7f3252209000 end=7f325220a000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=32527895 inprog=1 start=7f329504d000 end=7f329504e000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=325279eb inprog=1 start=7f3296f00000 end=7f3296f01000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=325279f5 inprog=1 start=7f3296fae000 end=7f3296faf000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=32527b4d inprog=1 start=7f329937e000 end=7f329937f000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=32525ef6 inprog=1 start=7f3272503000 end=7f3272504000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=32526517 inprog=1 start=7f327a568000 end=7f327a569000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=325268e8 inprog=1 start=7f327e4a4000 end=7f327e4a5000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=32527543 inprog=1 start=7f328f8ca000 end=7f328f8cb000
>       1 ept[0] vcpu=ffff9c43d5618000 seq=1c861ab6 inprog=1 start=7fb4c67de000 end=7fb4c67df000
>       1 ept[0] vcpu=ffff9c43d5618000 seq=1c862600 inprog=1 start=7fb48c132000 end=7fb48c133000
>       1 ept[0] vcpu=ffff9c43d5618000 seq=1c862a8b inprog=1 start=7fb4f06b8000 end=7fb4f06b9000
>       1 ept[0] vcpu=ffff9c43d5618000 seq=1c862b9f inprog=1 start=7fb4f1861000 end=7fb4f1862000
>       1 ept[0] vcpu=ffff9c43d5618000 seq=1c862d33 inprog=1 start=7fb4e72f5000 end=7fb4e72f6000
>       1 ept[0] vcpu=ffff9c43d5618000 seq=1c86415c inprog=1 start=7fb49fb5a000 end=7fb49fb5b000
>       1 ept[0] vcpu=ffff9c43d5618000 seq=1c864162 inprog=1 start=7fb49fb59000 end=7fb49fb5a000
>       1 ept[0] vcpu=ffff9c533e1dc680 seq=1c862ba1 inprog=1 start=7fb4f0e24000 end=7fb4f0e25000
>       1 ept[0] vcpu=ffff9c533e1dc680 seq=1c862bab inprog=1 start=7fb4f0e26000 end=7fb4f0e27000
>       1 ept[0] vcpu=ffff9c533e1dc680 seq=1c862bb1 inprog=1 start=7fb4f0e27000 end=7fb4f0e28000
>       1 ept[0] vcpu=ffff9c533e1dc680 seq=1c862cbd inprog=1 start=7fb4efffd000 end=7fb4efffe000
>       1 ept[0] vcpu=ffff9c533e1dc680 seq=1c862cc4 inprog=1 start=7fb4f0692000 end=7fb4f0693000
>       1 ept[0] vcpu=ffff9c533e1dc680 seq=1c862d32 inprog=1 start=7fb4dd282000 end=7fb4dd283000
>       1 ept[0] vcpu=ffff9c533e1dc680 seq=1c862d36 inprog=1 start=7fb4e8e97000 end=7fb4e8e98000
>       1 ept[0] vcpu=ffff9c436d26c680 seq=3252adeb inprog=1 start=7f326209b000 end=7f326209c000
> 
> The entire dump is 22,687 lines if you want to see it, here (expires in 1 week):
> 
> 	https://privatebin.net/?9a3bff6b6fd2566f#BHjrt4NGpoXL12NWiUDpThifi9E46LNXCy7eWzGXgqYx
> 
> > > 
> > > What is involved in doing this with struct offsets for Linux v6.1.x?
> > 
> > Unless you are up for a challenge, I'd drop the PID entirely, getting that will
> > be ugly.
> > 
> > For the KVM info, you need the offset of "kvm" within struct kvm_vcpu (more than
> > likely it's '0'), and then the offset of each of the mmu_invaliate_* fields within
> > struct kvm.  These need to come from the exact kernel you're running, though unless
> > a field is added/removed to/from struct kvm between kernel versions, the offsets
> > should be stable.
> > 
> > A cheesy/easy way to get the offsets is to feed offsetof() into __aligned and
> > then compile.  So long as the offset doesn't happen to be a power-of-2, the
> > compiler will yell.  E.g. with this
> > 
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 92c50dc159e8..04ec37f7374a 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -543,7 +543,13 @@ struct kvm_hva_range {
> >   */
> >  static void kvm_null_fn(void)
> >  {
> > +       int v __aligned(offsetof(struct kvm_vcpu, kvm));
> > +       int w __aligned(offsetof(struct kvm, mmu_invalidate_seq));
> > +       int x __aligned(offsetof(struct kvm, mmu_invalidate_in_progress));
> > +       int y __aligned(offsetof(struct kvm, mmu_invalidate_range_start));
> > +       int z __aligned(offsetof(struct kvm, mmu_invalidate_range_end));
> >  
> > +       v = w = x = y = z = 0;
> >  }
> >  #define IS_KVM_NULL_FN(fn) ((fn) == (void *)kvm_null_fn)
> > 
> > I get yelled at with (trimmed):
> > 
> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:546:34: error: requested alignment ‘0’ is not a positive power of 2 [-Werror=attributes]
> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:547:20: error: requested alignment ‘36960’ is not a positive power of 2
> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:549:20: error: requested alignment ‘36968’ is not a positive power of 2
> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:551:20: error: requested alignment ‘36976’ is not a positive power of 2
> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:553:20: error: requested alignment ‘36984’ is not a positive power of 2
> 
> Neat trick.
> 
> So here are my numbers:
> 
> # make modules  KDIR=virt 2>&1 | grep -A1 alignment |grep -v ^-
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:568:40: error: requested alignment ‘0’ is not a positive power of 2 [-Werror=attributes]
>   568 |        int v __aligned(offsetof(struct kvm_vcpu, kvm));
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:569:40: error: requested alignment ‘39552’ is not a positive power of 2
>   569 |        int w __aligned(offsetof(struct kvm, mmu_invalidate_seq));
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:570:40: error: requested alignment ‘39560’ is not a positive power of 2
>   570 |        int x __aligned(offsetof(struct kvm, mmu_invalidate_in_progress));
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:571:40: error: requested alignment ‘39568’ is not a positive power of 2
>   571 |        int y __aligned(offsetof(struct kvm, mmu_invalidate_range_start));
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:572:40: error: requested alignment ‘39576’ is not a positive power of 2
>   572 |        int z __aligned(offsetof(struct kvm, mmu_invalidate_range_end));
> 
> and the resulting script:
> 	kprobe:handle_ept_violation
> 	{
> 		$kvm = *((uint64 *)((uint64)arg0 + 0));
> 
> 		printf("vcpu=%08lx seq=%08lx inprog=%lx start=%08lx end=%08lx\n",
> 			arg0, 
> 		       *((uint64 *)($kvm + 39552)),
> 		       *((uint64 *)($kvm + 39560)),
> 		       *((uint64 *)($kvm + 39568)),
> 		       *((uint64 *)($kvm + 39576))
> 		       );
> 	}
> 
> ... but the output shows all 0's except vcpu:
> 
> 	# bpftrace ./handle_ept_violation.bt |grep ^vcpu | uniq -c
> 	     11 vcpu=ffff9d518541c680 seq=00000000 inprog=0 start=00000000 end=00000000
> 	     29 vcpu=ffff9d80cc120000 seq=00000000 inprog=0 start=00000000 end=00000000
> 	    331 vcpu=ffff9d5f1d1a2340 seq=00000000 inprog=0 start=00000000 end=00000000
> 	    858 vcpu=ffff9d80c7b98000 seq=00000000 inprog=0 start=00000000 end=00000000
> 	   2183 vcpu=ffff9d6033fb2340 seq=00000000 inprog=0 start=00000000 end=00000000
> 
> Did I do something wrong here?
> 
> -Eric
> 
> > 
> > Then take those offsets and do math.  For me, this provides the same output as
> > the above pretty version.  Just use common sense and verify you're getting sane
> > data.
> > 
> > kprobe:handle_ept_violation
> > {
> > 	$kvm = *((uint64 *)((uint64)arg0 + 0));
> > 
> > 	printf("vcpu = %lx MMU seq = %lx, in-prog = %lx, start = %lx, end = %lx\n",
> > 	       arg0,
> >                *((uint64 *)($kvm + 36960)),
> >                *((uint64 *)($kvm + 36968)),
> >                *((uint64 *)($kvm + 36976)),
> >                *((uint64 *)($kvm + 36984)));
> > }
> > 
> > 
> 
> 
> 
> 
> 
> 
> --
> Eric Wheeler
> 
> 

  reply	other threads:[~2023-08-21 20:27 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23 14:02 Deadlock due to EPT_VIOLATION Brian Rak
2023-05-23 16:22 ` Sean Christopherson
2023-05-24 13:39   ` Brian Rak
2023-05-26 16:59     ` Brian Rak
2023-05-26 21:02       ` Sean Christopherson
2023-05-30 17:35         ` Brian Rak
2023-05-30 18:36           ` Sean Christopherson
2023-05-31 17:40             ` Brian Rak
2023-07-21 14:34             ` Amaan Cheval
2023-07-21 17:37               ` Sean Christopherson
2023-07-24 12:08                 ` Amaan Cheval
2023-07-25 17:30                   ` Sean Christopherson
2023-08-02 14:21                     ` Amaan Cheval
2023-08-02 15:34                       ` Sean Christopherson
2023-08-02 16:45                         ` Amaan Cheval
2023-08-02 17:52                           ` Sean Christopherson
2023-08-08 15:34                             ` Amaan Cheval
2023-08-08 17:07                               ` Sean Christopherson
2023-08-10  0:48                                 ` Eric Wheeler
2023-08-10  1:27                                   ` Eric Wheeler
2023-08-10 23:58                                     ` Sean Christopherson
2023-08-11 12:37                                       ` Amaan Cheval
2023-08-11 18:02                                         ` Sean Christopherson
2023-08-12  0:50                                           ` Eric Wheeler
2023-08-14 17:29                                             ` Sean Christopherson
2023-08-15  0:30                                 ` Eric Wheeler
2023-08-15 16:10                                   ` Sean Christopherson
2023-08-16 23:54                                     ` Eric Wheeler
2023-08-17 18:21                                       ` Sean Christopherson
2023-08-18  0:55                                         ` Eric Wheeler
2023-08-18 14:33                                           ` Sean Christopherson
2023-08-18 23:06                                             ` Eric Wheeler
2023-08-21 20:27                                               ` Eric Wheeler [this message]
2023-08-21 23:51                                                 ` Sean Christopherson
2023-08-22  0:11                                                   ` Sean Christopherson
2023-08-22  1:10                                                   ` Eric Wheeler
2023-08-22 15:11                                                     ` Sean Christopherson
2023-08-22 21:23                                                       ` Eric Wheeler
2023-08-22 21:32                                                         ` Sean Christopherson
2023-08-23  0:39                                                       ` Eric Wheeler
2023-08-23 17:54                                                         ` Sean Christopherson
2023-08-23 19:44                                                           ` Eric Wheeler
2023-08-23 22:12                                                           ` Eric Wheeler
2023-08-23 22:32                                                             ` Eric Wheeler
2023-08-23 23:21                                                               ` Sean Christopherson
2023-08-24  0:30                                                                 ` Eric Wheeler
2023-08-24  0:52                                                                   ` Sean Christopherson
2023-08-24 23:51                                                                     ` Eric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5fc6cea-9f51-582c-8bb3-21e0b4bf397@ewheeler.net \
    --to=kvm@lists.ewheeler.net \
    --cc=amaan.cheval@gmail.com \
    --cc=brak@gameservers.com \
    --cc=kvm@vger.kernel.org \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.