All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Wheeler <kvm@lists.ewheeler.net>
To: Sean Christopherson <seanjc@google.com>
Cc: Amaan Cheval <amaan.cheval@gmail.com>,
	brak@gameservers.com, kvm@vger.kernel.org
Subject: Re: Deadlock due to EPT_VIOLATION
Date: Mon, 21 Aug 2023 18:10:40 -0700 (PDT)	[thread overview]
Message-ID: <468b1298-e43e-2397-5f3-4b6af6e2f461@ewheeler.net> (raw)
In-Reply-To: <ZOP4lwiMU2Uf89eQ@google.com>

On Mon, 21 Aug 2023, Sean Christopherson wrote:
> On Mon, Aug 21, 2023, Eric Wheeler wrote:
> > On Fri, 18 Aug 2023, Eric Wheeler wrote:
> > > On Fri, 18 Aug 2023, Sean Christopherson wrote:
> > > > On Thu, Aug 17, 2023, Eric Wheeler wrote:
> > > > > On Thu, 17 Aug 2023, Sean Christopherson wrote:
> > > > > > > > kprobe:handle_ept_violation
> > > > > > > > {
> > > > > > > > 	printf("vcpu = %lx pid = %u MMU seq = %lx, in-prog = %lx, start = %lx, end = %lx\n",
> > > > > > > > 	       arg0, ((struct kvm_vcpu *)arg0)->pid->numbers[0].nr,
> > > > > > > > 	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_seq,
> > > > > > > > 	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_in_progress,
> > > > > > > > 	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_range_start,
> > > > > > > > 	       ((struct kvm_vcpu *)arg0)->kvm->mmu_invalidate_range_end);
> > > > > > > > }
> > > > > > > > 
> > > > > > > > If you don't have BTF info, we can still use a bpf program, but to get at the
> > > > > > > > fields of interested, I think we'd have to resort to pointer arithmetic with struct
> > > > > > > > offsets grab from your build.
> > > > > > > 
> > > > > > > We have BTF, so hurray for not needing struct offsets!
> > 
> > We found a new sample in 6.1.38, right after a lockup, where _all_ log 
> > entries show inprog=1, in case that is interesting. Here is a sample, 
> > there are 500,000+ entries so let me know if you want the whole log.
> > 
> > To me, these are opaque numbers.  What do they represent?  What are you looking for in them?
> 
> inprog is '1' if there is an in-progress mmu_notifier invalidation at the time
> of the EPT violation.  start/end are the range that is being invalidated _if_
> there is an in-progress invalidation.  If a vCPU were stuck with inprog=1, then
> the most likely scenario is that there's an unpaired invalidation, i.e. something
> started an invalidation but never completed it.
> 
> seq is a sequence count that is incremented when an invalidation completes, e.g.
> if a vCPU was stuck and seq were constantly changing, then it would mean that
> the primary MMU is invalidating the same range over and over so quickly that the
> vCPU can't make forward progress.

Here is another one, I think you described exactly this: the vcpu is 
always the same, and the sequence increments, forever:

      1 ept[0] vcpu=ffff9964cdc48000 seq=80854227 inprog=1 start=7fa3183a3000 end=7fa3183a4000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854228 inprog=1 start=7fa3183a3000 end=7fa3183a4000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854229 inprog=1 start=7fa3183a4000 end=7fa3183a5000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085422a inprog=1 start=7fa3183a4000 end=7fa3183a5000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085422b inprog=1 start=7fa3183a8000 end=7fa3183a9000
      2 ept[0] vcpu=ffff9964cdc48000 seq=8085422d inprog=1 start=7fa3183a9000 end=7fa3183aa000
      1 ept[0] vcpu=ffff9964cdc48000 seq=8085422e inprog=1 start=7fa3183a9000 end=7fa3183aa000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854232 inprog=1 start=7fa3183ac000 end=7fa3183ad000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854233 inprog=1 start=7fa3183ad000 end=7fa3183ae000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854235 inprog=1 start=7fa3183ae000 end=7fa3183af000
      1 ept[0] vcpu=ffff9964cdc48000 seq=80854236 inprog=1 start=7fa3183ae000 end=7fa3183af000

Here is the whole log with 500,000+ lines over 5 minutes of recording, it 
was first stuck on one vcpu for most of the time, and toward the end it 
was stuck on a different VCPU:

The file starts with 555,596 occurances of vcpu=ffff9964cdc48000 and is 
then followed by 31,784 occurances of vcpu=ffff9934ed50c680.  As you can 
see in the file, they are not interleaved:

	https://www.linuxglobal.com/out/handle_ept_violation.log2

  # awk '{print $3}' handle_ept_violation.log2 |uniq -c
   555596 vcpu=ffff9964cdc48000
    31784 vcpu=ffff9934ed50c680

> Below is another bpftrace program that will hopefully shrink the 
> haystack to the point where we can find something via code inspection.

Ok thanks, we'll give it a try.

> Just a heads up, if we don't find a smoking gun, I likely won't be able 
> to help beyond high level guidance as we're approaching what I can do 
> with bpftrace without a significant time investment (which I can't 
> make).

I understand, I'm grateful for your help, whatever you can offer.

-Eric

> 
> ---
> struct kvm_page_fault {
> 	const gpa_t addr;
> 	const u32 error_code;
> 	const bool prefetch;
> 
> 	const bool exec;
> 	const bool write;
> 	const bool present;
> 	const bool rsvd;
> 	const bool user;
> 
> 	const bool is_tdp;
> 	const bool nx_huge_page_workaround_enabled;
> 
> 	bool huge_page_disallowed;
> 	u8 max_level;
> 
> 	u8 req_level;
> 
> 	u8 goal_level;
> 
> 	gfn_t gfn;
> 
> 	struct kvm_memory_slot *slot;
> 
> 	kvm_pfn_t pfn;
> 	unsigned long hva;
> 	bool map_writable;
> };
> 
> kprobe:handle_ept_violation
> {
> 	$vcpu = (struct kvm_vcpu *)arg0;
> 	$nr_handled = $vcpu->stat.pf_emulate + $vcpu->stat.pf_fixed + $vcpu->stat.pf_spurious;
> 
> 	if (($vcpu->stat.pf_taken / 5) > $nr_handled) {
> 		printf("PID = %u stuck, taken = %lu, emulated = %lu, fixed = %lu, spurious = %lu\n",
> 		       pid, $vcpu->stat.pf_taken, $vcpu->stat.pf_emulate, $vcpu->stat.pf_fixed, $vcpu->stat.pf_spurious);
> 	}
> }
> 
> kprobe:kvm_faultin_pfn
> {
> 	$vcpu = (struct kvm_vcpu *)arg0;
> 	$nr_handled = $vcpu->stat.pf_emulate + $vcpu->stat.pf_fixed + $vcpu->stat.pf_spurious;
> 
> 	if (($vcpu->stat.pf_taken / 5) > $nr_handled) {
> 		$fault = (struct kvm_page_fault *)arg1;
> 		$flags = 0;
> 
> 		if ($fault->slot != 0) {
> 			$flags = $fault->slot->flags;
> 		}
> 
> 		printf("PID = %u stuck, reached kvm_faultin_pfn(), slot = %lx, flags = %lx\n",
> 		       pid, arg1, $flags);
> 	}
> }
> 
> kprobe:make_mmio_spte
> {
> 	$vcpu = (struct kvm_vcpu *)arg0;
> 	$nr_handled = $vcpu->stat.pf_emulate + $vcpu->stat.pf_fixed + $vcpu->stat.pf_spurious;
> 
> 	if (($vcpu->stat.pf_taken / 5) > $nr_handled) {
> 		printf("PID %u stuck, reached make_mmio_spte()", pid);
> 	}
> }
> 
> kprobe:make_spte
> {
> 	$vcpu = (struct kvm_vcpu *)arg0;
> 	$nr_handled = $vcpu->stat.pf_emulate + $vcpu->stat.pf_fixed + $vcpu->stat.pf_spurious;
> 
> 	if (($vcpu->stat.pf_taken / 5) > $nr_handled) {
> 		printf("PID %u stuck, reached make_spte()", pid);
> 	}
> }
> 
> 

  parent reply	other threads:[~2023-08-22  1:10 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23 14:02 Deadlock due to EPT_VIOLATION Brian Rak
2023-05-23 16:22 ` Sean Christopherson
2023-05-24 13:39   ` Brian Rak
2023-05-26 16:59     ` Brian Rak
2023-05-26 21:02       ` Sean Christopherson
2023-05-30 17:35         ` Brian Rak
2023-05-30 18:36           ` Sean Christopherson
2023-05-31 17:40             ` Brian Rak
2023-07-21 14:34             ` Amaan Cheval
2023-07-21 17:37               ` Sean Christopherson
2023-07-24 12:08                 ` Amaan Cheval
2023-07-25 17:30                   ` Sean Christopherson
2023-08-02 14:21                     ` Amaan Cheval
2023-08-02 15:34                       ` Sean Christopherson
2023-08-02 16:45                         ` Amaan Cheval
2023-08-02 17:52                           ` Sean Christopherson
2023-08-08 15:34                             ` Amaan Cheval
2023-08-08 17:07                               ` Sean Christopherson
2023-08-10  0:48                                 ` Eric Wheeler
2023-08-10  1:27                                   ` Eric Wheeler
2023-08-10 23:58                                     ` Sean Christopherson
2023-08-11 12:37                                       ` Amaan Cheval
2023-08-11 18:02                                         ` Sean Christopherson
2023-08-12  0:50                                           ` Eric Wheeler
2023-08-14 17:29                                             ` Sean Christopherson
2023-08-15  0:30                                 ` Eric Wheeler
2023-08-15 16:10                                   ` Sean Christopherson
2023-08-16 23:54                                     ` Eric Wheeler
2023-08-17 18:21                                       ` Sean Christopherson
2023-08-18  0:55                                         ` Eric Wheeler
2023-08-18 14:33                                           ` Sean Christopherson
2023-08-18 23:06                                             ` Eric Wheeler
2023-08-21 20:27                                               ` Eric Wheeler
2023-08-21 23:51                                                 ` Sean Christopherson
2023-08-22  0:11                                                   ` Sean Christopherson
2023-08-22  1:10                                                   ` Eric Wheeler [this message]
2023-08-22 15:11                                                     ` Sean Christopherson
2023-08-22 21:23                                                       ` Eric Wheeler
2023-08-22 21:32                                                         ` Sean Christopherson
2023-08-23  0:39                                                       ` Eric Wheeler
2023-08-23 17:54                                                         ` Sean Christopherson
2023-08-23 19:44                                                           ` Eric Wheeler
2023-08-23 22:12                                                           ` Eric Wheeler
2023-08-23 22:32                                                             ` Eric Wheeler
2023-08-23 23:21                                                               ` Sean Christopherson
2023-08-24  0:30                                                                 ` Eric Wheeler
2023-08-24  0:52                                                                   ` Sean Christopherson
2023-08-24 23:51                                                                     ` Eric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=468b1298-e43e-2397-5f3-4b6af6e2f461@ewheeler.net \
    --to=kvm@lists.ewheeler.net \
    --cc=amaan.cheval@gmail.com \
    --cc=brak@gameservers.com \
    --cc=kvm@vger.kernel.org \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.