All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brad Campbell <lists2009@fnarfbargle.com>,
	Borislav Petkov <bp@alien8.de>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	linux-mm <linux-mm@kvack.org>
Subject: Re: KVM induced panic on 2.6.38[2367] & 2.6.39
Date: Tue, 31 May 2011 21:52:56 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.00.1105312120530.22808@sister.anvils> (raw)
In-Reply-To: <20110601011527.GN19505@random.random>

On Wed, 1 Jun 2011, Andrea Arcangeli wrote:
> On Wed, Jun 01, 2011 at 08:37:25AM +0800, Brad Campbell wrote:
> > On 01/06/11 06:31, Hugh Dickins wrote:
> > > Brad, my suspicion is that in each case the top 16 bits of RDX have been
> > > mysteriously corrupted from ffff to 0000, causing the general protection
> > > faults.  I don't understand what that has to do with KSM.
> > >
> > > But it's only a suspicion, because I can't make sense of the "Code:"
> > > lines in your traces, they have more than the expected 64 bytes, and
> > > only one of them has a ">" (with no"<") to mark faulting instruction.
> > >
> > > I did try compiling the 2.6.39 kernel from your config, but of course
> > > we have different compilers, so although I got close, it wasn't exact.
> > >
> > > Would you mind mailing me privately (it's about 73MB) the "objdump -trd"
> > > output for your original vmlinux (with KSM on)?  (Those -trd options are
> > > the ones I'm used to typing, I bet not they're not all relevant.)
> > >
> > > Of course, it's only a tiny fraction of that output that I need,
> > > might be better to cut it down to remove_rmap_item_from_tree and
> > > dup_fd and ksm_scan_thread, if you have the time to do so.
> > 
> > Would you believe about 20 seconds after I pressed send the kernel oopsed.
> > 
> > http://www.fnarfbargle.com/private/003_kernel_oops/
> > 
> > oops reproduced here, but an un-munged version is in that directory 
> > alongside the kernel.
> > 
> > [36542.880228] general protection fault: 0000 [#1] SMP
> 
> Reminds me of another oops that was reported on the kvm list for
> 2.6.38.1 with message id 4D8C6110.6090204. There the top 16 bits of
> rsi were flipped and it was a general protection too because of
> hitting on the not mappable virtual range.
> 
> http://www.virtall.com/files/temp/kvm.txt
> http://www.virtall.com/files/temp/config-2.6.38.1
> http://virtall.com/files/temp/mmu-objdump.txt
> 
> That oops happened in kvm_unmap_rmapp though, but it looked memory
> corruption (Avi suggested use after free) but it was a production
> system so we couldn't debug it further.
> 
> I recommend next thing to reproduce again with 2.6.39 or
> 3.0.0-rc1. Let's fix your scsi trouble if needed but it's better you
> test with 2.6.39.

Brad, thanks for this and the other further crash, with vmlinux etc:
very helpful info.

Andrea, I'm pretty sure you're right to connect Brad's report with
the one above.

In four out of five of Brad's reports (cannot tell in the fifth),
the bad pointer (with top 16 bits 0000 instead of ffff) had been
loaded from SLUB memory at an address offset 0x7f8 (1 case) or
0xff8 (3 cases) i.e. it's the short at 0x7fe or 0xffe that has
been zeroed.

No reason to suspect KSM's rmap_item code, or file table handling:
they just seem to be the victims of corruption from elsewhere.

I notice %rax and %rsi, the corrupted pointer in your kvm.txt
case, is itself a ...7f8 address; and %r13 an ...ff8 address.
I've not even glanced at the code, but I wonder if that implies
that KVM is close to the origin of the corruption.

I doubt I'll be able to spend more time on this, hope you can
take over.

I guess Brad could try SLUB debugging, boot with slub_debug=P
for poisoning perhaps; though it might upset alignments and
drive the problem underground.  Or see if the same happens
with SLAB instead of SLUB.

But I rather hope that you or someone will understand the 7fe clue.

Thanks,
Hugh

WARNING: multiple messages have this Message-ID (diff)
From: Hugh Dickins <hughd@google.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brad Campbell <lists2009@fnarfbargle.com>,
	Borislav Petkov <bp@alien8.de>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	linux-mm <linux-mm@kvack.org>
Subject: Re: KVM induced panic on 2.6.38[2367] & 2.6.39
Date: Tue, 31 May 2011 21:52:56 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.00.1105312120530.22808@sister.anvils> (raw)
In-Reply-To: <20110601011527.GN19505@random.random>

On Wed, 1 Jun 2011, Andrea Arcangeli wrote:
> On Wed, Jun 01, 2011 at 08:37:25AM +0800, Brad Campbell wrote:
> > On 01/06/11 06:31, Hugh Dickins wrote:
> > > Brad, my suspicion is that in each case the top 16 bits of RDX have been
> > > mysteriously corrupted from ffff to 0000, causing the general protection
> > > faults.  I don't understand what that has to do with KSM.
> > >
> > > But it's only a suspicion, because I can't make sense of the "Code:"
> > > lines in your traces, they have more than the expected 64 bytes, and
> > > only one of them has a ">" (with no"<") to mark faulting instruction.
> > >
> > > I did try compiling the 2.6.39 kernel from your config, but of course
> > > we have different compilers, so although I got close, it wasn't exact.
> > >
> > > Would you mind mailing me privately (it's about 73MB) the "objdump -trd"
> > > output for your original vmlinux (with KSM on)?  (Those -trd options are
> > > the ones I'm used to typing, I bet not they're not all relevant.)
> > >
> > > Of course, it's only a tiny fraction of that output that I need,
> > > might be better to cut it down to remove_rmap_item_from_tree and
> > > dup_fd and ksm_scan_thread, if you have the time to do so.
> > 
> > Would you believe about 20 seconds after I pressed send the kernel oopsed.
> > 
> > http://www.fnarfbargle.com/private/003_kernel_oops/
> > 
> > oops reproduced here, but an un-munged version is in that directory 
> > alongside the kernel.
> > 
> > [36542.880228] general protection fault: 0000 [#1] SMP
> 
> Reminds me of another oops that was reported on the kvm list for
> 2.6.38.1 with message id 4D8C6110.6090204. There the top 16 bits of
> rsi were flipped and it was a general protection too because of
> hitting on the not mappable virtual range.
> 
> http://www.virtall.com/files/temp/kvm.txt
> http://www.virtall.com/files/temp/config-2.6.38.1
> http://virtall.com/files/temp/mmu-objdump.txt
> 
> That oops happened in kvm_unmap_rmapp though, but it looked memory
> corruption (Avi suggested use after free) but it was a production
> system so we couldn't debug it further.
> 
> I recommend next thing to reproduce again with 2.6.39 or
> 3.0.0-rc1. Let's fix your scsi trouble if needed but it's better you
> test with 2.6.39.

Brad, thanks for this and the other further crash, with vmlinux etc:
very helpful info.

Andrea, I'm pretty sure you're right to connect Brad's report with
the one above.

In four out of five of Brad's reports (cannot tell in the fifth),
the bad pointer (with top 16 bits 0000 instead of ffff) had been
loaded from SLUB memory at an address offset 0x7f8 (1 case) or
0xff8 (3 cases) i.e. it's the short at 0x7fe or 0xffe that has
been zeroed.

No reason to suspect KSM's rmap_item code, or file table handling:
they just seem to be the victims of corruption from elsewhere.

I notice %rax and %rsi, the corrupted pointer in your kvm.txt
case, is itself a ...7f8 address; and %r13 an ...ff8 address.
I've not even glanced at the code, but I wonder if that implies
that KVM is close to the origin of the corruption.

I doubt I'll be able to spend more time on this, hope you can
take over.

I guess Brad could try SLUB debugging, boot with slub_debug=P
for poisoning perhaps; though it might upset alignments and
drive the problem underground.  Or see if the same happens
with SLAB instead of SLUB.

But I rather hope that you or someone will understand the 7fe clue.

Thanks,
Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-06-01  4:53 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-31  1:24 KVM induced panic on 2.6.38[2367] & 2.6.39 Brad Campbell
2011-05-31  5:47 ` Borislav Petkov
2011-05-31  5:47   ` Borislav Petkov
2011-05-31  9:26   ` Brad Campbell
2011-05-31  9:26     ` Brad Campbell
2011-05-31 10:38     ` Borislav Petkov
2011-05-31 10:38       ` Borislav Petkov
2011-05-31 14:24       ` Brad Campbell
2011-05-31 14:24         ` Brad Campbell
2011-05-31 14:24         ` Brad Campbell
2011-05-31 22:31         ` Hugh Dickins
2011-05-31 22:31           ` Hugh Dickins
2011-06-01  0:18           ` Brad Campbell
2011-06-01  0:18             ` Brad Campbell
2011-06-01  0:37           ` Brad Campbell
2011-06-01  0:37             ` Brad Campbell
2011-06-01  1:15             ` Andrea Arcangeli
2011-06-01  1:15               ` Andrea Arcangeli
2011-06-01  2:03               ` Brad Campbell
2011-06-01  2:03                 ` Brad Campbell
2011-06-01  4:52               ` Hugh Dickins [this message]
2011-06-01  4:52                 ` Hugh Dickins
2011-06-01  6:31                 ` Brad Campbell
2011-06-01  6:31                   ` Brad Campbell
2011-06-01  6:56                   ` Avi Kivity
2011-06-01  6:56                     ` Avi Kivity
2011-06-01  9:29                     ` Brad Campbell
2011-06-01  9:29                       ` Brad Campbell
2011-06-01  9:29                       ` Brad Campbell
2011-06-01  9:40                       ` Avi Kivity
2011-06-01  9:40                         ` Avi Kivity
2011-06-01  9:41                         ` Avi Kivity
2011-06-01  9:41                           ` Avi Kivity
2011-06-01 10:53                           ` Brad Campbell
2011-06-01 10:53                             ` Brad Campbell
2011-06-01 11:09                             ` Avi Kivity
2011-06-01 11:09                               ` Avi Kivity
2011-06-01 11:18                             ` CaT
2011-06-01 11:18                               ` CaT
2011-06-01 11:52                               ` Brad Campbell
2011-06-01 11:52                                 ` Brad Campbell
2011-06-01 23:03                                 ` CaT
2011-06-01 23:03                                   ` CaT
2011-06-03 13:38                                   ` Brad Campbell
2011-06-03 13:38                                     ` Brad Campbell
2011-06-03 15:50                                     ` Bernhard Held
2011-06-03 15:50                                       ` Bernhard Held
2011-06-03 15:50                                       ` Bernhard Held
2011-06-03 16:07                                       ` Brad Campbell
2011-06-03 16:07                                         ` Brad Campbell
2011-06-06 20:10                                         ` Bart De Schuymer
2011-06-06 20:10                                           ` Bart De Schuymer
2011-06-06 20:23                                           ` Eric Dumazet
2011-06-06 20:23                                             ` Eric Dumazet
2011-06-06 20:23                                             ` Eric Dumazet
2011-06-07  3:33                                           ` Brad Campbell
2011-06-07  3:33                                             ` Brad Campbell
2011-06-07 13:30                                             ` Patrick McHardy
2011-06-07 13:30                                               ` Patrick McHardy
2011-06-07 14:40                                               ` Brad Campbell
2011-06-07 14:40                                                 ` Brad Campbell
2011-06-07 15:35                                                 ` Patrick McHardy
2011-06-07 15:35                                                   ` Patrick McHardy
2011-06-07 18:31                                                   ` Eric Dumazet
2011-06-07 18:31                                                     ` Eric Dumazet
2011-06-07 18:31                                                     ` Eric Dumazet
2011-06-07 22:57                                                     ` Patrick McHardy
2011-06-07 22:57                                                       ` Patrick McHardy
2011-06-07 22:57                                                       ` Patrick McHardy
2011-06-08  0:18                                                       ` Brad Campbell
2011-06-08  0:18                                                         ` Brad Campbell
2011-06-08  0:18                                                         ` Brad Campbell
2011-06-08  3:59                                                         ` Eric Dumazet
2011-06-08  3:59                                                           ` Eric Dumazet
2011-06-08  3:59                                                           ` Eric Dumazet
2011-06-08 17:02                                                           ` Brad Campbell
2011-06-08 17:02                                                             ` Brad Campbell
2011-06-08 17:02                                                             ` Brad Campbell
2011-06-08 21:22                                                             ` Eric Dumazet
2011-06-08 21:22                                                               ` Eric Dumazet
2011-06-08 21:22                                                               ` Eric Dumazet
2011-06-10  2:52                                                             ` Simon Horman
2011-06-10  2:52                                                               ` Simon Horman
2011-06-10 12:37                                                               ` Mark Lord
2011-06-10 12:37                                                                 ` Mark Lord
2011-06-10 16:43                                                                 ` Henrique de Moraes Holschuh
2011-06-10 16:43                                                                   ` Henrique de Moraes Holschuh
2011-06-12 15:38                                                               ` Avi Kivity
2011-06-12 15:38                                                                 ` Avi Kivity
2011-06-07 23:43                                                   ` Brad Campbell
2011-06-07 23:43                                                     ` Brad Campbell
2011-06-07 18:04                                                 ` Bart De Schuymer
2011-06-07 18:04                                                   ` Bart De Schuymer
2011-06-08  0:15                                                   ` Brad Campbell
2011-06-08  0:15                                                     ` Brad Campbell
2011-06-05  8:14                                     ` Avi Kivity
2011-06-05  8:14                                       ` Avi Kivity
2011-06-05 13:45                                       ` Brad Campbell
2011-06-05 13:45                                         ` Brad Campbell
2011-06-05 13:58                                         ` Avi Kivity
2011-06-05 13:58                                           ` Avi Kivity
2011-06-06 20:22                                         ` Eric Dumazet
2011-06-06 20:22                                           ` Eric Dumazet
2011-06-06 20:22                                           ` Eric Dumazet
2011-06-07 13:27                                           ` Brad Campbell
2011-06-07 13:37                                             ` Eric Dumazet
2011-06-07 15:15                                               ` Brad Campbell
2011-08-20 13:16                                               ` Brad Campbell
2011-08-22  6:36                                                 ` Avi Kivity
2011-08-22  6:45                                                   ` Eric Dumazet
2011-08-22 11:45                                                     ` Brad Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.00.1105312120530.22808@sister.anvils \
    --to=hughd@google.com \
    --cc=aarcange@redhat.com \
    --cc=bp@alien8.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lists2009@fnarfbargle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.