All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tim Deegan <Tim.Deegan@citrix.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Igor Mammedov <imammedo@redhat.com>,
	<xen-devel@lists.xensource.com>, Keir Fraser <keir@xen.org>,
	"containers@lists.linux-foundation.org" 
	<containers@lists.linux-foundation.org>,
	Li Zefan <lizf@cn.fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.cz>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Keir Fraser <keir.xen@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com>,
	Paul Menage <menage@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>
Subject: Re: [Xen-devel] Possible shadow bug (was: Re: [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world)
Date: Thu, 9 Jun 2011 16:01:33 +0100	[thread overview]
Message-ID: <20110609150133.GF5098@whitby.uk.xensource.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1106091311530.12963@kaball-desktop>

At 13:40 +0100 on 09 Jun (1307626812), Stefano Stabellini wrote:
> CC'ing xen-devel and Tim.
> 
> This is a comment from a previous email in the thread:
> 
> > It most easily reproduced only on xen hvm 32bit guest under heavy vcpus
> > contention for real cpus resources (i.e. I had to overcommit cpus and
> > run several cpu hog tasks on host to make guest crash on reboot cycle).
> > And from last experiments, crash happens only on on hosts that doesn't
> > have hap feature or if hap is disabled in hypervisor.
> 
> it makes me think that it is a shadow pagetables bug; see details below.
> You can find more details on it following this thread on the lkml.

Oh dear.  I'm having a look at the linux code now to try and understand
the behaviour.  In the meantime, what version of Xen was this on?  If
you're willing to try recompiling Xen with some small patches that
disable the "cleverer" parts of the shadow pagetable code that might
indicate something.  (Of course, it might just change the timing to
obscure a real linux bug too.)

The only time I've seen a corruption like this, with a mapping
transiently going to the wrong frame, it turned out to be caused by
32-bit pagetable-handling code writing a PAE PTE with a single 64-bit
write (which is not atomic on x86-32), and the TLB happening to see the
intermediate, half-written entry.  I doubt that there's any bug like
that in linux, though, or we'd surely have seen it before now.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

WARNING: multiple messages have this Message-ID (diff)
From: Tim Deegan <Tim.Deegan@citrix.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Igor Mammedov <imammedo@redhat.com>,
	xen-devel@lists.xensource.com, Keir Fraser <keir@xen.org>,
	"containers@lists.linux-foundation.org"
	<containers@lists.linux-foundation.org>,
	Li Zefan <lizf@cn.fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.cz>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Keir Fraser <keir.xen@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com>,
	Paul Menage <menage@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>
Subject: Re: [Xen-devel] Possible shadow bug (was: Re: [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world)
Date: Thu, 9 Jun 2011 16:01:33 +0100	[thread overview]
Message-ID: <20110609150133.GF5098@whitby.uk.xensource.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1106091311530.12963@kaball-desktop>

At 13:40 +0100 on 09 Jun (1307626812), Stefano Stabellini wrote:
> CC'ing xen-devel and Tim.
> 
> This is a comment from a previous email in the thread:
> 
> > It most easily reproduced only on xen hvm 32bit guest under heavy vcpus
> > contention for real cpus resources (i.e. I had to overcommit cpus and
> > run several cpu hog tasks on host to make guest crash on reboot cycle).
> > And from last experiments, crash happens only on on hosts that doesn't
> > have hap feature or if hap is disabled in hypervisor.
> 
> it makes me think that it is a shadow pagetables bug; see details below.
> You can find more details on it following this thread on the lkml.

Oh dear.  I'm having a look at the linux code now to try and understand
the behaviour.  In the meantime, what version of Xen was this on?  If
you're willing to try recompiling Xen with some small patches that
disable the "cleverer" parts of the shadow pagetable code that might
indicate something.  (Of course, it might just change the timing to
obscure a real linux bug too.)

The only time I've seen a corruption like this, with a mapping
transiently going to the wrong frame, it turned out to be caused by
32-bit pagetable-handling code writing a PAE PTE with a single 64-bit
write (which is not atomic on x86-32), and the TLB happening to see the
intermediate, half-written entry.  I doubt that there's any bug like
that in linux, though, or we'd surely have seen it before now.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-06-09 15:01 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-01 10:44 [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world Igor Mammedov
2011-06-01 12:39 ` Michal Hocko
2011-06-01 12:39   ` Michal Hocko
2011-06-01 13:07   ` Igor Mammedov
2011-06-01 13:07     ` Igor Mammedov
     [not found]     ` <4DE6399C.8070802-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-06-01 13:41       ` Michal Hocko
2011-06-01 13:41     ` Michal Hocko
2011-06-01 13:41       ` Michal Hocko
     [not found]       ` <20110601134149.GD4266-VqjxzfR4DlwKmadIfiO5sKVXKuFTiq87@public.gmane.org>
2011-06-01 14:39         ` Igor Mammedov
2011-06-01 14:39       ` Igor Mammedov
2011-06-01 14:39         ` Igor Mammedov
2011-06-01 15:20         ` Michal Hocko
2011-06-01 15:20           ` Michal Hocko
     [not found]           ` <20110601152039.GG4266-VqjxzfR4DlwKmadIfiO5sKVXKuFTiq87@public.gmane.org>
2011-06-01 16:42             ` Igor Mammedov
2011-06-01 16:42           ` Igor Mammedov
2011-06-01 23:10             ` Hiroyuki Kamezawa
2011-06-01 23:10               ` Hiroyuki Kamezawa
     [not found]               ` <BANLkTimbqHPeUdue=_Z31KVdPwcXtbLpeg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-06-03 12:35                 ` Igor Mammedov
2011-06-03 12:35               ` Igor Mammedov
2011-06-03 12:35                 ` Igor Mammedov
2011-06-03 13:00                 ` Hiroyuki Kamezawa
2011-06-03 13:00                   ` Hiroyuki Kamezawa
2011-06-07 13:25                   ` Igor Mammedov
2011-06-07 13:25                     ` Igor Mammedov
     [not found]                     ` <4DEE26E7.2060201-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-06-08  3:35                       ` KAMEZAWA Hiroyuki
2011-06-08  3:35                     ` KAMEZAWA Hiroyuki
2011-06-08  3:35                       ` KAMEZAWA Hiroyuki
2011-06-08 21:09                       ` Andrew Morton
2011-06-08 21:09                         ` Andrew Morton
2011-06-08 23:44                         ` KAMEZAWA Hiroyuki
2011-06-08 23:44                           ` KAMEZAWA Hiroyuki
     [not found]                         ` <20110608140951.115ab1dd.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2011-06-08 23:44                           ` KAMEZAWA Hiroyuki
2011-06-10 16:57                           ` Igor Mammedov
2011-06-10 16:57                         ` Igor Mammedov
2011-06-10 16:57                           ` Igor Mammedov
2011-07-26 21:17                           ` Andrew Morton
2011-07-26 21:17                             ` Andrew Morton
     [not found]                             ` <20110726141754.c69b96c6.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2011-07-27  7:58                               ` Michal Hocko
2011-07-27  7:58                             ` Michal Hocko
2011-07-27  7:58                               ` Michal Hocko
2011-07-27  9:30                               ` Igor Mammedov
2011-07-27  9:30                                 ` Igor Mammedov
2011-07-27  9:57                                 ` Michal Hocko
2011-07-27  9:57                                   ` Michal Hocko
     [not found]                                 ` <4E2FDAA0.5020702-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-07-27  9:57                                   ` Michal Hocko
     [not found]                               ` <20110727075845.GA4024-VqjxzfR4DlwKmadIfiO5sKVXKuFTiq87@public.gmane.org>
2011-07-27  9:30                                 ` Igor Mammedov
     [not found]                           ` <4DF24D04.1080802-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-07-26 21:17                             ` Andrew Morton
     [not found]                       ` <20110608123527.479e6991.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-06-08 21:09                         ` Andrew Morton
2011-06-09  8:11                         ` Igor Mammedov
2011-06-09  8:11                       ` Igor Mammedov
2011-06-09  8:11                         ` Igor Mammedov
2011-06-09 12:40                         ` Possible shadow bug (was: Re: [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world) Stefano Stabellini
2011-06-09 12:40                           ` Stefano Stabellini
2011-06-09 12:40                           ` Stefano Stabellini
2011-06-09 15:01                           ` [Xen-devel] " Tim Deegan
2011-06-09 15:01                           ` Tim Deegan [this message]
2011-06-09 15:01                             ` Tim Deegan
2011-06-09 16:47                             ` [Xen-devel] Possible shadow bug Igor Mammedov
2011-06-09 16:47                               ` Igor Mammedov
2011-06-10 10:01                               ` Tim Deegan
2011-06-10 10:01                                 ` Tim Deegan
2011-06-10 10:10                                 ` Tim Deegan
2011-06-10 10:10                                   ` Tim Deegan
     [not found]                                   ` <20110610101011.GH5098-uBdcGoUfBNNYtxbxJUhB2Dgeux46jI+i@public.gmane.org>
2011-06-10 11:48                                     ` Pasi Kärkkäinen
2011-06-10 13:55                                     ` Igor Mammedov
2011-06-10 11:48                                   ` Pasi Kärkkäinen
2011-06-10 11:48                                     ` Pasi Kärkkäinen
2011-06-10 12:40                                     ` Tim Deegan
2011-06-10 12:40                                       ` Tim Deegan
2011-06-10 15:38                                       ` Igor Mammedov
2011-06-10 15:38                                         ` Igor Mammedov
     [not found]                                       ` <20110610124034.GI5098-uBdcGoUfBNNYtxbxJUhB2Dgeux46jI+i@public.gmane.org>
2011-06-10 15:38                                         ` Igor Mammedov
     [not found]                                     ` <20110610114821.GB32595-GxtO3QLqHcLR7s880joybQ@public.gmane.org>
2011-06-10 12:40                                       ` Tim Deegan
2011-06-10 13:55                                   ` Igor Mammedov
2011-06-10 13:55                                     ` Igor Mammedov
     [not found]                                 ` <20110610100139.GG5098-uBdcGoUfBNNYtxbxJUhB2Dgeux46jI+i@public.gmane.org>
2011-06-10 10:10                                   ` Tim Deegan
     [not found]                               ` <4DF0F90D.4010900-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-06-10 10:01                                 ` Tim Deegan
     [not found]                             ` <20110609150133.GF5098-uBdcGoUfBNNYtxbxJUhB2Dgeux46jI+i@public.gmane.org>
2011-06-09 16:47                               ` Igor Mammedov
     [not found]                         ` <4DF0801F.9050908-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-06-09 12:40                           ` Possible shadow bug (was: Re: [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world) Stefano Stabellini
     [not found]                   ` <BANLkTinMamg_qesEffGxKu3QkT=zyQ2MRQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-06-07 13:25                     ` [PATCH] memcg: do not expose uninitialized mem_cgroup_per_node to world Igor Mammedov
     [not found]                 ` <4DE8D50F.1090406-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-06-03 13:00                   ` Hiroyuki Kamezawa
     [not found]             ` <4DE66BEB.7040502-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-06-01 23:10               ` Hiroyuki Kamezawa
     [not found]         ` <4DE64F0C.3050203-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-06-01 15:20           ` Michal Hocko
2011-06-01 13:49   ` Igor Mammedov
2011-06-01 13:49     ` Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110609150133.GF5098@whitby.uk.xensource.com \
    --to=tim.deegan@citrix.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=imammedo@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kamezawa.hiroyuki@gmail.com \
    --cc=keir.xen@gmail.com \
    --cc=keir@xen.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=mhocko@suse.cz \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.