Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	linux-edac@vger.kernel.org, linux-mm@kvack.org,
	linux-nvdimm@lists.01.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Jane Chu <jane.chu@oracle.com>
Subject: Re: [RFC] Make the memory failure blast radius more precise
Date: Tue, 23 Jun 2020 15:26:58 -0700
Message-ID: <20200623222658.GA21817@agluck-desk2.amr.corp.intel.com> (raw)
In-Reply-To: <20200623221741.GH21350@casper.infradead.org>

On Tue, Jun 23, 2020 at 11:17:41PM +0100, Matthew Wilcox wrote:
> It might also be nice to have an madvise() MADV_ZERO option so the
> application doesn't have to look up the fd associated with that memory
> range, but we haven't floated that idea with the customer yet; I just
> thought of it now.

So the conversation between OS and kernel goes like this?

1) machine check
2) Kernel unmaps the 4K page surroundinng the poison and sends
   SIGBUS to the application to say that one cache line is gone
3) App says madvise(MADV_ZERO, that cache line)
4) Kernel says ... "oh, you know how to deal with this" and allocates
   a new page, copying the 63 good cache lines from the old page and
   zeroing the missing one. New page is mapped to user.

Do you have folks lined up to use that?  I don't know that many
folks are even catching the SIGBUS :-(

-Tony

  reply index

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-23 20:17 Matthew Wilcox
2020-06-23 21:48 ` Dan Williams
2020-06-23 22:04 ` Luck, Tony
2020-06-23 22:17   ` Matthew Wilcox
2020-06-23 22:26     ` Luck, Tony [this message]
2020-06-23 22:40       ` Matthew Wilcox
2020-06-24  0:01         ` Darrick J. Wong
2020-06-24 12:10           ` Matthew Wilcox
2020-06-24 23:21             ` Dan Williams
2020-06-25  0:17               ` Matthew Wilcox
2020-06-25  1:18                 ` Dan Williams
2020-06-24 21:22         ` Jane Chu
2020-06-25  0:13           ` Luck, Tony
2020-06-25 16:23             ` Jane Chu
2020-06-24  4:32   ` David Rientjes
2020-06-24 20:57     ` Jane Chu
2020-06-24 22:01       ` David Rientjes
2020-06-25  2:16     ` HORIGUCHI NAOYA(堀口 直也)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200623222658.GA21817@agluck-desk2.amr.corp.intel.com \
    --to=tony.luck@intel.com \
    --cc=bp@alien8.de \
    --cc=darrick.wong@oracle.com \
    --cc=jane.chu@oracle.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org
	public-inbox-index linux-edac

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git