linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] HWPOISON: improve logging
@ 2012-11-02 16:33 Naoya Horiguchi
  2012-11-02 16:33 ` [PATCH 1/2 v2] HWPOISON: fix action_result() to print out dirty/clean Naoya Horiguchi
  2012-11-02 16:33 ` [PATCH 2/2 v2] mm: print out information of file affected by memory error Naoya Horiguchi
  0 siblings, 2 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2012-11-02 16:33 UTC (permalink / raw)
  To: Tony Luck, Andi Kleen
  Cc: Wu Fengguang, Andrew Morton, Ingo Molnar, Jun'ichi Nomura,
	Naoya Horiguchi, linux-kernel, linux-mm

Hello,

These 2 patches fix or add the kernel messages which help users to
know what kind of pages are hit by errors and/or how the impact is.

Originally these were posted as part of patchsets which are pending
due to unsolved issues, but these are simple enough and related only
to memory error handling (no change on IO error handling,) so I think
these 2 can be separated from whole things and go into mainline first.

Thanks,
Naoya

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2 v2] HWPOISON: fix action_result() to print out dirty/clean
  2012-11-02 16:33 [PATCH 0/2] HWPOISON: improve logging Naoya Horiguchi
@ 2012-11-02 16:33 ` Naoya Horiguchi
  2012-11-05 21:56   ` Andrew Morton
  2012-11-02 16:33 ` [PATCH 2/2 v2] mm: print out information of file affected by memory error Naoya Horiguchi
  1 sibling, 1 reply; 10+ messages in thread
From: Naoya Horiguchi @ 2012-11-02 16:33 UTC (permalink / raw)
  To: Tony Luck, Andi Kleen
  Cc: Wu Fengguang, Andrew Morton, Ingo Molnar, Jun'ichi Nomura,
	Naoya Horiguchi, linux-kernel, linux-mm

action_result() fails to print out "dirty" even if an error occurred on a
dirty pagecache, because when we check PageDirty in action_result() it was
cleared after page isolation even if it's dirty before error handling. This
can break some applications that monitor this message, so should be fixed.

There are several callers of action_result() except page_action(), but
either of them are not for LRU pages but for free pages or kernel pages,
so we don't have to consider dirty or not for them.

Note that PG_dirty can be set outside page locks as described in commit
554940dc8c1e, so this patch does not completely closes the race window,
but just narrows it.

Changelog v2:
  - Add comment about setting PG_dirty outside page lock

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
---
 mm/memory-failure.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git v3.7-rc3.orig/mm/memory-failure.c v3.7-rc3/mm/memory-failure.c
index 6c5899b..4377de9 100644
--- v3.7-rc3.orig/mm/memory-failure.c
+++ v3.7-rc3/mm/memory-failure.c
@@ -781,16 +781,16 @@ static struct page_state {
 	{ compound,	compound,	"huge",		me_huge_page },
 #endif
 
-	{ sc|dirty,	sc|dirty,	"swapcache",	me_swapcache_dirty },
-	{ sc|dirty,	sc,		"swapcache",	me_swapcache_clean },
+	{ sc|dirty,	sc|dirty,	"dirty swapcache",	me_swapcache_dirty },
+	{ sc|dirty,	sc,		"clean swapcache",	me_swapcache_clean },
 
-	{ unevict|dirty, unevict|dirty,	"unevictable LRU", me_pagecache_dirty},
-	{ unevict,	unevict,	"unevictable LRU", me_pagecache_clean},
+	{ unevict|dirty, unevict|dirty,	"dirty unevictable LRU", me_pagecache_dirty },
+	{ unevict,	unevict,	"clean unevictable LRU", me_pagecache_clean },
 
-	{ mlock|dirty,	mlock|dirty,	"mlocked LRU",	me_pagecache_dirty },
-	{ mlock,	mlock,		"mlocked LRU",	me_pagecache_clean },
+	{ mlock|dirty,	mlock|dirty,	"dirty mlocked LRU",	me_pagecache_dirty },
+	{ mlock,	mlock,		"clean mlocked LRU",	me_pagecache_clean },
 
-	{ lru|dirty,	lru|dirty,	"LRU",		me_pagecache_dirty },
+	{ lru|dirty,	lru|dirty,	"dirty LRU",	me_pagecache_dirty },
 	{ lru|dirty,	lru,		"clean LRU",	me_pagecache_clean },
 
 	/*
@@ -812,14 +812,14 @@ static struct page_state {
 #undef slab
 #undef reserved
 
+/*
+ * "Dirty/Clean" indication is not 100% accurate due to the possibility of
+ * setting PG_dirty outside page lock. See also comment above set_page_dirty().
+ */
 static void action_result(unsigned long pfn, char *msg, int result)
 {
-	struct page *page = pfn_to_page(pfn);
-
-	printk(KERN_ERR "MCE %#lx: %s%s page recovery: %s\n",
-		pfn,
-		PageDirty(page) ? "dirty " : "",
-		msg, action_name[result]);
+	pr_err("MCE %#lx: %s page recovery: %s\n",
+		pfn, msg, action_name[result]);
 }
 
 static int page_action(struct page_state *ps, struct page *p,
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2 v2] mm: print out information of file affected by memory error
  2012-11-02 16:33 [PATCH 0/2] HWPOISON: improve logging Naoya Horiguchi
  2012-11-02 16:33 ` [PATCH 1/2 v2] HWPOISON: fix action_result() to print out dirty/clean Naoya Horiguchi
@ 2012-11-02 16:33 ` Naoya Horiguchi
  2012-11-05 22:01   ` Andrew Morton
  1 sibling, 1 reply; 10+ messages in thread
From: Naoya Horiguchi @ 2012-11-02 16:33 UTC (permalink / raw)
  To: Tony Luck, Andi Kleen
  Cc: Wu Fengguang, Andrew Morton, Ingo Molnar, Jun'ichi Nomura,
	Naoya Horiguchi, linux-kernel, linux-mm, Jan Kara

Printing out the information about which file can be affected by a
memory error in generic_error_remove_page() is helpful for user to
estimate the impact of the error.

Changelog v2:
  - dereference mapping->host after if (!mapping) check for robustness

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Jan Kara <jack@suse.cz>
---
 mm/truncate.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git v3.7-rc3.orig/mm/truncate.c v3.7-rc3/mm/truncate.c
index d51ce92..db1b216 100644
--- v3.7-rc3.orig/mm/truncate.c
+++ v3.7-rc3/mm/truncate.c
@@ -151,14 +151,20 @@ int truncate_inode_page(struct address_space *mapping, struct page *page)
  */
 int generic_error_remove_page(struct address_space *mapping, struct page *page)
 {
+	struct inode *inode;
+
 	if (!mapping)
 		return -EINVAL;
+	inode = mapping->host;
 	/*
 	 * Only punch for normal data pages for now.
 	 * Handling other types like directories would need more auditing.
 	 */
-	if (!S_ISREG(mapping->host->i_mode))
+	if (!S_ISREG(inode->i_mode))
 		return -EIO;
+	pr_info("MCE %#lx: file info pgoff:%lu, inode:%lu, dev:%s\n",
+		page_to_pfn(page), page_index(page),
+		inode->i_ino, inode->i_sb->s_id);
 	return truncate_inode_page(mapping, page);
 }
 EXPORT_SYMBOL(generic_error_remove_page);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2 v2] HWPOISON: fix action_result() to print out dirty/clean
  2012-11-02 16:33 ` [PATCH 1/2 v2] HWPOISON: fix action_result() to print out dirty/clean Naoya Horiguchi
@ 2012-11-05 21:56   ` Andrew Morton
  2012-11-05 22:40     ` [PATCH v3] HWPOISON: fix action_result() to print out dirty/clean (Re: [PATCH 1/2 v2] HWPOISON: fix action_result() to print out) dirty/clean Naoya Horiguchi
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2012-11-05 21:56 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Tony Luck, Andi Kleen, Wu Fengguang, Ingo Molnar,
	Jun'ichi Nomura, linux-kernel, linux-mm

On Fri,  2 Nov 2012 12:33:12 -0400
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> action_result() fails to print out "dirty" even if an error occurred on a
> dirty pagecache, because when we check PageDirty in action_result() it was
> cleared after page isolation even if it's dirty before error handling. This
> can break some applications that monitor this message, so should be fixed.
> 
> There are several callers of action_result() except page_action(), but
> either of them are not for LRU pages but for free pages or kernel pages,
> so we don't have to consider dirty or not for them.
> 
> Note that PG_dirty can be set outside page locks as described in commit
> 554940dc8c1e, so this patch does not completely closes the race window,
> but just narrows it.

I can find no commit 554940dc8c1e.  What commit are you referring to here?

This is one of the reasons why we ask people to refer to commits by
both hash and by name, using the form

078de5f706ece3 ("userns: Store uid and gid values in struct cred with
kuid_t and kgid_t types")

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v2] mm: print out information of file affected by memory error
  2012-11-02 16:33 ` [PATCH 2/2 v2] mm: print out information of file affected by memory error Naoya Horiguchi
@ 2012-11-05 22:01   ` Andrew Morton
  2012-11-06  5:07     ` Naoya Horiguchi
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2012-11-05 22:01 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Tony Luck, Andi Kleen, Wu Fengguang, Ingo Molnar,
	Jun'ichi Nomura, linux-kernel, linux-mm, Jan Kara

On Fri,  2 Nov 2012 12:33:13 -0400
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> Printing out the information about which file can be affected by a
> memory error in generic_error_remove_page() is helpful for user to
> estimate the impact of the error.
> 
> Changelog v2:
>   - dereference mapping->host after if (!mapping) check for robustness
> 
> ...
>
> --- v3.7-rc3.orig/mm/truncate.c
> +++ v3.7-rc3/mm/truncate.c
> @@ -151,14 +151,20 @@ int truncate_inode_page(struct address_space *mapping, struct page *page)
>   */
>  int generic_error_remove_page(struct address_space *mapping, struct page *page)
>  {
> +	struct inode *inode;
> +
>  	if (!mapping)
>  		return -EINVAL;
> +	inode = mapping->host;
>  	/*
>  	 * Only punch for normal data pages for now.
>  	 * Handling other types like directories would need more auditing.
>  	 */
> -	if (!S_ISREG(mapping->host->i_mode))
> +	if (!S_ISREG(inode->i_mode))
>  		return -EIO;
> +	pr_info("MCE %#lx: file info pgoff:%lu, inode:%lu, dev:%s\n",
> +		page_to_pfn(page), page_index(page),
> +		inode->i_ino, inode->i_sb->s_id);
>  	return truncate_inode_page(mapping, page);
>  }
>  EXPORT_SYMBOL(generic_error_remove_page);

A couple of things.

- I worry that if a hardware error occurs, it might affect a large
  amount of memory all at the same time.  For example, if a 4G memory
  block goes bad, this message will be printed a million times?

- hard-wiring "MCE" in here seems a bit of a layering violation? 
  What right does the generic, core .error_remove_page() implementation
  have to assume that it was called because of an MCE?  Many CPU types
  don't eveh have such a thing?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3] HWPOISON: fix action_result() to print out dirty/clean (Re: [PATCH 1/2 v2] HWPOISON: fix action_result() to print out) dirty/clean
  2012-11-05 21:56   ` Andrew Morton
@ 2012-11-05 22:40     ` Naoya Horiguchi
  0 siblings, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2012-11-05 22:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, Tony Luck, Andi Kleen, Wu Fengguang,
	Ingo Molnar, Jun'ichi Nomura, linux-kernel, linux-mm

On Mon, Nov 05, 2012 at 01:56:28PM -0800, Andrew Morton wrote:
> On Fri,  2 Nov 2012 12:33:12 -0400
> Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
>
> > action_result() fails to print out "dirty" even if an error occurred on a
> > dirty pagecache, because when we check PageDirty in action_result() it was
> > cleared after page isolation even if it's dirty before error handling. This
> > can break some applications that monitor this message, so should be fixed.
> >
> > There are several callers of action_result() except page_action(), but
> > either of them are not for LRU pages but for free pages or kernel pages,
> > so we don't have to consider dirty or not for them.
> >
> > Note that PG_dirty can be set outside page locks as described in commit
> > 554940dc8c1e, so this patch does not completely closes the race window,
> > but just narrows it.
>
> I can find no commit 554940dc8c1e.  What commit are you referring to here?

Sorry, I pointed to a wrong ID somehow. Here is one I intended:

  commit 6746aff74da293b5fd24e5c68b870b721e86cd5f
  Author: Wu Fengguang <fengguang.wu@intel.com>
  Date:   Wed Sep 16 11:50:14 2009 +0200

      HWPOISON: shmem: call set_page_dirty() with locked page

Could you replace the previous one with an attached one?

> This is one of the reasons why we ask people to refer to commits by
> both hash and by name, using the form
>
> 078de5f706ece3 ("userns: Store uid and gid values in struct cred with
> kuid_t and kgid_t types")

OK, I'll keep this in my mind.

Naoya

---
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Date: Fri, 2 Nov 2012 13:44:41 -0400
Subject: [PATCH v3] HWPOISON: fix action_result() to print out dirty/clean

action_result() fails to print out "dirty" even if an error occurred on a
dirty pagecache, because when we check PageDirty in action_result() it was
cleared after page isolation even if it's dirty before error handling. This
can break some applications that monitor this message, so should be fixed.

There are several callers of action_result() except page_action(), but
either of them are not for LRU pages but for free pages or kernel pages,
so we don't have to consider dirty or not for them.

Note that PG_dirty can be set outside page locks as described in commit
6746aff74da29 ("HWPOISON: shmem: call set_page_dirty() with locked page"),
so this patch does not completely closes the race window, but just narrows it.

Changelog v3:
  - fix commit ID in description

Changelog v2:
  - Add comment about setting PG_dirty outside page lock

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
---
 mm/memory-failure.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 1abffee..01509aa 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -781,16 +781,16 @@ static struct page_state {
 	{ compound,	compound,	"huge",		me_huge_page },
 #endif
 
-	{ sc|dirty,	sc|dirty,	"swapcache",	me_swapcache_dirty },
-	{ sc|dirty,	sc,		"swapcache",	me_swapcache_clean },
+	{ sc|dirty,	sc|dirty,	"dirty swapcache",	me_swapcache_dirty },
+	{ sc|dirty,	sc,		"clean swapcache",	me_swapcache_clean },
 
-	{ unevict|dirty, unevict|dirty,	"unevictable LRU", me_pagecache_dirty},
-	{ unevict,	unevict,	"unevictable LRU", me_pagecache_clean},
+	{ unevict|dirty, unevict|dirty,	"dirty unevictable LRU", me_pagecache_dirty },
+	{ unevict,	unevict,	"clean unevictable LRU", me_pagecache_clean },
 
-	{ mlock|dirty,	mlock|dirty,	"mlocked LRU",	me_pagecache_dirty },
-	{ mlock,	mlock,		"mlocked LRU",	me_pagecache_clean },
+	{ mlock|dirty,	mlock|dirty,	"dirty mlocked LRU",	me_pagecache_dirty },
+	{ mlock,	mlock,		"clean mlocked LRU",	me_pagecache_clean },
 
-	{ lru|dirty,	lru|dirty,	"LRU",		me_pagecache_dirty },
+	{ lru|dirty,	lru|dirty,	"dirty LRU",	me_pagecache_dirty },
 	{ lru|dirty,	lru,		"clean LRU",	me_pagecache_clean },
 
 	/*
@@ -812,14 +812,14 @@ static struct page_state {
 #undef slab
 #undef reserved
 
+/*
+ * "Dirty/Clean" indication is not 100% accurate due to the possibility of
+ * setting PG_dirty outside page lock. See also comment above set_page_dirty().
+ */
 static void action_result(unsigned long pfn, char *msg, int result)
 {
-	struct page *page = pfn_to_page(pfn);
-
-	printk(KERN_ERR "MCE %#lx: %s%s page recovery: %s\n",
-		pfn,
-		PageDirty(page) ? "dirty " : "",
-		msg, action_name[result]);
+	pr_err("MCE %#lx: %s page recovery: %s\n",
+		pfn, msg, action_name[result]);
 }
 
 static int page_action(struct page_state *ps, struct page *p,
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v2] mm: print out information of file affected by memory error
  2012-11-05 22:01   ` Andrew Morton
@ 2012-11-06  5:07     ` Naoya Horiguchi
  2012-11-06 20:12       ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Naoya Horiguchi @ 2012-11-06  5:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, Tony Luck, Andi Kleen, Wu Fengguang,
	Ingo Molnar, Jun'ichi Nomura, linux-kernel, linux-mm,
	Jan Kara

On Mon, Nov 05, 2012 at 02:01:54PM -0800, Andrew Morton wrote:
> On Fri,  2 Nov 2012 12:33:13 -0400
> Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > Printing out the information about which file can be affected by a
> > memory error in generic_error_remove_page() is helpful for user to
> > estimate the impact of the error.
> > 
> > Changelog v2:
> >   - dereference mapping->host after if (!mapping) check for robustness
> > 
> > ...
> >
> > --- v3.7-rc3.orig/mm/truncate.c
> > +++ v3.7-rc3/mm/truncate.c
> > @@ -151,14 +151,20 @@ int truncate_inode_page(struct address_space *mapping, struct page *page)
> >   */
> >  int generic_error_remove_page(struct address_space *mapping, struct page *page)
> >  {
> > +	struct inode *inode;
> > +
> >  	if (!mapping)
> >  		return -EINVAL;
> > +	inode = mapping->host;
> >  	/*
> >  	 * Only punch for normal data pages for now.
> >  	 * Handling other types like directories would need more auditing.
> >  	 */
> > -	if (!S_ISREG(mapping->host->i_mode))
> > +	if (!S_ISREG(inode->i_mode))
> >  		return -EIO;
> > +	pr_info("MCE %#lx: file info pgoff:%lu, inode:%lu, dev:%s\n",
> > +		page_to_pfn(page), page_index(page),
> > +		inode->i_ino, inode->i_sb->s_id);
> >  	return truncate_inode_page(mapping, page);
> >  }
> >  EXPORT_SYMBOL(generic_error_remove_page);
> 
> A couple of things.
> 
> - I worry that if a hardware error occurs, it might affect a large
>   amount of memory all at the same time.  For example, if a 4G memory
>   block goes bad, this message will be printed a million times?

If the error on 4G memory block triggered by SRAO MCE and these 1M pages
are all pagecache pages, the answer is yes.
But I think that if it's a whole DIMM error, it should be reported by
another type of MCE than SRAO, so printing a million times seems to be
unlikely to happen.

> - hard-wiring "MCE" in here seems a bit of a layering violation? 
>   What right does the generic, core .error_remove_page() implementation
>   have to assume that it was called because of an MCE?

OK, we need not assume that. I change "MCE " prefix to more specific
one like "Memory error ".

> Many CPU types don't eveh have such a thing?

No. At least currently, only SRAO MCE triggers memory_failure() and
it's defined only on some newest highend models of Intel CPUs.

Thanks,
Naoya

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v2] mm: print out information of file affected by memory error
  2012-11-06  5:07     ` Naoya Horiguchi
@ 2012-11-06 20:12       ` Andrew Morton
  2012-11-06 22:45         ` Naoya Horiguchi
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2012-11-06 20:12 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Tony Luck, Andi Kleen, Wu Fengguang, Ingo Molnar,
	Jun'ichi Nomura, linux-kernel, linux-mm, Jan Kara

On Tue,  6 Nov 2012 00:07:53 -0500
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> On Mon, Nov 05, 2012 at 02:01:54PM -0800, Andrew Morton wrote:
> > On Fri,  2 Nov 2012 12:33:13 -0400
> > Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> > 
> > > Printing out the information about which file can be affected by a
> > > memory error in generic_error_remove_page() is helpful for user to
> > > estimate the impact of the error.
> > > 
> > > Changelog v2:
> > >   - dereference mapping->host after if (!mapping) check for robustness
> > > 
> > > ...
> > >
> > > --- v3.7-rc3.orig/mm/truncate.c
> > > +++ v3.7-rc3/mm/truncate.c
> > > @@ -151,14 +151,20 @@ int truncate_inode_page(struct address_space *mapping, struct page *page)
> > >   */
> > >  int generic_error_remove_page(struct address_space *mapping, struct page *page)
> > >  {
> > > +	struct inode *inode;
> > > +
> > >  	if (!mapping)
> > >  		return -EINVAL;
> > > +	inode = mapping->host;
> > >  	/*
> > >  	 * Only punch for normal data pages for now.
> > >  	 * Handling other types like directories would need more auditing.
> > >  	 */
> > > -	if (!S_ISREG(mapping->host->i_mode))
> > > +	if (!S_ISREG(inode->i_mode))
> > >  		return -EIO;
> > > +	pr_info("MCE %#lx: file info pgoff:%lu, inode:%lu, dev:%s\n",
> > > +		page_to_pfn(page), page_index(page),
> > > +		inode->i_ino, inode->i_sb->s_id);
> > >  	return truncate_inode_page(mapping, page);
> > >  }
> > >  EXPORT_SYMBOL(generic_error_remove_page);
> > 
> > A couple of things.
> > 
> > - I worry that if a hardware error occurs, it might affect a large
> >   amount of memory all at the same time.  For example, if a 4G memory
> >   block goes bad, this message will be printed a million times?
> 
> If the error on 4G memory block triggered by SRAO MCE and these 1M pages
> are all pagecache pages, the answer is yes.

Well that's bad.

> But I think that if it's a whole DIMM error, it should be reported by
> another type of MCE than SRAO, so printing a million times seems to be
> unlikely to happen.

"should be" and "unlikely" aren't very reassuring things to hear! 
Emitting a million lines into syslog is pretty poor behaviour and
should be reliably avoided.

> > - hard-wiring "MCE" in here seems a bit of a layering violation? 
> >   What right does the generic, core .error_remove_page() implementation
> >   have to assume that it was called because of an MCE?
> 
> OK, we need not assume that. I change "MCE " prefix to more specific
> one like "Memory error ".
> 
> > Many CPU types don't eveh have such a thing?
> 
> No. At least currently, only SRAO MCE triggers memory_failure() and
> it's defined only on some newest highend models of Intel CPUs.

Again, your reply is full of assumptions about one particualar
implementation on one particular CPU.  But this is generic,
cross-architecture code!

Now, it's pretty harmless to make these assumptions at this time.  But
this new code will need to redone if/when other CPU types come along,
and because there's a printk in there, that rework will cause
user-visible changes in kernel behaviour.  It would be best if we can
just avoid the problem on day one.

Maybe move the printk into x86-specific code?  And just one printk
please - not a million!


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v2] mm: print out information of file affected by memory error
  2012-11-06 20:12       ` Andrew Morton
@ 2012-11-06 22:45         ` Naoya Horiguchi
  2012-11-06 22:52           ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Naoya Horiguchi @ 2012-11-06 22:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, Tony Luck, Andi Kleen, Wu Fengguang,
	Ingo Molnar, Jun'ichi Nomura, linux-kernel, linux-mm,
	Jan Kara

On Tue, Nov 06, 2012 at 12:12:20PM -0800, Andrew Morton wrote:
> On Tue,  6 Nov 2012 00:07:53 -0500
> Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > On Mon, Nov 05, 2012 at 02:01:54PM -0800, Andrew Morton wrote:
> > > On Fri,  2 Nov 2012 12:33:13 -0400
> > > Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> > > 
> > > > Printing out the information about which file can be affected by a
> > > > memory error in generic_error_remove_page() is helpful for user to
> > > > estimate the impact of the error.
> > > > 
> > > > Changelog v2:
> > > >   - dereference mapping->host after if (!mapping) check for robustness
> > > > 
> > > > ...
> > > >
> > > > --- v3.7-rc3.orig/mm/truncate.c
> > > > +++ v3.7-rc3/mm/truncate.c
> > > > @@ -151,14 +151,20 @@ int truncate_inode_page(struct address_space *mapping, struct page *page)
> > > >   */
> > > >  int generic_error_remove_page(struct address_space *mapping, struct page *page)
> > > >  {
> > > > +	struct inode *inode;
> > > > +
> > > >  	if (!mapping)
> > > >  		return -EINVAL;
> > > > +	inode = mapping->host;
> > > >  	/*
> > > >  	 * Only punch for normal data pages for now.
> > > >  	 * Handling other types like directories would need more auditing.
> > > >  	 */
> > > > -	if (!S_ISREG(mapping->host->i_mode))
> > > > +	if (!S_ISREG(inode->i_mode))
> > > >  		return -EIO;
> > > > +	pr_info("MCE %#lx: file info pgoff:%lu, inode:%lu, dev:%s\n",
> > > > +		page_to_pfn(page), page_index(page),
> > > > +		inode->i_ino, inode->i_sb->s_id);
> > > >  	return truncate_inode_page(mapping, page);
> > > >  }
> > > >  EXPORT_SYMBOL(generic_error_remove_page);
> > > 
> > > A couple of things.
> > > 
> > > - I worry that if a hardware error occurs, it might affect a large
> > >   amount of memory all at the same time.  For example, if a 4G memory
> > >   block goes bad, this message will be printed a million times?
> > 
> > If the error on 4G memory block triggered by SRAO MCE and these 1M pages
> > are all pagecache pages, the answer is yes.
> 
> Well that's bad.
> 
> > But I think that if it's a whole DIMM error, it should be reported by
> > another type of MCE than SRAO, so printing a million times seems to be
> > unlikely to happen.
> 
> "should be" and "unlikely" aren't very reassuring things to hear! 
> Emitting a million lines into syslog is pretty poor behaviour and
> should be reliably avoided.

So capping maximum lines of messages per some duration (a hour or a day)
is a possible option. BTW, even if we don't apply this patch, the kernel
can emit million lines of messages in the above-mentioned situation because
each memory error event emits a message like "MCE 0x3f57f4: dirty LRU page
recovery: Ignored" on syslog. If it's also bad, we need to do capping
also over existing printk()s, right?

> > > - hard-wiring "MCE" in here seems a bit of a layering violation? 
> > >   What right does the generic, core .error_remove_page() implementation
> > >   have to assume that it was called because of an MCE?
> > 
> > OK, we need not assume that. I change "MCE " prefix to more specific
> > one like "Memory error ".
> > 
> > > Many CPU types don't eveh have such a thing?
> > 
> > No. At least currently, only SRAO MCE triggers memory_failure() and
> > it's defined only on some newest highend models of Intel CPUs.
> 
> Again, your reply is full of assumptions about one particualar
> implementation on one particular CPU.  But this is generic,
> cross-architecture code!
> 
> Now, it's pretty harmless to make these assumptions at this time.  But
> this new code will need to redone if/when other CPU types come along,
> and because there's a printk in there, that rework will cause
> user-visible changes in kernel behaviour.  It would be best if we can
> just avoid the problem on day one.
> 
> Maybe move the printk into x86-specific code?  And just one printk
> please - not a million!

OK, we need some cleanup for this at first.

Thanks,
Naoya

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2 v2] mm: print out information of file affected by memory error
  2012-11-06 22:45         ` Naoya Horiguchi
@ 2012-11-06 22:52           ` Andrew Morton
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2012-11-06 22:52 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Tony Luck, Andi Kleen, Wu Fengguang, Ingo Molnar,
	Jun'ichi Nomura, linux-kernel, linux-mm, Jan Kara

On Tue,  6 Nov 2012 17:45:05 -0500
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> > "should be" and "unlikely" aren't very reassuring things to hear! 
> > Emitting a million lines into syslog is pretty poor behaviour and
> > should be reliably avoided.
> 
> So capping maximum lines of messages per some duration (a hour or a day)
> is a possible option. BTW, even if we don't apply this patch, the kernel
> can emit million lines of messages in the above-mentioned situation because
> each memory error event emits a message like "MCE 0x3f57f4: dirty LRU page
> recovery: Ignored" on syslog. If it's also bad, we need to do capping
> also over existing printk()s, right?

Yes, that sounds like a bug report waiting to happen.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-11-06 22:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-02 16:33 [PATCH 0/2] HWPOISON: improve logging Naoya Horiguchi
2012-11-02 16:33 ` [PATCH 1/2 v2] HWPOISON: fix action_result() to print out dirty/clean Naoya Horiguchi
2012-11-05 21:56   ` Andrew Morton
2012-11-05 22:40     ` [PATCH v3] HWPOISON: fix action_result() to print out dirty/clean (Re: [PATCH 1/2 v2] HWPOISON: fix action_result() to print out) dirty/clean Naoya Horiguchi
2012-11-02 16:33 ` [PATCH 2/2 v2] mm: print out information of file affected by memory error Naoya Horiguchi
2012-11-05 22:01   ` Andrew Morton
2012-11-06  5:07     ` Naoya Horiguchi
2012-11-06 20:12       ` Andrew Morton
2012-11-06 22:45         ` Naoya Horiguchi
2012-11-06 22:52           ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).