All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: James Morse <james.morse@arm.com>,
	linux-mm@kvack.org, linux-acpi@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Len Brown <lenb@kernel.org>, Tony Luck <tony.luck@intel.com>,
	Borislav Petkov <bp@alien8.de>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Tyler Baicar <tyler@amperecomputing.com>,
	Xie XiuQi <xiexiuqi@huawei.com>
Subject: Re: [PATCH v2 1/3] mm/memory-failure: Add memory_failure_queue_kick()
Date: Mon, 18 May 2020 12:58:28 -0700	[thread overview]
Message-ID: <20200518125828.e4e3973c743556e976c5ee65@linux-foundation.org> (raw)
In-Reply-To: <49686237.p6yG9EJavU@kreacher>

On Mon, 18 May 2020 14:45:05 +0200 "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:

> On Friday, May 1, 2020 6:45:41 PM CEST James Morse wrote:
> > The GHES code calls memory_failure_queue() from IRQ context to schedule
> > work on the current CPU so that memory_failure() can sleep.
> > 
> > For synchronous memory errors the arch code needs to know any signals
> > that memory_failure() will trigger are pending before it returns to
> > user-space, possibly when exiting from the IRQ.
> > 
> > Add a helper to kick the memory failure queue, to ensure the scheduled
> > work has happened. This has to be called from process context, so may
> > have been migrated from the original cpu. Pass the cpu the work was
> > queued on.
> > 
> > Change memory_failure_work_func() to permit being called on the 'wrong'
> > cpu.
> > 
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -3012,6 +3012,7 @@ enum mf_flags {
> >  };
> >  extern int memory_failure(unsigned long pfn, int flags);
> >  extern void memory_failure_queue(unsigned long pfn, int flags);
> > +extern void memory_failure_queue_kick(int cpu);
> >  extern int unpoison_memory(unsigned long pfn);
> >  extern int get_hwpoison_page(struct page *page);
> >  #define put_hwpoison_page(page)	put_page(page)
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index a96364be8ab4..c4afb407bf0f 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1493,7 +1493,7 @@ static void memory_failure_work_func(struct work_struct *work)
> >  	unsigned long proc_flags;
> >  	int gotten;
> >  
> > -	mf_cpu = this_cpu_ptr(&memory_failure_cpu);
> > +	mf_cpu = container_of(work, struct memory_failure_cpu, work);
> >  	for (;;) {
> >  		spin_lock_irqsave(&mf_cpu->lock, proc_flags);
> >  		gotten = kfifo_get(&mf_cpu->fifo, &entry);
> > @@ -1507,6 +1507,19 @@ static void memory_failure_work_func(struct work_struct *work)
> >  	}
> >  }
> >  
> > +/*
> > + * Process memory_failure work queued on the specified CPU.
> > + * Used to avoid return-to-userspace racing with the memory_failure workqueue.
> > + */
> > +void memory_failure_queue_kick(int cpu)
> > +{
> > +	struct memory_failure_cpu *mf_cpu;
> > +
> > +	mf_cpu = &per_cpu(memory_failure_cpu, cpu);
> > +	cancel_work_sync(&mf_cpu->work);
> > +	memory_failure_work_func(&mf_cpu->work);
> > +}
> > +
> >  static int __init memory_failure_init(void)
> >  {
> >  	struct memory_failure_cpu *mf_cpu;
> > 
> 
> I could apply this provided an ACK from the mm people.
> 

Naoya Horiguchi is the memory-failure.c person.  A review would be
appreciated please?

I'm struggling with it a bit.  memory_failure_queue_kick() should be
called on the cpu which is identified by arg `cpu', yes? 
memory_failure_work_func() appears to assume this.

If that's right then a) why bother passing in the `cpu' arg?  and b)
what keeps this thread pinned to that CPU?  cancel_work_sync() can
schedule.


WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Mark Rutland <mark.rutland@arm.com>,
	Tony Luck <tony.luck@intel.com>,
	linux-acpi@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Xie XiuQi <xiexiuqi@huawei.com>,
	linux-mm@kvack.org, Tyler Baicar <tyler@amperecomputing.com>,
	James Morse <james.morse@arm.com>, Borislav Petkov <bp@alien8.de>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org, Len Brown <lenb@kernel.org>
Subject: Re: [PATCH v2 1/3] mm/memory-failure: Add memory_failure_queue_kick()
Date: Mon, 18 May 2020 12:58:28 -0700	[thread overview]
Message-ID: <20200518125828.e4e3973c743556e976c5ee65@linux-foundation.org> (raw)
In-Reply-To: <49686237.p6yG9EJavU@kreacher>

On Mon, 18 May 2020 14:45:05 +0200 "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:

> On Friday, May 1, 2020 6:45:41 PM CEST James Morse wrote:
> > The GHES code calls memory_failure_queue() from IRQ context to schedule
> > work on the current CPU so that memory_failure() can sleep.
> > 
> > For synchronous memory errors the arch code needs to know any signals
> > that memory_failure() will trigger are pending before it returns to
> > user-space, possibly when exiting from the IRQ.
> > 
> > Add a helper to kick the memory failure queue, to ensure the scheduled
> > work has happened. This has to be called from process context, so may
> > have been migrated from the original cpu. Pass the cpu the work was
> > queued on.
> > 
> > Change memory_failure_work_func() to permit being called on the 'wrong'
> > cpu.
> > 
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -3012,6 +3012,7 @@ enum mf_flags {
> >  };
> >  extern int memory_failure(unsigned long pfn, int flags);
> >  extern void memory_failure_queue(unsigned long pfn, int flags);
> > +extern void memory_failure_queue_kick(int cpu);
> >  extern int unpoison_memory(unsigned long pfn);
> >  extern int get_hwpoison_page(struct page *page);
> >  #define put_hwpoison_page(page)	put_page(page)
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index a96364be8ab4..c4afb407bf0f 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1493,7 +1493,7 @@ static void memory_failure_work_func(struct work_struct *work)
> >  	unsigned long proc_flags;
> >  	int gotten;
> >  
> > -	mf_cpu = this_cpu_ptr(&memory_failure_cpu);
> > +	mf_cpu = container_of(work, struct memory_failure_cpu, work);
> >  	for (;;) {
> >  		spin_lock_irqsave(&mf_cpu->lock, proc_flags);
> >  		gotten = kfifo_get(&mf_cpu->fifo, &entry);
> > @@ -1507,6 +1507,19 @@ static void memory_failure_work_func(struct work_struct *work)
> >  	}
> >  }
> >  
> > +/*
> > + * Process memory_failure work queued on the specified CPU.
> > + * Used to avoid return-to-userspace racing with the memory_failure workqueue.
> > + */
> > +void memory_failure_queue_kick(int cpu)
> > +{
> > +	struct memory_failure_cpu *mf_cpu;
> > +
> > +	mf_cpu = &per_cpu(memory_failure_cpu, cpu);
> > +	cancel_work_sync(&mf_cpu->work);
> > +	memory_failure_work_func(&mf_cpu->work);
> > +}
> > +
> >  static int __init memory_failure_init(void)
> >  {
> >  	struct memory_failure_cpu *mf_cpu;
> > 
> 
> I could apply this provided an ACK from the mm people.
> 

Naoya Horiguchi is the memory-failure.c person.  A review would be
appreciated please?

I'm struggling with it a bit.  memory_failure_queue_kick() should be
called on the cpu which is identified by arg `cpu', yes? 
memory_failure_work_func() appears to assume this.

If that's right then a) why bother passing in the `cpu' arg?  and b)
what keeps this thread pinned to that CPU?  cancel_work_sync() can
schedule.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-05-18 19:58 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-01 16:45 [PATCH v2 0/3] ACPI / APEI: Kick the memory_failure() queue for synchronous errors James Morse
2020-05-01 16:45 ` James Morse
2020-05-01 16:45 ` [PATCH v2 1/3] mm/memory-failure: Add memory_failure_queue_kick() James Morse
2020-05-01 16:45   ` James Morse
2020-05-18 12:45   ` Rafael J. Wysocki
2020-05-18 12:45     ` Rafael J. Wysocki
2020-05-18 19:58     ` Andrew Morton [this message]
2020-05-18 19:58       ` Andrew Morton
2020-05-19  3:15       ` HORIGUCHI NAOYA(堀口 直也)
2020-05-19  3:15         ` HORIGUCHI NAOYA(堀口 直也)
2020-05-19  3:15         ` HORIGUCHI NAOYA(堀口 直也)
2020-05-19 17:53         ` Rafael J. Wysocki
2020-05-19 17:53           ` Rafael J. Wysocki
2020-05-19 17:53           ` Rafael J. Wysocki
2020-05-01 16:45 ` [PATCH v2 2/3] ACPI / APEI: Kick the memory_failure() queue for synchronous errors James Morse
2020-05-01 16:45   ` James Morse
2020-05-01 16:45 ` [PATCH v2 3/3] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
2020-05-01 16:45   ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200518125828.e4e3973c743556e976c5ee65@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=rjw@rjwysocki.net \
    --cc=tony.luck@intel.com \
    --cc=tyler@amperecomputing.com \
    --cc=will@kernel.org \
    --cc=xiexiuqi@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.