From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Michal Hocko <mhocko@suse.com>
Cc: Petr Mladek <pmladek@suse.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Boqun Feng <boqun.feng@gmail.com>, Ingo Molnar <mingo@redhat.com>,
John Ogness <john.ogness@linutronix.de>,
Mel Gorman <mgorman@techsingularity.net>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Waiman Long <longman@redhat.com>, Will Deacon <will@kernel.org>
Subject: Re: [PATCH v2 1/2] seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin_nested()
Date: Sat, 29 Jul 2023 14:31:01 +0900 [thread overview]
Message-ID: <b6ba16ce-4849-d32c-68fe-07a15aaf9d9c@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20230727151029.e_M9bi8N@linutronix.de>
On 2023/07/28 0:10, Sebastian Andrzej Siewior wrote:
> On 2023-06-28 21:14:16 [+0900], Tetsuo Handa wrote:
>>> Anyway, please do not do this change only because of printk().
>>> IMHO, the current ordering is more logical and the printk() problem
>>> should be solved another way.
>>
>> Then, since [PATCH 1/2] cannot be applied, [PATCH 2/2] is automatically
>> rejected.
>
> My understanding is that this patch gets applied and your objection will
> be noted.
My preference is that zonelist_update_seq is not checked by !__GFP_DIRECT_RECLAIM
allocations, which is a low-hanging fruit towards GFP_LOCKLESS mentioned at
https://lkml.kernel.org/r/ZG3+l4qcCWTPtSMD@dhcp22.suse.cz and
https://lkml.kernel.org/r/ZJWWpGZMJIADQvRS@dhcp22.suse.cz .
Maybe we can defer checking zonelist_update_seq till retry check like below,
for this is really an infrequent event.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7d3460c7a480..2f7b82af2590 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3642,22 +3642,27 @@ EXPORT_SYMBOL_GPL(fs_reclaim_release);
* retries the allocation if zonelist changes. Writer side is protected by the
* embedded spin_lock.
*/
-static DEFINE_SEQLOCK(zonelist_update_seq);
+static unsigned int zonelist_update_seq;
static unsigned int zonelist_iter_begin(void)
{
if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
- return read_seqbegin(&zonelist_update_seq);
+ return data_race(READ_ONCE(zonelist_update_seq));
return 0;
}
-static unsigned int check_retry_zonelist(unsigned int seq)
+static unsigned int check_retry_zonelist(gfp_t gfp, unsigned int seq)
{
- if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
- return read_seqretry(&zonelist_update_seq, seq);
+ if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE) && (gfp & __GFP_DIRECT_RECLAIM)) {
+ unsigned int seq2;
+
+ smp_rmb();
+ seq2 = data_race(READ_ONCE(zonelist_update_seq));
+ return unlikely(seq != seq2 || (seq2 & 1));
+ }
- return seq;
+ return 0;
}
/* Perform direct synchronous page reclaim */
@@ -4146,7 +4151,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
* a unnecessary OOM kill.
*/
if (check_retry_cpuset(cpuset_mems_cookie, ac) ||
- check_retry_zonelist(zonelist_iter_cookie))
+ check_retry_zonelist(gfp_mask, zonelist_iter_cookie))
goto restart;
/* Reclaim has failed us, start killing things */
@@ -4172,7 +4177,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
* a unnecessary OOM kill.
*/
if (check_retry_cpuset(cpuset_mems_cookie, ac) ||
- check_retry_zonelist(zonelist_iter_cookie))
+ check_retry_zonelist(gfp_mask, zonelist_iter_cookie))
goto restart;
/*
@@ -5136,22 +5141,12 @@ static void __build_all_zonelists(void *data)
int nid;
int __maybe_unused cpu;
pg_data_t *self = data;
+ static DEFINE_SPINLOCK(lock);
unsigned long flags;
- /*
- * Explicitly disable this CPU's interrupts before taking seqlock
- * to prevent any IRQ handler from calling into the page allocator
- * (e.g. GFP_ATOMIC) that could hit zonelist_iter_begin and livelock.
- */
- local_irq_save(flags);
- /*
- * Explicitly disable this CPU's synchronous printk() before taking
- * seqlock to prevent any printk() from trying to hold port->lock, for
- * tty_insert_flip_string_and_push_buffer() on other CPU might be
- * calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held.
- */
- printk_deferred_enter();
- write_seqlock(&zonelist_update_seq);
+ spin_lock_irqsave(&lock, flags);
+ data_race(zonelist_update_seq++);
+ smp_wmb();
#ifdef CONFIG_NUMA
memset(node_load, 0, sizeof(node_load));
@@ -5188,9 +5183,9 @@ static void __build_all_zonelists(void *data)
#endif
}
- write_sequnlock(&zonelist_update_seq);
- printk_deferred_exit();
- local_irq_restore(flags);
+ smp_wmb();
+ data_race(zonelist_update_seq++);
+ spin_unlock_irqrestore(&lock, flags);
}
static noinline void __init
next prev parent reply other threads:[~2023-07-29 5:32 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-23 17:12 [PATCH v2 0/2] seqlock,mm: lockdep annotation + write_seqlock_irqsave() Sebastian Andrzej Siewior
2023-06-23 17:12 ` [PATCH v2 1/2] seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin_nested() Sebastian Andrzej Siewior
2023-06-24 6:54 ` Tetsuo Handa
2023-06-26 8:12 ` Sebastian Andrzej Siewior
2023-06-26 9:25 ` Tetsuo Handa
2023-06-26 10:48 ` Peter Zijlstra
2023-06-26 11:26 ` Tetsuo Handa
2023-06-26 11:35 ` Michal Hocko
2023-06-26 12:27 ` Tetsuo Handa
2023-06-26 13:16 ` Michal Hocko
2023-06-26 12:46 ` Sebastian Andrzej Siewior
2023-06-26 13:13 ` Sebastian Andrzej Siewior
2023-06-26 14:44 ` Petr Mladek
2023-06-28 12:14 ` Tetsuo Handa
2023-07-27 15:10 ` Sebastian Andrzej Siewior
2023-07-29 5:31 ` Tetsuo Handa [this message]
2023-07-29 11:05 ` Tetsuo Handa
2023-07-31 14:25 ` Michal Hocko
2023-08-03 13:18 ` Tetsuo Handa
2023-08-03 14:49 ` Michal Hocko
2023-08-04 13:27 ` Tetsuo Handa
2023-08-07 8:20 ` Michal Hocko
2023-06-26 12:56 ` Mel Gorman
2023-06-23 17:12 ` [PATCH v2 2/2] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save() Sebastian Andrzej Siewior
2023-06-23 18:17 ` Michal Hocko
2023-06-23 20:15 ` [PATCH v3 " Sebastian Andrzej Siewior
2023-06-26 7:56 ` David Hildenbrand
2023-06-26 13:14 ` Mel Gorman
2023-06-28 13:56 ` Michal Hocko
2023-06-25 2:27 ` [PATCH v2 0/2] seqlock,mm: lockdep annotation + write_seqlock_irqsave() Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b6ba16ce-4849-d32c-68fe-07a15aaf9d9c@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=john.ogness@linutronix.de \
--cc=lgoncalv@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).