linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: linux-mm@kvack.org,
	"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@suse.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Petr Mladek <pmladek@suse.com>
Subject: Re: [PATCH] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save().
Date: Wed, 21 Jun 2023 16:34:21 +0200	[thread overview]
Message-ID: <20230621143421.BgHjJklo@linutronix.de> (raw)
In-Reply-To: <dd7e0df5-a6c1-adef-af16-c0cf3a8aee65@I-love.SAKURA.ne.jp>

On 2023-06-21 22:32:40 [+0900], Tetsuo Handa wrote:
> include/linux/seqlock.h says
> Is above understanding correct?

That is correct.

> And you are trying to replace
> 
>   local_irq_save(flags);
>   printk_deferred_enter();
>   write_seqlock(&zonelist_update_seq);
> 
> with
> 
>   write_seqlock_irqsave(&zonelist_update_seq, flags);
>   printk_deferred_enter();
> 
> , aren't you?

correct.

> But include/linux/printk.h says
> 
>   /*
>    * The printk_deferred_enter/exit macros are available only as a hack for
>    * some code paths that need to defer all printk console printing. Interrupts
>    * must be disabled for the deferred duration.
>    */
>   #define printk_deferred_enter __printk_safe_enter
>   #define printk_deferred_exit __printk_safe_exit
> 
> which means that local_irq_save() is _required_ before printk_deferred_enter().

It says that, yes, but that requirement is described as too heavy. The
requirement is that printk_deferred_enter() happens on the same CPU as
printk_deferred_exit(). This can be achieved by an explicit
local_irq_save(), yes, but also by something like spin_lock_irq() which
_ensures_ that the task does not migrate to another CPU while the lock
is acquired. This is the requirement by the current implementation.

> If local_irq_save() is hidden by your patch, what guarantees that
> printk_deferred_enter() and printk_deferred_exit() run on the same CPU?

Because spin_lock_irqsave() on CONFIG_PREEMPT_RT uses migrate_disable().
The function ensures that the scheduler does not migrate the task to
another CPU. The CPU is even block from going down (as in CPU-hotplug)
until the matching migrate_enable() occurs.

> Also, if local_irq_save() is hidden due to RT, what guarantees that
> 
>   write_seqlock_irqsave(&zonelist_update_seq, flags);
>   <<IRQ>>
>     some_timer_function() {
>       printk();
>     }
>   <<IRQ>>
>   printk_deferred_enter();
>
> does not happen because write_seqlock_irqsave() does not disable IRQ?

I don't see how zonelist_update_seq and printk here are connected
without the port lock/ or memory allocation. But there are two things
that are different on RT which probably answer your question:

- If the reader observes an odd sequence number then it acquires the
  lock of the sequence counter (it is held by the writer) which
  forces the writer to complete the write critical section and then the
  reader can continue. There are _no_ memory allocation within a
  hard IRQ context (as in the actual interrupt). The timer (hrtimer or
  timer_list timer) are served in task context and we have
  forced-threaded interrupts. Clearly this means that the seqlock_t (as
  used here) can only be used task context and not in hard IRQ context.

- The printk implementation is slightly different and it is being worked
  to merge it upstream. The two important differences here:
  - Printing happens by default in a dedicated printing thread.
  - In emergency cases like panic(), printing happens directly within
    the invocation of printk(). This requires a so called atomic console
    which does not use the tty_port::lock.

> Disabling IRQ before incrementing zonelist_update_seq is _required_ for both
> 
>   making printk_deferred_enter() safe
> 
> and
> 
>   making sure that printk_deferred_enter() takes effect
> 
> .
Did I explain why it is sufficient to do
	write_seqlock_irqsave()
	printk_deferred_enter()

assuming we have

| static inline void do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
| {
|         seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
|         do_raw_write_seqcount_begin(s);
| }

?

Sebastian


  reply	other threads:[~2023-06-21 14:34 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-21 10:40 [PATCH] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save() Sebastian Andrzej Siewior
2023-06-21 10:59 ` Michal Hocko
2023-06-21 11:16   ` Sebastian Andrzej Siewior
2023-06-21 11:49     ` Michal Hocko
2023-06-21 13:11       ` Sebastian Andrzej Siewior
2023-06-21 13:22         ` Michal Hocko
2023-06-21 13:25           ` Sebastian Andrzej Siewior
2023-06-21 11:14 ` David Hildenbrand
2023-06-21 11:33 ` Tetsuo Handa
2023-06-21 12:40   ` Petr Mladek
2023-06-21 13:08     ` Sebastian Andrzej Siewior
2023-06-21 13:06   ` Sebastian Andrzej Siewior
2023-06-21 13:32     ` Tetsuo Handa
2023-06-21 14:34       ` Sebastian Andrzej Siewior [this message]
2023-06-21 14:50         ` Tetsuo Handa
2023-06-21 23:24           ` Tetsuo Handa
2023-06-22  7:18             ` Michal Hocko
2023-06-22 10:58               ` Tetsuo Handa
2023-06-22 12:09                 ` Michal Hocko
2023-06-22 13:36             ` Tetsuo Handa
2023-06-22 14:11               ` Petr Mladek
2023-06-22 14:28                 ` Tetsuo Handa
2023-06-23  9:35                   ` Sebastian Andrzej Siewior
2023-06-22 15:04                 ` Petr Mladek
2023-06-22 15:43                   ` Tetsuo Handa
2023-06-23  9:45                     ` Sebastian Andrzej Siewior
2023-06-23  9:51                       ` Tetsuo Handa
2023-06-23 10:11                         ` Sebastian Andrzej Siewior
2023-06-23 10:36                           ` Tetsuo Handa
2023-06-23 12:44                             ` Sebastian Andrzej Siewior
2023-06-23 12:57                               ` Michal Hocko
2023-06-23 10:53                           ` Petr Mladek
2023-06-23 11:16                             ` Tetsuo Handa
2023-06-23 13:31                             ` Sebastian Andrzej Siewior
2023-06-23 15:38                               ` Petr Mladek
2023-06-23 16:04                                 ` Sebastian Andrzej Siewior
2023-06-23  9:31               ` Sebastian Andrzej Siewior
2023-06-23  7:27           ` Sebastian Andrzej Siewior
2023-06-21 15:38         ` Petr Mladek
2023-06-23  8:12           ` Sebastian Andrzej Siewior
2023-06-23  9:21             ` Michal Hocko
2023-06-23  9:58               ` Sebastian Andrzej Siewior
2023-06-23 10:43                 ` Michal Hocko
2023-06-23 10:45                 ` Sebastian Andrzej Siewior
2023-06-23 10:50                   ` Sebastian Andrzej Siewior
2023-06-23 11:32                   ` Michal Hocko
2023-06-23 10:40             ` Petr Mladek
2023-06-23 13:24               ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230621143421.BgHjJklo@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=lgoncalv@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=pmladek@suse.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).