All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Juri Lelli <jlelli@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>, He Zhe <zhe.he@windriver.com>
Subject: Re: 5.13-rt1 + KVM = WARNING: at fs/eventfd.c:74 eventfd_signal()
Date: Wed, 14 Jul 2021 06:41:57 -0400	[thread overview]
Message-ID: <20210714063814-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <475f84e2-78ee-1a24-ef57-b16c1f2651ed@redhat.com>

On Wed, Jul 14, 2021 at 12:35:27PM +0200, Paolo Bonzini wrote:
> On 14/07/21 11:23, Jason Wang wrote:
> > > This was added in 2020, so it's unlikely to be the direct cause of the
> > > change.  What is a known-good version for the host?
> > > 
> > > Since it is not KVM stuff, I'm CCing Michael and Jason.
> > 
> > I think this can be probably fixed here:
> > 
> > https://lore.kernel.org/lkml/20210618084412.18257-1-zhe.he@windriver.com/
> 
> That seems wrong; in particular it wouldn't protect against AB/BA deadlocks.
> In fact, the bug is with the locking; the code assumes that
> spin_lock_irqsave/spin_unlock_irqrestore is non-preemptable and therefore
> increments and decrements the percpu variable inside the critical section.
> 
> This obviously does not fly with PREEMPT_RT; the right fix should be
> using a local_lock.  Something like this (untested!!):
> 
> --------------- 8< ---------------
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [PATCH] eventfd: protect eventfd_wake_count with a local_lock
> 
> eventfd_signal assumes that spin_lock_irqsave/spin_unlock_irqrestore is
> non-preemptable and therefore increments and decrements the percpu
> variable inside the critical section.
> 
> This obviously does not fly with PREEMPT_RT.  If eventfd_signal is
> preempted and an unrelated thread calls eventfd_signal, the result is
> a spurious WARN.  To avoid this, protect the percpu variable with a
> local_lock.
> 
> Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com>
> Fixes: b5e683d5cab8 ("eventfd: track eventfd_signal() recursion depth")
> Cc: stable@vger.kernel.org
> Cc: He Zhe <zhe.he@windriver.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


Makes sense ... 

Acked-by: Michael S. Tsirkin <mst@redhat.com>

want to send this to the windriver guys so they can test?
Here's the list from that thread:

To: xieyongji@bytedance.com, mst@redhat.com, jasowang@redhat.com,
	stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com,
	hch@infradead.org, christian.brauner@canonical.com,
	rdunlap@infradead.org, willy@infradead.org,
	viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org,
	corbet@lwn.net, mika.penttila@nextfour.com,
	dan.carpenter@oracle.com, gregkh@linuxfoundation.org,
	songmuchun@bytedance.com,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, qiang.zhang@windriver.com,
	zhe.he@windriver.com


> 
> diff --git a/fs/eventfd.c b/fs/eventfd.c
> index e265b6dd4f34..7d27b6e080ea 100644
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -12,6 +12,7 @@
>  #include <linux/fs.h>
>  #include <linux/sched/signal.h>
>  #include <linux/kernel.h>
> +#include <linux/local_lock.h>
>  #include <linux/slab.h>
>  #include <linux/list.h>
>  #include <linux/spinlock.h>
> @@ -25,6 +26,7 @@
>  #include <linux/idr.h>
>  #include <linux/uio.h>
> +static local_lock_t eventfd_wake_lock = INIT_LOCAL_LOCK(eventfd_wake_lock);
>  DEFINE_PER_CPU(int, eventfd_wake_count);
>  static DEFINE_IDA(eventfd_ida);
> @@ -71,8 +73,11 @@ __u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n)
>  	 * it returns true, the eventfd_signal() call should be deferred to a
>  	 * safe context.
>  	 */
> -	if (WARN_ON_ONCE(this_cpu_read(eventfd_wake_count)))
> +	local_lock(&eventfd_wake_lock);
> +	if (WARN_ON_ONCE(this_cpu_read(eventfd_wake_count))) {
> +		local_unlock(&eventfd_wake_lock);
>  		return 0;
> +	}
>  	spin_lock_irqsave(&ctx->wqh.lock, flags);
>  	this_cpu_inc(eventfd_wake_count);
> @@ -83,6 +88,7 @@ __u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n)
>  		wake_up_locked_poll(&ctx->wqh, EPOLLIN);
>  	this_cpu_dec(eventfd_wake_count);
>  	spin_unlock_irqrestore(&ctx->wqh.lock, flags);
> +	local_unlock(&eventfd_wake_lock);
>  	return n;
>  }


  reply	other threads:[~2021-07-14 10:42 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-14  8:01 5.13-rt1 + KVM = WARNING: at fs/eventfd.c:74 eventfd_signal() Daniel Bristot de Oliveira
2021-07-14  8:10 ` Paolo Bonzini
2021-07-14  9:23   ` Jason Wang
2021-07-14 10:35     ` Paolo Bonzini
2021-07-14 10:41       ` Michael S. Tsirkin [this message]
2021-07-14 10:44         ` Paolo Bonzini
2021-07-14 12:20       ` Daniel Bristot de Oliveira
2021-07-15  4:14       ` Jason Wang
2021-07-15  5:58         ` Paolo Bonzini
2021-07-15  6:45           ` Jason Wang
2021-07-15  8:22       ` Daniel Bristot de Oliveira
2021-07-15  8:44         ` He Zhe
2021-07-15  9:51           ` Paolo Bonzini
2021-07-15 10:10             ` He Zhe
2021-07-15 11:05               ` Paolo Bonzini
2021-07-16  2:26                 ` Jason Wang
2021-07-16  2:43                   ` He Zhe
2021-07-16  2:46                     ` Jason Wang
2021-07-15  9:46         ` Paolo Bonzini
2021-07-15 12:34           ` Daniel Bristot de Oliveira
     [not found]       ` <20210715102249.2205-1-hdanton@sina.com>
2021-07-15 12:31         ` Daniel Bristot de Oliveira
     [not found]         ` <20210716020611.2288-1-hdanton@sina.com>
2021-07-16  6:54           ` Paolo Bonzini
     [not found]           ` <20210716075539.2376-1-hdanton@sina.com>
2021-07-16  7:59             ` Paolo Bonzini
     [not found]             ` <20210716093725.2438-1-hdanton@sina.com>
2021-07-16 11:55               ` Paolo Bonzini
2021-07-18 12:42                 ` Hillf Danton
2021-07-19 15:38                   ` Paolo Bonzini
2021-07-21  7:04                     ` Hillf Danton
2021-07-21  7:25                       ` Thomas Gleixner
2021-07-21 10:11                         ` Hillf Danton
2021-07-21 10:59                           ` Paolo Bonzini
2021-07-22  5:58                             ` Hillf Danton
2021-07-23  2:23                             ` Hillf Danton
2021-07-23  7:59                               ` Paolo Bonzini
2021-07-23  9:48                                 ` Hillf Danton
2021-07-23 10:56                                   ` Paolo Bonzini
2021-07-24  4:33                                     ` Hillf Danton
2021-07-26 11:03                                       ` Paolo Bonzini
2021-07-28  8:06       ` Thomas Gleixner
2021-07-28 10:21         ` Paolo Bonzini
2021-07-28 19:07           ` Thomas Gleixner
2021-07-29 11:01             ` [PATCH] eventfd: Make signal recursion protection a task bit Thomas Gleixner
2021-07-29 14:32               ` Daniel Bristot de Oliveira
2021-07-29 19:23               ` Daniel Bristot de Oliveira
2021-08-26  7:03               ` Jason Wang
2021-08-27 23:41               ` [tip: sched/core] " tip-bot2 for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210714063814-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jlelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zhe.he@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.