All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0
@ 2023-07-09  6:54 wenyang.linux
       [not found] ` <add93bba-5c63-7d7f-2034-2d25b7b44ada@web.de>
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: wenyang.linux @ 2023-07-09  6:54 UTC (permalink / raw)
  To: Alexander Viro, Jens Axboe, Christian Brauner
  Cc: Wen Yang, Christoph Hellwig, Dylan Yudaken, David Woodhouse,
	Matthew Wilcox, linux-fsdevel, linux-kernel

From: Wen Yang <wenyang.linux@foxmail.com>

For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.

Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dylan Yudaken <dylany@fb.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 fs/eventfd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/eventfd.c b/fs/eventfd.c
index 8aa36cd37351..10a101df19cd 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
 {
 	lockdep_assert_held(&ctx->wqh.lock);
 
-	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
+	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
 	ctx->count -= *cnt;
 }
 EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
@@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
 		return -EFAULT;
 	if (ucnt == ULLONG_MAX)
 		return -EINVAL;
+	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
+		return -EINVAL;
 	spin_lock_irq(&ctx->wqh.lock);
 	res = -EAGAIN;
 	if (ULLONG_MAX - ctx->count > ucnt)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0
       [not found] ` <add93bba-5c63-7d7f-2034-2d25b7b44ada@web.de>
@ 2023-07-09 20:05   ` Matthew Wilcox
  0 siblings, 0 replies; 8+ messages in thread
From: Matthew Wilcox @ 2023-07-09 20:05 UTC (permalink / raw)
  To: Markus Elfring
  Cc: Wen Yang, linux-fsdevel, kernel-janitors, Alexander Viro,
	Christian Brauner, Jens Axboe, LKML, Christoph Hellwig,
	David Woodhouse, Dylan Yudaken

On Sun, Jul 09, 2023 at 09:13:28PM +0200, Markus Elfring wrote:
> > For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> > eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> 
> Please choose an imperative change suggestion.

Markus, stop this nitpicking.  It puts off contributors.  The changelog
is perfectly clear.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0
  2023-07-09  6:54 [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0 wenyang.linux
       [not found] ` <add93bba-5c63-7d7f-2034-2d25b7b44ada@web.de>
@ 2023-07-10 14:12 ` Christian Brauner
  2023-07-10 15:02   ` Wen Yang
  2023-07-11  9:46 ` Christian Brauner
  2 siblings, 1 reply; 8+ messages in thread
From: Christian Brauner @ 2023-07-10 14:12 UTC (permalink / raw)
  To: wenyang.linux
  Cc: Alexander Viro, Jens Axboe, Christoph Hellwig, Dylan Yudaken,
	David Woodhouse, Matthew Wilcox, linux-fsdevel, linux-kernel

On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
> From: Wen Yang <wenyang.linux@foxmail.com>
> 
> For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> 
> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Dylan Yudaken <dylany@fb.com>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---

So this looks ok but I would like to see an analysis how the overflow
can happen. I'm looking at the callers and it seems that once ctx->count
hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
there a caller that can call directly or indirectly
eventfd_ctx_do_read() on a ctx->count == 0?

I'm just slightly skeptical about patches that fix issues without an
analysis how this can happen.

>  fs/eventfd.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/eventfd.c b/fs/eventfd.c
> index 8aa36cd37351..10a101df19cd 100644
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
>  {
>  	lockdep_assert_held(&ctx->wqh.lock);
>  
> -	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
> +	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
>  	ctx->count -= *cnt;
>  }
>  EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
> @@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
>  		return -EFAULT;
>  	if (ucnt == ULLONG_MAX)
>  		return -EINVAL;
> +	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
> +		return -EINVAL;
>  	spin_lock_irq(&ctx->wqh.lock);
>  	res = -EAGAIN;
>  	if (ULLONG_MAX - ctx->count > ucnt)
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0
  2023-07-10 14:12 ` Christian Brauner
@ 2023-07-10 15:02   ` Wen Yang
  2023-07-11  9:07     ` Christian Brauner
  2023-07-11  9:39     ` Christian Brauner
  0 siblings, 2 replies; 8+ messages in thread
From: Wen Yang @ 2023-07-10 15:02 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexander Viro, Jens Axboe, Christoph Hellwig, Dylan Yudaken,
	David Woodhouse, Matthew Wilcox, linux-fsdevel, linux-kernel


On 2023/7/10 22:12, Christian Brauner wrote:
> On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
>> From: Wen Yang <wenyang.linux@foxmail.com>
>>
>> For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
>> eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
>>
>> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
>> Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> Cc: Christian Brauner <brauner@kernel.org>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Dylan Yudaken <dylany@fb.com>
>> Cc: David Woodhouse <dwmw@amazon.co.uk>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: linux-fsdevel@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> ---
> So this looks ok but I would like to see an analysis how the overflow
> can happen. I'm looking at the callers and it seems that once ctx->count
> hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
> there a caller that can call directly or indirectly
> eventfd_ctx_do_read() on a ctx->count == 0?
eventfd_read() ensures that ctx->count is not 0 before calling 
eventfd_ctx_do_read() and it is correct.

But it is not appropriate for eventfd_ctx_remove_wait_queue() to call 
eventfd_ctx_do_read() unconditionally,

as it may not only causes ctx->count to overflow, but also unnecessarily 
calls wake_up_locked_poll().


I am sorry for just adding the following string in the patch:
Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")


Looking forward to your suggestions.

--

Best wishes,

Wen


> I'm just slightly skeptical about patches that fix issues without an
> analysis how this can happen.
>
>>   fs/eventfd.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/eventfd.c b/fs/eventfd.c
>> index 8aa36cd37351..10a101df19cd 100644
>> --- a/fs/eventfd.c
>> +++ b/fs/eventfd.c
>> @@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
>>   {
>>   	lockdep_assert_held(&ctx->wqh.lock);
>>   
>> -	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
>> +	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
>>   	ctx->count -= *cnt;
>>   }
>>   EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
>> @@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
>>   		return -EFAULT;
>>   	if (ucnt == ULLONG_MAX)
>>   		return -EINVAL;
>> +	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
>> +		return -EINVAL;
>>   	spin_lock_irq(&ctx->wqh.lock);
>>   	res = -EAGAIN;
>>   	if (ULLONG_MAX - ctx->count > ucnt)
>> -- 
>> 2.25.1
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0
  2023-07-10 15:02   ` Wen Yang
@ 2023-07-11  9:07     ` Christian Brauner
  2023-07-11  9:39     ` Christian Brauner
  1 sibling, 0 replies; 8+ messages in thread
From: Christian Brauner @ 2023-07-11  9:07 UTC (permalink / raw)
  To: Wen Yang
  Cc: Alexander Viro, Jens Axboe, Christoph Hellwig, Dylan Yudaken,
	David Woodhouse, Matthew Wilcox, linux-fsdevel, linux-kernel

On Mon, Jul 10, 2023 at 11:02:33PM +0800, Wen Yang wrote:
> 
> On 2023/7/10 22:12, Christian Brauner wrote:
> > On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
> > > From: Wen Yang <wenyang.linux@foxmail.com>
> > > 
> > > For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> > > eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> > > 
> > > Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> > > Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
> > > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > > Cc: Jens Axboe <axboe@kernel.dk>
> > > Cc: Christian Brauner <brauner@kernel.org>
> > > Cc: Christoph Hellwig <hch@lst.de>
> > > Cc: Dylan Yudaken <dylany@fb.com>
> > > Cc: David Woodhouse <dwmw@amazon.co.uk>
> > > Cc: Matthew Wilcox <willy@infradead.org>
> > > Cc: linux-fsdevel@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > ---
> > So this looks ok but I would like to see an analysis how the overflow
> > can happen. I'm looking at the callers and it seems that once ctx->count
> > hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
> > there a caller that can call directly or indirectly
> > eventfd_ctx_do_read() on a ctx->count == 0?
> eventfd_read() ensures that ctx->count is not 0 before calling
> eventfd_ctx_do_read() and it is correct.
> 
> But it is not appropriate for eventfd_ctx_remove_wait_queue() to call
> eventfd_ctx_do_read() unconditionally,
> 
> as it may not only causes ctx->count to overflow, but also unnecessarily
> calls wake_up_locked_poll().

Hm, so I think you're right and an underflow can be triggered for at
least three subsystems:

(1) virt/kvm/eventfd.c
(2) drivers/vfio/virqfd.c
(3) drivers/virt/acrn/irqfd.c

where (2) and (3) are just modeled after (1). The eventfd must've been
set to EFD_SEMAPHORE and ctx->count must been or decremented zero. The
only way I can see the _underflow_ happening is if the irqfd is shutdown
through an ioctl() like KVM_IRQFD with KVM_IRQFD_FLAG_DEASSIGN raised
while ctx->count is zero:

kvm_vm_ioctl()
-> kvm_irqfd()
   -> kvm_irqfd_deassign()
      -> irqfd_deactivate()
         -> irqfd_shutdown()
            -> eventfd_ctx_remove_wait_queue(&cnt)

which would underflow @cnt and cause a spurious wakeup. Userspace would
still read one because of EFD_SEMAPHORE semantics and wouldn't notice
the underflow.

I think it's probably not that bad because afaict, this really can only
happen when (1)-(3) are shutdown. But we should still fix it ofc.

> 
> 
> I am sorry for just adding the following string in the patch:
> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> 
> 
> Looking forward to your suggestions.

What I usually look for is some callchain/analysis that explain under
what circumstance what this is fixing can happen. That makes life for
reviewers a lot easier because they don't have to dig out that work
themselves which takes time.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0
  2023-07-10 15:02   ` Wen Yang
  2023-07-11  9:07     ` Christian Brauner
@ 2023-07-11  9:39     ` Christian Brauner
  2023-07-11 18:14       ` Wen Yang
  1 sibling, 1 reply; 8+ messages in thread
From: Christian Brauner @ 2023-07-11  9:39 UTC (permalink / raw)
  To: Wen Yang
  Cc: Alexander Viro, Jens Axboe, Christoph Hellwig, Dylan Yudaken,
	David Woodhouse, Matthew Wilcox, linux-fsdevel, linux-kernel

On Mon, Jul 10, 2023 at 11:02:33PM +0800, Wen Yang wrote:
> 
> On 2023/7/10 22:12, Christian Brauner wrote:
> > On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
> > > From: Wen Yang <wenyang.linux@foxmail.com>
> > > 
> > > For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> > > eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> > > 
> > > Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> > > Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
> > > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > > Cc: Jens Axboe <axboe@kernel.dk>
> > > Cc: Christian Brauner <brauner@kernel.org>
> > > Cc: Christoph Hellwig <hch@lst.de>
> > > Cc: Dylan Yudaken <dylany@fb.com>
> > > Cc: David Woodhouse <dwmw@amazon.co.uk>
> > > Cc: Matthew Wilcox <willy@infradead.org>
> > > Cc: linux-fsdevel@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > ---
> > So this looks ok but I would like to see an analysis how the overflow
> > can happen. I'm looking at the callers and it seems that once ctx->count
> > hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
> > there a caller that can call directly or indirectly
> > eventfd_ctx_do_read() on a ctx->count == 0?
> eventfd_read() ensures that ctx->count is not 0 before calling
> eventfd_ctx_do_read() and it is correct.
> 
> But it is not appropriate for eventfd_ctx_remove_wait_queue() to call
> eventfd_ctx_do_read() unconditionally,
> 
> as it may not only causes ctx->count to overflow, but also unnecessarily
> calls wake_up_locked_poll().
> 
> 
> I am sorry for just adding the following string in the patch:
> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
> 
> 
> Looking forward to your suggestions.
> 
> --
> 
> Best wishes,
> 
> Wen
> 
> 
> > I'm just slightly skeptical about patches that fix issues without an
> > analysis how this can happen.
> > 
> > >   fs/eventfd.c | 4 +++-
> > >   1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/eventfd.c b/fs/eventfd.c
> > > index 8aa36cd37351..10a101df19cd 100644
> > > --- a/fs/eventfd.c
> > > +++ b/fs/eventfd.c
> > > @@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
> > >   {
> > >   	lockdep_assert_held(&ctx->wqh.lock);
> > > -	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
> > > +	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
> > >   	ctx->count -= *cnt;
> > >   }
> > >   EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
> > > @@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
> > >   		return -EFAULT;
> > >   	if (ucnt == ULLONG_MAX)
> > >   		return -EINVAL;
> > > +	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
> > > +		return -EINVAL;

Hm, why is bit necessary though? What's wrong with specifying ucnt == 0
with EFD_SEMAPHORE? This also looks like a (very low potential) uapi
break.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0
  2023-07-09  6:54 [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0 wenyang.linux
       [not found] ` <add93bba-5c63-7d7f-2034-2d25b7b44ada@web.de>
  2023-07-10 14:12 ` Christian Brauner
@ 2023-07-11  9:46 ` Christian Brauner
  2 siblings, 0 replies; 8+ messages in thread
From: Christian Brauner @ 2023-07-11  9:46 UTC (permalink / raw)
  To: wenyang.linux
  Cc: Christian Brauner, Christoph Hellwig, Dylan Yudaken,
	David Woodhouse, Matthew Wilcox, linux-fsdevel, linux-kernel,
	Alexander Viro, Jens Axboe

On Sun, 09 Jul 2023 14:54:51 +0800, wenyang.linux@foxmail.com wrote:
> For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
> eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
> 
> 

I've tweaked the commit message to explain how an underflow can happen.
And for now I dropped that bit:

diff --git a/fs/eventfd.c b/fs/eventfd.c
index 10a101df19cd..33a918f9566c 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -269,8 +269,6 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
                return -EFAULT;
        if (ucnt == ULLONG_MAX)
                return -EINVAL;
-       if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
-               return -EINVAL;
        spin_lock_irq(&ctx->wqh.lock);
        res = -EAGAIN;
        if (ULLONG_MAX - ctx->count > ucnt)

because I don't yet understand why that should be forbidden. Please
explain the reason for wanting that check in there and I can add it back.
I might just be missing the obvious.

---

Applied to the vfs.misc branch of the vfs/vfs.git tree.
Patches in the vfs.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.misc

[1/1] eventfd: prevent underflow for eventfd semaphores 
      https://git.kernel.org/vfs/vfs/c/7b2edd278691

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0
  2023-07-11  9:39     ` Christian Brauner
@ 2023-07-11 18:14       ` Wen Yang
  0 siblings, 0 replies; 8+ messages in thread
From: Wen Yang @ 2023-07-11 18:14 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexander Viro, Jens Axboe, Christoph Hellwig, Dylan Yudaken,
	David Woodhouse, Matthew Wilcox, linux-fsdevel, linux-kernel


On 2023/7/11 17:39, Christian Brauner wrote:
> On Mon, Jul 10, 2023 at 11:02:33PM +0800, Wen Yang wrote:
>> On 2023/7/10 22:12, Christian Brauner wrote:
>>> On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@foxmail.com wrote:
>>>> From: Wen Yang <wenyang.linux@foxmail.com>
>>>>
>>>> For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
>>>> eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
>>>>
>>>> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
>>>> Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
>>>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>>>> Cc: Jens Axboe <axboe@kernel.dk>
>>>> Cc: Christian Brauner <brauner@kernel.org>
>>>> Cc: Christoph Hellwig <hch@lst.de>
>>>> Cc: Dylan Yudaken <dylany@fb.com>
>>>> Cc: David Woodhouse <dwmw@amazon.co.uk>
>>>> Cc: Matthew Wilcox <willy@infradead.org>
>>>> Cc: linux-fsdevel@vger.kernel.org
>>>> Cc: linux-kernel@vger.kernel.org
>>>> ---
>>> So this looks ok but I would like to see an analysis how the overflow
>>> can happen. I'm looking at the callers and it seems that once ctx->count
>>> hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
>>> there a caller that can call directly or indirectly
>>> eventfd_ctx_do_read() on a ctx->count == 0?
>> eventfd_read() ensures that ctx->count is not 0 before calling
>> eventfd_ctx_do_read() and it is correct.
>>
>> But it is not appropriate for eventfd_ctx_remove_wait_queue() to call
>> eventfd_ctx_do_read() unconditionally,
>>
>> as it may not only causes ctx->count to overflow, but also unnecessarily
>> calls wake_up_locked_poll().
>>
>>
>> I am sorry for just adding the following string in the patch:
>> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
>>
>>
>> Looking forward to your suggestions.
>>
>> --
>>
>> Best wishes,
>>
>> Wen
>>
>>
>>> I'm just slightly skeptical about patches that fix issues without an
>>> analysis how this can happen.
>>>
>>>>    fs/eventfd.c | 4 +++-
>>>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/eventfd.c b/fs/eventfd.c
>>>> index 8aa36cd37351..10a101df19cd 100644
>>>> --- a/fs/eventfd.c
>>>> +++ b/fs/eventfd.c
>>>> @@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
>>>>    {
>>>>    	lockdep_assert_held(&ctx->wqh.lock);
>>>> -	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
>>>> +	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
>>>>    	ctx->count -= *cnt;
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
>>>> @@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
>>>>    		return -EFAULT;
>>>>    	if (ucnt == ULLONG_MAX)
>>>>    		return -EINVAL;
>>>> +	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
>>>> +		return -EINVAL;
> Hm, why is bit necessary though? What's wrong with specifying ucnt == 0
> with EFD_SEMAPHORE? This also looks like a (very low potential) uapi
> break.


Thank you for your careful review.

As you pointed out, this may break the uapi.
Although manaul has the following description (man 2 eventfd):
*  The file descriptor is readable (the select(2) readfds argument; the 
poll(2) POLLIN flag) if the counter has a value greater than 0.
*  The file descriptor is writable (the select(2) writefds argument; the 
poll(2) POLLOUT flag) if it is possible to write a value of at least "1" 
without blocking.

But it does not specify that the ucnt cannot be zero, so we can only 
delete the two lines of code above

Could we propose another patch specifically to address the issue you 
have identified?

Since there are indeed some corner scenes when ucnt is 0 and ctx->count 
is also 0:


static ssize_t eventfd_write(struct file *file, const char __user *buf, 
size_t count,
                              loff_t *ppos)
{
...
         if (ULLONG_MAX - ctx->count > ucnt)
                 res = sizeof(ucnt);                 ---> always > 0
...
         if (likely(res > 0)) {
                 ctx->count += ucnt;                 ---> unnecessary 
addition of 0
                 current->in_eventfd = 1;            ---> May affect 
eventfd_signal()
                 if (waitqueue_active(&ctx->wqh))
                         wake_up_locked_poll(&ctx->wqh, EPOLLIN);  ---> 
heavyweight wakeup
                 current->in_eventfd = 0;
         }
...
}


static __poll_t eventfd_poll(struct file *file, poll_table *wait)
{
...
         count = READ_ONCE(ctx->count);

         if (count > 0)
                 events |= EPOLLIN;    ---> If count is 0, all previous 
operations are wasted

         if (count == ULLONG_MAX)
                 events |= EPOLLERR;
         if (ULLONG_MAX - 1 > count)
                 events |= EPOLLOUT;

         return events;
}

Could we optimize it like this?

static ssize_t eventfd_write(struct file *file, const char __user *buf, 
size_t count,
                              loff_t *ppos)
{
...
         if (likely(res > 0)) {
                 ctx->count += ucnt;
                 if (ctx->count) {                       ---> avoiding 
unnecessary heavyweight operations
                         current->in_eventfd = 1;
                         if (waitqueue_active(&ctx->wqh))
wake_up_locked_poll(&ctx->wqh, EPOLLIN);
                         current->in_eventfd = 0;
                 }
         }
...
}


--

Best wishes,

Wen




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-07-11 18:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-09  6:54 [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is 0 wenyang.linux
     [not found] ` <add93bba-5c63-7d7f-2034-2d25b7b44ada@web.de>
2023-07-09 20:05   ` Matthew Wilcox
2023-07-10 14:12 ` Christian Brauner
2023-07-10 15:02   ` Wen Yang
2023-07-11  9:07     ` Christian Brauner
2023-07-11  9:39     ` Christian Brauner
2023-07-11 18:14       ` Wen Yang
2023-07-11  9:46 ` Christian Brauner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.