From: "André Almeida" <andrealmeid@collabora.com>
To: Hillf Danton <hdanton@sina.com>
Cc: linux-kernel@vger.kernel.org,
"kernel@collabora.com" <kernel@collabora.com>
Subject: Re: [PATCH v3 0/4] Implement FUTEX_WAIT_MULTIPLE operation
Date: Thu, 5 Mar 2020 11:48:07 -0300 [thread overview]
Message-ID: <fdcd5f32-803f-7665-22a2-d674840a3e54@collabora.com> (raw)
In-Reply-To: <20200229105130.15436-1-hdanton@sina.com>
On 2/29/20 7:51 AM, Hillf Danton wrote:
>
> On Thu, 13 Feb 2020 18:45:21 -0300 Andre Almeida wrote:
>>
>> When a write or a read operation in an eventfd file succeeds, it will try
>> to wake up all threads that are waiting to perform some operation to
>> the file. The lock (ctx->wqh.lock) that hold the access to the file value
>> (ctx->count) is the same lock used to control access the waitqueue. When
>> all those those thread woke, they will compete to get this lock. Along
>> with that, the poll() also manipulates the waitqueue and need to hold
>> this same lock. This lock is specially hard to acquire when poll() calls
>> poll_freewait(), where it tries to free all waitqueues associated with
>> this poll. While doing that, it will compete with a lot of read and
>> write operations that have been waken.
>
> Want to know if a rwsem is likely to help mitigate the tension between the
> readers and writers in your workloads.
>
Thanks for the suggestion Hillf. However, keep in mind that the lock
that is causing the tension (ctx->wqh.lock) is shared between eventfd()
and poll() syscall, so a solution to help mitigate the tension would
need to be shared between both codes. And since wqh.lock isn't a
read/write lock, even if a lot of readers manage to get this lock in
parallel, they will compete wqh.lock
> --- a/fs/eventfd.c
> +++ b/fs/eventfd.c
> @@ -40,6 +40,7 @@ struct eventfd_ctx {
> __u64 count;
> unsigned int flags;
> int id;
> + struct rw_semaphore rwsem;
> };
>
> /**
> @@ -212,6 +213,8 @@ static ssize_t eventfd_read(struct file
> if (count < sizeof(ucnt))
> return -EINVAL;
>
> + /* take sem in write mode for event read */
> + down_write(&ctx->rwsem);
> spin_lock_irq(&ctx->wqh.lock);
> res = -EAGAIN;
> if (ctx->count > 0)
> @@ -229,7 +232,9 @@ static ssize_t eventfd_read(struct file
> break;
> }
> spin_unlock_irq(&ctx->wqh.lock);
> + up_write(&ctx->rwsem);
> schedule();
> + down_write(&ctx->rwsem);
> spin_lock_irq(&ctx->wqh.lock);
> }
> __remove_wait_queue(&ctx->wqh, &wait);
> @@ -241,6 +246,7 @@ static ssize_t eventfd_read(struct file
> wake_up_locked_poll(&ctx->wqh, EPOLLOUT);
> }
> spin_unlock_irq(&ctx->wqh.lock);
> + up_write(&ctx->rwsem);
>
> if (res > 0 && put_user(ucnt, (__u64 __user *)buf))
> return -EFAULT;
> @@ -262,6 +268,8 @@ static ssize_t eventfd_write(struct file
> return -EFAULT;
> if (ucnt == ULLONG_MAX)
> return -EINVAL;
> + /* take sem in read mode for event write */
> + down_read(&ctx->rwsem);
> spin_lock_irq(&ctx->wqh.lock);
> res = -EAGAIN;
> if (ULLONG_MAX - ctx->count > ucnt)
> @@ -279,7 +287,9 @@ static ssize_t eventfd_write(struct file
> break;
> }
> spin_unlock_irq(&ctx->wqh.lock);
> + up_read(&ctx->rwsem);
> schedule();
> + down_read(&ctx->rwsem);
> spin_lock_irq(&ctx->wqh.lock);
> }
> __remove_wait_queue(&ctx->wqh, &wait);
> @@ -291,6 +301,7 @@ static ssize_t eventfd_write(struct file
> wake_up_locked_poll(&ctx->wqh, EPOLLIN);
> }
> spin_unlock_irq(&ctx->wqh.lock);
> + up_read(&ctx->rwsem);
>
> return res;
> }
> @@ -408,6 +419,7 @@ static int do_eventfd(unsigned int count
> init_waitqueue_head(&ctx->wqh);
> ctx->count = count;
> ctx->flags = flags;
> + init_rwsem(&ctx->rwsem);
> ctx->id = ida_simple_get(&eventfd_ida, 0, 0, GFP_KERNEL);
>
> fd = anon_inode_getfd("[eventfd]", &eventfd_fops, ctx,
>
next parent reply other threads:[~2020-03-05 14:50 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200229105130.15436-1-hdanton@sina.com>
2020-03-05 14:48 ` André Almeida [this message]
2020-02-13 21:45 [PATCH v3 0/4] Implement FUTEX_WAIT_MULTIPLE operation André Almeida
2020-02-19 16:27 ` shuah
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fdcd5f32-803f-7665-22a2-d674840a3e54@collabora.com \
--to=andrealmeid@collabora.com \
--cc=hdanton@sina.com \
--cc=kernel@collabora.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).