All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Sychov <dmitry.sychov@gmail.com>
To: Mark Papadakis <markuspapadakis@icloud.com>
Cc: "H. de Vries" <hdevries@fastmail.com>,
	io-uring <io-uring@vger.kernel.org>
Subject: Re: Any performance gains from using per thread(thread local) urings?
Date: Wed, 13 May 2020 16:15:51 +0300	[thread overview]
Message-ID: <CADPKF+eZCE4A2yXnQaZvq1uk3b-zR+-rwQhzA2z=v7+VsTndkQ@mail.gmail.com> (raw)
In-Reply-To: <7692E70C-A0EA-423B-883F-6BF91B0DB359@icloud.com>

Hey Mark,

Or we could share one SQ and one CQ between multiple threads(bound by
the max number of CPU cores) for direct read/write access using very
light mutex to sync.

This also solves threads starvation issue  - thread A submits the job
into shared SQ while thread B both collects and _processes_ the result
from the shared CQ instead of waiting on his own unique CQ for next
completion event.

On Wed, May 13, 2020 at 2:56 PM Mark Papadakis
<markuspapadakis@icloud.com> wrote:
>
> For what it’s worth, I am (also) using using multiple “reactor” (i.e event driven) cores, each associated with one OS thread, and each reactor core manages its own io_uring context/queues.
>
> Even if scheduling all SQEs through a single io_uring SQ — by e.g collecting all such SQEs in every OS thread and then somehow “moving” them to the one OS thread that manages the SQ so that it can enqueue them all -- is very cheap, you ‘d still need to drain the CQ from that thread and presumably process those CQEs in a single OS thread, which will definitely be more work than having each reactor/OS thread dequeue CQEs for SQEs that itself submitted.
> You could have a single OS thread just for I/O and all other threads could do something else but you’d presumably need to serialize access/share state between them and the one OS thread for I/O which maybe a scalability bottleneck.
>
> ( if you are curious, you can read about it here https://medium.com/@markpapadakis/building-high-performance-services-in-2020-e2dea272f6f6 )
>
> If you experiment with the various possible designs though, I’d love it if you were to share your findings.
>
> —
> @markpapapdakis
>
>
> > On 13 May 2020, at 2:01 PM, Dmitry Sychov <dmitry.sychov@gmail.com> wrote:
> >
> > Hi Hielke,
> >
> >> If you want max performance, what you generally will see in non-blocking servers is one event loop per core/thread.
> >> This means one ring per core/thread. Of course there is no simple answer to this.
> >> See how thread-based servers work vs non-blocking servers. E.g. Apache vs Nginx or Tomcat vs Netty.
> >
> > I think a lot depends on the internal uring implementation. To what
> > degree the kernel is able to handle multiple urings independently,
> > without much congestion points(like updates of the same memory
> > locations from multiple threads), thus taking advantage of one ring
> > per CPU core.
> >
> > For example, if the tasks from multiple rings are later combined into
> > single input kernel queue (effectively forming a congestion point) I
> > see
> > no reason to use exclusive ring per core in user space.
> >
> > [BTW in Windows IOCP is always one input+output queue for all(active) threads].
> >
> > Also we could pop out multiple completion events from a single CQ at
> > once to spread the handling to cores-bound threads .
> >
> > I thought about one uring per core at first, but now I'am not sure -
> > maybe the kernel devs have something to add to the discussion?
> >
> > P.S. uring is the main reason I'am switching from windows to linux dev
> > for client-sever app so I want to extract the max performance possible
> > out of this new exciting uring stuff. :)
> >
> > Thanks, Dmitry
>

  reply	other threads:[~2020-05-13 13:16 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-12 20:20 Any performance gains from using per thread(thread local) urings? Dmitry Sychov
2020-05-13  6:07 ` H. de Vries
2020-05-13 11:01   ` Dmitry Sychov
2020-05-13 11:56     ` Mark Papadakis
2020-05-13 13:15       ` Dmitry Sychov [this message]
2020-05-13 13:27         ` Mark Papadakis
2020-05-13 13:48           ` Dmitry Sychov
2020-05-13 14:12           ` Sergiy Yevtushenko
     [not found]           ` <CAO5MNut+nD-OqsKgae=eibWYuPim1f8-NuwqVpD87eZQnrwscA@mail.gmail.com>
2020-05-13 14:22             ` Dmitry Sychov
2020-05-13 14:31               ` Dmitry Sychov
2020-05-13 16:02               ` Pavel Begunkov
2020-05-13 19:23                 ` Dmitry Sychov
2020-05-14 10:06                   ` Pavel Begunkov
2020-05-14 11:35                     ` Dmitry Sychov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADPKF+eZCE4A2yXnQaZvq1uk3b-zR+-rwQhzA2z=v7+VsTndkQ@mail.gmail.com' \
    --to=dmitry.sychov@gmail.com \
    --cc=hdevries@fastmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=markuspapadakis@icloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.