All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v2 0/9] fixed worker
@ 2022-04-20 10:39 Hao Xu
  2022-04-20 10:39 ` [PATCH 1/9] io-wq: add a worker flag for individual exit Hao Xu
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Hao Xu @ 2022-04-20 10:39 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, Pavel Begunkov

This is the second version of fixed worker implementation.
Wrote a nop test program to test it, 3 fixed-workers VS 3 normal workers.
normal workers:
./run_nop_wqe.sh nop_wqe_normal 200000 100 3 1-3
        time spent: 10464397 usecs      IOPS: 1911242
        time spent: 9610976 usecs       IOPS: 2080954
        time spent: 9807361 usecs       IOPS: 2039284

fixed workers:
./run_nop_wqe.sh nop_wqe_fixed 200000 100 3 1-3
        time spent: 17314274 usecs      IOPS: 1155116
        time spent: 17016942 usecs      IOPS: 1175299
        time spent: 17908684 usecs      IOPS: 1116776

About 2x improvement. From perf result, almost no acct->lock contension.
Test program: https://github.com/HowHsu/liburing/tree/fixed_worker
liburing/test/nop_wqe.c

Hao Xu (9):
  io-wq: add a worker flag for individual exit
  io-wq: change argument of create_io_worker() for convienence
  io-wq: add infra data structure for fixed workers
  io-wq: tweak io_get_acct()
  io-wq: fixed worker initialization
  io-wq: fixed worker exit
  io-wq: implement fixed worker logic
  io-wq: batch the handling of fixed worker private works
  io_uring: add register fixed worker interface

 fs/io-wq.c                    | 457 ++++++++++++++++++++++++++++++----
 fs/io-wq.h                    |   8 +
 fs/io_uring.c                 |  71 ++++++
 include/uapi/linux/io_uring.h |  11 +
 4 files changed, 498 insertions(+), 49 deletions(-)

-- 
2.36.0


^ permalink raw reply	[flat|nested] 12+ messages in thread
* [RFC v3 0/9] fixed worker
@ 2022-04-29 10:18 Hao Xu
  2022-04-29 10:18 ` [PATCH 8/9] io-wq: batch the handling of fixed worker private works Hao Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Hao Xu @ 2022-04-29 10:18 UTC (permalink / raw)
  To: io-uring; +Cc: Jens Axboe, Pavel Begunkov, linux-fsdevel, linux-kernel

This is the third version of fixed worker implementation.
Wrote a nop test program to test it, 3 fixed-workers VS 3 normal workers.
normal workers:
./run_nop_wqe.sh nop_wqe_normal 200000 100 3 1-3
        time spent: 10464397 usecs      IOPS: 1911242
        time spent: 9610976 usecs       IOPS: 2080954
        time spent: 9807361 usecs       IOPS: 2039284

fixed workers:
./run_nop_wqe.sh nop_wqe_fixed 200000 100 3 1-3
        time spent: 17314274 usecs      IOPS: 1155116
        time spent: 17016942 usecs      IOPS: 1175299
        time spent: 17908684 usecs      IOPS: 1116776

About 2x improvement. From perf result, almost no acct->lock contension.
Test program: https://github.com/HowHsu/liburing/tree/fixed_worker
liburing/test/nop_wqe.c

v2->v3:
 - change dispatch work strategy from random to round-robin

things to be done:
 - Still need some thinking about the work cancellation
 - not very sure IO_WORKER_F_EXIT is safe enough on synchronization
 - the iowq hash stuff is not compatible with fixed worker for now

Any comments are welcome. Thanks in advance.

Hao Xu (9):
  io-wq: add a worker flag for individual exit
  io-wq: change argument of create_io_worker() for convienence
  io-wq: add infra data structure for fixed workers
  io-wq: tweak io_get_acct()
  io-wq: fixed worker initialization
  io-wq: fixed worker exit
  io-wq: implement fixed worker logic
  io-wq: batch the handling of fixed worker private works
  io_uring: add register fixed worker interface

 fs/io-wq.c                    | 460 ++++++++++++++++++++++++++++++----
 fs/io-wq.h                    |   8 +
 fs/io_uring.c                 |  71 ++++++
 include/uapi/linux/io_uring.h |  11 +
 4 files changed, 501 insertions(+), 49 deletions(-)

-- 
2.36.0


^ permalink raw reply	[flat|nested] 12+ messages in thread
* [RFC 0/9] fixed worker: a new way to handle io works
@ 2021-11-24  4:46 Hao Xu
  2021-11-24  4:46 ` [PATCH 8/9] io-wq: batch the handling of fixed worker private works Hao Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Hao Xu @ 2021-11-24  4:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring, Pavel Begunkov, Joseph Qi

There is big contension in current io-wq implementation. Introduce a new
type io-worker called fixed-worker to solve this problem. it is also a
new way to handle works. In this new system, works are dispatched to
different private queues rather than a long shared queue.

Detail introduction and data in 7/9.

To be done: 1) the hash optimization isn't applied yet
            2) user interface
            3) cannot ensure linear order for works of same reg file
               writing since we now have multiple work lists.
            4) code clean

Sent this for suggestions.

The test program used in this patchset:
// nop_test.c
// remove some error handling, variable definition, header files etc.
typedef long long ll;
ll usecs(struct timeval tv) {
    return tv.tv_sec*(ll)1000*1000+tv.tv_usec;
}

static int test_single_nop(struct io_uring *ring, int depth)
{
    for (i=0; i<depth; i++) {
        sqe = io_uring_get_sqe(ring);
        io_uring_prep_nop(sqe);
        sqe->flags |= IOSQE_ASYNC;
    }
    ret = io_uring_submit(ring);
    for(i=0; i<depth; i++) {
        ret = io_uring_wait_cqe(ring, &cqe);
        io_uring_cqe_seen(ring, cqe);
    }
    return 0;
}

int main(int argc, char *argv[])
{
    ll delta;
    struct io_uring ring;
    int ret, l, loop=4000000, depth = 10;
    struct timeval tv_begin, tv_end;
    struct timezone tz;

    ret = io_uring_queue_init(10010, &ring, 0);
    if (ret) {
        fprintf(stderr, "ring setup failed: %d\n", ret);
        return 1;
    }
    l = loop;
    gettimeofday(&tv_begin, &tz);
    while(loop--)
        test_single_nop(&ring, depth);
    gettimeofday(&tv_end, &tz);
    delta =  usecs(tv_end) - usecs(tv_begin);
    printf("time spent: %lld usecs\n", delta);
    printf("IOPS: %lld\n", (ll)l * depth * 1000000 / delta);

    return 0;
}


Hao Xu (9):
  io-wq: decouple work_list protection from the big wqe->lock
  io-wq: reduce acct->lock crossing functions lock/unlock
  io-wq: update check condition for lock
  io-wq: use IO_WQ_ACCT_NR rather than hardcoded number
  io-wq: move hash wait entry to io_wqe_acct
  io-wq: add infra data structure for fix workers
  io-wq: implement fixed worker logic
  io-wq: batch the handling of fixed worker private works
  io-wq: small optimization for __io_worker_busy()

 fs/io-wq.c | 415 ++++++++++++++++++++++++++++++++++++++---------------
 fs/io-wq.h |   5 +
 2 files changed, 308 insertions(+), 112 deletions(-)

-- 
2.24.4


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-04-29 10:19 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-20 10:39 [RFC v2 0/9] fixed worker Hao Xu
2022-04-20 10:39 ` [PATCH 1/9] io-wq: add a worker flag for individual exit Hao Xu
2022-04-20 10:39 ` [PATCH 2/9] io-wq: change argument of create_io_worker() for convienence Hao Xu
2022-04-20 10:39 ` [PATCH 3/9] io-wq: add infra data structure for fixed workers Hao Xu
2022-04-20 10:39 ` [PATCH 4/9] io-wq: tweak io_get_acct() Hao Xu
2022-04-20 10:39 ` [PATCH 5/9] io-wq: fixed worker initialization Hao Xu
2022-04-20 10:39 ` [PATCH 6/9] io-wq: fixed worker exit Hao Xu
2022-04-20 10:39 ` [PATCH 7/9] io-wq: implement fixed worker logic Hao Xu
2022-04-20 10:39 ` [PATCH 8/9] io-wq: batch the handling of fixed worker private works Hao Xu
2022-04-20 10:40 ` [PATCH 9/9] io_uring: add register fixed worker interface Hao Xu
  -- strict thread matches above, loose matches on Subject: below --
2022-04-29 10:18 [RFC v3 0/9] fixed worker Hao Xu
2022-04-29 10:18 ` [PATCH 8/9] io-wq: batch the handling of fixed worker private works Hao Xu
2021-11-24  4:46 [RFC 0/9] fixed worker: a new way to handle io works Hao Xu
2021-11-24  4:46 ` [PATCH 8/9] io-wq: batch the handling of fixed worker private works Hao Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.