linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	io-uring@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, nbd@other.debian.org
Subject: Re: ublk-nbd: ublk-nbd is avaialbe
Date: Thu, 26 Jan 2023 13:54:33 +0100	[thread overview]
Message-ID: <20230126125433.GA4368@1wt.eu> (raw)
In-Reply-To: <Y9JnBDrm0V1ZdWK6@T590>

On Thu, Jan 26, 2023 at 07:41:56PM +0800, Ming Lei wrote:
> On Thu, Jan 26, 2023 at 05:08:22AM +0100, Willy Tarreau wrote:
> > Hi,
> > 
> > On Thu, Jan 26, 2023 at 11:08:26AM +0800, Ming Lei wrote:
> > > Hi Jens,
> > > 
> > > On Thu, Jan 19, 2023 at 11:49:04AM -0700, Jens Axboe wrote:
> > > > On 1/19/23 7:23 AM, Ming Lei wrote:
> > > > > Hi,
> > > > > 
> > > > > ublk-nbd[1] is available now.
> > > > > 
> > > > > Basically it is one nbd client, but totally implemented in userspace,
> > > > > and wrt. current nbd-client in [2], the transmission phase is done
> > > > > by linux block nbd driver.
> > > > > 
> > > > > The handshake implementation is borrowed from nbd project[2], so
> > > > > basically ublk-nbd just adds new code for implementing transmission
> > > > > phase, and it can be thought as moving linux block nbd driver into
> > > > > userspace.
> > > > > 
> > > > > The added new code is basically in nbd/tgt_nbd.cpp, and io handling
> > > > > is based on liburing[3], and implemented by c++20 coroutine, so
> > > > > everything is done in single pthread totally lockless, meantime turns
> > > > > out it is pretty easy to design & implement, attributed to ublk framework,
> > > > > c++20 coroutine and liburing.
> > > > > 
> > > > > ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
> > > > > send zero copy via command line '--send_zc', see details in README[4].
> > > > > 
> > > > > No regression is found in xfstests by using ublk-nbd as both test device
> > > > > and scratch device, and builtin test(make test T=nbd) runs well.
> > > > > 
> > > > > Fio test("make test T=nbd") shows that ublk-nbd performance is
> > > > > basically same with nbd-client/nbd driver when running fio on real
> > > > > ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
> > > > > nbd-client(nbd driver) with 512K BS, which is because linux nbd
> > > > > driver sets max_sectors_kb as 64KB at default.
> > > > > 
> > > > > But when running fio over local tcp socket, it is observed in my test
> > > > > machine that ublk-nbd performs better than nbd-client/nbd driver,
> > > > > especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
> > > > > according to different block size.
> > > > 
> > > > This is pretty nice! Just curious, have you tried setting up your
> > > > ring with
> > > > 
> > > > p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN;
> > > > 
> > > > and see if that yields any extra performance improvements for you?
> > > > Depending on how you do processing, you should not need to do any
> > > > further changes there.
> > > > 
> > > > A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN.
> > > 
> > > IORING_SETUP_COOP_TASKRUN is enabled in current ublksrv.
> > > 
> > > After disabling COOP_TASKRUN and enabling SINGLE_ISSUER & DEFER_TASKRUN,
> > > not see obvious improvement, meantime regression is observed on 64k
> > > rw.
> > 
> > Does it handle network errors better than the default nbd client, i.e.
> > is it able to seamlessly reconnect after while keeping the same device
> > or do you end up with multiple devices ? That's one big trouble I faced
> > with the original nbd client, forcing you to unmount and remount
> > everything after a network outage for example.
> 
> All kinds of ublk disk supports such seamlessly recovery which is
> provided by UBLK_CMD_START_USER_RECOVERY/UBLK_CMD_END_USER_RECOVERY.
> During user recovery, the bdev and gendisk instance won't be gone,
> and will become fully functional after the recovery(such as reconnect)
> is successful.
> 
> So yes for this seamlessly reconnect error handling.

Nice, it's tempting to give it a try then ;-)

Willy

  reply	other threads:[~2023-01-26 12:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-19 14:23 ublk-nbd: ublk-nbd is avaialbe Ming Lei
2023-01-19 18:49 ` Jens Axboe
2023-01-26  3:08   ` Ming Lei
2023-01-26  4:08     ` Willy Tarreau
2023-01-26 11:41       ` Ming Lei
2023-01-26 12:54         ` Willy Tarreau [this message]
2023-02-28 10:04 ` Pavel Machek
2023-03-02  3:11   ` Ming Lei
2023-03-11 13:21 ` Wouter Verhelst
2023-03-12  8:30   ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230126125433.GA4368@1wt.eu \
    --to=w@1wt.eu \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=nbd@other.debian.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).