All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, mreitz@redhat.com,
	kwolf@redhat.com, pbonzini@redhat.com
Subject: Re: [PATCH v4 20/32] nbd/client-connection: implement connection retry
Date: Mon, 22 Nov 2021 20:17:34 +0300	[thread overview]
Message-ID: <fca77dff-caba-907b-6ab2-91ed9987760f@virtuozzo.com> (raw)
In-Reply-To: <20211122163001.ahvcby7rrg4hc23n@redhat.com>

22.11.2021 19:30, Eric Blake wrote:
> Reviving this thread, as a good as place as any for my question:
> 
> On Thu, Jun 10, 2021 at 01:07:50PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> Add an option for a thread to retry connection until succeeds. We'll
>> use nbd/client-connection both for reconnect and for initial connection
>> in nbd_open(), so we need a possibility to use same NBDClientConnection
>> instance to connect once in nbd_open() and then use retry semantics for
>> reconnect.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/nbd.h     |  2 ++
>>   nbd/client-connection.c | 56 +++++++++++++++++++++++++++++++----------
>>   2 files changed, 45 insertions(+), 13 deletions(-)
> 
>>   NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr,
>>                                                  bool do_negotiation,
>>                                                  const char *export_name,
>> @@ -154,23 +164,43 @@ static void *connect_thread_func(void *opaque)
>>       NBDClientConnection *conn = opaque;
>>       int ret;
>>       bool do_free;
>> +    uint64_t timeout = 1;
>> +    uint64_t max_timeout = 16;
>>   
>> -    conn->sioc = qio_channel_socket_new();
>> +    while (true) {
>> +        conn->sioc = qio_channel_socket_new();
>>   
>> -    error_free(conn->err);
>> -    conn->err = NULL;
>> -    conn->updated_info = conn->initial_info;
>> +        error_free(conn->err);
>> +        conn->err = NULL;
>> +        conn->updated_info = conn->initial_info;
>>   
>> -    ret = nbd_connect(conn->sioc, conn->saddr,
>> -                      conn->do_negotiation ? &conn->updated_info : NULL,
>> -                      conn->tlscreds, &conn->ioc, &conn->err);
> 
> This says that on each retry attempt, we reset whether to ask the
> server for structured replies back to our original initial_info
> values.
> 
> But when dealing with NBD retries in general, I suspect we have a bug.
> Consider what happens if our first connection requests structured
> replies and base:allocation block status, and we are successful.  But
> later, the server disconnects, triggering a retry.  Suppose that on
> our retry, we encounter a different server that no longer supports
> structured replies.  We would no longer be justified in sending
> NBD_CMD_BLOCK_STATUS requests to the reconnected server.  But I can't
> find anywhere in the code base that ensures that on a reconnect, the
> new server supplies at least as many extensions as the original
> server, nor anywhere that we would be able to gracefully handle an
> in-flight block status command that can no longer be successfully
> continued because the reconnect landed on a downgraded server.
> 
> In general, you don't expect a server to downgrade its capabilities
> across restarts, so assuming that a retried connection will hit a
> server at least as capable as the original server is typical, even if
> unsafe.  But it is easy enough to use nbdkit to write a server that
> purposefully downgrades its abilities after the first client
> connection, for testing how qemu falls apart if it continues making
> assumptions about the current server based solely on what it learned
> prior to retrying from the first server.
> 
> Is this something we need to address quickly for inclusion in 6.2?
> Maybe by having a retry connect fail if the new server does not have
> the same capabilities as the old?  Do we also need to care about a
> server reporting a different size export than the old server?
> 

Yes that's a problem. We previously noted it here https://lists.gnu.org/archive/html/qemu-block/2021-06/msg00458.html

Honestly, I didn't start any fix for that :(.. I agree, it would be good to fix it somehow in 6.2. I'll try to make something simple this week. Or did you already started doing some fix?


-- 
Best regards,
Vladimir


  reply	other threads:[~2021-11-22 17:19 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-10 10:07 [PATCH v4 00/32] block/nbd: rework client connection Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 01/32] co-queue: drop extra coroutine_fn marks Vladimir Sementsov-Ogievskiy
2021-06-10 17:22   ` Eric Blake
2021-06-10 17:37     ` Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 02/32] block/nbd: fix channel object leak Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 03/32] block/nbd: fix how state is cleared on nbd_open() failure paths Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 04/32] block/nbd: connect_thread_func(): do qio_channel_set_delay(false) Vladimir Sementsov-Ogievskiy
2021-06-10 18:37   ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 05/32] qemu-sockets: introduce socket_address_parse_named_fd() Vladimir Sementsov-Ogievskiy
2021-06-11 13:22   ` Eric Blake
2021-06-11 14:10     ` Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 06/32] block/nbd: call socket_address_parse_named_fd() in advance Vladimir Sementsov-Ogievskiy
2021-06-11 13:54   ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 07/32] block/nbd: ensure ->connection_thread is always valid Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 08/32] block/nbd: nbd_client_handshake(): fix leak of s->ioc Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 09/32] block/nbd: BDRVNBDState: drop unused connect_err and connect_status Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 10/32] block/nbd: simplify waking of nbd_co_establish_connection() Vladimir Sementsov-Ogievskiy
2021-06-11 14:06   ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 11/32] block/nbd: drop thr->state Vladimir Sementsov-Ogievskiy
2021-06-11 14:25   ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 12/32] block/nbd: bs-independent interface for nbd_co_establish_connection() Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 13/32] block/nbd: make nbd_co_establish_connection_cancel() bs-independent Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 14/32] block/nbd: rename NBDConnectThread to NBDClientConnection Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 15/32] block/nbd: introduce nbd_client_connection_new() Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 16/32] block/nbd: introduce nbd_client_connection_release() Vladimir Sementsov-Ogievskiy
2021-06-11 14:28   ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 17/32] nbd: move connection code from block/nbd to nbd/client-connection Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 18/32] nbd/client-connection: use QEMU_LOCK_GUARD Vladimir Sementsov-Ogievskiy
2021-06-11 14:31   ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 19/32] nbd/client-connection: add possibility of negotiation Vladimir Sementsov-Ogievskiy
2021-06-11 15:07   ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 20/32] nbd/client-connection: implement connection retry Vladimir Sementsov-Ogievskiy
2021-06-11 15:12   ` Eric Blake
2021-11-22 16:30   ` Eric Blake
2021-11-22 17:17     ` Vladimir Sementsov-Ogievskiy [this message]
2021-11-22 21:51       ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 21/32] nbd/client-connection: shutdown connection on release Vladimir Sementsov-Ogievskiy
2021-06-11 15:27   ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 22/32] block/nbd: split nbd_handle_updated_info out of nbd_client_handshake() Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 23/32] block/nbd: use negotiation of NBDClientConnection Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 24/32] block/nbd: don't touch s->sioc in nbd_teardown_connection() Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 25/32] block/nbd: drop BDRVNBDState::sioc Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 26/32] nbd/client-connection: return only one io channel Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 27/32] block-coroutine-wrapper: allow non bdrv_ prefix Vladimir Sementsov-Ogievskiy
2021-06-10 10:07 ` [PATCH v4 28/32] block/nbd: split nbd_co_do_establish_connection out of nbd_reconnect_attempt Vladimir Sementsov-Ogievskiy
2021-06-11 15:29   ` Eric Blake
2021-06-10 10:07 ` [PATCH v4 29/32] nbd/client-connection: add option for non-blocking connection attempt Vladimir Sementsov-Ogievskiy
2021-06-10 10:08 ` [PATCH v4 30/32] block/nbd: reuse nbd_co_do_establish_connection() in nbd_open() Vladimir Sementsov-Ogievskiy
2021-06-10 10:08 ` [PATCH v4 31/32] block/nbd: add nbd_client_connected() helper Vladimir Sementsov-Ogievskiy
2021-06-10 10:08 ` [PATCH v4 32/32] block/nbd: safer transition to receiving request Vladimir Sementsov-Ogievskiy
2021-06-11 15:55 ` [PATCH v4 00/32] block/nbd: rework client connection Eric Blake
2021-06-11 17:23   ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fca77dff-caba-907b-6ab2-91ed9987760f@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.