All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Eric Blake <eblake@redhat.com>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>
Cc: "kwolf@redhat.com" <kwolf@redhat.com>,
	Denis Lunev <den@virtuozzo.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"stefanha@redhat.com" <stefanha@redhat.com>,
	"mreitz@redhat.com" <mreitz@redhat.com>
Subject: Re: [PATCH v9 2/3] block/nbd: nbd reconnect
Date: Tue, 24 Sep 2019 08:05:43 +0000	[thread overview]
Message-ID: <e65d1cd0-9ed4-698e-16e5-a6e6487d098d@virtuozzo.com> (raw)
In-Reply-To: <b4532b07-71a9-1890-949d-a017ede348ee@redhat.com>

23.09.2019 22:23, Eric Blake wrote:
> On 9/17/19 12:13 PM, Vladimir Sementsov-Ogievskiy wrote:
>> Implement reconnect. To achieve this:
>>
>> 1. add new modes:
>>     connecting-wait: means, that reconnecting is in progress, and there
>>       were small number of reconnect attempts, so all requests are
>>       waiting for the connection.
>>     connecting-nowait: reconnecting is in progress, there were a lot of
>>       attempts of reconnect, all requests will return errors.
>>
>>     two old modes are used too:
>>     connected: normal state
>>     quit: exiting after fatal error or on close
>>
>> Possible transitions are:
>>
>>     * -> quit
>>     connecting-* -> connected
>>     connecting-wait -> connecting-nowait (transition is done after
>>                        reconnect-delay seconds in connecting-wait mode)
>>     connected -> connecting-wait
>>
>> 2. Implement reconnect in connection_co. So, in connecting-* mode,
>>      connection_co, tries to reconnect unlimited times.
>>
>> 3. Retry nbd queries on channel error, if we are in connecting-wait
>>      state.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/nbd.c | 332 ++++++++++++++++++++++++++++++++++++++++++----------
>>   1 file changed, 269 insertions(+), 63 deletions(-)
>>
> 
>> +
>> +static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
>> +{
>> +    Error *local_err = NULL;
>> +
>> +    if (!nbd_client_connecting(s)) {
>> +        return;
>> +    }
>> +
>> +    /* Wait for completion of all in-flight requests */
>> +
>> +    qemu_co_mutex_lock(&s->send_mutex);
>> +
>> +    while (s->in_flight > 0) {
>> +        qemu_co_mutex_unlock(&s->send_mutex);
>> +        nbd_recv_coroutines_wake_all(s);
>> +        s->wait_in_flight = true;
>> +        qemu_coroutine_yield();
>> +        s->wait_in_flight = false;
>> +        qemu_co_mutex_lock(&s->send_mutex);
>> +    }
>> +
>> +    qemu_co_mutex_unlock(&s->send_mutex);
>> +
>> +    if (!nbd_client_connecting(s)) {
>> +        return;
>> +    }
>> +
>> +    /*
>> +     * Now we are sure that nobody is accessing the channel, and no one will
>> +     * try until we set the state to CONNECTED.
>> +     */
>> +
>> +    /* Finalize previous connection if any */
>> +    if (s->ioc) {
>> +        nbd_client_detach_aio_context(s->bs);
>> +        object_unref(OBJECT(s->sioc));
>> +        s->sioc = NULL;
>> +        object_unref(OBJECT(s->ioc));
>> +        s->ioc = NULL;
>> +    }
>> +
>> +    s->connect_status = nbd_client_connect(s->bs, &local_err);
>> +    error_free(s->connect_err);
>> +    s->connect_err = NULL;
>> +    error_propagate(&s->connect_err, local_err);
>> +    local_err = NULL;
>> +
> 
> Looks like a dead assignment to local_err. 

Hmm, agree, it's dead.

> But I see elsewhere you add
> it, because you convert straight-line code into loops where it matters.
> 
>> +    if (s->connect_status < 0) {
>> +        /* failed attempt */
>> +        return;
>> +    }
>> +
>> +    /* successfully connected */
>> +    s->state = NBD_CLIENT_CONNECTED;
>> +    qemu_co_queue_restart_all(&s->free_sema);
>> +}
>> +
>> +static coroutine_fn void nbd_reconnect_loop(BDRVNBDState *s)
> 
> Coroutine functions should generally have '_co_' in their name.  I'd
> prefer nbd_co_reconnect_loop.

OK, agreed.

> 
>> +{
>> +    uint64_t start_time_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>> +    uint64_t delay_ns = s->reconnect_delay * NANOSECONDS_PER_SECOND;
>> +    uint64_t timeout = 1 * NANOSECONDS_PER_SECOND;
>> +    uint64_t max_timeout = 16 * NANOSECONDS_PER_SECOND;
>> +
>> +    nbd_reconnect_attempt(s);
>> +
>> +    while (nbd_client_connecting(s)) {
>> +        if (s->state == NBD_CLIENT_CONNECTING_WAIT &&
>> +            qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start_time_ns > delay_ns)
>> +        {
>> +            s->state = NBD_CLIENT_CONNECTING_NOWAIT;
>> +            qemu_co_queue_restart_all(&s->free_sema);
>> +        }
>> +
>> +        qemu_co_sleep_ns_wakeable(QEMU_CLOCK_REALTIME, timeout,
>> +                                  &s->connection_co_sleep_ns_state);
>> +        if (s->drained) {
>> +            bdrv_dec_in_flight(s->bs);
>> +            s->wait_drained_end = true;
>> +            while (s->drained) {
>> +                /*
>> +                 * We may be entered once from nbd_client_attach_aio_context_bh
>> +                 * and then from nbd_client_co_drain_end. So here is a loop.
>> +                 */
>> +                qemu_coroutine_yield();
>> +            }
>> +            bdrv_inc_in_flight(s->bs);
>> +        }
>> +        if (timeout < max_timeout) {
>> +            timeout *= 2;
>> +        }
> 
> Exponential backup, ok.  If I read the loop correctly, you've hardcoded
> the max_timeout at 16s, which means the overall timeout is about 30s
> when adding in the time of the earlier iterations.  Does that need to be
> tunable?  But for now I can live with it.

I think, we can add an option later if needed.

> 
>> +
>> +        nbd_reconnect_attempt(s);
>> +    }
>>   }
>>   
>>   static coroutine_fn void nbd_connection_entry(void *opaque)
>> @@ -177,16 +330,26 @@ static coroutine_fn void nbd_connection_entry(void *opaque)
>>            * Therefore we keep an additional in_flight reference all the time and
>>            * only drop it temporarily here.
>>            */
>> +
>> +        if (nbd_client_connecting(s)) {
>> +            nbd_reconnect_loop(s);
>> +        }
>> +
>> +        if (s->state != NBD_CLIENT_CONNECTED) {
>> +            continue;
>> +        }
> 
> Is 'continue' right, even if s->state == QUIT?

No matter, as we jump to "while (s->state != NBD_CLIENT_QUIT) {".

> 
>> +
>>           assert(s->reply.handle == 0);
>>           ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, &local_err);
>>   
>>           if (local_err) {
>>               trace_nbd_read_reply_entry_fail(ret, error_get_pretty(local_err));
>>               error_free(local_err);
>> +            local_err = NULL;
> 
> Could be fun in concert with your proposal to get rid of local_err ;)
> But here, we aren't using error_propagate().

And we don't have Error **errp parameter here too.

> 
>>           }
>>           if (ret <= 0) {
>>               nbd_channel_error(s, ret ? ret : -EIO);
>> -            break;
>> +            continue;
>>           }
>>   
>>           /*
> 
> We're getting really close.  If you can answer my questions above, and
> the only thing left is adding _co_ in the function name, I could tweak
> that locally to spare you a v10.  At any rate, I'm tentatively queuing
> this on my NBD tree; I'll probably do a pull request today without it,
> and save it for next week's PR after I've had a week to hammer on it in
> local tests.
> 

Thank you! That's great!

-- 
Best regards,
Vladimir

  reply	other threads:[~2019-09-24  8:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-17 17:13 [Qemu-devel] [PATCH v9 0/3] NBD reconnect Vladimir Sementsov-Ogievskiy
2019-09-17 17:13 ` [Qemu-devel] [PATCH v9 1/3] qemu-coroutine-sleep: introduce qemu_co_sleep_wake Vladimir Sementsov-Ogievskiy
2019-09-23 14:53   ` Eric Blake
2019-09-17 17:13 ` [Qemu-devel] [PATCH v9 2/3] block/nbd: nbd reconnect Vladimir Sementsov-Ogievskiy
2019-09-23 19:23   ` Eric Blake
2019-09-24  8:05     ` Vladimir Sementsov-Ogievskiy [this message]
2019-10-04 17:29     ` Vladimir Sementsov-Ogievskiy
2019-09-17 17:13 ` [Qemu-devel] [PATCH v9 3/3] iotests: test " Vladimir Sementsov-Ogievskiy
2019-09-23 19:51   ` Eric Blake
2019-09-24  8:31     ` Vladimir Sementsov-Ogievskiy
2019-10-04 18:05       ` Eric Blake
2019-10-07 10:48         ` Vladimir Sementsov-Ogievskiy
2019-10-07 20:03           ` Eric Blake
2019-09-24  9:27 ` [PATCH v9 4/3] " Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e65d1cd0-9ed4-698e-16e5-a6e6487d098d@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=den@virtuozzo.com \
    --cc=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.