On 9/17/19 12:13 PM, Vladimir Sementsov-Ogievskiy wrote: > Implement reconnect. To achieve this: > > 1. add new modes: > connecting-wait: means, that reconnecting is in progress, and there > were small number of reconnect attempts, so all requests are > waiting for the connection. > connecting-nowait: reconnecting is in progress, there were a lot of > attempts of reconnect, all requests will return errors. > > two old modes are used too: > connected: normal state > quit: exiting after fatal error or on close > > Possible transitions are: > > * -> quit > connecting-* -> connected > connecting-wait -> connecting-nowait (transition is done after > reconnect-delay seconds in connecting-wait mode) > connected -> connecting-wait > > 2. Implement reconnect in connection_co. So, in connecting-* mode, > connection_co, tries to reconnect unlimited times. > > 3. Retry nbd queries on channel error, if we are in connecting-wait > state. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/nbd.c | 332 ++++++++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 269 insertions(+), 63 deletions(-) > > + > +static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) > +{ > + Error *local_err = NULL; > + > + if (!nbd_client_connecting(s)) { > + return; > + } > + > + /* Wait for completion of all in-flight requests */ > + > + qemu_co_mutex_lock(&s->send_mutex); > + > + while (s->in_flight > 0) { > + qemu_co_mutex_unlock(&s->send_mutex); > + nbd_recv_coroutines_wake_all(s); > + s->wait_in_flight = true; > + qemu_coroutine_yield(); > + s->wait_in_flight = false; > + qemu_co_mutex_lock(&s->send_mutex); > + } > + > + qemu_co_mutex_unlock(&s->send_mutex); > + > + if (!nbd_client_connecting(s)) { > + return; > + } > + > + /* > + * Now we are sure that nobody is accessing the channel, and no one will > + * try until we set the state to CONNECTED. > + */ > + > + /* Finalize previous connection if any */ > + if (s->ioc) { > + nbd_client_detach_aio_context(s->bs); > + object_unref(OBJECT(s->sioc)); > + s->sioc = NULL; > + object_unref(OBJECT(s->ioc)); > + s->ioc = NULL; > + } > + > + s->connect_status = nbd_client_connect(s->bs, &local_err); > + error_free(s->connect_err); > + s->connect_err = NULL; > + error_propagate(&s->connect_err, local_err); > + local_err = NULL; > + Looks like a dead assignment to local_err. But I see elsewhere you add it, because you convert straight-line code into loops where it matters. > + if (s->connect_status < 0) { > + /* failed attempt */ > + return; > + } > + > + /* successfully connected */ > + s->state = NBD_CLIENT_CONNECTED; > + qemu_co_queue_restart_all(&s->free_sema); > +} > + > +static coroutine_fn void nbd_reconnect_loop(BDRVNBDState *s) Coroutine functions should generally have '_co_' in their name. I'd prefer nbd_co_reconnect_loop. > +{ > + uint64_t start_time_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > + uint64_t delay_ns = s->reconnect_delay * NANOSECONDS_PER_SECOND; > + uint64_t timeout = 1 * NANOSECONDS_PER_SECOND; > + uint64_t max_timeout = 16 * NANOSECONDS_PER_SECOND; > + > + nbd_reconnect_attempt(s); > + > + while (nbd_client_connecting(s)) { > + if (s->state == NBD_CLIENT_CONNECTING_WAIT && > + qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start_time_ns > delay_ns) > + { > + s->state = NBD_CLIENT_CONNECTING_NOWAIT; > + qemu_co_queue_restart_all(&s->free_sema); > + } > + > + qemu_co_sleep_ns_wakeable(QEMU_CLOCK_REALTIME, timeout, > + &s->connection_co_sleep_ns_state); > + if (s->drained) { > + bdrv_dec_in_flight(s->bs); > + s->wait_drained_end = true; > + while (s->drained) { > + /* > + * We may be entered once from nbd_client_attach_aio_context_bh > + * and then from nbd_client_co_drain_end. So here is a loop. > + */ > + qemu_coroutine_yield(); > + } > + bdrv_inc_in_flight(s->bs); > + } > + if (timeout < max_timeout) { > + timeout *= 2; > + } Exponential backup, ok. If I read the loop correctly, you've hardcoded the max_timeout at 16s, which means the overall timeout is about 30s when adding in the time of the earlier iterations. Does that need to be tunable? But for now I can live with it. > + > + nbd_reconnect_attempt(s); > + } > } > > static coroutine_fn void nbd_connection_entry(void *opaque) > @@ -177,16 +330,26 @@ static coroutine_fn void nbd_connection_entry(void *opaque) > * Therefore we keep an additional in_flight reference all the time and > * only drop it temporarily here. > */ > + > + if (nbd_client_connecting(s)) { > + nbd_reconnect_loop(s); > + } > + > + if (s->state != NBD_CLIENT_CONNECTED) { > + continue; > + } Is 'continue' right, even if s->state == QUIT? > + > assert(s->reply.handle == 0); > ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, &local_err); > > if (local_err) { > trace_nbd_read_reply_entry_fail(ret, error_get_pretty(local_err)); > error_free(local_err); > + local_err = NULL; Could be fun in concert with your proposal to get rid of local_err ;) But here, we aren't using error_propagate(). > } > if (ret <= 0) { > nbd_channel_error(s, ret ? ret : -EIO); > - break; > + continue; > } > > /* We're getting really close. If you can answer my questions above, and the only thing left is adding _co_ in the function name, I could tweak that locally to spare you a v10. At any rate, I'm tentatively queuing this on my NBD tree; I'll probably do a pull request today without it, and save it for next week's PR after I've had a week to hammer on it in local tests. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org