All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Zhimin Feng <fengzhimin1@huawei.com>
Cc: zhang.zhanghailiang@huawei.com, quintela@redhat.com,
	qemu-devel@nongnu.org, armbru@redhat.com, jemmy858585@gmail.com
Subject: Re: [PATCH RFC 12/12] migration/rdma: only register the virt-ram block for MultiRDMA
Date: Fri, 17 Jan 2020 18:52:27 +0000	[thread overview]
Message-ID: <20200117185227.GS3209@work-vm> (raw)
In-Reply-To: <20200109045922.904-13-fengzhimin1@huawei.com>

* Zhimin Feng (fengzhimin1@huawei.com) wrote:
> From: fengzhimin <fengzhimin1@huawei.com>
> 
> The virt-ram block is sent by MultiRDMA, so we only register it for
> MultiRDMA channels and main channel don't register the virt-ram block.
> 
> Signed-off-by: fengzhimin <fengzhimin1@huawei.com>

You can't specialise on the name of the RAMBlock like that.
'mach-virt.ram' is the name specific to just the main ram on just
aarch's machine type;  for example the name on x86 is completely
different and if you use NUMA or hotplug etc it would also be different
on aarch.

Is there a downside to also registering the mach-virt.ram on the main
channel?

Dave

> ---
>  migration/rdma.c | 140 +++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 112 insertions(+), 28 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 0a150099e2..1477fd509b 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -618,7 +618,9 @@ const char *print_wrid(int wrid);
>  static int qemu_rdma_exchange_send(RDMAContext *rdma, RDMAControlHeader *head,
>                                     uint8_t *data, RDMAControlHeader *resp,
>                                     int *resp_idx,
> -                                   int (*callback)(RDMAContext *rdma));
> +                                   int (*callback)(RDMAContext *rdma,
> +                                   uint8_t id),
> +                                   uint8_t id);
>  
>  static inline uint64_t ram_chunk_index(const uint8_t *start,
>                                         const uint8_t *host)
> @@ -1198,24 +1200,81 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma)
>      return 0;
>  }
>  
> -static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
> +/*
> + * Parameters:
> + *    @id == UNUSED_ID :
> + *    This means that we register memory for the main RDMA channel,
> + *    the main RDMA channel don't register the mach-virt.ram block
> + *    when we use multiRDMA method to migrate.
> + *
> + *    @id == 0 or id == 1 or ... :
> + *    This means that we register memory for the multiRDMA channels,
> + *    the multiRDMA channels only register memory for the mach-virt.ram
> + *    block when we use multiRDAM method to migrate.
> + */
> +static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma, uint8_t id)
>  {
>      int i;
>      RDMALocalBlocks *local = &rdma->local_ram_blocks;
>  
> -    for (i = 0; i < local->nb_blocks; i++) {
> -        local->block[i].mr =
> -            ibv_reg_mr(rdma->pd,
> -                    local->block[i].local_host_addr,
> -                    local->block[i].length,
> -                    IBV_ACCESS_LOCAL_WRITE |
> -                    IBV_ACCESS_REMOTE_WRITE
> -                    );
> -        if (!local->block[i].mr) {
> -            perror("Failed to register local dest ram block!\n");
> -            break;
> +    if (migrate_use_multiRDMA()) {
> +        if (id == UNUSED_ID) {
> +            for (i = 0; i < local->nb_blocks; i++) {
> +                /* main RDMA channel don't register the mach-virt.ram block */
> +                if (strcmp(local->block[i].block_name, "mach-virt.ram") == 0) {
> +                    continue;
> +                }
> +
> +                local->block[i].mr =
> +                    ibv_reg_mr(rdma->pd,
> +                            local->block[i].local_host_addr,
> +                            local->block[i].length,
> +                            IBV_ACCESS_LOCAL_WRITE |
> +                            IBV_ACCESS_REMOTE_WRITE
> +                            );
> +                if (!local->block[i].mr) {
> +                    perror("Failed to register local dest ram block!\n");
> +                    break;
> +                }
> +                rdma->total_registrations++;
> +            }
> +        } else {
> +            for (i = 0; i < local->nb_blocks; i++) {
> +                /*
> +                 * The multiRDAM channels only register
> +                 * the mach-virt.ram block
> +                 */
> +                if (strcmp(local->block[i].block_name, "mach-virt.ram") == 0) {
> +                    local->block[i].mr =
> +                        ibv_reg_mr(rdma->pd,
> +                                local->block[i].local_host_addr,
> +                                local->block[i].length,
> +                                IBV_ACCESS_LOCAL_WRITE |
> +                                IBV_ACCESS_REMOTE_WRITE
> +                                );
> +                    if (!local->block[i].mr) {
> +                        perror("Failed to register local dest ram block!\n");
> +                        break;
> +                    }
> +                    rdma->total_registrations++;
> +                }
> +            }
> +        }
> +    } else {
> +        for (i = 0; i < local->nb_blocks; i++) {
> +            local->block[i].mr =
> +                ibv_reg_mr(rdma->pd,
> +                        local->block[i].local_host_addr,
> +                        local->block[i].length,
> +                        IBV_ACCESS_LOCAL_WRITE |
> +                        IBV_ACCESS_REMOTE_WRITE
> +                        );
> +            if (!local->block[i].mr) {
> +                perror("Failed to register local dest ram block!\n");
> +                break;
> +            }
> +            rdma->total_registrations++;
>          }
> -        rdma->total_registrations++;
>      }
>  
>      if (i >= local->nb_blocks) {
> @@ -1223,8 +1282,10 @@ static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
>      }
>  
>      for (i--; i >= 0; i--) {
> -        ibv_dereg_mr(local->block[i].mr);
> -        rdma->total_registrations--;
> +        if (local->block[i].mr) {
> +            ibv_dereg_mr(local->block[i].mr);
> +            rdma->total_registrations--;
> +        }
>      }
>  
>      return -1;
> @@ -1446,7 +1507,7 @@ static int qemu_rdma_unregister_waiting(RDMAContext *rdma)
>          reg.key.chunk = chunk;
>          register_to_network(rdma, &reg);
>          ret = qemu_rdma_exchange_send(rdma, &head, (uint8_t *) &reg,
> -                                &resp, NULL, NULL);
> +                                      &resp, NULL, NULL, UNUSED_ID);
>          if (ret < 0) {
>              return ret;
>          }
> @@ -1915,11 +1976,17 @@ static void qemu_rdma_move_header(RDMAContext *rdma, int idx,
>   * The extra (optional) response is used during registration to us from having
>   * to perform an *additional* exchange of message just to provide a response by
>   * instead piggy-backing on the acknowledgement.
> + *
> + * Parameters:
> + *    @id : callback function need two parameters, id is the second parameter.
> + *
>   */
>  static int qemu_rdma_exchange_send(RDMAContext *rdma, RDMAControlHeader *head,
>                                     uint8_t *data, RDMAControlHeader *resp,
>                                     int *resp_idx,
> -                                   int (*callback)(RDMAContext *rdma))
> +                                   int (*callback)(RDMAContext *rdma,
> +                                   uint8_t id),
> +                                   uint8_t id)
>  {
>      int ret = 0;
>  
> @@ -1973,7 +2040,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma, RDMAControlHeader *head,
>      if (resp) {
>          if (callback) {
>              trace_qemu_rdma_exchange_send_issue_callback();
> -            ret = callback(rdma);
> +            ret = callback(rdma, id);
>              if (ret < 0) {
>                  return ret;
>              }
> @@ -2168,7 +2235,7 @@ retry:
>  
>                  compress_to_network(rdma, &comp);
>                  ret = qemu_rdma_exchange_send(rdma, &head,
> -                                (uint8_t *) &comp, NULL, NULL, NULL);
> +                                (uint8_t *) &comp, NULL, NULL, NULL, UNUSED_ID);
>  
>                  if (ret < 0) {
>                      return -EIO;
> @@ -2195,7 +2262,7 @@ retry:
>  
>              register_to_network(rdma, &reg);
>              ret = qemu_rdma_exchange_send(rdma, &head, (uint8_t *) &reg,
> -                                    &resp, &reg_result_idx, NULL);
> +                                    &resp, &reg_result_idx, NULL, UNUSED_ID);
>              if (ret < 0) {
>                  return ret;
>              }
> @@ -2828,7 +2895,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>              head.len = len;
>              head.type = RDMA_CONTROL_QEMU_FILE;
>  
> -            ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
> +            ret = qemu_rdma_exchange_send(rdma, &head, data, NULL,
> +                                          NULL, NULL, UNUSED_ID);
>  
>              if (ret < 0) {
>                  rdma->error_state = ret;
> @@ -3660,7 +3728,7 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque)
>              }
>  
>              if (rdma->pin_all) {
> -                ret = qemu_rdma_reg_whole_ram_blocks(rdma);
> +                ret = qemu_rdma_reg_whole_ram_blocks(rdma, UNUSED_ID);
>                  if (ret) {
>                      error_report("rdma migration: error dest "
>                                      "registering ram blocks");
> @@ -3675,6 +3743,15 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque)
>               * their "local" descriptions with what was sent.
>               */
>              for (i = 0; i < local->nb_blocks; i++) {
> +                /*
> +                 * use the main RDMA channel to deliver the block of device
> +                 * use the multiRDMA channels to deliver the RAMBlock
> +                 */
> +                if (migrate_use_multiRDMA() &&
> +                    strcmp(local->block[i].block_name, "mach-virt.ram") == 0) {
> +                        continue;
> +                }
> +
>                  rdma->dest_blocks[i].remote_host_addr =
>                      (uintptr_t)(local->block[i].local_host_addr);
>  
> @@ -3992,7 +4069,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
>           */
>          ret = qemu_rdma_exchange_send(rdma, &head, NULL, &resp,
>                      &reg_result_idx, rdma->pin_all ?
> -                    qemu_rdma_reg_whole_ram_blocks : NULL);
> +                    qemu_rdma_reg_whole_ram_blocks : NULL, UNUSED_ID);
>          if (ret < 0) {
>              ERROR(errp, "receiving remote info!");
>              return ret;
> @@ -4025,6 +4102,11 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
>          memcpy(rdma->dest_blocks,
>              rdma->wr_data[reg_result_idx].control_curr, resp.len);
>          for (i = 0; i < nb_dest_blocks; i++) {
> +            if (migrate_use_multiRDMA() &&
> +                strcmp(local->block[i].block_name, "mach-virt.ram") == 0) {
> +                continue;
> +            }
> +
>              network_to_dest_block(&rdma->dest_blocks[i]);
>  
>              /* We require that the blocks are in the same order */
> @@ -4050,7 +4132,8 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
>      trace_qemu_rdma_registration_stop(flags);
>  
>      head.type = RDMA_CONTROL_REGISTER_FINISHED;
> -    ret = qemu_rdma_exchange_send(rdma, &head, NULL, NULL, NULL, NULL);
> +    ret = qemu_rdma_exchange_send(rdma, &head, NULL, NULL,
> +                                  NULL, NULL, UNUSED_ID);
>  
>      if (migrate_use_multiRDMA()) {
>          /*
> @@ -4298,7 +4381,7 @@ static int qemu_multiRDMA_registration_handle(void *opaque)
>                    sizeof(RDMALocalBlock), dest_ram_sort_func);
>  
>              if (rdma->pin_all) {
> -                ret = qemu_rdma_reg_whole_ram_blocks(rdma);
> +                ret = qemu_rdma_reg_whole_ram_blocks(rdma, p->id);
>                  if (ret) {
>                      error_report("rdma migration: error dest "
>                                   "registering ram blocks");
> @@ -4680,7 +4763,7 @@ static void *multiRDMA_send_thread(void *opaque)
>  
>      ret = qemu_rdma_exchange_send(rdma, &head, NULL, &resp,
>              &reg_result_idx, rdma->pin_all ?
> -            qemu_rdma_reg_whole_ram_blocks : NULL);
> +            qemu_rdma_reg_whole_ram_blocks : NULL, p->id);
>      if (ret < 0) {
>          return NULL;
>      }
> @@ -4749,7 +4832,8 @@ static void *multiRDMA_send_thread(void *opaque)
>  
>          /* Send FINISHED to the destination */
>          head.type = RDMA_CONTROL_REGISTER_FINISHED;
> -        ret = qemu_rdma_exchange_send(rdma, &head, NULL, NULL, NULL, NULL);
> +        ret = qemu_rdma_exchange_send(rdma, &head, NULL, NULL,
> +                                      NULL, NULL, p->id);
>          if (ret < 0) {
>              return NULL;
>          }
> -- 
> 2.19.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2020-01-17 18:53 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-09  4:59 [PATCH RFC 00/12] *** mulitple RDMA channels for migration *** Zhimin Feng
2020-01-09  4:59 ` [PATCH RFC 01/12] migration: Add multiRDMA capability support Zhimin Feng
2020-01-13 15:30   ` Markus Armbruster
2020-01-15  1:55     ` fengzhimin
2020-01-13 16:26   ` Eric Blake
2020-01-15  2:04     ` fengzhimin
2020-01-15 18:09   ` Dr. David Alan Gilbert
2020-01-16 13:18     ` Juan Quintela
2020-01-17  1:30       ` fengzhimin
2020-01-09  4:59 ` [PATCH RFC 02/12] migration: Export the 'migration_incoming_setup' function and add the 'migrate_use_rdma_pin_all' function Zhimin Feng
2020-01-15 18:57   ` Dr. David Alan Gilbert
2020-01-16 13:19   ` Juan Quintela
2020-01-09  4:59 ` [PATCH RFC 03/12] migration: Create the multi-rdma-channels parameter Zhimin Feng
2020-01-13 15:34   ` Markus Armbruster
2020-01-15  1:57     ` fengzhimin
2020-01-16 13:20   ` Juan Quintela
2020-01-09  4:59 ` [PATCH RFC 04/12] migration/rdma: Create multiRDMA migration threads Zhimin Feng
2020-01-16 13:25   ` Juan Quintela
2020-01-17  1:32     ` fengzhimin
2020-01-09  4:59 ` [PATCH RFC 05/12] migration/rdma: Create the multiRDMA channels Zhimin Feng
2020-01-15 19:54   ` Dr. David Alan Gilbert
2020-01-16 13:30   ` Juan Quintela
2020-01-09  4:59 ` [PATCH RFC 06/12] migration/rdma: Transmit initial package Zhimin Feng
2020-01-15 18:36   ` Dr. David Alan Gilbert
2020-01-09  4:59 ` [PATCH RFC 07/12] migration/rdma: Be sure all channels are created Zhimin Feng
2020-01-09  4:59 ` [PATCH RFC 08/12] migration/rdma: register memory for multiRDMA channels Zhimin Feng
2020-01-09  4:59 ` [PATCH RFC 09/12] migration/rdma: Wait for all multiRDMA to complete registration Zhimin Feng
2020-01-09  4:59 ` [PATCH RFC 10/12] migration/rdma: use multiRDMA to send RAM block for rdma-pin-all mode Zhimin Feng
2020-01-09  4:59 ` [PATCH RFC 11/12] migration/rdma: use multiRDMA to send RAM block for NOT " Zhimin Feng
2020-01-09  4:59 ` [PATCH RFC 12/12] migration/rdma: only register the virt-ram block for MultiRDMA Zhimin Feng
2020-01-17 18:52   ` Dr. David Alan Gilbert [this message]
2020-01-19  1:44     ` fengzhimin
2020-01-20  9:05       ` Dr. David Alan Gilbert
2020-01-21  1:30         ` fengzhimin
2020-01-09 10:38 ` [PATCH RFC 00/12] *** mulitple RDMA channels for migration *** no-reply
2020-01-15 19:57 ` Dr. David Alan Gilbert
2020-01-16  1:37   ` fengzhimin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200117185227.GS3209@work-vm \
    --to=dgilbert@redhat.com \
    --cc=armbru@redhat.com \
    --cc=fengzhimin1@huawei.com \
    --cc=jemmy858585@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.