qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
To: "lizhijian@fujitsu.com" <lizhijian@fujitsu.com>
Cc: qemu devel list <qemu-devel@nongnu.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH 1/2] migration/rdma: Try to register On-Demand Paging memory region
Date: Mon, 23 Aug 2021 10:03:31 +0300	[thread overview]
Message-ID: <CAC_L=vXqtWQoDD-gb36gzcQAWe=9-sAEpUgEJgSN=DTPghtyLw@mail.gmail.com> (raw)
In-Reply-To: <7ee884ab-3927-79c2-dd5b-4f805442d118@fujitsu.com>

Hi Zhijian,

On Mon, Aug 23, 2021 at 4:41 AM lizhijian@fujitsu.com
<lizhijian@fujitsu.com> wrote:
>
>
>
> On 22/08/2021 16:53, Marcel Apfelbaum wrote:
> > Hi
> >
> > On Sat, Jul 31, 2021 at 5:00 PM Li Zhijian <lizhijian@cn.fujitsu.com> wrote:
> >> Previously, for the fsdax mem-backend-file, it will register failed with
> >> Operation not supported. In this case, we can try to register it with
> >> On-Demand Paging[1] like what rpma_mr_reg() does on rpma[2].
> >>
> >> [1]: https://community.mellanox.com/s/article/understanding-on-demand-paging--odp-x
> >> [2]: http://pmem.io/rpma/manpages/v0.9.0/rpma_mr_reg.3
> >> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> >> ---
> >>   migration/rdma.c       | 27 ++++++++++++++++++---------
> >>   migration/trace-events |  1 +
> >>   2 files changed, 19 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/migration/rdma.c b/migration/rdma.c
> >> index 5c2d113aa94..8784b5f22a6 100644
> >> --- a/migration/rdma.c
> >> +++ b/migration/rdma.c
> >> @@ -1123,15 +1123,21 @@ static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
> >>       RDMALocalBlocks *local = &rdma->local_ram_blocks;
> >>
> >>       for (i = 0; i < local->nb_blocks; i++) {
> >> +        int access = IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE;
> >> +
> >> +on_demand:
> >>           local->block[i].mr =
> >>               ibv_reg_mr(rdma->pd,
> >>                       local->block[i].local_host_addr,
> >> -                    local->block[i].length,
> >> -                    IBV_ACCESS_LOCAL_WRITE |
> >> -                    IBV_ACCESS_REMOTE_WRITE
> >> +                    local->block[i].length, access
> >>                       );
> >>           if (!local->block[i].mr) {
> >> -            perror("Failed to register local dest ram block!");
> >> +            if (!(access & IBV_ACCESS_ON_DEMAND) && errno == ENOTSUP) {
> >> +                access |= IBV_ACCESS_ON_DEMAND;
> >> +                trace_qemu_rdma_register_odp_mr(local->block[i].block_name);
> >> +                goto on_demand;
> > Wouldn't it be better to check first if the device supports ODP ?
> > Something like:
> >      ret = ibv_exp_query_device(context, &dattr);
> >      if (dattr.exp_device_cap_flags & IBV_EXP_DEVICE_ODP)...
>
> Good idea !
>
>
>
> >
> > Also, I  am not (personally) too fond of the "on_demand" label usage here,
> > however I will let the maintainer/others decide.
> Indeed, how just repeating the ibv_reg_mr() instead of a 'go to'
>

Any "boring"/standard approach would do.

Thanks,
Marcel

> Thanks
> Zhijian
>
>
>
> >
> > Thanks,
> > Marcel
> >
> >> +            }
> >> +            perror("Failed to register local dest ram block!\n");
> >>               break;
> >>           }
> >>           rdma->total_registrations++;
> >> @@ -1215,15 +1221,18 @@ static int qemu_rdma_register_and_get_keys(RDMAContext *rdma,
> >>        */
> >>       if (!block->pmr[chunk]) {
> >>           uint64_t len = chunk_end - chunk_start;
> >> +        int access = rkey ? IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE : 0;
> >>
> >>           trace_qemu_rdma_register_and_get_keys(len, chunk_start);
> >>
> >> -        block->pmr[chunk] = ibv_reg_mr(rdma->pd,
> >> -                chunk_start, len,
> >> -                (rkey ? (IBV_ACCESS_LOCAL_WRITE |
> >> -                        IBV_ACCESS_REMOTE_WRITE) : 0));
> >> -
> >> +on_demand:
> >> +        block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
> >>           if (!block->pmr[chunk]) {
> >> +            if (!(access & IBV_ACCESS_ON_DEMAND) && errno == ENOTSUP) {
> >> +                access |= IBV_ACCESS_ON_DEMAND;
> >> +                trace_qemu_rdma_register_odp_mr(block->block_name);
> >> +                goto on_demand;
> >> +            }
> >>               perror("Failed to register chunk!");
> >>               fprintf(stderr, "Chunk details: block: %d chunk index %d"
> >>                               " start %" PRIuPTR " end %" PRIuPTR
> >> diff --git a/migration/trace-events b/migration/trace-events
> >> index a1c0f034ab8..5f6aa580def 100644
> >> --- a/migration/trace-events
> >> +++ b/migration/trace-events
> >> @@ -212,6 +212,7 @@ qemu_rdma_poll_write(const char *compstr, int64_t comp, int left, uint64_t block
> >>   qemu_rdma_poll_other(const char *compstr, int64_t comp, int left) "other completion %s (%" PRId64 ") received left %d"
> >>   qemu_rdma_post_send_control(const char *desc) "CONTROL: sending %s.."
> >>   qemu_rdma_register_and_get_keys(uint64_t len, void *start) "Registering %" PRIu64 " bytes @ %p"
> >> +qemu_rdma_register_odp_mr(const char *name) "Try to register On-Demand Paging memory region: %s"
> >>   qemu_rdma_registration_handle_compress(int64_t length, int index, int64_t offset) "Zapping zero chunk: %" PRId64 " bytes, index %d, offset %" PRId64
> >>   qemu_rdma_registration_handle_finished(void) ""
> >>   qemu_rdma_registration_handle_ram_blocks(void) ""
> >> --
> >> 2.31.1
> >>
> >>
> >>
> >>
> >


  reply	other threads:[~2021-08-23  7:04 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-31 14:03 [PATCH 0/2] enable fsdax rdma migration Li Zhijian
2021-07-31 14:03 ` [PATCH 1/2] migration/rdma: Try to register On-Demand Paging memory region Li Zhijian
2021-08-22  8:53   ` Marcel Apfelbaum
2021-08-23  1:41     ` lizhijian
2021-08-23  7:03       ` Marcel Apfelbaum [this message]
2021-07-31 14:03 ` [PATCH 2/2] migration/rdma: advise prefetch write for ODP region Li Zhijian
2021-08-22  8:39   ` Marcel Apfelbaum
2021-08-23  1:21     ` lizhijian
2021-08-16  2:10 ` [PATCH 0/2] enable fsdax rdma migration lizhijian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAC_L=vXqtWQoDD-gb36gzcQAWe=9-sAEpUgEJgSN=DTPghtyLw@mail.gmail.com' \
    --to=marcel.apfelbaum@gmail.com \
    --cc=dgilbert@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).