All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jinpu Wang <jinpu.wang@cloud.ionos.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Danil Kipnis <danil.kipnis@cloud.ionos.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Jack Wang <jinpuwang@gmail.com>,
	linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	Sagi Grimberg <sagi@grimberg.me>,
	Leon Romanovsky <leon@kernel.org>,
	Doug Ledford <dledford@redhat.com>,
	Roman Penyaev <rpenyaev@suse.de>,
	Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Subject: Re: [PATCH v9 06/25] RDMA/rtrs: client: main functionality
Date: Wed, 4 Mar 2020 17:43:24 +0100	[thread overview]
Message-ID: <CAMGffEmbV1YL4O860EswBKm2UHBYP_cgqMFYFVc2AdHnAFeu+g@mail.gmail.com> (raw)
In-Reply-To: <597777e7-ac5a-5fc4-c1f7-3ffa5876a6f2@acm.org>

On Tue, Mar 3, 2020 at 5:04 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 3/2/20 5:20 AM, Danil Kipnis wrote:
> > On Sun, Mar 1, 2020 at 2:33 AM Bart Van Assche <bvanassche@acm.org> wrote:
> >> On 2020-02-21 02:47, Jack Wang wrote:
> >>> +static struct rtrs_permit *
> >>> +__rtrs_get_permit(struct rtrs_clt *clt, enum rtrs_clt_con_type con_type)
> >>> +{
> >>> +     size_t max_depth = clt->queue_depth;
> >>> +     struct rtrs_permit *permit;
> >>> +     int cpu, bit;
> >>> +
> >>> +     /* Combined with cq_vector, we pin the IO to the the cpu it comes */
> >>
> >> This comment is confusing. Please clarify this comment. All I see below
> >> is that preemption is disabled. I don't see pinning of I/O to the CPU of
> >> the caller.
> > The comment is addressing a use-case of the driver: The user can
> > assign (under /proc/irq/) the irqs of the HCA cq_vectors "one-to-one"
> > to each cpu. This will "force" the driver to process io response on
> > the same cpu the io has been submitted on.
> > In the code below only preemption is disabled. This can lead to the
> > situation that callers from different cpus will grab the same bit,
> > since find_first_zero_bit is not atomic. But then the
> > test_and_set_bit_lock will fail for all the callers but one, so that
> > they will loop again. This way an explicit spinlock is not required.
> > Will extend the comment.
>
> If the purpose of get_cpu() and put_cpu() calls is to serialize code
> against other threads, please use locking instead of disabling
> preemption. This will help tools that verify locking like lockdep and
> the kernel thread sanitizer (https://github.com/google/ktsan/wiki).
We can look into it, but I'm afraid converting to spinlock might have
a performance impact.

>
> >>> +static int rtrs_post_send_rdma(struct rtrs_clt_con *con,
> >>> +                             struct rtrs_clt_io_req *req,
> >>> +                             struct rtrs_rbuf *rbuf, u32 off,
> >>> +                             u32 imm, struct ib_send_wr *wr)
> >>> +{
> >>> +     struct rtrs_clt_sess *sess = to_clt_sess(con->c.sess);
> >>> +     enum ib_send_flags flags;
> >>> +     struct ib_sge sge;
> >>> +
> >>> +     if (unlikely(!req->sg_size)) {
> >>> +             rtrs_wrn(con->c.sess,
> >>> +                      "Doing RDMA Write failed, no data supplied\n");
> >>> +             return -EINVAL;
> >>> +     }
> >>> +
> >>> +     /* user data and user message in the first list element */
> >>> +     sge.addr   = req->iu->dma_addr;
> >>> +     sge.length = req->sg_size;
> >>> +     sge.lkey   = sess->s.dev->ib_pd->local_dma_lkey;
> >>> +
> >>> +     /*
> >>> +      * From time to time we have to post signalled sends,
> >>> +      * or send queue will fill up and only QP reset can help.
> >>> +      */
> >>> +     flags = atomic_inc_return(&con->io_cnt) % sess->queue_depth ?
> >>> +                     0 : IB_SEND_SIGNALED;
> >>> +
> >>> +     ib_dma_sync_single_for_device(sess->s.dev->ib_dev, req->iu->dma_addr,
> >>> +                                   req->sg_size, DMA_TO_DEVICE);
> >>> +
> >>> +     return rtrs_iu_post_rdma_write_imm(&con->c, req->iu, &sge, 1,
> >>> +                                         rbuf->rkey, rbuf->addr + off,
> >>> +                                         imm, flags, wr);
> >>> +}
> >>
> >> I don't think that posting a signalled send from time to time is
> >> sufficient to prevent send queue overflow. Please address Jason's
> >> comment from January 7th: "Not quite. If the SQ depth is 16 and you post
> >> 16 things and then signal the last one, you *cannot* post new work until
> >> you see the completion. More SQ space *ONLY* becomes available upon
> >> receipt of a completion. This is why you can't have an unsignaled SQ."
> >
> >> See also https://lore.kernel.org/linux-rdma/20200107182528.GB26174@ziepe.ca/
> > In our case we set the send queue of each QP belonging to one
> > "session" to the one supported by the hardware (max_qp_wr) which is
> > around 5K on our hardware. The queue depth of our "session" is 512.
> > Those 512 are "shared" by all the QPs (number of CPUs on client side)
> > belonging to that session. So we have at most 512 and 512/num_cpus on
> > average inflights on each QP. We never experienced send queue full
> > event in any of our performance tests or production usage. The
> > alternative would be to count submitted requests and completed
> > requests, check the difference before submission and wait if the
> > difference multiplied by the queue depth of "session" exceeds the max
> > supported by the hardware. The check will require quite some code and
> > will most probably affect performance. I do not think it is worth it
> > to introduce a code path which is triggered only on a condition which
> > is known to never become true.
> > Jason, do you think it's necessary to implement such tracking?
>
> Please either make sure that send queues do not overflow by providing
> enough space for 512 in-flight requests fit or implement tracking for
> the number of in-flight requests.
We do have enough space for send queue.
>
> Thanks,
>
> Bart.
>
Thanks Bart!

  reply	other threads:[~2020-03-04 16:43 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-21 10:46 [PATCH v9 00/25] RTRS (former IBTRS) RDMA Transport Library and RNBD (former IBNBD) RDMA Network Block Device Jack Wang
2020-02-21 10:46 ` [PATCH v9 01/25] sysfs: export sysfs_remove_file_self() Jack Wang
2020-03-01  0:24   ` Bart Van Assche
2020-03-02 14:37     ` Jinpu Wang
2020-02-21 10:46 ` [PATCH v9 02/25] RDMA/rtrs: public interface header to establish RDMA connections Jack Wang
2020-03-01  0:31   ` Bart Van Assche
2020-03-02  8:39     ` Jinpu Wang
2020-03-03  9:40   ` Leon Romanovsky
2020-03-03 14:05     ` Jinpu Wang
2020-03-03 14:16       ` Leon Romanovsky
2020-03-03 14:23         ` Jinpu Wang
2020-02-21 10:46 ` [PATCH v9 03/25] RDMA/rtrs: private headers with rtrs protocol structs and helpers Jack Wang
2020-03-01  0:37   ` Bart Van Assche
2020-03-02  9:21     ` Jinpu Wang
2020-03-03  9:45   ` Leon Romanovsky
2020-03-03 13:52     ` Jinpu Wang
2020-03-03 14:05       ` Jason Gunthorpe
2020-03-03 16:13         ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 04/25] RDMA/rtrs: core: lib functions shared between client and server modules Jack Wang
2020-03-01  0:47   ` Bart Van Assche
2020-03-02  8:40     ` Jinpu Wang
2020-03-03  9:57   ` Leon Romanovsky
2020-03-04 11:21     ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 05/25] RDMA/rtrs: client: private header with client structs and functions Jack Wang
2020-03-01  0:51   ` Bart Van Assche
2020-03-02 13:49     ` Jinpu Wang
2020-03-02 16:13       ` Bart Van Assche
2020-03-02 16:18         ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 06/25] RDMA/rtrs: client: main functionality Jack Wang
2020-03-01  1:33   ` Bart Van Assche
2020-03-02 13:20     ` Danil Kipnis
2020-03-03 16:04       ` Bart Van Assche
2020-03-04 16:43         ` Jinpu Wang [this message]
2020-03-04 16:49           ` Jason Gunthorpe
2020-03-05 11:26             ` Jinpu Wang
2020-03-05 13:27               ` Jason Gunthorpe
2020-03-05 13:37                 ` Jinpu Wang
2020-03-05 13:54                   ` Jason Gunthorpe
2020-03-06 13:33                     ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 07/25] RDMA/rtrs: client: statistics functions Jack Wang
2020-03-03 11:28   ` Leon Romanovsky
2020-03-03 11:46     ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 08/25] RDMA/rtrs: client: sysfs interface functions Jack Wang
2020-02-21 10:47 ` [PATCH v9 09/25] RDMA/rtrs: server: private header with server structs and functions Jack Wang
2020-02-21 10:47 ` [PATCH v9 10/25] RDMA/rtrs: server: main functionality Jack Wang
2020-03-01  1:42   ` Bart Van Assche
2020-03-02 14:39     ` Jinpu Wang
2020-03-03 11:37   ` Leon Romanovsky
2020-03-03 16:41     ` Jinpu Wang
2020-03-03 16:59       ` Leon Romanovsky
2020-03-04 11:03         ` Jinpu Wang
2020-03-05  8:00           ` Leon Romanovsky
     [not found]             ` <CAHg0Huyc=pn1=WSKGLjm+c8AcchyQ8q7JS-0ToQyiBRgpGG=jA@mail.gmail.com>
2020-03-05 12:16               ` Leon Romanovsky
2020-03-05 12:28                 ` Jinpu Wang
2020-03-05 12:35                   ` Leon Romanovsky
2020-03-05 13:02                     ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 11/25] RDMA/rtrs: server: statistics functions Jack Wang
2020-02-21 10:47 ` [PATCH v9 12/25] RDMA/rtrs: server: sysfs interface functions Jack Wang
2020-02-21 10:47 ` [PATCH v9 13/25] RDMA/rtrs: include client and server modules into kernel compilation Jack Wang
2020-02-21 10:47 ` [PATCH v9 14/25] RDMA/rtrs: a bit of documentation Jack Wang
2020-02-21 10:47 ` [PATCH v9 15/25] block/rnbd: private headers with rnbd protocol structs and helpers Jack Wang
2020-03-01  2:12   ` Bart Van Assche
2020-03-02 16:37     ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 16/25] block/rnbd: client: private header with client structs and functions Jack Wang
2020-03-01  2:26   ` Bart Van Assche
2020-03-02 14:59     ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 17/25] block/rnbd: client: main functionality Jack Wang
2020-03-01  2:46   ` Bart Van Assche
2020-03-02 14:58     ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 18/25] block/rnbd: client: sysfs interface functions Jack Wang
2020-02-21 10:47 ` [PATCH v9 19/25] block/rnbd: server: private header with server structs and functions Jack Wang
2020-03-01  2:47   ` Bart Van Assche
2020-03-02 10:07     ` Danil Kipnis
2020-02-21 10:47 ` [PATCH v9 20/25] block/rnbd: server: main functionality Jack Wang
2020-03-01  2:58   ` Bart Van Assche
2020-03-02  9:58     ` Danil Kipnis
2020-03-03  5:57       ` Bart Van Assche
2020-02-21 10:47 ` [PATCH v9 21/25] block/rnbd: server: functionality for IO submission to file or block dev Jack Wang
2020-03-01  3:09   ` Bart Van Assche
2020-03-02 10:06     ` Danil Kipnis
2020-03-03 16:20       ` Jinpu Wang
2020-03-03 16:28         ` Bart Van Assche
2020-03-03 16:43           ` Jinpu Wang
2020-02-21 10:47 ` [PATCH v9 22/25] block/rnbd: server: sysfs interface functions Jack Wang
2020-02-21 10:47 ` [PATCH v9 23/25] block/rnbd: include client and server modules into kernel compilation Jack Wang
2020-02-21 10:47 ` [PATCH v9 24/25] block/rnbd: a bit of documentation Jack Wang
2020-02-21 10:47 ` [PATCH v9 25/25] MAINTAINERS: Add maintainers for RNBD/RTRS modules Jack Wang
2020-03-03  9:28 ` [PATCH v9 00/25] RTRS (former IBTRS) RDMA Transport Library and RNBD (former IBNBD) RDMA Network Block Device Leon Romanovsky
2020-03-04 14:06   ` Jinpu Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMGffEmbV1YL4O860EswBKm2UHBYP_cgqMFYFVc2AdHnAFeu+g@mail.gmail.com \
    --to=jinpu.wang@cloud.ionos.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=danil.kipnis@cloud.ionos.com \
    --cc=dledford@redhat.com \
    --cc=hch@infradead.org \
    --cc=jgg@ziepe.ca \
    --cc=jinpuwang@gmail.com \
    --cc=leon@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=pankaj.gupta@cloud.ionos.com \
    --cc=rpenyaev@suse.de \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.