Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Jinpu Wang <jinpu.wang@cloud.ionos.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: Jack Wang <jinpuwang@gmail.com>,
	linux-block@vger.kernel.org, linux-rdma@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	Sagi Grimberg <sagi@grimberg.me>,
	Bart Van Assche <bvanassche@acm.org>,
	Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Danil Kipnis <danil.kipnis@cloud.ionos.com>,
	Roman Penyaev <rpenyaev@suse.de>
Subject: Re: [PATCH v7 06/25] RDMA/rtrs: client: main functionality
Date: Thu, 16 Jan 2020 16:43:41 +0100
Message-ID: <CAMGffE=pym8iz4OVxx7s6i37AU+KPFN3AeVrCTOpLx+N8A9dEQ@mail.gmail.com> (raw)
In-Reply-To: <20200116145300.GC12433@unreal>

On Thu, Jan 16, 2020 at 3:53 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Jan 16, 2020 at 01:58:56PM +0100, Jack Wang wrote:
> > From: Jack Wang <jinpu.wang@cloud.ionos.com>
> >
> > This is main functionality of rtrs-client module, which manages
> > set of RDMA connections for each rtrs session, does multipathing,
> > load balancing and failover of RDMA requests.
> >
> > Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
> > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > ---
> >  drivers/infiniband/ulp/rtrs/rtrs-clt.c | 2967 ++++++++++++++++++++++++
> >  1 file changed, 2967 insertions(+)
> >  create mode 100644 drivers/infiniband/ulp/rtrs/rtrs-clt.c
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > new file mode 100644
> > index 000000000000..717d19d4d930
> > --- /dev/null
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > @@ -0,0 +1,2967 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * RDMA Transport Layer
> > + *
> > + * Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
> > + *
> > + * Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
> > + *
> > + * Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
>
> Please no extra lines between Copyright lines.
I checked in kernel tree, seems most of Copyright indeed contain no
extra line in between

>
> > + */
> > +
> > +#undef pr_fmt
> > +#define pr_fmt(fmt) KBUILD_MODNAME " L" __stringify(__LINE__) ": " fmt
>
> I never understood this pr_fmt() thing, do we really need it?
you can custorm the format for print, include modue name and line
number in this case, it's quite useful for debugging.
>
> > +
> > +#include <linux/module.h>
> > +#include <linux/rculist.h>
> > +#include <linux/blkdev.h> /* for BLK_MAX_SEGMENT_SIZE */
> > +
> > +#include "rtrs-clt.h"
> > +#include "rtrs-log.h"
> > +
> > +#define RTRS_CONNECT_TIMEOUT_MS 30000
> > +
> > +MODULE_DESCRIPTION("RDMA Transport Client");
> > +MODULE_LICENSE("GPL");
> > +
> > +static ushort nr_cons_per_session;
> > +module_param(nr_cons_per_session, ushort, 0444);
> > +MODULE_PARM_DESC(nr_cons_per_session,
> > +              "Number of connections per session. (default: nr_cpu_ids)");
> > +
> > +static int retry_cnt = 7;
> > +module_param_named(retry_cnt, retry_cnt, int, 0644);
> > +MODULE_PARM_DESC(retry_cnt,
> > +              "Number of times to send the message if the remote side didn't respond with Ack or Nack (default: 7, min: "
> > +              __stringify(MIN_RTR_CNT) ", max: "
> > +              __stringify(MAX_RTR_CNT) ")");
> > +
> > +static int __read_mostly noreg_cnt;
> > +module_param_named(noreg_cnt, noreg_cnt, int, 0444);
> > +MODULE_PARM_DESC(noreg_cnt,
> > +              "Max number of SG entries when MR registration does not happen (default: 0)");
>
> We don't like modules in new code.
could you elaberate a bit, no module paramters? which one? all?

>
> > +
> > +static const struct rtrs_rdma_dev_pd_ops dev_pd_ops;
> > +static struct rtrs_rdma_dev_pd dev_pd = {
> > +     .ops = &dev_pd_ops
> > +};
> > +
> > +static struct workqueue_struct *rtrs_wq;
> > +static struct class *rtrs_clt_dev_class;
> > +
> > +static inline bool rtrs_clt_is_connected(const struct rtrs_clt *clt)
> > +{
> > +     struct rtrs_clt_sess *sess;
> > +     bool connected = false;
> > +
> > +     rcu_read_lock();
> > +     list_for_each_entry_rcu(sess, &clt->paths_list, s.entry)
> > +             connected |= (READ_ONCE(sess->state) == RTRS_CLT_CONNECTED);
> > +     rcu_read_unlock();
> > +
> > +     return connected;
> > +}
> > +
> > +static inline struct rtrs_permit *
> > +__rtrs_get_permit(struct rtrs_clt *clt, enum rtrs_clt_con_type con_type)
> > +{
> > +     size_t max_depth = clt->queue_depth;
> > +     struct rtrs_permit *permit;
> > +     int cpu, bit;
> > +
> > +     /* Combined with cq_vector, we pin the IO to the the cpu it comes */
> > +     cpu = get_cpu();
> > +     do {
> > +             bit = find_first_zero_bit(clt->permits_map, max_depth);
> > +             if (unlikely(bit >= max_depth)) {
> > +                     put_cpu();
> > +                     return NULL;
> > +             }
> > +
> > +     } while (unlikely(test_and_set_bit_lock(bit, clt->permits_map)));
> > +     put_cpu();
> > +
> > +     permit = GET_PERMIT(clt, bit);
> > +     WARN_ON(permit->mem_id != bit);
> > +     permit->cpu_id = cpu;
> > +     permit->con_type = con_type;
> > +
> > +     return permit;
> > +}
> > +
> > +static inline void __rtrs_put_permit(struct rtrs_clt *clt,
> > +                                   struct rtrs_permit *permit)
> > +{
> > +     clear_bit_unlock(permit->mem_id, clt->permits_map);
> > +}
> > +
> > +struct rtrs_permit *rtrs_clt_get_permit(struct rtrs_clt *clt,
> > +                                       enum rtrs_clt_con_type con_type,
> > +                                       int can_wait)
> > +{
> > +     struct rtrs_permit *permit;
> > +     DEFINE_WAIT(wait);
> > +
> > +     permit = __rtrs_get_permit(clt, con_type);
> > +     if (likely(permit) || !can_wait)
> > +             return permit;
> > +
> > +     do {
> > +             prepare_to_wait(&clt->permits_wait, &wait,
> > +                             TASK_UNINTERRUPTIBLE);
> > +             permit = __rtrs_get_permit(clt, con_type);
> > +             if (likely(permit))
> > +                     break;
> > +
> > +             io_schedule();
> > +     } while (1);
> > +
> > +     finish_wait(&clt->permits_wait, &wait);
> > +
> > +     return permit;
> > +}
> > +EXPORT_SYMBOL(rtrs_clt_get_permit);
> > +
> > +void rtrs_clt_put_permit(struct rtrs_clt *clt, struct rtrs_permit *permit)
> > +{
> > +     if (WARN_ON(!test_bit(permit->mem_id, clt->permits_map)))
> > +             return;
> > +
> > +     __rtrs_put_permit(clt, permit);
> > +
> > +     /*
> > +      * Putting a permit is a barrier, so we will observe
> > +      * new entry in the wait list, no worries.
> > +      */
> > +     if (waitqueue_active(&clt->permits_wait))
>
> Where do you put permit? Does it include barrier?
__rtrs_put_permit calls clear_bit_unlock which includes barrier
rnbd-clt call rtrs_clt_put_permit before finish the request to block layer.
>
> > +             wake_up(&clt->permits_wait);
> > +}
> > +EXPORT_SYMBOL(rtrs_clt_put_permit);
> > +
> > +struct rtrs_permit *rtrs_permit_from_pdu(void *pdu)
> > +{
> > +     return pdu - sizeof(struct rtrs_permit);
>
> C standard doesn't allow pointer arithmetic on void*.
gcc never complains,  searched aournd:
https://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c

You're right, will fix.

>
> > +}
> > +EXPORT_SYMBOL(rtrs_permit_from_pdu);
> > +
> > +void *rtrs_permit_to_pdu(struct rtrs_permit *permit)
> > +{
> > +     return permit + 1;
> > +}
> > +EXPORT_SYMBOL(rtrs_permit_to_pdu);
> > +
> > +/**
> > + * rtrs_permit_to_clt_con() - returns RDMA connection pointer by the permit
> > + * @sess: client session pointer
> > + * @permit: permit for the allocation of the RDMA buffer
> > + * Note:
> > + *     IO connection starts from 1.
> > + *     0 connection is for user messages.
> > + */
> > +static
> > +struct rtrs_clt_con *rtrs_permit_to_clt_con(struct rtrs_clt_sess *sess,
> > +                                         struct rtrs_permit *permit)
> > +{
> > +     int id = 0;
> > +
> > +     if (likely(permit->con_type == RTRS_IO_CON))
> > +             id = (permit->cpu_id % (sess->s.con_num - 1)) + 1;
> > +
> > +     return to_clt_con(sess->s.con[id]);
> > +}
> > +
> > +static bool __rtrs_clt_change_state(struct rtrs_clt_sess *sess,
> > +                                  enum rtrs_clt_state new_state)
> > +{
> > +     enum rtrs_clt_state old_state;
> > +     bool changed = false;
> > +
> > +     lockdep_assert_held(&sess->state_wq.lock);
> > +
> > +     old_state = sess->state;
> > +     switch (new_state) {
> > +     case RTRS_CLT_CONNECTING:
> > +             switch (old_state) {
>
> Double switch is better to be avoided.
what's the better way to do it?
>
> > +             case RTRS_CLT_RECONNECTING:
> > +                     changed = true;
> > +                     /* FALLTHRU */
> > +             default:
> > +                     break;
> > +             }
> > +             break;
>
>
> ....
>
> Thanks
Thanks

  reply index

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-16 12:58 [PATCH v7 00/25] RTRS (former IBTRS) RDMA Transport Library and RNBD (former IBNBD) RDMA Network Block Device Jack Wang
2020-01-16 12:58 ` [PATCH v7 01/25] sysfs: export sysfs_remove_file_self() Jack Wang
2020-01-16 12:58 ` [PATCH v7 02/25] RDMA/rtrs: public interface header to establish RDMA connections Jack Wang
2020-01-16 12:58 ` [PATCH v7 03/25] RDMA/rtrs: private headers with rtrs protocol structs and helpers Jack Wang
2020-01-16 12:58 ` [PATCH v7 04/25] RDMA/rtrs: core: lib functions shared between client and server modules Jack Wang
2020-01-19 14:48   ` Leon Romanovsky
2020-01-16 12:58 ` [PATCH v7 05/25] RDMA/rtrs: client: private header with client structs and functions Jack Wang
2020-01-16 12:58 ` [PATCH v7 06/25] RDMA/rtrs: client: main functionality Jack Wang
2020-01-16 14:53   ` Leon Romanovsky
2020-01-16 15:43     ` Jinpu Wang [this message]
2020-01-16 15:53       ` Jason Gunthorpe
2020-01-16 16:48         ` Jinpu Wang
2020-01-16 15:58       ` Leon Romanovsky
2020-01-16 16:24         ` Jinpu Wang
2020-01-18 10:12           ` Leon Romanovsky
2020-01-16 12:58 ` [PATCH v7 07/25] RDMA/rtrs: client: statistics functions Jack Wang
2020-01-16 12:58 ` [PATCH v7 08/25] RDMA/rtrs: client: sysfs interface functions Jack Wang
2020-01-16 12:58 ` [PATCH v7 09/25] RDMA/rtrs: server: private header with server structs and functions Jack Wang
2020-01-16 12:59 ` [PATCH v7 10/25] RDMA/rtrs: server: main functionality Jack Wang
2020-01-16 12:59 ` [PATCH v7 11/25] RDMA/rtrs: server: statistics functions Jack Wang
2020-01-16 12:59 ` [PATCH v7 12/25] RDMA/rtrs: server: sysfs interface functions Jack Wang
2020-01-16 12:59 ` [PATCH v7 13/25] RDMA/rtrs: include client and server modules into kernel compilation Jack Wang
2020-01-16 12:59 ` [PATCH v7 14/25] RDMA/rtrs: a bit of documentation Jack Wang
2020-01-16 12:59 ` [PATCH v7 15/25] block/rnbd: private headers with rnbd protocol structs and helpers Jack Wang
2020-01-16 12:59 ` [PATCH v7 16/25] block/rnbd: client: private header with client structs and functions Jack Wang
2020-01-16 12:59 ` [PATCH v7 17/25] block/rnbd: client: main functionality Jack Wang
2020-01-16 12:59 ` [PATCH v7 18/25] block/rnbd: client: sysfs interface functions Jack Wang
2020-01-16 12:59 ` [PATCH v7 19/25] block/rnbd: server: private header with server structs and functions Jack Wang
2020-01-16 12:59 ` [PATCH v7 20/25] block/rnbd: server: main functionality Jack Wang
2020-01-16 12:59 ` [PATCH v7 21/25] block/rnbd: server: functionality for IO submission to file or block dev Jack Wang
2020-01-16 12:59 ` [PATCH v7 22/25] block/rnbd: server: sysfs interface functions Jack Wang
2020-01-16 12:59 ` [PATCH v7 23/25] block/rnbd: include client and server modules into kernel compilation Jack Wang
2020-01-16 14:40   ` Leon Romanovsky
2020-01-16 14:54     ` Jinpu Wang
2020-01-16 15:59       ` Leon Romanovsky
2020-01-16 16:53         ` Jinpu Wang
2020-01-16 12:59 ` [PATCH v7 24/25] block/rnbd: a bit of documentation Jack Wang
2020-01-16 12:59 ` [PATCH v7 25/25] MAINTAINERS: Add maintainers for RNBD/RTRS modules Jack Wang

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMGffE=pym8iz4OVxx7s6i37AU+KPFN3AeVrCTOpLx+N8A9dEQ@mail.gmail.com' \
    --to=jinpu.wang@cloud.ionos.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=danil.kipnis@cloud.ionos.com \
    --cc=dledford@redhat.com \
    --cc=hch@infradead.org \
    --cc=jgg@ziepe.ca \
    --cc=jinpuwang@gmail.com \
    --cc=leon@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpenyaev@suse.de \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git