netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: "Saleem, Shiraz" <shiraz.saleem@intel.com>
Cc: "Kirsher, Jeffrey T" <jeffrey.t.kirsher@intel.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"jgg@ziepe.ca" <jgg@ziepe.ca>,
	"Ismail, Mustafa" <mustafa.ismail@intel.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"nhorman@redhat.com" <nhorman@redhat.com>,
	"sassmann@redhat.com" <sassmann@redhat.com>
Subject: Re: [RFC PATCH v5 07/16] RDMA/irdma: Add connection manager
Date: Tue, 21 Apr 2020 10:33:43 +0300	[thread overview]
Message-ID: <20200421073343.GJ121146@unreal> (raw)
In-Reply-To: <9DD61F30A802C4429A01CA4200E302A7DCD4858B@fmsmsx124.amr.corp.intel.com>

On Tue, Apr 21, 2020 at 12:26:14AM +0000, Saleem, Shiraz wrote:
> > Subject: Re: [RFC PATCH v5 07/16] RDMA/irdma: Add connection manager
> >
> > On Fri, Apr 17, 2020 at 10:12:42AM -0700, Jeff Kirsher wrote:
> > > From: Mustafa Ismail <mustafa.ismail@intel.com>
> > >
> > > Add connection management (CM) implementation for iWARP including
> > > accept, reject, connect, create_listen, destroy_listen and CM utility
> > > functions
> > >
> > > Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
> > > Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> > > ---
> > >  drivers/infiniband/hw/irdma/cm.c | 4499
> > > ++++++++++++++++++++++++++++++  drivers/infiniband/hw/irdma/cm.h |
> > > 413 +++
> > >  2 files changed, 4912 insertions(+)
> > >  create mode 100644 drivers/infiniband/hw/irdma/cm.c  create mode
> > > 100644 drivers/infiniband/hw/irdma/cm.h
> > >
> > > diff --git a/drivers/infiniband/hw/irdma/cm.c
> > > b/drivers/infiniband/hw/irdma/cm.c
> > > new file mode 100644
> > > index 000000000000..87d6300fee35
> > > --- /dev/null
> > > +++ b/drivers/infiniband/hw/irdma/cm.c
> > > @@ -0,0 +1,4499 @@
> > > +// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB
> > > +/* Copyright (c) 2015 - 2019 Intel Corporation */ #include
> > > +<linux/highmem.h> #include <net/addrconf.h> #include
> > > +<net/ip6_route.h> #include <net/flow.h> #include <net/secure_seq.h>
> > > +
> > > +#include "main.h"
> > > +#include "trace.h"
> > > +
> > > +static void irdma_rem_ref_cm_node(struct irdma_cm_node *); static
> > > +void irdma_cm_post_event(struct irdma_cm_event *event); static void
> > > +irdma_disconnect_worker(struct work_struct *work);
> > > +
> > > +/**
> > > + * irdma_free_sqbuf - put back puda buffer if refcount is 0
> > > + * @vsi: The VSI structure of the device
> > > + * @bufp: puda buffer to free
> > > + */
> > > +void irdma_free_sqbuf(struct irdma_sc_vsi *vsi, void *bufp) {
> > > +	struct irdma_puda_buf *buf = bufp;
> > > +	struct irdma_puda_rsrc *ilq = vsi->ilq;
> > > +
> > > +	if (refcount_dec_and_test(&buf->refcount))
> > > +		irdma_puda_ret_bufpool(ilq, buf);
> > > +}
> > > +
> > > +/**
> > > + * irdma_derive_hw_ird_setting - Calculate IRD
> > > + * @cm_ird: IRD of connection's node
> > > + *
> > > + * The ird from the connection is rounded to a supported HW
> > > + * setting (2,8,32,64,128) and then encoded for ird_size field
> > > + * of qp_ctx
> > > + */
> > > +u8 irdma_derive_hw_ird_setting(u16 cm_ird) {
> > > +	/* ird_size field is encoded in qp_ctx */
> > > +	switch (cm_ird ? roundup_pow_of_two(cm_ird) : 0) {
> > > +	case IRDMA_HW_IRD_SETTING_128:
> > > +		return 4;
> > > +	case IRDMA_HW_IRD_SETTING_64:
> > > +		return 3;
> > > +	case IRDMA_HW_IRD_SETTING_32:
> > > +	case IRDMA_HW_IRD_SETTING_16:
> > > +		return 2;
> > > +	case IRDMA_HW_IRD_SETTING_8:
> > > +	case IRDMA_HW_IRD_SETTING_4:
> > > +		return 1;
> > > +	case IRDMA_HW_IRD_SETTING_2:
> > > +	default:
> > > +		break;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +/**
> > > + * irdma_record_ird_ord - Record IRD/ORD passed in
> > > + * @cm_node: connection's node
> > > + * @conn_ird: connection IRD
> > > + * @conn_ord: connection ORD
> > > + */
> > > +static void irdma_record_ird_ord(struct irdma_cm_node *cm_node, u32
> > conn_ird,
> > > +				 u32 conn_ord)
> > > +{
> > > +	if (conn_ird > cm_node->dev->hw_attrs.max_hw_ird)
> > > +		conn_ird = cm_node->dev->hw_attrs.max_hw_ird;
> > > +
> > > +	if (conn_ord > cm_node->dev->hw_attrs.max_hw_ord)
> > > +		conn_ord = cm_node->dev->hw_attrs.max_hw_ord;
> > > +	else if (!conn_ord && cm_node->send_rdma0_op ==
> > SEND_RDMA_READ_ZERO)
> > > +		conn_ord = 1;
> > > +	cm_node->ird_size = conn_ird;
> > > +	cm_node->ord_size = conn_ord;
> > > +}
> > > +
> > > +/**
> > > + * irdma_copy_ip_ntohl - copy IP address from  network to host
> > > + * @dst: IP address in host order
> > > + * @src: IP address in network order (big endian)  */ void
> > > +irdma_copy_ip_ntohl(u32 *dst, __be32 *src) {
> > > +	*dst++ = ntohl(*src++);
> > > +	*dst++ = ntohl(*src++);
> > > +	*dst++ = ntohl(*src++);
> > > +	*dst = ntohl(*src);
> > > +}
> > > +
> > > +/**
> > > + * irdma_copy_ip_htonl - copy IP address from host to network order
> > > + * @dst: IP address in network order (big endian)
> > > + * @src: IP address in host order
> > > + */
> > > +void irdma_copy_ip_htonl(__be32 *dst, u32 *src) {
> > > +	*dst++ = htonl(*src++);
> > > +	*dst++ = htonl(*src++);
> > > +	*dst++ = htonl(*src++);
> > > +	*dst = htonl(*src);
> > > +}
> >
> > It is strange that we don't have in the kernel some generic function to do it.
> >
> > > +
> > > +/**
> > > + * irdma_get_addr_info
> > > + * @cm_node: contains ip/tcp info
> > > + * @cm_info: to get a copy of the cm_node ip/tcp info  */ static void
> > > +irdma_get_addr_info(struct irdma_cm_node *cm_node,
> > > +				struct irdma_cm_info *cm_info)
> > > +{
> > > +	memset(cm_info, 0, sizeof(*cm_info));
> > > +	cm_info->ipv4 = cm_node->ipv4;
> > > +	cm_info->vlan_id = cm_node->vlan_id;
> > > +	memcpy(cm_info->loc_addr, cm_node->loc_addr, sizeof(cm_info-
> > >loc_addr));
> > > +	memcpy(cm_info->rem_addr, cm_node->rem_addr, sizeof(cm_info-
> > >rem_addr));
> > > +	cm_info->loc_port = cm_node->loc_port;
> > > +	cm_info->rem_port = cm_node->rem_port; }
> > > +
> > > +/**
> > > + * irdma_fill_sockaddr4 - fill in addr info for IPv4 connection
> > > + * @cm_node: connection's node
> > > + * @event: upper layer's cm event
> > > + */
> > > +static inline void irdma_fill_sockaddr4(struct irdma_cm_node *cm_node,
> > > +					struct iw_cm_event *event)
> > > +{
> > > +	struct sockaddr_in *laddr = (struct sockaddr_in *)&event->local_addr;
> > > +	struct sockaddr_in *raddr = (struct sockaddr_in
> > > +*)&event->remote_addr;
> > > +
> > > +	laddr->sin_family = AF_INET;
> > > +	raddr->sin_family = AF_INET;
> > > +
> > > +	laddr->sin_port = htons(cm_node->loc_port);
> > > +	raddr->sin_port = htons(cm_node->rem_port);
> > > +
> > > +	laddr->sin_addr.s_addr = htonl(cm_node->loc_addr[0]);
> > > +	raddr->sin_addr.s_addr = htonl(cm_node->rem_addr[0]); }
> > > +
> > > +/**
> > > + * irdma_fill_sockaddr6 - fill in addr info for IPv6 connection
> > > + * @cm_node: connection's node
> > > + * @event: upper layer's cm event
> > > + */
> > > +static inline void irdma_fill_sockaddr6(struct irdma_cm_node *cm_node,
> > > +					struct iw_cm_event *event)
> > > +{
> > > +	struct sockaddr_in6 *laddr6 = (struct sockaddr_in6 *)&event->local_addr;
> > > +	struct sockaddr_in6 *raddr6 = (struct sockaddr_in6
> > > +*)&event->remote_addr;
> > > +
> > > +	laddr6->sin6_family = AF_INET6;
> > > +	raddr6->sin6_family = AF_INET6;
> > > +
> > > +	laddr6->sin6_port = htons(cm_node->loc_port);
> > > +	raddr6->sin6_port = htons(cm_node->rem_port);
> > > +
> > > +	irdma_copy_ip_htonl(laddr6->sin6_addr.in6_u.u6_addr32,
> > > +			    cm_node->loc_addr);
> > > +	irdma_copy_ip_htonl(raddr6->sin6_addr.in6_u.u6_addr32,
> > > +			    cm_node->rem_addr);
> > > +}
> > > +
> > > +/**
> > > + * irdma_get_cmevent_info - for cm event upcall
> > > + * @cm_node: connection's node
> > > + * @cm_id: upper layers cm struct for the event
> > > + * @event: upper layer's cm event
> > > + */
> > > +static inline void irdma_get_cmevent_info(struct irdma_cm_node *cm_node,
> > > +					  struct iw_cm_id *cm_id,
> > > +					  struct iw_cm_event *event)
> > > +{
> > > +	memcpy(&event->local_addr, &cm_id->m_local_addr,
> > > +	       sizeof(event->local_addr));
> > > +	memcpy(&event->remote_addr, &cm_id->m_remote_addr,
> > > +	       sizeof(event->remote_addr));
> > > +	if (cm_node) {
> > > +		event->private_data = cm_node->pdata_buf;
> > > +		event->private_data_len = (u8)cm_node->pdata.size;
> > > +		event->ird = cm_node->ird_size;
> > > +		event->ord = cm_node->ord_size;
> > > +	}
> > > +}
> > > +
> > > +/**
> > > + * irdma_send_cm_event - upcall cm's event handler
> > > + * @cm_node: connection's node
> > > + * @cm_id: upper layer's cm info struct
> > > + * @type: Event type to indicate
> > > + * @status: status for the event type  */ static int
> > > +irdma_send_cm_event(struct irdma_cm_node *cm_node,
> > > +			       struct iw_cm_id *cm_id,
> > > +			       enum iw_cm_event_type type, int status) {
> > > +	struct iw_cm_event event = {};
> > > +
> > > +	event.event = type;
> > > +	event.status = status;
> > > +	switch (type) {
> > > +	case IW_CM_EVENT_CONNECT_REQUEST:
> > > +		trace_irdma_send_cm_event(cm_node, cm_id, type, status,
> > > +					  __builtin_return_address(0));
> > > +		if (cm_node->ipv4)
> > > +			irdma_fill_sockaddr4(cm_node, &event);
> > > +		else
> > > +			irdma_fill_sockaddr6(cm_node, &event);
> > > +		event.provider_data = cm_node;
> > > +		event.private_data = cm_node->pdata_buf;
> > > +		event.private_data_len = (u8)cm_node->pdata.size;
> > > +		event.ird = cm_node->ird_size;
> > > +		break;
> > > +	case IW_CM_EVENT_CONNECT_REPLY:
> > > +		trace_irdma_send_cm_event(cm_node, cm_id, type, status,
> > > +					  __builtin_return_address(0));
> > > +		irdma_get_cmevent_info(cm_node, cm_id, &event);
> > > +		break;
> > > +	case IW_CM_EVENT_ESTABLISHED:
> > > +		trace_irdma_send_cm_event(cm_node, cm_id, type, status,
> > > +					  __builtin_return_address(0));
> > > +		event.ird = cm_node->ird_size;
> > > +		event.ord = cm_node->ord_size;
> > > +		break;
> > > +	case IW_CM_EVENT_DISCONNECT:
> > > +		trace_irdma_send_cm_event_no_node(cm_id, type, status,
> > > +						  __builtin_return_address(0));
> > > +		break;
> > > +	case IW_CM_EVENT_CLOSE:
> > > +		trace_irdma_send_cm_event_no_node(cm_id, type, status,
> > > +						  __builtin_return_address(0));
> > > +		break;
> > > +	default:
> > > +		ibdev_dbg(to_ibdev(cm_node->iwdev),
> > > +			  "CM: Unsupported event type received type = %d\n",
> > > +			  type);
> > > +		return -1;
> >
> > How are these trace events different from existing in IB/core and why should driver
> > implement CM event handler? Is it iWARP specific?
> >
> > I'm really worried to see CM code in the driver.
> >
>
> Yes the CM code in the driver is iWARP specific
> https://elixir.bootlin.com/linux/v5.7-rc2/source/include/rdma/iw_cm.h#L72
> https://elixir.bootlin.com/linux/v5.7-rc2/source/include/rdma/ib_verbs.h#L2566
>
> We have some CM tracing to record driver specific data / paths in connection
> flows. For example in this patch we record,
>
> +	    TP_printk("iwdev=%p  caller=%pS  cm_id=%p  node=%p  refcnt=%d  vlan_id=%d  accel=%d  state=%s  event_type=%s  status=%d  loc: %s  rem: %s",
> +		      __entry->iwdev,
> +		      __entry->caller,
> +		      __entry->cm_id,
> +		      __entry->cm_node,
> +		      __entry->refcount,
> +		      __entry->vlan_id,
> +		      __entry->accel,
> +		      parse_cm_state(__entry->state),
> +		      parse_iw_event_type(__entry->type),
> +		      __entry->status,
> +		      __print_ip_addr(__get_dynamic_array(laddr),
> +				      __entry->lport, __entry->ipv4),
> +		      __print_ip_addr(__get_dynamic_array(raddr),
> +				      __entry->rport, __entry->ipv4)
> +		    )
> +);


IMHO, everything here should be in general iWARP CM implementation.

Thanks

  reply	other threads:[~2020-04-21  7:33 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-17 17:12 [RFC PATCH v5 00/16] Add Intel Ethernet Protocol Driver for RDMA (irdma) Jeff Kirsher
2020-04-17 17:12 ` [RFC PATCH v5 01/16] RDMA/irdma: Add driver framework definitions Jeff Kirsher
2020-04-17 19:34   ` Leon Romanovsky
2020-04-21  0:23     ` Saleem, Shiraz
2020-04-21  0:46       ` Jason Gunthorpe
2020-04-21 18:19         ` Saleem, Shiraz
2020-04-21 18:22           ` Jason Gunthorpe
2020-04-23  0:32             ` Saleem, Shiraz
2020-04-23 15:02               ` Jason Gunthorpe
2020-04-23 17:15                 ` Saleem, Shiraz
2020-04-23 19:03                   ` Jason Gunthorpe
2020-04-23 23:54                     ` Saleem, Shiraz
2020-04-24  0:48                       ` Jason Gunthorpe
2020-04-27 23:57                         ` Saleem, Shiraz
2020-04-28  0:03                           ` Jason Gunthorpe
2020-04-21  7:14       ` Leon Romanovsky
2020-04-17 19:37   ` Jason Gunthorpe
2020-04-17 17:12 ` [RFC PATCH v5 02/16] RDMA/irdma: Implement device initialization definitions Jeff Kirsher
2020-04-17 17:12 ` [RFC PATCH v5 03/16] RDMA/irdma: Implement HW Admin Queue OPs Jeff Kirsher
2020-04-17 17:12 ` [RFC PATCH v5 04/16] RDMA/irdma: Add HMC backing store setup functions Jeff Kirsher
2020-04-17 20:17   ` Leon Romanovsky
2020-04-21  0:25     ` Saleem, Shiraz
2020-04-17 17:12 ` [RFC PATCH v5 05/16] RDMA/irdma: Add privileged UDA queue implementation Jeff Kirsher
2020-04-17 17:12 ` [RFC PATCH v5 06/16] RDMA/irdma: Add QoS definitions Jeff Kirsher
2020-04-17 17:12 ` [RFC PATCH v5 07/16] RDMA/irdma: Add connection manager Jeff Kirsher
2020-04-17 20:23   ` Leon Romanovsky
2020-04-21  0:26     ` Saleem, Shiraz
2020-04-21  7:33       ` Leon Romanovsky [this message]
2020-04-17 17:12 ` [RFC PATCH v5 08/16] RDMA/irdma: Add PBLE resource manager Jeff Kirsher
2020-04-17 17:12 ` [RFC PATCH v5 09/16] RDMA/irdma: Implement device supported verb APIs Jeff Kirsher
2020-04-17 19:59   ` Leon Romanovsky
2020-04-21  0:29     ` Saleem, Shiraz
2020-04-21  7:16       ` Leon Romanovsky
2020-04-17 17:12 ` [RFC PATCH v5 10/16] RDMA/irdma: Add RoCEv2 UD OP support Jeff Kirsher
2020-04-17 19:46   ` Leon Romanovsky
2020-04-21  0:27     ` Saleem, Shiraz
2020-04-17 17:12 ` [RFC PATCH v5 11/16] RDMA/irdma: Add user/kernel shared libraries Jeff Kirsher
2020-04-17 17:12 ` [RFC PATCH v5 12/16] RDMA/irdma: Add miscellaneous utility definitions Jeff Kirsher
2020-04-17 20:32   ` Leon Romanovsky
2020-04-21  0:27     ` Saleem, Shiraz
2020-04-21  7:30       ` Leon Romanovsky
2020-04-22  0:02         ` Saleem, Shiraz
2020-04-22  0:06           ` Jason Gunthorpe
2020-04-23  0:32             ` Saleem, Shiraz
2020-04-17 17:12 ` [RFC PATCH v5 13/16] RDMA/irdma: Add dynamic tracing for CM Jeff Kirsher
2020-04-17 17:12 ` [RFC PATCH v5 14/16] RDMA/irdma: Add ABI definitions Jeff Kirsher
2020-04-17 19:43   ` Leon Romanovsky
2020-04-21  0:29     ` Saleem, Shiraz
2020-04-21  7:22       ` Leon Romanovsky
2020-04-17 17:12 ` [RFC PATCH v5 15/16] RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw Jeff Kirsher
2020-04-17 17:12 ` [RFC PATCH v5 16/16] RDMA/irdma: Update MAINTAINERS file Jeff Kirsher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200421073343.GJ121146@unreal \
    --to=leon@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mustafa.ismail@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@redhat.com \
    --cc=sassmann@redhat.com \
    --cc=shiraz.saleem@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).