Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
From: Mark Zhang <markz@mellanox.com>
To: Alex Rosenbaum <rosenbaumalex@gmail.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>,
	Eran Ben Elisha <eranbe@mellanox.com>,
	Yishai Hadas <yishaih@mellanox.com>,
	Alex Rosenbaum <alexr@mellanox.com>,
	Maor Gottlieb <maorg@mellanox.com>,
	Leon Romanovsky <leonro@mellanox.com>
Subject: Re: [RFC v2] RoCE v2.0 Entropy - IPv6 Flow Label and UDP Source Port
Date: Wed, 15 Jan 2020 09:48:38 +0000
Message-ID: <ed4c97ae-5bf6-20a9-8292-ead9ee356585@mellanox.com> (raw)
In-Reply-To: <CAFgAxU8ArpoL9fMpJY5v-UZS5AMXom+TJ8HS57XeEOsCFFov8Q@mail.gmail.com>

On 1/8/2020 10:26 PM, Alex Rosenbaum wrote:
> A combination of the flow_label field in the IPv6 header and UDP source port
> field in RoCE v2.0 are used to identify a group of packets that must be
> delivered in order by the network, end-to-end.
> These fields are used to create entropy for network routers (ECMP), load
> balancers and 802.3ad link aggregation switching that are not aware of RoCE IB
> headers.
> 
> The flow_label field is defined by a 20 bit hash value. CM based connections
> will use a hash function definition based on the service type (QP Type) and
> Service ID (SID). Where CM services are not used, the 20 bit hash will be
> according to the source and destination QPN values.
> Drivers will derive the RoCE v2.0 UDP src_port from the flow_label result.
> 
> UDP source port selection must adhere IANA port allocation ranges. Thus we will
> be using IANA recommendation for Ephemeral port range of: 49152-65535, or in
> hex: 0xC000-0xFFFF.
> 
> The below calculations take into account the importance of producing a symmetric
> hash result so we can support symmetric hash calculation of network elements.
> 
> Hash Calculation for RDMA IP CM Service
> =======================================
> For RDMA IP CM Services, based on QP1 iMAD usage and connected RC QPs using the
> RDMA IP CM Service ID, the flow label will be calculated according to IBTA CM
> REQ private data info and Service ID.
> 
> Flow label hash function calculations definition will be defined as:
> Extract the following fields from the CM IP REQ:
>    CM_REQ.ServiceID.DstPort [2 Bytes]
>    CM_REQ.PrivateData.SrcPort [2 Bytes]
>    u32 hash = DstPort * SrcPort;
>    hash ^= (hash >> 16);
>    hash ^= (hash >> 8);
>    AH_ATTR.GRH.flow_label = hash AND IB_GRH_FLOWLABEL_MASK;
> 
>    #define IB_GRH_FLOWLABEL_MASK  0x000FFFFF
> 
> Result of the above hash will be kept in the CM's route path record connection
> context and will be used all across its vitality for all preceding CM messages
> on both ends of the connection (including REP, REJ, DREQ, DREP, ..).
> Once connection is established, the corresponding Connected RC QPs, on both
> ends of the connection, will update their context with the calculated RDMA IP
> CM Service based flow_label and UDP src_port values at the Connect phase of
> the active side and Accept phase of the passive side of the connection.
> 
> CM will provide to the calculated value of the flow_label hash (20 bit) result
> in the 'uint32_t flow_label' field of 'struct ibv_global_route' in 'struct
> ibv_ah_attr'.
> The 'struct ibv_ah_attr' is passed by the CM to the provider library when
> modifying a connected QP's (e.g.: RC) state by calling 'ibv_modify_qp(qp,
> ah_attr, attr_mask |= IBV_QP_AV)' or when create a AH for working with
> datagram QP's (e.g.: UD) by calling ibv_create_ah(ah_attr).
> 
> Hash Calculation for non-RDMA CM Service ID
> ===========================================
> For non CM QP's, the application can define the flow_label value in the
> 'struct ibv_ah_attr' when modifying the connected QP's (e.g.: RC) or creating
> a AH for the datagram QP's (e.g.: UD).
> 

Hi Alex, when creating an AH for the datagram QP, I think we don't have 
the src.QP and dst.QP, so we can't set the flow_label here?


> If the provided flow_label value is zero, not set by the application (e.g.:
> legacy cases), then verbs providers should use the src.QP[24bit] and
> dst.QP[24bit] as input arguments for flow_label calculation.
> As QPN's are an array of 3 bytes, the multiplication will result in 6 bytes
> value. We'll define a flow_label value as:
>    DstQPn [3 Bytes]
>    SrcQPn [3 Bytes]
>    u64 hash = DstQPn * SrcQPn;
>    hash ^= (hash >> 20);
>    hash ^= (hash >> 40);
>    AH_ATTR.GRH.flow_label = hash AND IB_GRH_FLOWLABEL_MASK;
> 
> Hash Calculation for UDP src_port
> =================================
> Providers supporting RoCEv2 will use the 'flow_label' value as input to
> calculate the RoCEv2 UDP src_port, which will be used in the QP context or the
> AH context.
> 
> UDP src_port calculations from flow label:
> [while considering the 14 bits UDP port range according to IANA recommendation]
>    AH_ATTR.GRH.flow_label [20 bits]
>    u32 fl_low  = fl & 0x03FFF;
>    u32 fl_high = fl & 0xFC000;
>    u16 udp_sport = fl_low XOR (fl_high >> 14);
>    RoCE.UDP.src_port = udp_sport OR IB_ROCE_UDP_ENCAP_VALID_PORT
> 
>    #define IB_ROCE_UDP_ENCAP_VALID_PORT 0xC000
> 
> This is a v2 follow-up on "[RFC] RoCE v2.0 UDP Source Port Entropy" [1]
> 
> [1] https://www.spinics.net/lists/linux-rdma/msg73735.html
> 
> Signed-off-by: Alex Rosenbaum <alexr@mellanox.com>
> 


  reply index

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-08 14:26 Alex Rosenbaum
2020-01-15  9:48 ` Mark Zhang [this message]
2020-02-06 14:18 ` Tom Talpey
2020-02-06 14:35   ` Jason Gunthorpe
2020-02-06 14:39   ` Alex Rosenbaum
2020-02-06 15:19     ` Tom Talpey
2020-02-08  9:58       ` Alex Rosenbaum
2020-02-12 15:47         ` Tom Talpey
2020-02-13 11:03           ` Alex Rosenbaum
2020-02-13 15:26             ` Tom Talpey
2020-02-13 15:41               ` Jason Gunthorpe
2020-02-14 14:23                 ` Mark Zhang
2020-02-15  6:27                   ` Mark Zhang
2020-02-18 14:16                     ` Tom Talpey
2020-02-18 17:41                       ` Tom Talpey
2020-02-19  1:51                         ` Mark Zhang
2020-02-19  2:01                           ` Tom Talpey
2020-02-19  2:06                             ` Mark Zhang
2020-02-19 13:06                               ` Jason Gunthorpe
2020-02-19 17:41                                 ` Tom Talpey
2020-02-19 17:55                                   ` Jason Gunthorpe
2020-02-20  1:04                                   ` Mark Zhang
2020-02-21 14:47                                     ` Tom Talpey
2020-02-25 13:20                                       ` Alex Rosenbaum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ed4c97ae-5bf6-20a9-8292-ead9ee356585@mellanox.com \
    --to=markz@mellanox.com \
    --cc=alexr@mellanox.com \
    --cc=eranbe@mellanox.com \
    --cc=jgg@ziepe.ca \
    --cc=leonro@mellanox.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maorg@mellanox.com \
    --cc=rosenbaumalex@gmail.com \
    --cc=yishaih@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git