From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4305C33CB1 for ; Wed, 15 Jan 2020 12:52:18 +0000 (UTC) Received: from dpdk.org (dpdk.org [92.243.14.124]) by mail.kernel.org (Postfix) with ESMTP id 4EB59206D7 for ; Wed, 15 Jan 2020 12:52:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=Mellanox.com header.i=@Mellanox.com header.b="V9TZXTMs" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4EB59206D7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=mellanox.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2DC4E1BF80; Wed, 15 Jan 2020 13:52:17 +0100 (CET) Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-eopbgr60079.outbound.protection.outlook.com [40.107.6.79]) by dpdk.org (Postfix) with ESMTP id 7FF101BF7E for ; Wed, 15 Jan 2020 13:52:15 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GUAvqmKbgBiOqZ0BUj0k1tq6UymK8KlCU+N7bf2nuUQtPt4eDBU5wKok8a2HPd3a7wnhMJ6ruiaU2khrS3M/sIL50wAbjPdoVsxfcx5YtU6S/TO9/twRtytLVS5j8RtjFmC8F+7PxPShvk046q6WpbattQXqROEYoZLfdLoT462oRgSGSaSnX0v8Ek8X7jOjcIY7flFPyyF/XSBJZs3B9iUvAkfHacNa4q++QBfDAdp/HlDIVzrPCDXokz+agkcL3PcvMTKgWNxy2HpMregMH39oPGAAPDeB0Rlo0FmCyHglByfeulIANMplw9DWnvo5pwe33TOkp/V38imG33AGqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BOBaiUzkYqdwLbdSuRquvP5xR7b9Von+j1hM09LuPII=; b=UdCwRzs2v6cxIBnDUCUQWurVGsGcK67R/bXyO2Lu3TNqPvKOMLJM77HHhSZRJn7rmdoOUMyvRu2fgzoIkE6ORBKXRSWZsQQ1nlp/jKzRXZMleYmBVQiO6ghqvfP58hbXJ1wKZJD8/ZYUjzFGOVVu1tNXOD/Ec2LpKITFo3hQV08dKy2nLc1ZB6oSO8njahX8PUAI9xEtbCuyIMZGvphNs2cnKHnJw5qbaB4FFs7rE7rvxaBDFDWQHmLZtk/KkoCZfLGIGfzeNLseLm/hECQCu8jWClnCVTCiwGZf9HbgJakaPaEXiMWtae+IyYSGKliphPSrTKDtD+thPQw4qUNNGQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BOBaiUzkYqdwLbdSuRquvP5xR7b9Von+j1hM09LuPII=; b=V9TZXTMsS5krCc0J4injUWhh7r8eQF4BpNUF6Kn4Et5TvaUHftUvdmKpehhCbQ3O46Vw6cnt5ZCowJ+JBiDs6l8+4UrQleCMpa9vlmbm+tTHiMmLvFtXphujQEIzOVkOaPGe/EH0FikZjH00MNNkOmXadT6yDkkRV5+ZYbzN6Nc= Received: from AM4PR05MB3265.eurprd05.prod.outlook.com (10.171.188.154) by AM4PR05MB3267.eurprd05.prod.outlook.com (10.171.188.28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2644.19; Wed, 15 Jan 2020 12:52:13 +0000 Received: from AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::68eb:ad79:71f4:110f]) by AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::68eb:ad79:71f4:110f%3]) with mapi id 15.20.2623.018; Wed, 15 Jan 2020 12:52:13 +0000 From: Slava Ovsiienko To: Olivier Matz CC: "dev@dpdk.org" , Matan Azrad , Raslan Darawsheh , Ori Kam , Shahaf Shuler , "stephen@networkplumber.org" Thread-Topic: [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer Thread-Index: AQHVyu8biJrXrE/1QU6oREG6CoHT6KfrqdGw Date: Wed, 15 Jan 2020 12:52:13 +0000 Message-ID: References: <20191118094938.192850-1-shahafs@mellanox.com> <1578993305-15165-1-git-send-email-viacheslavo@mellanox.com> <1578993305-15165-2-git-send-email-viacheslavo@mellanox.com> <20200114152713.GS22738@platinum> In-Reply-To: <20200114152713.GS22738@platinum> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=viacheslavo@mellanox.com; x-originating-ip: [95.164.10.10] x-ms-publictraffictype: Email x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: c3416a1c-c3d4-4505-115b-08d799b9be57 x-ms-traffictypediagnostic: AM4PR05MB3267:|AM4PR05MB3267: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtFwd x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 02830F0362 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(366004)(376002)(136003)(396003)(346002)(199004)(189003)(55016002)(6506007)(81156014)(9686003)(8936002)(53546011)(6916009)(66946007)(64756008)(66476007)(7696005)(26005)(66446008)(76116006)(5660300002)(81166006)(478600001)(71200400001)(316002)(2906002)(54906003)(52536014)(186003)(8676002)(4326008)(66556008)(33656002)(86362001); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR05MB3267; H:AM4PR05MB3265.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: yz1fhD306do7h5u1uLgvMPn9dtsiQcUXtQ5JVAw5tvPq2/Q4OuN2CqGSpIYosr6s/7VasxXw/h2z8+OJxVroQAjUaRS9Y4OBDrLFdRwmI0uMAvTxRw23H8VKiXILczjW3wxeAWWat2pnyVuYtW/9Aca4Ji0WEx2Gect8vs/p4Ctpy9TiCcNp8kotNr5869/qDJI6HjypUXvhvUYFRiqEtMAqPwUFfrCtjaflFM4fwobSkBBQ8wyuFhyQFJZACeIvVpdWfDWO09UOsLqGeVmjjT4h5ePo9m72htL3Ft0zpPNCkEWDcmbmVmHACBadoInP/YYTOeoDTOgmjX3IEouEa6bUyTuzmNc1jqjlHpI1g71a++qmxfpczsJswuy3AiZHmTOlZPu4Di7l6pcbZKqxakYuEYH2k3i2hPwL3BQhrPgfxupBMzZOeRdgcoUaWY51 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: c3416a1c-c3d4-4505-115b-08d799b9be57 X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Jan 2020 12:52:13.6231 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: vOokD94NSZkhhJ819zM3Sl5Fi7Nm1q6CQsE8P3sSnjBVUiphpbCnrPI9dmjzcR1N42jCR9VEkN+5q49QcBOiXtWrbNL6nDoZSY10zxPjA04= X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB3267 Subject: Re: [dpdk-dev] [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Olivier Matz > Sent: Tuesday, January 14, 2020 17:27 > To: Slava Ovsiienko > Cc: dev@dpdk.org; Matan Azrad ; Raslan Darawsheh > ; Ori Kam ; Shahaf Shuler > ; stephen@networkplumber.org > Subject: Re: [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer >=20 > Hi Viacheslav, >=20 > Please see some comments below. >=20 > On Tue, Jan 14, 2020 at 09:15:02AM +0000, Viacheslav Ovsiienko wrote: > > Update detach routine to check the mbuf pool type. > > > > Signed-off-by: Shahaf Shuler > > Signed-off-by: Viacheslav Ovsiienko > > --- > > lib/librte_mbuf/rte_mbuf.h | 64 > > +++++++++++++++++++++++++++++++++++++++++++--- > > 1 file changed, 60 insertions(+), 4 deletions(-) > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > > index 219b110..8f486af 100644 > > --- a/lib/librte_mbuf/rte_mbuf.h > > +++ b/lib/librte_mbuf/rte_mbuf.h > > @@ -306,6 +306,46 @@ struct rte_pktmbuf_pool_private { > > uint32_t flags; /**< reserved for future use. */ }; > > > > +/** > > + * When set pktmbuf mempool will hold only mbufs with pinned external > > + * buffer. The external buffer will be attached on the mbuf at the > > + * memory pool creation and will never be detached by the mbuf free ca= lls. > > + * mbuf should not contain any room for data after the mbuf structure. > > + */ > > +#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0) >=20 > Out of curiosity, why using a pool flag instead of flagging the mbufs? > The reason I see is that adding a new m->flag would impact places where w= e > copy or reset the flags (it should be excluded). Is there another reason? >=20 Can we introduce the new static flag for mbuf? Yes, there is some problem - there are many places in DPDK where the flags in new allocated mbufs are disregarded (ol_flags field is just set to zero)= . So, any flag set on allocation (even static, dynamic one is not possible to= handle at all) would get lost. We could fix it in new application (this feature is address= ed to the new ones) and PMDs supporting this, the question is whether we are allowed = to define the new mbuf static (in meaning not dynamic) flag. > > +/** > > + * Returns TRUE if given mbuf has an pinned external buffer, or FALSE > > + * otherwise. The pinned external buffer is allocated at pool > > +creation > > + * time and should not be freed. > > + * > > + * External buffer is a user-provided anonymous buffer. > > + */ > > +#ifdef ALLOW_EXPERIMENTAL_API > > +#define RTE_MBUF_HAS_PINNED_EXTBUF(mb) > rte_mbuf_has_pinned_extbuf(mb) > > +#else #define RTE_MBUF_HAS_PINNED_EXTBUF(mb) false #endif >=20 > I suppose you added these lines because the compilation was broken after > introducing the new __rte_experimental API, which is called from detach()= . >=20 > I find a bit strange that we require to do this. I don't see what would b= e > broken without the ifdef: an application compiled for 19.11 cannot use th= e > pinned-ext-buf feature (because it did not exist), so the modification lo= oks > safe to me. Without ifdef compilation fails if there is no experimental API usage confi= gured. >=20 > > + > > +__rte_experimental > > +static inline uint32_t >=20 > I don't think uint32_t is really better than uint64_t. I agree with Steph= en that > bool is the preferred choice, however if it breaks compilation in some ca= ses, I > think int is better. Yes, bool causes compilation issues (just including stdbool.h causes build = failure), and, for example bool is reserved gcc keyword if AltiVec support is enable= d. uint32_t is the type of priv->flags.=20 >=20 > > +rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m) { > > + if (RTE_MBUF_HAS_EXTBUF(m)) { > > + /* > > + * The mbuf has the external attached buffer, > > + * we should check the type of the memory pool where > > + * the mbuf was allocated from. > > + */ > > + struct rte_pktmbuf_pool_private *priv =3D > > + (struct rte_pktmbuf_pool_private *) > > + rte_mempool_get_priv(m->pool); > > + > > + return priv->flags & > RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF; >=20 > What about introducing a rte_pktmbuf_priv_flags() function, on the same > model than rte_pktmbuf_priv_size() or rte_pktmbuf_data_room_size()? Nice idea, thanks. I think this routine can be not experimental and we woul= d get rid of ifdef and other stuff. >=20 >=20 > > + } > > + return 0; > > +} > > + > > #ifdef RTE_LIBRTE_MBUF_DEBUG > > > > /** check mbuf type in debug mode */ @@ -571,7 +611,8 @@ static > > inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp) > > static __rte_always_inline void rte_mbuf_raw_free(struct rte_mbuf *m) > > { > > - RTE_ASSERT(RTE_MBUF_DIRECT(m)); > > + RTE_ASSERT(!RTE_MBUF_CLONED(m) && > > + (!RTE_MBUF_HAS_EXTBUF(m) || > RTE_MBUF_HAS_PINNED_EXTBUF(m))); > > RTE_ASSERT(rte_mbuf_refcnt_read(m) =3D=3D 1); > > RTE_ASSERT(m->next =3D=3D NULL); > > RTE_ASSERT(m->nb_segs =3D=3D 1); > > @@ -1141,11 +1182,26 @@ static inline void rte_pktmbuf_detach(struct > rte_mbuf *m) > > uint32_t mbuf_size, buf_len; > > uint16_t priv_size; > > > > - if (RTE_MBUF_HAS_EXTBUF(m)) > > + if (RTE_MBUF_HAS_EXTBUF(m)) { > > + /* > > + * The mbuf has the external attached buffed, > > + * we should check the type of the memory pool where > > + * the mbuf was allocated from to detect the pinned > > + * external buffer. > > + */ > > + struct rte_pktmbuf_pool_private *priv =3D > > + (struct rte_pktmbuf_pool_private *) > > + rte_mempool_get_priv(mp); > > + >=20 > It could be rte_pktmbuf_priv_flags() as said above. >=20 > > + if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) { > > + RTE_ASSERT(m->shinfo =3D=3D NULL); > > + m->ol_flags =3D EXT_ATTACHED_MBUF; > > + return; > > + } >=20 > I think it is not possible to have m->shinfo =3D=3D NULL (this comment is= also > related to next patch, because shinfo init is done there). If you try to = clone a > mbuf that comes from an ext-pinned pool, it will crash. Here is the code = from > attach(): >=20 > static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct > rte_mbuf *m) > { > RTE_ASSERT(RTE_MBUF_DIRECT(mi) && > rte_mbuf_refcnt_read(mi) =3D=3D 1); >=20 > if (RTE_MBUF_HAS_EXTBUF(m)) { > rte_mbuf_ext_refcnt_update(m->shinfo, 1); << HERE > mi->ol_flags =3D m->ol_flags; > mi->shinfo =3D m->shinfo; > ... >=20 > The 2 alternatives I see are: >=20 > - do not allow to clone these mbufs, but today there is no return value > to attach() functions, so there is no way to know if it failed or not >=20 > - manage shinfo to support clones >=20 > I think just ignoring the rte_mbuf_ext_refcnt_update() if shinfo =3D=3D N= ULL is not > an option, because if could result in recycling the extbuf while a clone = still > references it. >=20 >=20 The clone for the mbufs with pinned buffers is not supposed at all, this wa= s chosen as development precondition. We can't touch the buf_adr/iova_addr fi= elds in mbuf, because there is no way to restore these pointers (nomore fixed offset betw= een mbuf and data buffer), so rte_mbuf_detach() would be not operational at all. Also, we can't deduce the mbuf base address from data buffer address.=20 We could add RTE_ASSERT to prevent attaching (to) mbufs with pinned data, w= hat do you think? > > __rte_pktmbuf_free_extbuf(m); > > - else > > + } else { > > __rte_pktmbuf_free_direct(m); > > - > > + } > > priv_size =3D rte_pktmbuf_priv_size(mp); > > mbuf_size =3D (uint32_t)(sizeof(struct rte_mbuf) + priv_size); > > buf_len =3D rte_pktmbuf_data_room_size(mp); > > -- > > 1.8.3.1 > >