From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com [209.85.218.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 52FB912E6A for ; Wed, 6 Mar 2024 02:54:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.52 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709693685; cv=none; b=O/5u605t46S2vbCTNfhZrFPX5cPhMCN+qKJHqOWaIdLd/p9dMU6CFWVz+p/B9x8uTy40gUZdhjiQ9wg9vh/U6P8082pxYkXWuhbVXyqzdTtCK5aIw1iTIMMy+XNtR7dPq8s7zbhaZhEv0sA6SGzd9wgf5siTJKWwmJ+u4EPkrz0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709693685; c=relaxed/simple; bh=rCsQQP7xgQocqgJZyTCewEFa4pqWP3bm2r+qfnB/Kdw=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=kxE22pNkBvmAazxPhgFZ9GPMuDJFxsGDYjpuFeF91v4z93EkGUxYwC7ZVRMNPUoxvB8QxfwuiIoLpSGs/f2RgJDjXG+p15Mr08X6Cz7TLLAVcPD7+wFa3SfZvQ+FoPNf2asLKLxOQ1NIO8yj2rLNxA4iv1G/19In90q7qIljn4U= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=FfspCDPZ; arc=none smtp.client-ip=209.85.218.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="FfspCDPZ" Received: by mail-ej1-f52.google.com with SMTP id a640c23a62f3a-a3122b70439so180185066b.3 for ; Tue, 05 Mar 2024 18:54:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709693681; x=1710298481; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=eS/C6JRe/LZxR5BB0/jvRViDTj3F8vBSe/WWRgNeFIA=; b=FfspCDPZNW8S7Qf7wpydUp7nMkxbIvbEb4ID33aWUPwHOYvxyiL5+xjC/RGuYky5nJ UXbu0gXQCFwnw50zF+4eh5urGT9USQAfcXkEasQoB328Tb8CE77MTWKT72cElYJh0wiU E8M6JwS0IqClv/VPW882Nqjm33OxkixBNmv2B1WPXiL5arG8e00lOAa7AfTo3NB4z5xf jcpIGJ9Llm7DNOCjp3Jw3H51L90Ew0/AcaBWuTDJDOQG4bdnCW1kNOzZmO/JeKseNhq8 VZCcpvUcpy7bLKHJluNjLXFu8vbMbxapQ9qoBZgzezbUe/OupdB7FZDjPG1+nCvQ0lcD tgYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709693681; x=1710298481; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eS/C6JRe/LZxR5BB0/jvRViDTj3F8vBSe/WWRgNeFIA=; b=xTtt3jbKpTtAySVWGvQHp39bfDRzREgmW87rEZQH7CEH1IRq24FvevvkgwA3mwmMsg v4KLWFfSXRPwEOH3VcJ+xvjEaq4CldGEd5Lo6S7+LooZnRXiDIK3SbW/k51Q4ixK5ZMJ cJp1UQZ11jxeEXhtCVWryfF4er42ANIqSiv+/AukOiFeVnY8NrS5alWpcakWaQogUBmw iCYTqrcMEsUP5VhjFiGdu7Jy4+jA/xHArJGg7v0qs+aWgJmqBZV7hFf1pUChiKpc9NTV f9XRA18QYbN9d0Wdg/iqkOt/qai71lvUps5Q2bAlGXiPYBb20vwGMzdyLSornQm3KltQ nvcw== X-Gm-Message-State: AOJu0YxbesEQkcM3R1kYvBBC90NwFxBbTfyCuBq5Jo5rZJ73wsRV96se RdR5G9uF/jQ7leMcvS+giZzyieZtnW9KR0p0INFIiG1fZKZmxBH0YQFlYEQ9MQozRTlc72qFJCZ ubkmdEFfDjIUsXZZ5vFF6uhd6Q/n8QciiBM/o X-Google-Smtp-Source: AGHT+IEBCHh5at8F+COR4cGDllSpD2J7JpR+ODcNFG7XI5qKLEW8CmG66IZiYQuqBlnDkixdz9i/+yCx3p+Y4GqdUM0= X-Received: by 2002:a17:906:4551:b0:a45:270e:3617 with SMTP id s17-20020a170906455100b00a45270e3617mr5771839ejq.27.1709693680497; Tue, 05 Mar 2024 18:54:40 -0800 (PST) Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240305020153.2787423-1-almasrymina@google.com> <20240305020153.2787423-10-almasrymina@google.com> <383c4870-167f-4123-bbf3-928db1463e01@davidwei.uk> <6562b8b0-6cc0-4652-b746-75549801c002@davidwei.uk> In-Reply-To: <6562b8b0-6cc0-4652-b746-75549801c002@davidwei.uk> From: Mina Almasry Date: Tue, 5 Mar 2024 18:54:28 -0800 Message-ID: Subject: Re: [RFC PATCH net-next v6 09/15] memory-provider: dmabuf devmem memory provider To: David Wei Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-arch@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Andreas Larsson , Jesper Dangaard Brouer , Ilias Apalodimas , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Arnd Bergmann , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Pavel Begunkov , Jason Gunthorpe , Yunsheng Lin , Shailend Chand , Harshitha Ramamurthy , Jeroen de Borst , Praveen Kaligineedi , Willem de Bruijn , Kaiyuan Zhang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Mar 5, 2024 at 6:47=E2=80=AFPM David Wei wrote: > > On 2024-03-05 18:42, Mina Almasry wrote: > > On Tue, Mar 5, 2024 at 6:28=E2=80=AFPM David Wei wrote= : > >> > >> On 2024-03-04 18:01, Mina Almasry wrote: > >>> + if (pool->p.queue) > >>> + binding =3D READ_ONCE(pool->p.queue->binding); > >>> + > >>> + if (binding) { > >>> + pool->mp_ops =3D &dmabuf_devmem_ops; > >>> + pool->mp_priv =3D binding; > >>> + } > >> > >> This is specific to TCP devmem. For ZC Rx we will need something more > >> generic to let us pass our own memory provider backend down to the pag= e > >> pool. > >> > >> What about storing ops and priv void ptr in struct netdev_rx_queue > >> instead? Then we can both use it. > > > > Yes, this is dmabuf specific, I was thinking you'd define your own > > member of netdev_rx_queue, and then add something like this to > > page_pool_init: > > > > + if (pool->p.queue) > > + io_uring_metadata =3D READ_ONCE(pool->p.queue->io_uring= _metadata); > > + > > + /* We don't support rx-queues that are configured for both > > io_uring & dmabuf binding */ > > + BUG_ON(io_uring_metadata && binding); > > + > > + if (io_uring_metadata) { > > + pool->mp_ops =3D &io_uring_ops; > > + pool->mp_priv =3D io_uring_metadata; > > + } > > > > I.e., we share the pool->mp_ops and the pool->mp_priv but we don't > > really need to share the same netdev_rx_queue member. For me it's a > > dma-buf specific data structure (netdev_dmabuf_binding) and for you > > it's something else. > > This adds size to struct netdev_rx_queue and requires checks on whether > both are set. There can be thousands of these structs at any one time so > if we don't need to add size unnecessarily then that would be best. > > We can disambiguate by comparing &mp_ops and then cast the void ptr to > our impl specific objects. > > What do you not like about this approach? > I was thinking it leaks page_pool specifics into a generic struct unrelated to the page pool like netdev_rx_queue. My mental model is that the rx-queue just says that it's bound to a dma-buf/io_uring unaware of page_pool internals, and the page pool internals figure out what to do from there. Currently netdev_rx_queue.h doesn't include net/page_pool/types.h for example because there is no dependency between netdev_rx_queue & page_pool, I think this change would add a dependency. But I concede it does not matter much AFAICT, I can certainly change the netdev_rx_queue to hold the mp_priv & mp_ops directly and include net/page_pool/types.h if you prefer that. I'll look into applying this change in the next iteration if there are no objections. > > > > page_pool_init() probably needs to validate that the queue is > > configured for dma-buf or io_uring but not both. If it's configured > > for both then the user is doing something funky we shouldn't support. > > > > Perhaps I can make the intention clearer by renaming 'binding' to > > something more specific to dma-buf like queue->dmabuf_binding, to make > > it clear that this is the dma-buf binding and not some other binding > > like io_uring? > > --=20 Thanks, Mina