ntb.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [RFC] PCI EP/RC network transfer by using eDMA
@ 2022-09-28 21:38 Frank Li
  2022-10-11 11:37 ` Kishon Vijay Abraham I
  0 siblings, 1 reply; 3+ messages in thread
From: Frank Li @ 2022-09-28 21:38 UTC (permalink / raw)
  To: fancer.lancer, helgaas, sergey.semin, kw, linux-pci,
	manivannan.sadhasivam, ntb, jdmason, kishon, haotian.wang,
	lznuaa, imx


ALL:

       Recently some important PCI EP function patch already merged.  
Especially DWC EDMA support. 
       PCIe EDMA have nice feature, which can read/write all PCI host
memory regardless EP side PCI memory map windows size.
       Pci-epf-vntb.c also merged into mainline.  
       And part of vntb msi patch already merged. 
		https://lore.kernel.org/imx/86mtaj7hdw.wl-maz@kernel.org/T/#m35546867af07735c1070f596d653a2666f453c52

       Although msi can improve transfer latency,  the transfer speed
still quite slow because DMA have not supported yet. 

       I plan continue to improve transfer speed. But I find some
fundamental limitation at original framework, which can’t use EDMA 100% benefits. 
       After research some old thread: 
		https://lore.kernel.org/linux-pci/20200702082143.25259-1-kishon@ti.com/
		https://lore.kernel.org/linux-pci/9f8e596f-b601-7f97-a98a-111763f966d1@ti.com/T/
		Some RDMA document and https://github.com/ntrdma/ntrdma-ext

       I think the solution, which based on haotian wang will be best one. 

  ┌─────────────────────────────────┐   ┌──────────────┐
  │                                 │   │              │
  │                                 │   │              │
  │   VirtQueue             RX      │   │  VirtQueue   │
  │     TX                 ┌──┐     │   │    TX        │
  │  ┌─────────┐           │  │     │   │ ┌─────────┐  │
  │  │ SRC LEN ├─────┐  ┌──┤  │◄────┼───┼─┤ SRC LEN │  │
  │  ├─────────┤     │  │  │  │     │   │ ├─────────┤  │
  │  │         │     │  │  │  │     │   │ │         │  │
  │  ├─────────┤     │  │  │  │     │   │ ├─────────┤  │
  │  │         │     │  │  │  │     │   │ │         │  │
  │  └─────────┘     │  │  └──┘     │   │ └─────────┘  │
  │                  │  │           │   │              │
  │     RX       ┌───┼──┘   TX      │   │    RX        │
  │  ┌─────────┐ │   │     ┌──┐     │   │ ┌─────────┐  │
  │  │         │◄┘   └────►│  ├─────┼───┼─┤         │  │
  │  ├─────────┤           │  │     │   │ ├─────────┤  │
  │  │         │           │  │     │   │ │         │  │
  │  ├─────────┤           │  │     │   │ ├─────────┤  │
  │  │         │           │  │     │   │ │         │  │
  │  └─────────┘           │  │     │   │ └─────────┘  │
  │   virtio_net           └──┘     │   │ virtio_net   │
  │  Virtual PCI BUS   EDMA Queue   │   │              │
  ├─────────────────────────────────┤   │              │
  │  PCI EP Controller with eDMA    │   │  PCI Host    │
  └─────────────────────────────────┘   └──────────────┘


       Basic idea is
	1.	Both EP and host probe virtio_net driver
	2.	There are two queues,  one is EP side(EQ),  the other is Host side. 
	3.	EP side epf driver map Host side’s queue into EP’s space. , Called HQ.
	4.	One working thread 
	a.	pick one TX from EQ and RX from HQ, combine and generate EDMA request, and put into DMA TX queue.
	b.	Pick one RX from EQ and TX from HQ, combine and generate EDMA request, and put into DMA RX queue. 
	5.	EDMA done irq will mark related item in EP and HQ finished.

The whole transfer is zero copied and use DMA queue.

      RDMA have similar idea and more coding efforts. 
      I think Kishon Vijay Abraham I prefer use vhost,  but I don’t know how to build a queue at host side.
      NTB transfer just do one directory EDMA transfer (DMA write) because Read actually local memory
 to local memory.

      Any comments about overall solution?

best regards
Frank Li

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] PCI EP/RC network transfer by using eDMA
  2022-09-28 21:38 [RFC] PCI EP/RC network transfer by using eDMA Frank Li
@ 2022-10-11 11:37 ` Kishon Vijay Abraham I
  2022-10-11 15:09   ` [EXT] " Frank Li
  0 siblings, 1 reply; 3+ messages in thread
From: Kishon Vijay Abraham I @ 2022-10-11 11:37 UTC (permalink / raw)
  To: Frank Li, fancer.lancer, helgaas, sergey.semin, kw, linux-pci,
	manivannan.sadhasivam, ntb, jdmason, haotian.wang, lznuaa, imx

Hi Frank,

On 29/09/22 3:08 am, Frank Li wrote:
> 
> ALL:
> 
>         Recently some important PCI EP function patch already merged.
> Especially DWC EDMA support.
>         PCIe EDMA have nice feature, which can read/write all PCI host
> memory regardless EP side PCI memory map windows size.
>         Pci-epf-vntb.c also merged into mainline.
>         And part of vntb msi patch already merged.
> 		https://lore.kernel.org/imx/86mtaj7hdw.wl-maz@kernel.org/T/#m35546867af07735c1070f596d653a2666f453c52
> 
>         Although msi can improve transfer latency,  the transfer speed
> still quite slow because DMA have not supported yet.
> 
>         I plan continue to improve transfer speed. But I find some
> fundamental limitation at original framework, which can’t use EDMA 100% benefits.

By framework, you mean limitations with pci-epf-vntb right?
>         After research some old thread:
> 		https://lore.kernel.org/linux-pci/20200702082143.25259-1-kishon@ti.com/
> 		https://lore.kernel.org/linux-pci/9f8e596f-b601-7f97-a98a-111763f966d1@ti.com/T/
> 		Some RDMA document and https://github.com/ntrdma/ntrdma-ext
> 
>         I think the solution, which based on haotian wang will be best one.

why?
> 
>    ┌─────────────────────────────────┐   ┌──────────────┐
>    │                                 │   │              │
>    │                                 │   │              │
>    │   VirtQueue             RX      │   │  VirtQueue   │
>    │     TX                 ┌──┐     │   │    TX        │
>    │  ┌─────────┐           │  │     │   │ ┌─────────┐  │
>    │  │ SRC LEN ├─────┐  ┌──┤  │◄────┼───┼─┤ SRC LEN │  │
>    │  ├─────────┤     │  │  │  │     │   │ ├─────────┤  │
>    │  │         │     │  │  │  │     │   │ │         │  │
>    │  ├─────────┤     │  │  │  │     │   │ ├─────────┤  │
>    │  │         │     │  │  │  │     │   │ │         │  │
>    │  └─────────┘     │  │  └──┘     │   │ └─────────┘  │
>    │                  │  │           │   │              │
>    │     RX       ┌───┼──┘   TX      │   │    RX        │
>    │  ┌─────────┐ │   │     ┌──┐     │   │ ┌─────────┐  │
>    │  │         │◄┘   └────►│  ├─────┼───┼─┤         │  │
>    │  ├─────────┤           │  │     │   │ ├─────────┤  │
>    │  │         │           │  │     │   │ │         │  │
>    │  ├─────────┤           │  │     │   │ ├─────────┤  │
>    │  │         │           │  │     │   │ │         │  │
>    │  └─────────┘           │  │     │   │ └─────────┘  │
>    │   virtio_net           └──┘     │   │ virtio_net   │
>    │  Virtual PCI BUS   EDMA Queue   │   │              │
>    ├─────────────────────────────────┤   │              │
>    │  PCI EP Controller with eDMA    │   │  PCI Host    │
>    └─────────────────────────────────┘   └──────────────┘
> 
> 
>         Basic idea is
> 	1.	Both EP and host probe virtio_net driver
> 	2.	There are two queues,  one is EP side(EQ),  the other is Host side.
> 	3.	EP side epf driver map Host side’s queue into EP’s space. , Called HQ.
> 	4.	One working thread
> 	a.	pick one TX from EQ and RX from HQ, combine and generate EDMA request, and put into DMA TX queue.
> 	b.	Pick one RX from EQ and TX from HQ, combine and generate EDMA request, and put into DMA RX queue.
> 	5.	EDMA done irq will mark related item in EP and HQ finished.
> 
> The whole transfer is zero copied and use DMA queue.
> 
>        RDMA have similar idea and more coding efforts.

My suggestion would be to pick a cleaner solution with the right 
abstractions and not based on coding efforts.

>        I think Kishon Vijay Abraham I prefer use vhost,  but I don’t know how to build a queue at host side.

Not sure what you mean by host side here. But the queue would be only on 
virtio frontend (virtio-net running on PCIe RC) and PCIe EP would access 
the front-end's queue.
>        NTB transfer just do one directory EDMA transfer (DMA write) because Read actually local memory
>   to local memory.
> 
>        Any comments about overall solution?

I would suggest you to go through the comments received on Haotian Wang 
patch and suggest what changes you are proposing.

Thanks,
Kishon

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [EXT] Re: [RFC] PCI EP/RC network transfer by using eDMA
  2022-10-11 11:37 ` Kishon Vijay Abraham I
@ 2022-10-11 15:09   ` Frank Li
  0 siblings, 0 replies; 3+ messages in thread
From: Frank Li @ 2022-10-11 15:09 UTC (permalink / raw)
  To: Kishon Vijay Abraham I, fancer.lancer, helgaas, sergey.semin, kw,
	linux-pci, manivannan.sadhasivam, ntb, jdmason, haotian.wang,
	lznuaa, imx



> -----Original Message-----
> From: Kishon Vijay Abraham I <kishon@ti.com>
> Sent: Tuesday, October 11, 2022 6:38 AM
> To: Frank Li <frank.li@nxp.com>; fancer.lancer@gmail.com;
> helgaas@kernel.org; sergey.semin@baikalelectronics.ru; kw@linux.com;
> linux-pci@vger.kernel.org; manivannan.sadhasivam@linaro.org;
> ntb@lists.linux.dev; jdmason@kudzu.us; haotian.wang@sifive.com;
> lznuaa@gmail.com; imx@lists.linux.dev
> Subject: [EXT] Re: [RFC] PCI EP/RC network transfer by using eDMA
> 
> Caution: EXT Email
> 
> Hi Frank,
> 
> On 29/09/22 3:08 am, Frank Li wrote:
> >
> > ALL:
> >
> >         Recently some important PCI EP function patch already merged.
> > Especially DWC EDMA support.
> >         PCIe EDMA have nice feature, which can read/write all PCI host
> > memory regardless EP side PCI memory map windows size.
> >         Pci-epf-vntb.c also merged into mainline.
> >         And part of vntb msi patch already merged.
> >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Fimx%2F86mtaj7hdw.wl-
> maz%40kernel.org%2FT%2F%23m35546867af07735c1070f596d653a2666f453
> c52&amp;data=05%7C01%7CFrank.Li%40nxp.com%7C5ddd8ea32c084aadda
> 9708daab7d205d%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63
> 8010851153612204%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> &amp;sdata=xo%2B4zJAACi1D4p7faOiUGJ7k12o0R3r8TD9ZhFbSWtM%3D&a
> mp;reserved=0
> >
> >         Although msi can improve transfer latency,  the transfer speed
> > still quite slow because DMA have not supported yet.
> >
> >         I plan continue to improve transfer speed. But I find some
> > fundamental limitation at original framework, which can’t use EDMA 100%
> benefits.
> 
> By framework, you mean limitations with pci-epf-vntb right?

[Frank Li] not pci-epf-vntb, it is ntb definition. 
NTB define one CPU just map part of memory of another CPU's memory.
So at least one memory copy happen.
1. CPU1:  user space buffer copy to map memory. 
2. CPU2:  map memory to user space buffer.

NTB support Memory to Memory DMA  to do 1 and 2. But still need additional
Memory copy regardless by DMA or CPU.

We can change ntb_transport.c to support PCIe EP's EDMA for write direction. 
All Read in NTB is local memory to local memory. 


> >         After research some old thread:
> >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Flinux-pci%2F20200702082143.25259-1-
> kishon%40ti.com%2F&amp;data=05%7C01%7CFrank.Li%40nxp.com%7C5ddd
> 8ea32c084aadda9708daab7d205d%7C686ea1d3bc2b4c6fa92cd99c5c301635
> %7C0%7C0%7C638010851153768444%7CUnknown%7CTWFpbGZsb3d8eyJWI
> joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
> 000%7C%7C%7C&amp;sdata=qQHtTwu0Q3H02g7p%2B0H%2BQNgmD%2Btx
> hreJY9KBHLT%2FSYw%3D&amp;reserved=0
> >
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.k
> ernel.org%2Flinux-pci%2F9f8e596f-b601-7f97-a98a-
> 111763f966d1%40ti.com%2FT%2F&amp;data=05%7C01%7CFrank.Li%40nxp.c
> om%7C5ddd8ea32c084aadda9708daab7d205d%7C686ea1d3bc2b4c6fa92cd9
> 9c5c301635%7C0%7C0%7C638010851153768444%7CUnknown%7CTWFpbGZ
> sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6M
> n0%3D%7C3000%7C%7C%7C&amp;sdata=hPPSiPtHtvAY6yHTNnngRJDnEjbfvo
> nnUMIonx%2BzxhI%3D&amp;reserved=0
> >               Some RDMA document and
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub
> .com%2Fntrdma%2Fntrdma-
> ext&amp;data=05%7C01%7CFrank.Li%40nxp.com%7C5ddd8ea32c084aadda9
> 708daab7d205d%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638
> 010851153768444%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
> LCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
> mp;sdata=Wta9GwpyRSmnk0hlY%2FFCpK9C7fq6Djx9K08LuPcMCmc%3D&am
> p;reserved=0
> >
> >         I think the solution, which based on haotian wang will be best one.
> 
> why?
[Frank Li] See below.

> >
> >    ┌─────────────────────────────
> ────┐   ┌──────────────┐
> >    │                                 │   │              │
> >    │                                 │   │              │
> >    │   VirtQueue             RX      │   │  VirtQueue   │
> >    │     TX                 ┌──┐     │   │    TX        │
> >    │  ┌─────────┐           │  │     │   │ ┌────────
> ─┐  │
> >    │  │ SRC LEN ├─────┐  ┌──┤  │◄────┼───┼─
> ┤ SRC LEN │  │
> >    │  ├─────────┤     │  │  │  │     │   │ ├──────
> ───┤  │
> >    │  │         │     │  │  │  │     │   │ │         │  │
> >    │  ├─────────┤     │  │  │  │     │   │ ├──────
> ───┤  │
> >    │  │         │     │  │  │  │     │   │ │         │  │
> >    │  └─────────┘     │  │  └──┘     │   │ └─────
> ────┘  │
> >    │                  │  │           │   │              │
> >    │     RX       ┌───┼──┘   TX      │   │    RX        │
> >    │  ┌─────────┐ │   │     ┌──┐     │   │ ┌─────
> ────┐  │
> >    │  │         │◄┘   └────►│  ├─────┼───┼─┤
> │  │
> >    │  ├─────────┤           │  │     │   │ ├────────
> ─┤  │
> >    │  │         │           │  │     │   │ │         │  │
> >    │  ├─────────┤           │  │     │   │ ├────────
> ─┤  │
> >    │  │         │           │  │     │   │ │         │  │
> >    │  └─────────┘           │  │     │   │ └────────
> ─┘  │
> >    │   virtio_net           └──┘     │   │ virtio_net   │
> >    │  Virtual PCI BUS   EDMA Queue   │   │              │
> >    ├─────────────────────────────
> ────┤   │              │
> >    │  PCI EP Controller with eDMA    │   │  PCI Host    │
> >    └─────────────────────────────
> ────┘   └──────────────┘
> >
> >
> >         Basic idea is
> >       1.      Both EP and host probe virtio_net driver
> >       2.      There are two queues,  one is EP side(EQ),  the other is Host side.
> >       3.      EP side epf driver map Host side’s queue into EP’s space. , Called
> HQ.
> >       4.      One working thread
> >       a.      pick one TX from EQ and RX from HQ, combine and generate
> EDMA request, and put into DMA TX queue.
> >       b.      Pick one RX from EQ and TX from HQ, combine and generate
> EDMA request, and put into DMA RX queue.
> >       5.      EDMA done irq will mark related item in EP and HQ finished.
> >
> > The whole transfer is zero copied and use DMA queue.
> >
> >        RDMA have similar idea and more coding efforts.
> 
> My suggestion would be to pick a cleaner solution with the right
> abstractions and not based on coding efforts.

[Frank Li] My idea is quite similar with RDMA. I am not sure how much people
Using RDMA.  I may need do more research about infiniband RDMA. 

> 
> >        I think Kishon Vijay Abraham I prefer use vhost,  but I don’t know how
> to build a queue at host side.
> 
> Not sure what you mean by host side here. But the queue would be only on
> virtio frontend (virtio-net running on PCIe RC) and PCIe EP would access
> the front-end's queue.

[Frank Li] we have to use two queue to maximum transfer speed.  EP queue and RC queue. 

If we just one queue at PCIe RC,

EP will done below work
	1. DMA Map
	2. submit one transfer to DMA queue
	3. wait for DMA transfer done
	4. DMA umap.

The latency between 1 to 4 is quite huge. 

If there are two queue, EP queue,  and RC queue. 

Kernel thread (EP -> RC example).
	1. dequeue TX from EP queue,  dequeue RX from RC queue.
	2. put to EDMA hardware transfer queue
	3. go to 1, until one of EP and RC queue empty. 

IRQ: 
	Loop EDMA hardware transfer queue,  mark TX EP queue item done, mark RX RC queue item done.  
	Notified both EP and RC side. 

Whole flow will not additional memory copy and sync wait.  All is asynced


> >        NTB transfer just do one directory EDMA transfer (DMA write) because
> Read actually local memory
> >   to local memory.
> >
> >        Any comments about overall solution?
> 
> I would suggest you to go through the comments received on Haotian Wang
> patch and suggest what changes you are proposing.

[Frank Li] The major concern from you is EP side using vhost.

EDMA changed the situation, see above explanation. 


> 
> Thanks,
> Kishon

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-10-11 15:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-28 21:38 [RFC] PCI EP/RC network transfer by using eDMA Frank Li
2022-10-11 11:37 ` Kishon Vijay Abraham I
2022-10-11 15:09   ` [EXT] " Frank Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).