All of lore.kernel.org
 help / color / mirror / Atom feed
* The method to alloc DMA ring buffer
@ 2022-12-07  4:17 Tony He
  2022-12-07 19:55 ` Tom Mitchell
  0 siblings, 1 reply; 3+ messages in thread
From: Tony He @ 2022-12-07  4:17 UTC (permalink / raw)
  To: kernelnewbies

Hi all,

This is my first question in the kernel newbies mailing list. Forgive
me if there is incorrect behavior.

I'm studying the NIC driver with the book <<Understanding the linux
network internals>>. This book uses 3com 3c59x NIC(3c59x.c) as an
example. The driver is very old and few people discuss it, but it's a
good place to start because it's simpler than many other drivers with
advanced offload features.

In the interrupt handler boomerang_rx(), I see it pre-allocate one new
skb and dma map before calling netif_rx.
boomerang_rx()
{
......
/* Pre-allocate the replacement skb.  If it or its
    * mapping fails then recycle the buffer thats already
    * in place
    */
    newskb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
    if (!newskb) {
    dev->stats.rx_dropped++;
    goto clear_complete;
    }
    newdma = dma_map_single(vp->gendev, newskb->data,
    PKT_BUF_SZ, DMA_FROM_DEVICE);
    if (dma_mapping_error(vp->gendev, newdma)) {
    dev->stats.rx_dropped++;
    consume_skb(newskb);
    goto clear_complete;
    }

    /* Pass up the skbuff already on the Rx ring. */
    skb = vp->rx_skbuff[entry];
    vp->rx_skbuff[entry] = newskb;
    vp->rx_ring[entry].addr = cpu_to_le32(newdma);
    skb_put(skb, pkt_len);
    dma_unmap_single(vp->gendev, dma, PKT_BUF_SZ, DMA_FROM_DEVICE);
    vp->rx_nocopy++;
    ......
    netif_rx(skb);
    .......
}

However, I see intel e1000 driver optimizes this. Intel doesn't
pre-allocate one at a time.
e1000_clean_rx_irq()
{
......
    /* return some buffers to hardware, one at a time is too slow */
    if (unlikely(cleaned_count >= E1000_RX_BUFFER_WRITE)) {
    adapter->alloc_rx_buf(adapter, rx_ring, cleaned_count);
    cleaned_count = 0;
......
}

I just want to know why this is faster. For all scenarios or some
scenarios? Can someone analyse it rigorously?
My guess is this method reduces the delay when many packets come,
right? Thanks in advance!

Tony

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: The method to alloc DMA ring buffer
  2022-12-07  4:17 The method to alloc DMA ring buffer Tony He
@ 2022-12-07 19:55 ` Tom Mitchell
  2022-12-08  2:17   ` Tony He
  0 siblings, 1 reply; 3+ messages in thread
From: Tom Mitchell @ 2022-12-07 19:55 UTC (permalink / raw)
  To: Tony He; +Cc: kernelnewbies

On Tue, Dec 6, 2022 at 8:17 PM Tony He <huangya90@gmail.com> wrote:
...
>
> I'm studying the NIC driver with the book <<Understanding the linux
> network internals>>. This book uses 3com 3c59x NIC(3c59x.c) as an
> example. The driver is very old and few people discuss it, but it's a
> good place to start because it's simpler than many other drivers with
> advanced offload features.
>
> In the interrupt handler boomerang_rx(), I see it pre-allocate one new
...

>     goto clear_complete;
Oh m a goto.
...
> However, I see intel e1000 driver optimizes this. Intel doesn't
> pre-allocate one at a time.
....
> I just want to know why this is faster. For all scenarios or some
> scenarios? Can someone analyse it rigorously?

The first step is to examine the hardware data pages.
Most optimizations like this are shaped by hardware

That said, buffer allocation in the kernel is slow(ish).
Any time there is sufficient ram to preallocate, do so in moderation.
...
Interrupt latency.
It helps to have the hardware address a list of commands in a queue without
hardware intervention.
Networking:
a buffer window sent to the client  allows that client to send data
without waiting on
a response and is unlikely to need to resend unless the network has loss.

The networking buffer window strategy is a giant topic.

Interrupt latency: that is another book  and it depends on the hardware.  Again
any hardware that can do useful work without interrupt servicing the better.

Simple serial IO hardware drivers can be easier to research and analyze.
Again start with the hardware data sheets for UARTS and hardware and software
flow control over serial data links.

Define: rigorously?  Is this a homework assignment?

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: The method to alloc DMA ring buffer
  2022-12-07 19:55 ` Tom Mitchell
@ 2022-12-08  2:17   ` Tony He
  0 siblings, 0 replies; 3+ messages in thread
From: Tony He @ 2022-12-08  2:17 UTC (permalink / raw)
  To: Tom Mitchell; +Cc: kernelnewbies

Thanks! Gained a lot.

Tom Mitchell <niftylinkern@niftyegg.com> 于2022年12月8日周四 03:55写道:
>
> On Tue, Dec 6, 2022 at 8:17 PM Tony He <huangya90@gmail.com> wrote:
> ...
> >
> > I'm studying the NIC driver with the book <<Understanding the linux
> > network internals>>. This book uses 3com 3c59x NIC(3c59x.c) as an
> > example. The driver is very old and few people discuss it, but it's a
> > good place to start because it's simpler than many other drivers with
> > advanced offload features.
> >
> > In the interrupt handler boomerang_rx(), I see it pre-allocate one new
> ...
>
> >     goto clear_complete;
> Oh m a goto.
> ...
> > However, I see intel e1000 driver optimizes this. Intel doesn't
> > pre-allocate one at a time.
> ....
> > I just want to know why this is faster. For all scenarios or some
> > scenarios? Can someone analyse it rigorously?
>
> The first step is to examine the hardware data pages.
> Most optimizations like this are shaped by hardware
>
> That said, buffer allocation in the kernel is slow(ish).
> Any time there is sufficient ram to preallocate, do so in moderation.
> ...
> Interrupt latency.
> It helps to have the hardware address a list of commands in a queue without
> hardware intervention.
> Networking:
> a buffer window sent to the client  allows that client to send data
> without waiting on
> a response and is unlikely to need to resend unless the network has loss.
>
> The networking buffer window strategy is a giant topic.
>
> Interrupt latency: that is another book  and it depends on the hardware.  Again
> any hardware that can do useful work without interrupt servicing the better.
>
> Simple serial IO hardware drivers can be easier to research and analyze.
> Again start with the hardware data sheets for UARTS and hardware and software
> flow control over serial data links.
>
> Define: rigorously?  Is this a homework assignment?
No, this is NOT a homework assignment. Have worked a few years but my
most areas were related to application instead
of kernel and driver.

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-08  2:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-07  4:17 The method to alloc DMA ring buffer Tony He
2022-12-07 19:55 ` Tom Mitchell
2022-12-08  2:17   ` Tony He

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.