All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tudor Cornea <tudor.cornea@gmail.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: linville@tuxdriver.com, Thomas Monjalon <thomas@monjalon.net>,
	 Mihai Pogonaru <pogonarumihai@gmail.com>,
	dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] net/af_packet: fix ignoring full ring on tx
Date: Tue, 5 Oct 2021 18:11:08 +0300	[thread overview]
Message-ID: <CAOuQ8vWHLgt=8aNxoSd4XvWeYQqMuJ1Wt4iKpeVmY8tqwVp-_w@mail.gmail.com> (raw)
In-Reply-To: <CAOuQ8vW2t_HJPc-M89+FwP0Ed9kpUR82X7FPm1KjdVuTO-kpSQ@mail.gmail.com>

Hi Ferruh,

I have attempted to narrow down the issue.
I have the following bash script, which computes packet rates on an
interface.

[root@localhost ~]# cat compute-rates.sh
#!/usr/bin/env bash

if [[ ${#} -ne 2 ]]; then
    echo "Usage: ${0} <iface-name> <sleep-interval-seconds>"
    exit 1
fi

IFACE_NAME="${1}"
SLEEP_INTERVAL_SECONDS="${2}"
TMP_STATS_FILE="/tmp/netstat"

# Clear Previous stats file
echo "0 0 0 0" > "${TMP_STATS_FILE}"

echo "Press CTRL+C to exit..."

while true; do
    export "RxB=0" "RxP=0" "TxB=0" "TxP=0"

    # Extract Rx{Bytes,Packets} and Tx{Bytes,Packets} and
    # format the output. Individual fields will be exported
    export $(\
        ifconfig "${IFACE_NAME}" \
            | grep 'packets' \
            | awk '{print $5, $3}' \
            | xargs echo \
            | sed -E -e \
                "s/([0-9]+) ([0-9]+) ([0-9]+) ([0-9]+)/RxB=\1 RxP=\2
TxB=\3 TxP=\4/")

    # Print Packet and Byte Rates
    # Format: | Rx Bytes | Rx Packets | Tx Bytes | Tx Packets |

    echo "${RxB}" "${RxP}" "${TxB}" "${TxP}" $(cat "${TMP_STATS_FILE}") \
        | awk '{print "RxB="$1-$5, "RxP="$2-$6, "TxB="$3-$7, "TxP="$4-$8}'

    # Save the new values
    echo "${RxB}" "${RxP}" "${TxB}" "${TxP}" > "${TMP_STATS_FILE}"

    sleep "${SLEEP_INTERVAL_SECONDS}"

done

On the transmit side, I'm using the engine behind [1] with the af_packet
PMD.

The configuration for the af_packet PMD is the following:
--vdev=net_af_packet0,iface=eth1,blocksz=16384,framesz=8192,framecnt=2048,qpairs=1,qdisc_bypass=0

I'm configuring a Tx rate of 335 packets / second and a packet size of 300
Bytes.
These seem to be the values using which we seem to have better chances of
seeing the problem. I suspect it might also be linked with the af_packet
configuration.

I'm starting traffic using the specified configuration, and in parallel,
running the script that computes the rates as follows:
./compute-rates.sh eth1 0.1

Initially, the packet rates seem steady

RxB=0 RxP=0 TxB=10952 TxP=37
RxB=0 RxP=0 TxB=10656 TxP=36
RxB=0 RxP=0 TxB=10656 TxP=36
RxB=0 RxP=0 TxB=10656 TxP=36
RxB=0 RxP=0 TxB=10952 TxP=37
RxB=0 RxP=0 TxB=10952 TxP=37
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10952 TxP=37

[...]

After a while, we toggle the interface up / down with a sleep between the
steps. I suspect the length of the sleep might be a variable in the
equation.

ifconfig eth1 down; sleep 7; ifconfig eth1 up


What we see, is that even after the interface is toggled back up, the rates
never seem to recover.

RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=2072 TxP=7
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=10360 TxP=35
RxB=0 RxP=0 TxB=521256 TxP=1761
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0
RxB=0 RxP=0 TxB=0 TxP=0

[...]


I've attempted to mirror the same behavior using dpdk-pktgen [2] on a
different machine (Ubuntu 20.04). This time, af_packet runs on top of
a Linux virtio_net interface.

I seem to be getting a  similar behavior. I have used the following
dpdk-pktgen configuration and run-time settings


pktgen \
    -l 1-4 \
    -n 4 \
    --proc-type=primary \
    --no-pci \
    --no-telemetry \
    --no-huge \
    -m 512 \
    --vdev=net_af_packet0,iface=eth1,blocksz=16384,framesz=8192,framecnt=2048,qpairs=1,qdisc_bypass=0
\
    -- \
    -P \
    -T \
    -m "3.0" \
    -f themes/black-yellow.theme

set 0 size 300
set 0 rate 0.008
set 0 burst 1
start 0


[1] https://github.com/open-traffic-generator/ixia-c
[2] http://code.dpdk.org/pktgen-dpdk/pktgen-20.11.2/source/INSTALL.md

On Wed, 29 Sept 2021 at 13:03, Tudor Cornea <tudor.cornea@gmail.com> wrote:

> Hi Ferruh,
>
> What you described above looks like a ring buffer with single producer and
>> single consumer, and producer overwrites the not consumed items.
>
>
> Indeed. This is also my understanding of the bug.
> I am going to try to isolate the issue, and should probably be able to
> come up with a script in a few days.
>
> Our of curiosity, are you using an modified af_packet implementation in
>> kernel
>> for above described usage?
>
>
> We are currently using an Ubuntu-based distro with a 4.15 Linux kernel.
> We don't have any kernel patches for the af_packet implementation to my
> knowledge (probably excepting patches that are back-ported by Ubuntu
> maintainers from newer releases).
>
>
> On Mon, 20 Sept 2021 at 20:44, Ferruh Yigit <ferruh.yigit@intel.com>
> wrote:
>
>> On 9/13/2021 2:45 PM, Tudor Cornea wrote:
>> > The poll call can return POLLERR which is ignored, or it can return
>> > POLLOUT, even if there are no free frames in the mmap-ed area.
>> >
>> > We can account for both of these cases by re-checking if the next
>> > frame is empty before writing into it.
>> >
>> > Signed-off-by: Mihai Pogonaru <pogonarumihai@gmail.com>
>> > Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
>> > ---
>> >  drivers/net/af_packet/rte_eth_af_packet.c | 19 +++++++++++++++++++
>> >  1 file changed, 19 insertions(+)
>> >
>> > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
>> b/drivers/net/af_packet/rte_eth_af_packet.c
>> > index b73b211..087c196 100644
>> > --- a/drivers/net/af_packet/rte_eth_af_packet.c
>> > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
>> > @@ -216,6 +216,25 @@ eth_af_packet_tx(void *queue, struct rte_mbuf
>> **bufs, uint16_t nb_pkts)
>> >                   (poll(&pfd, 1, -1) < 0))
>> >                       break;
>> >
>> > +             /*
>> > +              * Poll can return POLLERR if the interface is down
>> > +              *
>> > +              * It will almost always return POLLOUT, even if there
>> > +              * are no extra buffers available
>> > +              *
>> > +              * This happens, because packet_poll() calls
>> datagram_poll()
>> > +              * which checks the space left in the socket buffer and,
>> > +              * in the case of packet_mmap, the default socket buffer
>> length
>> > +              * doesn't match the requested size for the tx_ring.
>> > +              * As such, there is almost always space left in socket
>> buffer,
>> > +              * which doesn't seem to be correlated to the requested
>> size
>> > +              * for the tx_ring in packet_mmap.
>> > +              *
>> > +              * This results in poll() returning POLLOUT.
>> > +              */
>> > +             if (ppd->tp_status != TP_STATUS_AVAILABLE)
>> > +                     break;
>> > +
>>
>> If 'POLLOUT' doesn't indicate that there is space in the buffer, what is
>> the
>> point of the 'poll()' at all?
>>
>> What can we test/reproduce the mentioned behavior? Or is there a way to
>> fix the
>> behavior of poll() or use an alternative of it?
>>
>>
>> OK to break on the 'POLLERR', I guess it can be detected in the
>> 'pfd.revent'.
>>
>>
>> >               /* copy the tx frame data */
>> >               pbuf = (uint8_t *) ppd + TPACKET2_HDRLEN -
>> >                       sizeof(struct sockaddr_ll);
>> >
>>
>>

  reply	other threads:[~2021-10-05 15:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-20 13:39 [dpdk-dev] [PATCH] net/af_packet: fix ignoring full ring on tx Tudor Cornea
2021-09-01 16:34 ` Ferruh Yigit
2021-09-06 10:23   ` Tudor Cornea
2021-09-20 17:11     ` Ferruh Yigit
2021-09-13 13:45 ` [dpdk-dev] [PATCH v2] " Tudor Cornea
2021-09-20 17:44   ` Ferruh Yigit
2021-09-29 10:03     ` Tudor Cornea
2021-10-05 15:11       ` Tudor Cornea [this message]
2021-10-26 14:30         ` Ferruh Yigit
2021-11-02 15:24           ` Tudor Cornea
2021-11-02 15:47   ` [dpdk-dev] [PATCH v3] " Tudor Cornea
2021-11-02 16:47     ` Ferruh Yigit
2021-11-03  9:31     ` [dpdk-dev] [PATCH v4] " Tudor Cornea
2021-11-04 12:07       ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOuQ8vWHLgt=8aNxoSd4XvWeYQqMuJ1Wt4iKpeVmY8tqwVp-_w@mail.gmail.com' \
    --to=tudor.cornea@gmail.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=linville@tuxdriver.com \
    --cc=pogonarumihai@gmail.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.