netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Eric Dumazet <eric.dumazet@gmail.com>, netdev <netdev@vger.kernel.org>
Subject: Re: 5.15-rc3+ crash in fq-codel?
Date: Thu, 30 Sep 2021 09:44:34 -0700	[thread overview]
Message-ID: <1d5fc498-c783-4857-b8e5-851e00561898@candelatech.com> (raw)
In-Reply-To: <7a896ce5-ff52-0c44-752c-f6d238d6d8d9@candelatech.com>

On 9/29/21 6:36 PM, Ben Greear wrote:
> On 9/29/21 5:40 PM, Eric Dumazet wrote:
>>
>>
>> On 9/29/21 5:29 PM, Eric Dumazet wrote:
>>>
>>>
>>> On 9/29/21 5:04 PM, Ben Greear wrote:
>>>> On 9/29/21 4:48 PM, Ben Greear wrote:
>>>>> On 9/29/21 4:42 PM, Eric Dumazet wrote:
>>>>>>
>>>>>>
>>>>>> On 9/29/21 4:28 PM, Eric Dumazet wrote:
>>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Actually the bug seems to be in pktgen, vs NET_XMIT_CN
>>>>>>>
>>>>>>> You probably would hit the same issues with other qdisc also using NET_XMIT_CN
>>>>>>>
>>>>>>
>>>>>> I would try the following patch :
>>>>>>
>>>>>> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
>>>>>> index a3d74e2704c42e3bec1aa502b911c1b952a56cf1..0a2d9534f8d08d1da5dfc68c631f3a07f95c6f77 100644
>>>>>> --- a/net/core/pktgen.c
>>>>>> +++ b/net/core/pktgen.c
>>>>>> @@ -3567,6 +3567,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>>>>>>           case NET_XMIT_DROP:
>>>>>>           case NET_XMIT_CN:
>>>>>>                   /* skb has been consumed */
>>>>>> +               pkt_dev->last_ok = 1;
>>>>>>                   pkt_dev->errors++;
>>>>>>                   break;
>>>>>>           default: /* Drivers are not supposed to return other values! */
>>>>
>>>> While patching my variant of pktgen, I took a look at the 'default' case.  I think
>>>> it should probably go above NET_XMIT_DROP and fallthrough into the consumed pkt path?
>>>>
>>>> Although, probably not a big deal since only bugs elsewhere would hit that path, and
>>>> we don't really know if skb would be consumed in that case or not.
>>>>
>>>
>>> This is probably dead code after commit
>>>
>>> commit f466dba1832f05006cf6caa9be41fb98d11cb848    pktgen: ndo_start_xmit can return NET_XMIT_xxx values
>>>
>>> So this does not really matter anymore.
>>>
>>>
>>
>> Alternative would be the following patch.
>> NET_XMIT_CN means the packet has been queued for transmit,
>> but that we might have dropped prior packets.
>>
>> Probably not a big deal to make the difference in pktgen.
>>
>> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
>> index a3d74e2704c42e3bec1aa502b911c1b952a56cf1..5c612cbc74c790f64aff5ce602843378284c7119 100644
>> --- a/net/core/pktgen.c
>> +++ b/net/core/pktgen.c
>> @@ -3557,6 +3557,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>>          switch (ret) {
>>          case NETDEV_TX_OK:
>> +       case NET_XMIT_CN:
>>                  pkt_dev->last_ok = 1;
>>                  pkt_dev->sofar++;
>>                  pkt_dev->seq_num++;
>> @@ -3565,8 +3566,8 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>>                          goto xmit_more;
>>                  break;
>>          case NET_XMIT_DROP:
>> -       case NET_XMIT_CN:
>>                  /* skb has been consumed */
>> +               pkt_dev->last_ok = 1;
>>                  pkt_dev->errors++;
>>                  break;
>>          default: /* Drivers are not supposed to return other values! */
>>
> 
> Yes, I like that the XMIT_CN then means to increment the seq_num, though for my own purposes,
> I wouldn't want to increment the sofar++ in that case (and maybe not do other logic in that case),
> since we know at least something dropped.
> 
> For fq-codel, seems that XMIT_CN could mean that the attempted packet actually was queued
> for xmit, but at least some other packets were purged.
> 
> Thanks,
> Ben
> 

This does fix the crash for me (my patch in my tree is slightly different, but same idea).

Thanks,
Ben

      reply	other threads:[~2021-09-30 16:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-27 23:30 5.15-rc3+ crash in fq-codel? Ben Greear
2021-09-27 23:49 ` Eric Dumazet
2021-09-28  0:04   ` Ben Greear
2021-09-28  0:16     ` Ben Greear
2021-09-28 22:00       ` Ben Greear
2021-09-28 23:25         ` Eric Dumazet
2021-09-29 19:07           ` Ben Greear
2021-09-29 23:21             ` Eric Dumazet
2021-09-29 23:28               ` Eric Dumazet
2021-09-29 23:42                 ` Eric Dumazet
2021-09-29 23:48                   ` Ben Greear
2021-09-30  0:04                     ` Ben Greear
2021-09-30  0:29                       ` Eric Dumazet
2021-09-30  0:40                         ` Eric Dumazet
2021-09-30  1:36                           ` Ben Greear
2021-09-30 16:44                             ` Ben Greear [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1d5fc498-c783-4857-b8e5-851e00561898@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).