* [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb() @ 2020-09-21 2:04 Yunsheng Lin 2020-09-21 7:19 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Yunsheng Lin @ 2020-09-21 2:04 UTC (permalink / raw) To: davem, kuba, linmiaohe, martin.varghese, fw, edumazet, dcaratti, steffen.klassert, pabeni, kyk.segfault, saeed Cc: netdev, linux-kernel, linuxarm When napi_consume_skb() is called in the tx desc cleaning process, it is usually in the softirq context(BH disabled, or are processing softirqs), but it may also be in the task context, such as in the netpoll or loopback selftest process. Currently napi_consume_skb() uses non-zero budget to indicate the NAPI context, the driver writer may provide the wrong budget when tx desc cleaning function is reused for both NAPI and non-NAPI context, see [1]. So this patch uses in_softirq() to indicate the NAPI context, which doesn't necessarily mean in NAPI context, but it shouldn't care if NAPI context or not as long as it runs in softirq context or with BH disabled, then _kfree_skb_defer() will push the skb to the particular cpu' napi_alloc_cache atomically. [1] https://lkml.org/lkml/2020/9/15/38 Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> --- note that budget parameter is not removed in this patch because it involves many driver changes, we can remove it in separate patch if this patch is accepted. --- net/core/skbuff.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index e077447..03d0d28 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -895,8 +895,10 @@ void __kfree_skb_defer(struct sk_buff *skb) void napi_consume_skb(struct sk_buff *skb, int budget) { - /* Zero budget indicate non-NAPI context called us, like netpoll */ - if (unlikely(!budget)) { + /* called by non-softirq context, which usually means non-NAPI + * context, like netpoll. + */ + if (unlikely(!in_softirq())) { dev_consume_skb_any(skb); return; } -- 2.8.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb() 2020-09-21 2:04 [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb() Yunsheng Lin @ 2020-09-21 7:19 ` Eric Dumazet 2020-09-21 8:09 ` Yunsheng Lin 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2020-09-21 7:19 UTC (permalink / raw) To: Yunsheng Lin Cc: David Miller, Jakub Kicinski, linmiaohe, martin.varghese, Florian Westphal, Davide Caratti, Steffen Klassert, Paolo Abeni, kyk.segfault, Saeed Mahameed, netdev, LKML, linuxarm On Mon, Sep 21, 2020 at 4:08 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: > > When napi_consume_skb() is called in the tx desc cleaning process, > it is usually in the softirq context(BH disabled, or are processing > softirqs), but it may also be in the task context, such as in the > netpoll or loopback selftest process. > > Currently napi_consume_skb() uses non-zero budget to indicate the > NAPI context, the driver writer may provide the wrong budget when > tx desc cleaning function is reused for both NAPI and non-NAPI > context, see [1]. > > So this patch uses in_softirq() to indicate the NAPI context, which > doesn't necessarily mean in NAPI context, but it shouldn't care if > NAPI context or not as long as it runs in softirq context or with BH > disabled, then _kfree_skb_defer() will push the skb to the particular > cpu' napi_alloc_cache atomically. > > [1] https://lkml.org/lkml/2020/9/15/38 > > Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> > --- > note that budget parameter is not removed in this patch because it > involves many driver changes, we can remove it in separate patch if > this patch is accepted. > --- > net/core/skbuff.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index e077447..03d0d28 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -895,8 +895,10 @@ void __kfree_skb_defer(struct sk_buff *skb) > > void napi_consume_skb(struct sk_buff *skb, int budget) > { > - /* Zero budget indicate non-NAPI context called us, like netpoll */ > - if (unlikely(!budget)) { > + /* called by non-softirq context, which usually means non-NAPI > + * context, like netpoll. > + */ > + if (unlikely(!in_softirq())) { > dev_consume_skb_any(skb); > return; > } > -- I do not think we should add this kind of fuzzy logic, just because _one_ driver author made a mistake. Add a disable_bh() in the driver slow path, and accept the _existing_ semantic, the one that was understood by dozens. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb() 2020-09-21 7:19 ` Eric Dumazet @ 2020-09-21 8:09 ` Yunsheng Lin 2020-09-21 8:17 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Yunsheng Lin @ 2020-09-21 8:09 UTC (permalink / raw) To: Eric Dumazet Cc: David Miller, Jakub Kicinski, linmiaohe, martin.varghese, Florian Westphal, Davide Caratti, Steffen Klassert, Paolo Abeni, kyk.segfault, Saeed Mahameed, netdev, LKML, linuxarm On 2020/9/21 15:19, Eric Dumazet wrote: > On Mon, Sep 21, 2020 at 4:08 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: >> >> When napi_consume_skb() is called in the tx desc cleaning process, >> it is usually in the softirq context(BH disabled, or are processing >> softirqs), but it may also be in the task context, such as in the >> netpoll or loopback selftest process. >> >> Currently napi_consume_skb() uses non-zero budget to indicate the >> NAPI context, the driver writer may provide the wrong budget when >> tx desc cleaning function is reused for both NAPI and non-NAPI >> context, see [1]. >> >> So this patch uses in_softirq() to indicate the NAPI context, which >> doesn't necessarily mean in NAPI context, but it shouldn't care if >> NAPI context or not as long as it runs in softirq context or with BH >> disabled, then _kfree_skb_defer() will push the skb to the particular >> cpu' napi_alloc_cache atomically. >> >> [1] https://lkml.org/lkml/2020/9/15/38 >> >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> >> --- >> note that budget parameter is not removed in this patch because it >> involves many driver changes, we can remove it in separate patch if >> this patch is accepted. >> --- >> net/core/skbuff.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >> index e077447..03d0d28 100644 >> --- a/net/core/skbuff.c >> +++ b/net/core/skbuff.c >> @@ -895,8 +895,10 @@ void __kfree_skb_defer(struct sk_buff *skb) >> >> void napi_consume_skb(struct sk_buff *skb, int budget) >> { >> - /* Zero budget indicate non-NAPI context called us, like netpoll */ >> - if (unlikely(!budget)) { >> + /* called by non-softirq context, which usually means non-NAPI >> + * context, like netpoll. >> + */ >> + if (unlikely(!in_softirq())) { >> dev_consume_skb_any(skb); >> return; >> } >> -- > > > I do not think we should add this kind of fuzzy logic, just because > _one_ driver author made a mistake. > > Add a disable_bh() in the driver slow path, and accept the _existing_ > semantic, the one that was understood by dozens. As my understanding, this patch did not change _existing_ semantic, it still only call _kfree_skb_defer() in softirq context. This patch just remove the requirement that a softirq context hint need to be provided to decide whether calling _kfree_skb_defer(). Yes, we can add DEBUG_NET() clauses to catch this kind of error as you suggested. But why we need such a debug clauses, when we can decide if delaying skb freeing is possible in napi_consume_skb(), why not just use in_softirq() to make this API more easy to use? Just as __dev_kfree_skb_any() API use "in_irq() || irqs_disabled()" checking to handle the irq context and non-irq context. > . > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb() 2020-09-21 8:09 ` Yunsheng Lin @ 2020-09-21 8:17 ` Eric Dumazet 2020-09-21 8:40 ` Yunsheng Lin 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2020-09-21 8:17 UTC (permalink / raw) To: Yunsheng Lin Cc: David Miller, Jakub Kicinski, linmiaohe, martin.varghese, Florian Westphal, Davide Caratti, Steffen Klassert, Paolo Abeni, kyk.segfault, Saeed Mahameed, netdev, LKML, linuxarm On Mon, Sep 21, 2020 at 10:10 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: > > On 2020/9/21 15:19, Eric Dumazet wrote: > > On Mon, Sep 21, 2020 at 4:08 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: > >> > >> When napi_consume_skb() is called in the tx desc cleaning process, > >> it is usually in the softirq context(BH disabled, or are processing > >> softirqs), but it may also be in the task context, such as in the > >> netpoll or loopback selftest process. > >> > >> Currently napi_consume_skb() uses non-zero budget to indicate the > >> NAPI context, the driver writer may provide the wrong budget when > >> tx desc cleaning function is reused for both NAPI and non-NAPI > >> context, see [1]. > >> > >> So this patch uses in_softirq() to indicate the NAPI context, which > >> doesn't necessarily mean in NAPI context, but it shouldn't care if > >> NAPI context or not as long as it runs in softirq context or with BH > >> disabled, then _kfree_skb_defer() will push the skb to the particular > >> cpu' napi_alloc_cache atomically. > >> > >> [1] https://lkml.org/lkml/2020/9/15/38 > >> > >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> > >> --- > >> note that budget parameter is not removed in this patch because it > >> involves many driver changes, we can remove it in separate patch if > >> this patch is accepted. > >> --- > >> net/core/skbuff.c | 6 ++++-- > >> 1 file changed, 4 insertions(+), 2 deletions(-) > >> > >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c > >> index e077447..03d0d28 100644 > >> --- a/net/core/skbuff.c > >> +++ b/net/core/skbuff.c > >> @@ -895,8 +895,10 @@ void __kfree_skb_defer(struct sk_buff *skb) > >> > >> void napi_consume_skb(struct sk_buff *skb, int budget) > >> { > >> - /* Zero budget indicate non-NAPI context called us, like netpoll */ > >> - if (unlikely(!budget)) { > >> + /* called by non-softirq context, which usually means non-NAPI > >> + * context, like netpoll. > >> + */ > >> + if (unlikely(!in_softirq())) { > >> dev_consume_skb_any(skb); > >> return; > >> } > >> -- > > > > > > I do not think we should add this kind of fuzzy logic, just because > > _one_ driver author made a mistake. > > > > Add a disable_bh() in the driver slow path, and accept the _existing_ > > semantic, the one that was understood by dozens. > > As my understanding, this patch did not change _existing_ semantic, > it still only call _kfree_skb_defer() in softirq context. This patch > just remove the requirement that a softirq context hint need to be > provided to decide whether calling _kfree_skb_defer(). I do not want to remove the requirement. > > Yes, we can add DEBUG_NET() clauses to catch this kind of error as > you suggested. > > But why we need such a debug clauses, when we can decide if delaying > skb freeing is possible in napi_consume_skb(), why not just use > in_softirq() to make this API more easy to use? Just as __dev_kfree_skb_any() > API use "in_irq() || irqs_disabled()" checking to handle the irq context > and non-irq context. I just do not like your patch. Copying another piece of fuzzy logic, inherited from legacy code is not an excuse. Add a local_bh_disable() in the driver slow path to meet _existing_ requirement, so that we can keep the hot path fast. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb() 2020-09-21 8:17 ` Eric Dumazet @ 2020-09-21 8:40 ` Yunsheng Lin 2020-09-21 9:09 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Yunsheng Lin @ 2020-09-21 8:40 UTC (permalink / raw) To: Eric Dumazet Cc: David Miller, Jakub Kicinski, linmiaohe, martin.varghese, Florian Westphal, Davide Caratti, Steffen Klassert, Paolo Abeni, kyk.segfault, Saeed Mahameed, netdev, LKML, linuxarm On 2020/9/21 16:17, Eric Dumazet wrote: > On Mon, Sep 21, 2020 at 10:10 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: >> >> On 2020/9/21 15:19, Eric Dumazet wrote: >>> On Mon, Sep 21, 2020 at 4:08 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: >>>> >>>> When napi_consume_skb() is called in the tx desc cleaning process, >>>> it is usually in the softirq context(BH disabled, or are processing >>>> softirqs), but it may also be in the task context, such as in the >>>> netpoll or loopback selftest process. >>>> >>>> Currently napi_consume_skb() uses non-zero budget to indicate the >>>> NAPI context, the driver writer may provide the wrong budget when >>>> tx desc cleaning function is reused for both NAPI and non-NAPI >>>> context, see [1]. >>>> >>>> So this patch uses in_softirq() to indicate the NAPI context, which >>>> doesn't necessarily mean in NAPI context, but it shouldn't care if >>>> NAPI context or not as long as it runs in softirq context or with BH >>>> disabled, then _kfree_skb_defer() will push the skb to the particular >>>> cpu' napi_alloc_cache atomically. >>>> >>>> [1] https://lkml.org/lkml/2020/9/15/38 >>>> >>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> >>>> --- >>>> note that budget parameter is not removed in this patch because it >>>> involves many driver changes, we can remove it in separate patch if >>>> this patch is accepted. >>>> --- >>>> net/core/skbuff.c | 6 ++++-- >>>> 1 file changed, 4 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>>> index e077447..03d0d28 100644 >>>> --- a/net/core/skbuff.c >>>> +++ b/net/core/skbuff.c >>>> @@ -895,8 +895,10 @@ void __kfree_skb_defer(struct sk_buff *skb) >>>> >>>> void napi_consume_skb(struct sk_buff *skb, int budget) >>>> { >>>> - /* Zero budget indicate non-NAPI context called us, like netpoll */ >>>> - if (unlikely(!budget)) { >>>> + /* called by non-softirq context, which usually means non-NAPI >>>> + * context, like netpoll. >>>> + */ >>>> + if (unlikely(!in_softirq())) { >>>> dev_consume_skb_any(skb); >>>> return; >>>> } >>>> -- >>> >>> >>> I do not think we should add this kind of fuzzy logic, just because >>> _one_ driver author made a mistake. >>> >>> Add a disable_bh() in the driver slow path, and accept the _existing_ >>> semantic, the one that was understood by dozens. >> >> As my understanding, this patch did not change _existing_ semantic, >> it still only call _kfree_skb_defer() in softirq context. This patch >> just remove the requirement that a softirq context hint need to be >> provided to decide whether calling _kfree_skb_defer(). > > I do not want to remove the requirement. > >> >> Yes, we can add DEBUG_NET() clauses to catch this kind of error as >> you suggested. >> >> But why we need such a debug clauses, when we can decide if delaying >> skb freeing is possible in napi_consume_skb(), why not just use >> in_softirq() to make this API more easy to use? Just as __dev_kfree_skb_any() >> API use "in_irq() || irqs_disabled()" checking to handle the irq context >> and non-irq context. > > > I just do not like your patch. > > Copying another piece of fuzzy logic, inherited from legacy code is > not an excuse. > > Add a local_bh_disable() in the driver slow path to meet _existing_ > requirement, so that we can keep the hot path fast. "!in_softirq()" checking make the napi_consume_skb() slower than "!budget" checking? do I miss something? As a matter of fact, the hns3 driver has fixed this problem by passing zero-budget to napi_consume_skb() in non-NAPI context, this patch is more about how to avoid or catch this kind of error. So your opinion is still to catch this kind of error using something like DEBUG_NET() clauses? > . > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb() 2020-09-21 8:40 ` Yunsheng Lin @ 2020-09-21 9:09 ` Eric Dumazet 0 siblings, 0 replies; 6+ messages in thread From: Eric Dumazet @ 2020-09-21 9:09 UTC (permalink / raw) To: Yunsheng Lin, Eric Dumazet Cc: David Miller, Jakub Kicinski, linmiaohe, martin.varghese, Florian Westphal, Davide Caratti, Steffen Klassert, Paolo Abeni, kyk.segfault, Saeed Mahameed, netdev, LKML, linuxarm On 9/21/20 10:40 AM, Yunsheng Lin wrote: > On 2020/9/21 16:17, Eric Dumazet wrote: >> On Mon, Sep 21, 2020 at 10:10 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: >>> >>> On 2020/9/21 15:19, Eric Dumazet wrote: >>>> On Mon, Sep 21, 2020 at 4:08 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: >>>>> >>>>> When napi_consume_skb() is called in the tx desc cleaning process, >>>>> it is usually in the softirq context(BH disabled, or are processing >>>>> softirqs), but it may also be in the task context, such as in the >>>>> netpoll or loopback selftest process. >>>>> >>>>> Currently napi_consume_skb() uses non-zero budget to indicate the >>>>> NAPI context, the driver writer may provide the wrong budget when >>>>> tx desc cleaning function is reused for both NAPI and non-NAPI >>>>> context, see [1]. >>>>> >>>>> So this patch uses in_softirq() to indicate the NAPI context, which >>>>> doesn't necessarily mean in NAPI context, but it shouldn't care if >>>>> NAPI context or not as long as it runs in softirq context or with BH >>>>> disabled, then _kfree_skb_defer() will push the skb to the particular >>>>> cpu' napi_alloc_cache atomically. >>>>> >>>>> [1] https://lkml.org/lkml/2020/9/15/38 >>>>> >>>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> >>>>> --- >>>>> note that budget parameter is not removed in this patch because it >>>>> involves many driver changes, we can remove it in separate patch if >>>>> this patch is accepted. >>>>> --- >>>>> net/core/skbuff.c | 6 ++++-- >>>>> 1 file changed, 4 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>>>> index e077447..03d0d28 100644 >>>>> --- a/net/core/skbuff.c >>>>> +++ b/net/core/skbuff.c >>>>> @@ -895,8 +895,10 @@ void __kfree_skb_defer(struct sk_buff *skb) >>>>> >>>>> void napi_consume_skb(struct sk_buff *skb, int budget) >>>>> { >>>>> - /* Zero budget indicate non-NAPI context called us, like netpoll */ >>>>> - if (unlikely(!budget)) { >>>>> + /* called by non-softirq context, which usually means non-NAPI >>>>> + * context, like netpoll. >>>>> + */ >>>>> + if (unlikely(!in_softirq())) { >>>>> dev_consume_skb_any(skb); >>>>> return; >>>>> } >>>>> -- >>>> >>>> >>>> I do not think we should add this kind of fuzzy logic, just because >>>> _one_ driver author made a mistake. >>>> >>>> Add a disable_bh() in the driver slow path, and accept the _existing_ >>>> semantic, the one that was understood by dozens. >>> >>> As my understanding, this patch did not change _existing_ semantic, >>> it still only call _kfree_skb_defer() in softirq context. This patch >>> just remove the requirement that a softirq context hint need to be >>> provided to decide whether calling _kfree_skb_defer(). >> >> I do not want to remove the requirement. >> >>> >>> Yes, we can add DEBUG_NET() clauses to catch this kind of error as >>> you suggested. >>> >>> But why we need such a debug clauses, when we can decide if delaying >>> skb freeing is possible in napi_consume_skb(), why not just use >>> in_softirq() to make this API more easy to use? Just as __dev_kfree_skb_any() >>> API use "in_irq() || irqs_disabled()" checking to handle the irq context >>> and non-irq context. >> >> >> I just do not like your patch. >> >> Copying another piece of fuzzy logic, inherited from legacy code is >> not an excuse. >> >> Add a local_bh_disable() in the driver slow path to meet _existing_ >> requirement, so that we can keep the hot path fast. > > "!in_softirq()" checking make the napi_consume_skb() slower than > "!budget" checking? do I miss something? Yes, you missed that we can _remove_ this condition completely, if drivers make sure to always have BH disabled. > > As a matter of fact, the hns3 driver has fixed this problem by > passing zero-budget to napi_consume_skb() in non-NAPI context, this > patch is more about how to avoid or catch this kind of error. I think I understood this. Having one error in hns3 does not mean we are going to slow down the stack. > > So your opinion is still to catch this kind of error using something > like DEBUG_NET() clauses? > Yes. Again, we want to keep fast path fast, not something that is an swiss army knife. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-09-21 9:09 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-09-21 2:04 [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb() Yunsheng Lin 2020-09-21 7:19 ` Eric Dumazet 2020-09-21 8:09 ` Yunsheng Lin 2020-09-21 8:17 ` Eric Dumazet 2020-09-21 8:40 ` Yunsheng Lin 2020-09-21 9:09 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).