All of lore.kernel.org
 help / color / mirror / Atom feed
* Corrupted SKB
@ 2017-04-18  0:39 Michael Ma
  2017-04-18 23:12 ` Cong Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Ma @ 2017-04-18  0:39 UTC (permalink / raw)
  To: Linux Kernel Network Developers

Hi -

We've implemented a "glue" qdisc similar to mqprio which can associate
one qdisc to multiple txqs as the root qdisc. Reference count of the
child qdiscs have been adjusted properly in this case so that it
represents the number of txqs it has been attached to. However when
sending packets we saw the skb from dequeue_skb() corrupted with the
following call stack:

    [exception RIP: netif_skb_features+51]
    RIP: ffffffff815292b3  RSP: ffff8817f6987940  RFLAGS: 00010246

 #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa
#10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9
#11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193
#12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03

It looks like the skb has already been released since its dev pointer
field is invalid.

Any clue on how this can be investigated further? My current thought
is to add some instrumentation to the place where skb is released and
analyze whether there is any race condition happening there. However
by looking through the existing code I think the case where one root
qdisc is associated with multiple txqs already exists (when mqprio is
not used) so not sure why it won't work when we group txqs and assign
each group a root qdisc. Any insight on this issue would be much
appreciated!

Thanks,
Michael

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Corrupted SKB
  2017-04-18  0:39 Corrupted SKB Michael Ma
@ 2017-04-18 23:12 ` Cong Wang
  2017-04-19  4:46   ` Michael Ma
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2017-04-18 23:12 UTC (permalink / raw)
  To: Michael Ma; +Cc: Linux Kernel Network Developers

On Mon, Apr 17, 2017 at 5:39 PM, Michael Ma <make0818@gmail.com> wrote:
> Hi -
>
> We've implemented a "glue" qdisc similar to mqprio which can associate
> one qdisc to multiple txqs as the root qdisc. Reference count of the
> child qdiscs have been adjusted properly in this case so that it
> represents the number of txqs it has been attached to. However when
> sending packets we saw the skb from dequeue_skb() corrupted with the
> following call stack:
>
>     [exception RIP: netif_skb_features+51]
>     RIP: ffffffff815292b3  RSP: ffff8817f6987940  RFLAGS: 00010246
>
>  #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa
> #10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9
> #11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193
> #12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03
>
> It looks like the skb has already been released since its dev pointer
> field is invalid.
>
> Any clue on how this can be investigated further? My current thought
> is to add some instrumentation to the place where skb is released and
> analyze whether there is any race condition happening there. However

Either dropwatch or perf could do the work to instrument kfree_skb().

> by looking through the existing code I think the case where one root
> qdisc is associated with multiple txqs already exists (when mqprio is
> not used) so not sure why it won't work when we group txqs and assign
> each group a root qdisc. Any insight on this issue would be much
> appreciated!

How do you implement ->attach()? How does it work with netdev_pick_tx()?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Corrupted SKB
  2017-04-18 23:12 ` Cong Wang
@ 2017-04-19  4:46   ` Michael Ma
  2017-04-26  4:42     ` Michael Ma
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Ma @ 2017-04-19  4:46 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers, jin.oyj

2017-04-18 16:12 GMT-07:00 Cong Wang <xiyou.wangcong@gmail.com>:
> On Mon, Apr 17, 2017 at 5:39 PM, Michael Ma <make0818@gmail.com> wrote:
>> Hi -
>>
>> We've implemented a "glue" qdisc similar to mqprio which can associate
>> one qdisc to multiple txqs as the root qdisc. Reference count of the
>> child qdiscs have been adjusted properly in this case so that it
>> represents the number of txqs it has been attached to. However when
>> sending packets we saw the skb from dequeue_skb() corrupted with the
>> following call stack:
>>
>>     [exception RIP: netif_skb_features+51]
>>     RIP: ffffffff815292b3  RSP: ffff8817f6987940  RFLAGS: 00010246
>>
>>  #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa
>> #10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9
>> #11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193
>> #12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03
>>
>> It looks like the skb has already been released since its dev pointer
>> field is invalid.
>>
>> Any clue on how this can be investigated further? My current thought
>> is to add some instrumentation to the place where skb is released and
>> analyze whether there is any race condition happening there. However
>
> Either dropwatch or perf could do the work to instrument kfree_skb().

Thanks - will try it out.
>
>> by looking through the existing code I think the case where one root
>> qdisc is associated with multiple txqs already exists (when mqprio is
>> not used) so not sure why it won't work when we group txqs and assign
>> each group a root qdisc. Any insight on this issue would be much
>> appreciated!
>
> How do you implement ->attach()? How does it work with netdev_pick_tx()?

attach() essentially grafts the default qdisc(pfifo) to each "txq
group" represented by a TC class. For netdev_pick_txq() we use classid
of the socket to select a class based on a "class id base" and the
class to txq mapping defined together with this glue qdisc - it's
pretty much the same as mqprio with the difference of mapping one
class to multiple txqs and selecting the txq through a hash.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Corrupted SKB
  2017-04-19  4:46   ` Michael Ma
@ 2017-04-26  4:42     ` Michael Ma
  2017-04-26 17:38       ` Cong Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Ma @ 2017-04-26  4:42 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers, jin.oyj

2017-04-18 21:46 GMT-07:00 Michael Ma <make0818@gmail.com>:
> 2017-04-18 16:12 GMT-07:00 Cong Wang <xiyou.wangcong@gmail.com>:
>> On Mon, Apr 17, 2017 at 5:39 PM, Michael Ma <make0818@gmail.com> wrote:
>>> Hi -
>>>
>>> We've implemented a "glue" qdisc similar to mqprio which can associate
>>> one qdisc to multiple txqs as the root qdisc. Reference count of the
>>> child qdiscs have been adjusted properly in this case so that it
>>> represents the number of txqs it has been attached to. However when
>>> sending packets we saw the skb from dequeue_skb() corrupted with the
>>> following call stack:
>>>
>>>     [exception RIP: netif_skb_features+51]
>>>     RIP: ffffffff815292b3  RSP: ffff8817f6987940  RFLAGS: 00010246
>>>
>>>  #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa
>>> #10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9
>>> #11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193
>>> #12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03
>>>
>>> It looks like the skb has already been released since its dev pointer
>>> field is invalid.
>>>
>>> Any clue on how this can be investigated further? My current thought
>>> is to add some instrumentation to the place where skb is released and
>>> analyze whether there is any race condition happening there. However
>>
>> Either dropwatch or perf could do the work to instrument kfree_skb().
>
> Thanks - will try it out.

I'm using perf to collect the callstack for kfree_skb and trying to
correlate that with the corrupted SKB address however when system
crashes the perf.data file is also corrupted - how can I view this
file in case the system crashes before perf exits?
>>
>>> by looking through the existing code I think the case where one root
>>> qdisc is associated with multiple txqs already exists (when mqprio is
>>> not used) so not sure why it won't work when we group txqs and assign
>>> each group a root qdisc. Any insight on this issue would be much
>>> appreciated!
>>
>> How do you implement ->attach()? How does it work with netdev_pick_tx()?
>
> attach() essentially grafts the default qdisc(pfifo) to each "txq
> group" represented by a TC class. For netdev_pick_txq() we use classid
> of the socket to select a class based on a "class id base" and the
> class to txq mapping defined together with this glue qdisc - it's
> pretty much the same as mqprio with the difference of mapping one
> class to multiple txqs and selecting the txq through a hash.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Corrupted SKB
  2017-04-26  4:42     ` Michael Ma
@ 2017-04-26 17:38       ` Cong Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Cong Wang @ 2017-04-26 17:38 UTC (permalink / raw)
  To: Michael Ma; +Cc: Linux Kernel Network Developers, jin.oyj

On Tue, Apr 25, 2017 at 9:42 PM, Michael Ma <make0818@gmail.com> wrote:
> 2017-04-18 21:46 GMT-07:00 Michael Ma <make0818@gmail.com>:
>> 2017-04-18 16:12 GMT-07:00 Cong Wang <xiyou.wangcong@gmail.com>:
>>> On Mon, Apr 17, 2017 at 5:39 PM, Michael Ma <make0818@gmail.com> wrote:
>>>> Hi -
>>>>
>>>> We've implemented a "glue" qdisc similar to mqprio which can associate
>>>> one qdisc to multiple txqs as the root qdisc. Reference count of the
>>>> child qdiscs have been adjusted properly in this case so that it
>>>> represents the number of txqs it has been attached to. However when
>>>> sending packets we saw the skb from dequeue_skb() corrupted with the
>>>> following call stack:
>>>>
>>>>     [exception RIP: netif_skb_features+51]
>>>>     RIP: ffffffff815292b3  RSP: ffff8817f6987940  RFLAGS: 00010246
>>>>
>>>>  #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa
>>>> #10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9
>>>> #11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193
>>>> #12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03
>>>>
>>>> It looks like the skb has already been released since its dev pointer
>>>> field is invalid.
>>>>
>>>> Any clue on how this can be investigated further? My current thought
>>>> is to add some instrumentation to the place where skb is released and
>>>> analyze whether there is any race condition happening there. However
>>>
>>> Either dropwatch or perf could do the work to instrument kfree_skb().
>>
>> Thanks - will try it out.
>
> I'm using perf to collect the callstack for kfree_skb and trying to
> correlate that with the corrupted SKB address however when system
> crashes the perf.data file is also corrupted - how can I view this
> file in case the system crashes before perf exits?

Hmm, KASAN is pretty good at detecting use-after-free,
its report can nicely shows where we allocate/free it and the
use after free.

https://01.org/linuxgraphics/gfx-docs/drm/dev-tools/kasan.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-04-26 17:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-18  0:39 Corrupted SKB Michael Ma
2017-04-18 23:12 ` Cong Wang
2017-04-19  4:46   ` Michael Ma
2017-04-26  4:42     ` Michael Ma
2017-04-26 17:38       ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.