All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] tipc: correctly unlink packets from deferred queue
@ 2013-12-16  9:46 erik.hugne
  2013-12-16 15:30 ` Paul Gortmaker
  0 siblings, 1 reply; 4+ messages in thread
From: erik.hugne @ 2013-12-16  9:46 UTC (permalink / raw)
  To: netdev, jon.maloy; +Cc: ying.xue, paul.gortmaker, tipc-discussion, Erik Hugne

From: Erik Hugne <erik.hugne@ericsson.com>

When we pull a packet from the deferred queue, the next
pointer for the current packet being processed might still
refer to deferred packets. This is incorrect, and will
lead to an oops if the last fragment have once been put on
the deferred queue, and at least one packet have been
deferred after this fragment. The result of this is that
the fragment chain linked together with the defer-queue.

We fix this by clearing the next pointer for the current
packet being processed.

[...] general protection fault: 0000
[...]
[...] ? trace_hardirqs_on+0xd/0x10
[...] tipc_link_recv_fragment+0xd1/0x1b0 [tipc]
[...] tipc_recv_msg+0x4e4/0x920 [tipc]
[...] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
[...] tipc_l2_rcv_msg+0xcc/0x250 [tipc]
[...] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
[...] __netif_receive_skb_core+0x80b/0xd00
[...] ? __netif_receive_skb_core+0x144/0xd00
[...] __netif_receive_skb+0x26/0x70
[...] netif_receive_skb+0x2d/0x200

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reported-by: Ying Xue <ying.xue@windriver.com>
---
 net/tipc/link.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 3d73144..447e2c4 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1444,6 +1444,7 @@ void tipc_recv_msg(struct sk_buff *head, struct tipc_bearer *b_ptr)
 		int type;
 
 		head = head->next;
+		buf->next = NULL;
 
 		/* Ensure bearer is still enabled */
 		if (unlikely(!b_ptr->active))
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next] tipc: correctly unlink packets from deferred queue
  2013-12-16  9:46 [PATCH net-next] tipc: correctly unlink packets from deferred queue erik.hugne
@ 2013-12-16 15:30 ` Paul Gortmaker
  2013-12-16 16:35   ` Erik Hugne
  0 siblings, 1 reply; 4+ messages in thread
From: Paul Gortmaker @ 2013-12-16 15:30 UTC (permalink / raw)
  To: erik.hugne, netdev, jon.maloy; +Cc: tipc-discussion

On 13-12-16 04:46 AM, erik.hugne@ericsson.com wrote:
> From: Erik Hugne <erik.hugne@ericsson.com>
> 
> When we pull a packet from the deferred queue, the next
> pointer for the current packet being processed might still
> refer to deferred packets. This is incorrect, and will
> lead to an oops if the last fragment have once been put on
> the deferred queue, and at least one packet have been

Once again, I have to ask when this behaviour was introduced.
This should always be a question that you ask yourself, and
that you consider putting in the commit log.  Please add it
to your self-check list.

So, is this a fail we introduce with the pending two series,
or with the series already taken by DaveM?

Otherwise, if it is an older problem than that, then why
is this tagged net-next?  It looks like a genuine bug fix
for an oops, if the existing mainline code has this bug.

> deferred after this fragment. The result of this is that
> the fragment chain linked together with the defer-queue.

"...chain is linked ..."   ?

> 
> We fix this by clearing the next pointer for the current
> packet being processed.
> 
> [...] general protection fault: 0000

Was this all that was in the header?  Seems overly edited, and
missing content (registers, EIP, etc.)

> [...]
> [...] ? trace_hardirqs_on+0xd/0x10
> [...] tipc_link_recv_fragment+0xd1/0x1b0 [tipc]
> [...] tipc_recv_msg+0x4e4/0x920 [tipc]
> [...] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
> [...] tipc_l2_rcv_msg+0xcc/0x250 [tipc]
> [...] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
> [...] __netif_receive_skb_core+0x80b/0xd00
> [...] ? __netif_receive_skb_core+0x144/0xd00
> [...] __netif_receive_skb+0x26/0x70
> [...] netif_receive_skb+0x2d/0x200

Same here, why have you bothered to clobber the addresses?
Deleting the printk time prefix from non-time critical bugs is
fine, but don't delete the addresses, since they convey some
relative information about functions nearby etc.

Paul.
--

> 
> Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
> Reported-by: Ying Xue <ying.xue@windriver.com>
> ---
>  net/tipc/link.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/tipc/link.c b/net/tipc/link.c
> index 3d73144..447e2c4 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -1444,6 +1444,7 @@ void tipc_recv_msg(struct sk_buff *head, struct tipc_bearer *b_ptr)
>  		int type;
>  
>  		head = head->next;
> +		buf->next = NULL;
>  
>  		/* Ensure bearer is still enabled */
>  		if (unlikely(!b_ptr->active))
> 

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next] tipc: correctly unlink packets from deferred queue
  2013-12-16 15:30 ` Paul Gortmaker
@ 2013-12-16 16:35   ` Erik Hugne
  2013-12-16 18:11     ` Paul Gortmaker
  0 siblings, 1 reply; 4+ messages in thread
From: Erik Hugne @ 2013-12-16 16:35 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: netdev, jon.maloy, ying.xue, tipc-discussion

On Mon, Dec 16, 2013 at 10:30:42AM -0500, Paul Gortmaker wrote:
> On 13-12-16 04:46 AM, erik.hugne@ericsson.com wrote:
> > From: Erik Hugne <erik.hugne@ericsson.com>
> > 
> > When we pull a packet from the deferred queue, the next
> > pointer for the current packet being processed might still
> > refer to deferred packets. This is incorrect, and will
> > lead to an oops if the last fragment have once been put on
> > the deferred queue, and at least one packet have been
> 
> Once again, I have to ask when this behaviour was introduced.
> This should always be a question that you ask yourself, and
> that you consider putting in the commit log.  Please add it
> to your self-check list.
> 
> So, is this a fail we introduce with the pending two series,
> or with the series already taken by DaveM?

The problem have always been there, but the window for when
it may occur increased after commit 40ba3cdf5
tipc: message reassembly using fragment chain

> 
> Otherwise, if it is an older problem than that, then why
> is this tagged net-next?  It looks like a genuine bug fix
> for an oops, if the existing mainline code has this bug.
> 
> > deferred after this fragment. The result of this is that
> > the fragment chain linked together with the defer-queue.
> 
> "...chain is linked ..."   ?
What we have seen is that after successful delivery of a 
fragmented message, the last packet in the fragment chain
will point into the deferred queue. When we later free the
chain, kfree_skb_list will also free packets from the defer-queue.

In theory, the same thing can occur for non-fragmented traffic
aswell.

> 
> > 
> > We fix this by clearing the next pointer for the current
> > packet being processed.
> > 
> > [...] general protection fault: 0000
> 
> Was this all that was in the header?  Seems overly edited, and
> missing content (registers, EIP, etc.)
> 
> > [...]
> > [...] ? trace_hardirqs_on+0xd/0x10
> > [...] tipc_link_recv_fragment+0xd1/0x1b0 [tipc]
> > [...] tipc_recv_msg+0x4e4/0x920 [tipc]
> > [...] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
> > [...] tipc_l2_rcv_msg+0xcc/0x250 [tipc]
> > [...] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
> > [...] __netif_receive_skb_core+0x80b/0xd00
> > [...] ? __netif_receive_skb_core+0x144/0xd00
> > [...] __netif_receive_skb+0x26/0x70
> > [...] netif_receive_skb+0x2d/0x200
> 
> Same here, why have you bothered to clobber the addresses?
> Deleting the printk time prefix from non-time critical bugs is
> fine, but don't delete the addresses, since they convey some
> relative information about functions nearby etc.

Just trying to avoid an unnecessarily verbose commit message.
As the oops was from Ying's test system with non-upstream tipc
code i didn't think the addresses added any value

Should i do an edit/resend anyway?

//E

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next] tipc: correctly unlink packets from deferred queue
  2013-12-16 16:35   ` Erik Hugne
@ 2013-12-16 18:11     ` Paul Gortmaker
  0 siblings, 0 replies; 4+ messages in thread
From: Paul Gortmaker @ 2013-12-16 18:11 UTC (permalink / raw)
  To: Erik Hugne; +Cc: netdev, jon.maloy, ying.xue, tipc-discussion

On 13-12-16 11:35 AM, Erik Hugne wrote:
> On Mon, Dec 16, 2013 at 10:30:42AM -0500, Paul Gortmaker wrote:
>> On 13-12-16 04:46 AM, erik.hugne@ericsson.com wrote:
>>> From: Erik Hugne <erik.hugne@ericsson.com>
>>>
>>> When we pull a packet from the deferred queue, the next
>>> pointer for the current packet being processed might still
>>> refer to deferred packets. This is incorrect, and will
>>> lead to an oops if the last fragment have once been put on
>>> the deferred queue, and at least one packet have been
>>
>> Once again, I have to ask when this behaviour was introduced.
>> This should always be a question that you ask yourself, and
>> that you consider putting in the commit log.  Please add it
>> to your self-check list.
>>
>> So, is this a fail we introduce with the pending two series,
>> or with the series already taken by DaveM?
> 
> The problem have always been there, but the window for when
> it may occur increased after commit 40ba3cdf5
> tipc: message reassembly using fragment chain

OK, so put that in the commit log:

"It is our understanding that this problem has always existed,
however, with the recent change of commit 40ba3cdf5
("tipc: message reassembly using fragment chain"), the window
for it possibly happening has increased."

In this case, your choice of net-next is probably OK, given
the above new information, but  we need to put the info in the
commit log, so it makes it more clear for net vs. net-next.

In the end, Dave makes the final decision of net vs net-next,
based on the information we provide him, along with information
you aren't aware of, like how deep we are into the current rcN,
whether Linus is in a good mood and so on.  If we aren't sure,
we can even specify that, after the three dashes in the commit,
by listing the pros/cons of each.

> 
>>
>> Otherwise, if it is an older problem than that, then why
>> is this tagged net-next?  It looks like a genuine bug fix
>> for an oops, if the existing mainline code has this bug.
>>
>>> deferred after this fragment. The result of this is that
>>> the fragment chain linked together with the defer-queue.
>>
>> "...chain is linked ..."   ?
>
> What we have seen is that after successful delivery of a 
> fragmented message, the last packet in the fragment chain
> will point into the deferred queue. When we later free the
> chain, kfree_skb_list will also free packets from the defer-queue.

My comment was wrt. the missing "if" in the sentence, but I
like the more detailed paragraph you have given above better.

> 
> In theory, the same thing can occur for non-fragmented traffic
> aswell.
> 
>>
>>>
>>> We fix this by clearing the next pointer for the current
>>> packet being processed.
>>>
>>> [...] general protection fault: 0000
>>
>> Was this all that was in the header?  Seems overly edited, and
>> missing content (registers, EIP, etc.)
>>
>>> [...]
>>> [...] ? trace_hardirqs_on+0xd/0x10
>>> [...] tipc_link_recv_fragment+0xd1/0x1b0 [tipc]
>>> [...] tipc_recv_msg+0x4e4/0x920 [tipc]
>>> [...] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
>>> [...] tipc_l2_rcv_msg+0xcc/0x250 [tipc]
>>> [...] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
>>> [...] __netif_receive_skb_core+0x80b/0xd00
>>> [...] ? __netif_receive_skb_core+0x144/0xd00
>>> [...] __netif_receive_skb+0x26/0x70
>>> [...] netif_receive_skb+0x2d/0x200
>>
>> Same here, why have you bothered to clobber the addresses?
>> Deleting the printk time prefix from non-time critical bugs is
>> fine, but don't delete the addresses, since they convey some
>> relative information about functions nearby etc.
> 
> Just trying to avoid an unnecessarily verbose commit message.
> As the oops was from Ying's test system with non-upstream tipc
> code i didn't think the addresses added any value

You don't need to mangle the log, but it is always good to
specify up front if non-merged code is in use.  We have had
people come to netdev in the past, saying there is a bug in
core code like net/core/dev.c -- and only after digging in
do we find they are using some broken (un-reviewed) patches
from some vendor SDK in their kernel, which actually caused
the issue in that particular case.

So, something like "Note that the above backtrace was observed
on a tree with some additional pending TIPC changes in place,
but nothing in those changes, or in the backtrace above appears
to play a role in causing this issue."  Or, even better, simply
prove to yourself that you _can_ reproduce it on what is already
accepted into the net-next code base, if the WIP in Ying's test
system isn't at all related to this problem.

> 
> Should i do an edit/resend anyway?

Given all of the above, yes please.

Paul.
--

> 
> //E
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-12-16 18:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-16  9:46 [PATCH net-next] tipc: correctly unlink packets from deferred queue erik.hugne
2013-12-16 15:30 ` Paul Gortmaker
2013-12-16 16:35   ` Erik Hugne
2013-12-16 18:11     ` Paul Gortmaker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.