linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* igb and bnx2: "NETDEV WATCHDOG: transmit queue timed out" when skb has huge linear buffer
@ 2014-01-30 19:08 Zoltan Kiss
  2014-01-30 20:34 ` Michael Chan
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Zoltan Kiss @ 2014-01-30 19:08 UTC (permalink / raw)
  To: Jeff Kirsher, Jesse Brandeburg, Bruce Allan, Carolyn Wyborny,
	Don Skidmore, Greg Rose, Peter P Waskiewicz Jr, Alex Duyck,
	John Ronciak, Tushar Dave, Akeem G Abodunrin, David S. Miller,
	e1000-devel, netdev, linux-kernel, Michael Chan, xen-devel

Hi,

I've experienced some queue timeout problems mentioned in the subject 
with igb and bnx2 cards. I haven't seen them on other cards so far. I'm 
using XenServer with 3.10 Dom0 kernel (however igb were already updated 
to latest version), and there are Windows guests sending data through 
these cards. I noticed these problems in XenRT test runs, and I know 
that they usually mean some lost interrupt problem or other hardware 
error, but in my case they started to appear more often, and they are 
likely connected to my netback grant mapping patches. These patches 
causing skb's with huge (~64kb) linear buffers to appear more often.
The reason for that is an old problem in the ring protocol: originally 
the maximum amount of slots were linked to MAX_SKB_FRAGS, as every slot 
ended up as a frag of the skb. When this value were changed, netback had 
to cope with the situation by coalescing the packets into fewer frags.
My patch series take a different approach: the leftover slots (pages) 
were assigned to a new skb's frags, and that skb were stashed to the 
frag_list of the first one. Then, before sending it off to the stack it 
calls skb = skb_copy_expand(skb, 0, 0, GFP_ATOMIC, __GFP_NOWARN), which 
basically creates a new skb and copied all the data into it. As far as I 
understood, it put everything into the linear buffer, which can amount 
to 64KB at most. The original skb are freed then, and this new one were 
sent to the stack.
I suspect that this is the problem as it only happens when guests send 
too much slots. Does anyone familiar with these drivers have seen such 
issue before? (when these kind of skb's get stucked in the queue)

Regards,

Zoltan Kiss

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-02-12 17:14 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-30 19:08 igb and bnx2: "NETDEV WATCHDOG: transmit queue timed out" when skb has huge linear buffer Zoltan Kiss
2014-01-30 20:34 ` Michael Chan
2014-01-31 13:29   ` Zoltan Kiss
2014-02-04 19:47     ` Michael Chan
2014-02-05 20:23       ` Zoltan Kiss
2014-02-05 20:27         ` Zoltan Kiss
2014-02-05 20:43         ` Andrew Cooper
2014-02-06  9:58           ` Zoltan Kiss
2014-01-31 18:56 ` Wei Liu
2014-02-04 21:32   ` Zoltan Kiss
2014-02-12 17:13 ` Zoltan Kiss

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).