All of lore.kernel.org
 help / color / mirror / Atom feed
* sh_eth.c::sh_eth_rx(): mdp->rx_skbuff[entry] can be NULL
@ 2015-10-19 15:01 Yasushi SHOJI
  2015-10-20 20:48 ` Sergei Shtylyov
  0 siblings, 1 reply; 5+ messages in thread
From: Yasushi SHOJI @ 2015-10-19 15:01 UTC (permalink / raw)
  To: netdev; +Cc: yashi

Hi,

In a low memory situation with netdev_alloc_skb() failure,
mdp->rx_skbuff[entry] can be left NULL, however, sh_eth_rx() seems to
access it without checking NULL or not in the following code:

			skb = mdp->rx_skbuff[entry];
			mdp->rx_skbuff[entry] = NULL;
			if (mdp->cd->rpadir)
				skb_reserve(skb, NET_IP_ALIGN);
			dma_unmap_single(&ndev->dev, rxdesc->addr,
					 ALIGN(mdp->rx_buf_sz, 16),
					 DMA_FROM_DEVICE);

I've put BUG_ON() to test skb and got the following backtrace.  I can
also enable slub poisoning to see some bad access.

I'm not that familiar with this code base so I'm note including any
patch yet.  I appreciate if someone with insight in this code give a
quick look and tell me that it's a real one or not.  if this is a real
case, I can take a deep look.

BTW, the backtrace is from old 3.4.74+ kernel but the current tip is
very close.

Thanks,


Unable to handle kernel NULL pointer dereference at virtual address 00000050
pgd = 8490c000
[00000050] *pgd=4651e831, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] PREEMPT ARM
Modules linked in:
CPU: 0    Not tainted  (3.4-at16 #9)
PC is at skb_put+0x10/0x98
LR is at sh_eth_poll+0x2c8/0xa10
pc : [<8035f780>]    lr : [<8028bf50>]    psr: 60000113
sp : 84eb1a90  ip : 84eb1ac8  fp : 84eb1ac4
r10: 0000003f  r9 : 000005ea  r8 : 00000000
r7 : 00000000  r6 : 940453b0  r5 : 00030000  r4 : 9381b180
r3 : 00000000  r2 : 00000000  r1 : 000005ea  r0 : 00000000
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c53c7d  Table: 4248c059  DAC: 00000015
Process klogd (pid: 2046, stack limit = 0x84eb02e8)
Stack: (0x84eb1a90 to 0x84eb2000)
1a80:                                     84eb1ab4 84eb1aa0 a0000113 9381b180
1aa0: 00030000 00000001 0000012c 9381b180 00030000 940453b0 84eb1b1c 84eb1ac8
1ac0: 8028bf50 8035f77c 84eb1b54 84eb1ad8 8000d7c0 80008494 00000000 00000000
1ae0: 98848000 80584444 00000020 9381b594 00000000 9381b594 80594c00 00000001
1b00: 0000012c 00000020 00000003 80594c08 84eb1b54 84eb1b20 80368ff0 8028bc94
1b20: 80010dc4 ffff8f3f 84eb1b28 84eb0000 00000001 805a6acc 40000102 00000009
1b40: 00000003 00000000 84eb1b8c 84eb1b58 800230d4 80368f90 938106c0 00000000
1b60: 84eb1b84 0000008e 00000000 84eb1bd8 84eb1c0c 86530540 86530c40 803f9d18
1b80: 84eb1b9c 84eb1b90 8002359c 80023050 84eb1bb4 84eb1ba0 8000e4b0 80023558
1ba0: 98802000 80562074 84eb1bd4 84eb1bb8 800084c4 8000e450 8003cd18 80070aa0
1bc0: a0000013 ffffffff 84eb1c2c 84eb1bd8 8000d7c0 80008494 80594f2c 00000002
1be0: 84eb1c20 86770440 ffffffff 82766000 00000002 80594f2c 86530540 86530c40
1c00: 803f9d18 84eb1c2c 84eb1c30 84eb1c20 8003cd18 80070aa0 a0000013 ffffffff
1c20: 84eb1c54 84eb1c30 8003cd18 80070a90 00000000 80576260 8276601c 86770440
1c40: 9396d100 84eb0000 84eb1c6c 84eb1c58 8003cd68 8003cd00 00000000 80576218
1c60: 84eb1cf4 84eb1c70 8000dbd4 8003cd54 00000002 9396d100 84eb1ca4 84eb1c88
1c80: 8003ff50 803f9ce4 82767d6c 00000001 865221f4 00000001 84eb1cb4 84eb1ca8
1ca0: 84eb1ccc 84eb1cb0 80010dc4 8000fbf8 8000fc1c 84eb0000 00000001 00000001
1cc0: 84eb1cf4 84eb1cd0 8000fc58 84eb0000 00000001 00000001 86522200 000000c3
1ce0: 84eb0000 84eb1e58 84eb1d0c 84eb1cf8 803f9d18 803f9744 00000002 a0000013
1d00: 84eb1d3c 84eb1d10 8003fd9c 803f9ce4 000000c3 80010dac 84eb1dd4 86558200
1d20: 00000001 7fffffff 86ae5400 00000069 84eb1d54 84eb1d40 8035a380 8003fd30
1d40: 8035a338 86558200 84eb1dd4 84eb1d58 803c9950 8035a344 84eb1d8c 84eb1d68
1d60: 803f6270 805908fc 00000000 00000000 93713380 00000069 00000000 84eb1e58
1d80: 84eb1df4 84eb1d90 00000000 00000000 00000000 00000000 00000000 00000000
1da0: 00000000 00000000 84eb1dd4 93713380 84eb1f3c 00000069 00000000 00000000
1dc0: 84eb0000 00000000 84eb1eb4 84eb1dd8 80357428 803c94fc 8035f560 00000689
1de0: 00000000 00000001 ffffffff 00000000 00000000 00000000 00000000 00000000
1e00: 86770440 8035f560 00000000 00000000 8035f0f8 800aefd0 84eb1e58 00040000
1e20: 940453b0 0000003b a0000113 017fffff 84eb1e54 84eb1e40 80019124 8003f264
1e40: 800190e4 9381b180 84eb1eac 84eb1e58 8028c63c 800190f0 84eb1e7c 84eb1e68
1e60: 8002363c 00000069 93713380 00000000 84eb1d88 84eb1f3c 00000020 9381b594
1e80: 84eb1f60 84eb1f64 76fe94d0 00004000 84eb1eb4 84eb1ea0 000b9188 00000069
1ea0: 93713380 00004000 84eb1f8c 84eb1eb8 80358c74 80357388 80010dc4 8000fbf8
1ec0: 8000fc1c 84eb0000 00000001 84eb1f68 84eb1f04 84eb1ee0 8000fc58 80010dac
1ee0: 84eb1f2c 84eb1f20 8003f314 800235e8 00000000 8003f314 84eb1f1c 84eb1f08
1f00: 8003f314 8000fc28 0000008e 00000000 84eb1f2c 84eb1f20 800235e8 8003f264
1f20: 84eb1f44 84eb1f30 84eb1f64 84eb1f38 80048254 8031d998 80000000 00000000
1f40: 00000000 84eb1f58 00000001 00000000 00000000 00004000 000b91f1 00000000
1f60: fffffff7 00000000 00000014 000b9188 76fe94d0 00000121 8000dd84 00000000
1f80: 84eb1fa4 84eb1f90 80358cb8 80358bc8 00000000 00000000 00000000 84eb1fa8
1fa0: 8000dc00 80358ca4 00000014 000b9188 00000003 000b9188 00000069 00004000
1fc0: 00000014 000b9188 76fe94d0 00000121 00000000 0009f588 7ecfacf8 00001ffc
1fe0: ffffffff 7ecfac40 76f1800b 76f1b88c 60000010 00000003 0002c122 0002c0c2
Backtrace:
[<8035f770>] (skb_put+0x0/0x98) from [<8028bf50>] (sh_eth_poll+0x2c8/0xa10)
 r6:940453b0 r5:00030000 r4:9381b180
[<8028bc88>] (sh_eth_poll+0x0/0xa10) from [<80368ff0>] (net_rx_action+0x6c/0x150)
[<80368f84>] (net_rx_action+0x0/0x150) from [<800230d4>] (__do_softirq+0x90/0x128)
[<80023044>] (__do_softirq+0x0/0x128) from [<8002359c>] (irq_exit+0x50/0xa4)
[<8002354c>] (irq_exit+0x0/0xa4) from [<8000e4b0>] (handle_IRQ+0x6c/0x8c)
[<8000e444>] (handle_IRQ+0x0/0x8c) from [<800084c4>] (gic_handle_irq+0x3c/0x54)
 r5:80562074 r4:98802000
[<80008488>] (gic_handle_irq+0x0/0x54) from [<8000d7c0>] (__irq_svc+0x40/0x70)
Exception stack(0x84eb1bd8 to 0x84eb1c20)
1bc0:                                                       80594f2c 00000002
1be0: 84eb1c20 86770440 ffffffff 82766000 00000002 80594f2c 86530540 86530c40
1c00: 803f9d18 84eb1c2c 84eb1c30 84eb1c20 8003cd18 80070aa0 a0000013 ffffffff
 r6:ffffffff r5:a0000013 r4:80070aa0 r3:8003cd18
[<80070a84>] (__rcu_read_lock+0x0/0x2c) from [<8003cd18>] (__atomic_notifier_call_chain+0x24/0x54)
[<8003ccf4>] (__atomic_notifier_call_chain+0x0/0x54) from [<8003cd68>] (atomic_notifier_call_chain+0x20/0x28)
 r7:84eb0000 r6:9396d100 r5:86770440 r4:8276601c
[<8003cd48>] (atomic_notifier_call_chain+0x0/0x28) from [<8000dbd4>] (__switch_to+0x2c/0x58)
[<803f9738>] (__schedule+0x0/0x4fc) from [<803f9d18>] (preempt_schedule+0x40/0x5c)
[<803f9cd8>] (preempt_schedule+0x0/0x5c) from [<8003fd9c>] (__wake_up_sync_key+0x78/0x80)
 r4:a0000013 r3:00000002
[<8003fd24>] (__wake_up_sync_key+0x0/0x80) from [<8035a380>] (sock_def_readable+0x48/0x70)
 r8:00000069 r7:86ae5400 r6:7fffffff r5:00000001 r4:86558200
[<8035a338>] (sock_def_readable+0x0/0x70) from [<803c9950>] (unix_dgram_sendmsg+0x460/0x5e0)
 r4:86558200 r3:8035a338
[<803c94f0>] (unix_dgram_sendmsg+0x0/0x5e0) from [<80357428>] (sock_sendmsg+0xac/0xc8)
[<8035737c>] (sock_sendmsg+0x0/0xc8) from [<80358c74>] (sys_sendto+0xb8/0xdc)
 r7:00004000 r6:93713380 r5:00000069 r4:000b9188
[<80358bbc>] (sys_sendto+0x0/0xdc) from [<80358cb8>] (sys_send+0x20/0x28)
[<80358c98>] (sys_send+0x0/0x28) from [<8000dc00>] (ret_fast_syscall+0x0/0x30)
Code: e1a0c00d e92dd870 e24cb004 e24dd01c (e5902050)
---[ end trace 5d198d04c638f976 ]---
-- 
              yashi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sh_eth.c::sh_eth_rx(): mdp->rx_skbuff[entry] can be NULL
  2015-10-19 15:01 sh_eth.c::sh_eth_rx(): mdp->rx_skbuff[entry] can be NULL Yasushi SHOJI
@ 2015-10-20 20:48 ` Sergei Shtylyov
  2015-10-21  7:26   ` Yasushi SHOJI
  0 siblings, 1 reply; 5+ messages in thread
From: Sergei Shtylyov @ 2015-10-20 20:48 UTC (permalink / raw)
  To: Yasushi SHOJI, netdev

Hello.

On 10/19/2015 06:01 PM, Yasushi SHOJI wrote:

> In a low memory situation with netdev_alloc_skb() failure,
> mdp->rx_skbuff[entry] can be left NULL, however, sh_eth_rx() seems to
> access it without checking NULL or not in the following code:
>
> 			skb = mdp->rx_skbuff[entry];
> 			mdp->rx_skbuff[entry] = NULL;
> 			if (mdp->cd->rpadir)
> 				skb_reserve(skb, NET_IP_ALIGN);
> 			dma_unmap_single(&ndev->dev, rxdesc->addr,
> 					 ALIGN(mdp->rx_buf_sz, 16),
> 					 DMA_FROM_DEVICE);
>
> I've put BUG_ON() to test skb and got the following backtrace.  I can
> also enable slub poisoning to see some bad access.
>
> I'm not that familiar with this code base so I'm note including any
> patch yet.  I appreciate if someone with insight in this code give a
> quick look and tell me that it's a real one or not.  if this is a real
> case, I can take a deep look.

    If you got the oops, it's real. Thanks for the reporting. I guess I should 
check the new ravb driver as well...
    Do you want to try fixing the bug yourself?

> BTW, the backtrace is from old 3.4.74+ kernel but the current tip is
> very close.

     Yeah, this part didn't change in a long time...

> Thanks,

MBR, Sergei

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sh_eth.c::sh_eth_rx(): mdp->rx_skbuff[entry] can be NULL
  2015-10-20 20:48 ` Sergei Shtylyov
@ 2015-10-21  7:26   ` Yasushi SHOJI
  2015-10-22 21:17     ` Sergei Shtylyov
  0 siblings, 1 reply; 5+ messages in thread
From: Yasushi SHOJI @ 2015-10-21  7:26 UTC (permalink / raw)
  To: sergei.shtylyov; +Cc: netdev

Hi Sergei,

Thank your for your reply.

On Wed, 21 Oct 2015 05:48:01 +0900,
Sergei Shtylyov wrote:
> 
> On 10/19/2015 06:01 PM, Yasushi SHOJI wrote:
> 
> > I'm not that familiar with this code base so I'm note including any
> > patch yet.  I appreciate if someone with insight in this code give a
> > quick look and tell me that it's a real one or not.  if this is a real
> > case, I can take a deep look.
> 
>    If you got the oops, it's real. Thanks for the reporting. I guess I
> should check the new ravb driver as well...
>    Do you want to try fixing the bug yourself?

Sure.  I can dive in to this.  I appreciate if someone who has worked
on sh_eth.c give me some design advises or tell me the initial design
thoughts / what was the intention when allocation if failed.

My idea right now is to simply invalidate the descriptor when
netdev_alloc_skb() failed.  When next packet arrived, in near future,
the driver can try again to allocate the buffer and update the
corresponding descriptor if succeeds.  If memory is not yet available
when the controller is trying to use the invalid descriptor, the
controller will see it and DMA will stop.

Is it acceptable path to go?


Here is how I understand this driver:

sh_eth.c uses napi to handle incoming packet.  It calls
__napi_schedule() in its interrupt handler.  sh_eth_poll() is the pool
method of the driver.  sh_eth_poll calls sh_eth_rx() to handle
incoming packet.  And this is the function I need to fix.

The driver utilizes array of sk_buffs for tx and rx.  For rx, the
driver has an array of pointers of sk_buffs, rx_skbuff[]. This
rx_skbuff[] is filled with sk_buffs in sh_eth_ring_format() which is
called when the driver is open()ed.

The controller, the driver is targeted to, is GETHER.  This controller
is capable to DMA transfer the received packets to the system memory.
The link between them is described in "Receive Descriptor".

A receive descriptor corresponds to one sk_buff.  The controller
expects array of descriptors in the system memory and treat it as a
ring, meaning that the controller process each descriptor one by one.
Once the controller finished the last descriptor, it will go back to
the first one.

To achieve zero copy, the driver push the sk_buffs filled with
received packet to the netdev core with netif_receive_skb() then
netdev_alloc_skb() sk_buffs in the sh_eth_rx(), the poll method of the
driver, and update the corresponding descriptor.

If the allocation failed, it just leave the function, leaving old
pointer in the descriptor as is.  In some future, the controller will
access the descriptor and writes to the old memory address. (I haven't
checked the state of bits in the descriptor yet)



BTW, if any one has a bit of time, I have questions regarding to the
atomic allocation:

  Q1) is it OK to _always_ use napi_schedule()?
  Q2) is it OK to netdev_alloc_skb() in napi poll method? (sh_eth_rx)
      this, combined with Q1, leads to allocate sk_buff with GFP_ATOMIC
      all the time.
  Q3) if not, any alternative way to allocate sk_buff?
      ie.
        a) allocate sk_buff in pool out of interrupt context, and
           get sk_buff from it in softirq context.
	b) use dev_alloc_pages() instead.  The slab allocator seem to
	   fail if slabs are full but still order 0 pages are available.
-- 
             yashi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sh_eth.c::sh_eth_rx(): mdp->rx_skbuff[entry] can be NULL
  2015-10-21  7:26   ` Yasushi SHOJI
@ 2015-10-22 21:17     ` Sergei Shtylyov
  2015-10-23 11:05       ` Sergei Shtylyov
  0 siblings, 1 reply; 5+ messages in thread
From: Sergei Shtylyov @ 2015-10-22 21:17 UTC (permalink / raw)
  To: Yasushi SHOJI; +Cc: netdev

Hello.

On 10/21/2015 10:26 AM, Yasushi SHOJI wrote:

> Thank your for your reply.

    Not at all, I'm virtually a maintainer for that driver now, so trying to 
filter out the related mails even if I don't have time to read thru all the 
netdev mail.

>> On 10/19/2015 06:01 PM, Yasushi SHOJI wrote:
>>
>>> I'm not that familiar with this code base so I'm note including any
>>> patch yet.  I appreciate if someone with insight in this code give a
>>> quick look and tell me that it's a real one or not.  if this is a real
>>> case, I can take a deep look.
>>
>>     If you got the oops, it's real. Thanks for the reporting. I guess I
>> should check the new ravb driver as well...
>>     Do you want to try fixing the bug yourself?

> Sure.  I can dive in to this.  I appreciate if someone who has worked
> on sh_eth.c give me some design advises or tell me the initial design
> thoughts / what was the intention when allocation if failed.

    Hm, well, I seem to have some time to spend on fixing the issues in this 
driver (I noticed a couple while doing the AVB driver), so spending time on 
your "education" would seem somewhat inefficient... :-)

> My idea right now is to simply invalidate the descriptor when
> netdev_alloc_skb() failed.

    Well, it depends. If you're talking about the second loop in sh_eth_rx(), 
that seems a good idea (and it's what I've done for the dma_mapping_error() 
case in the ravb driver -- I just set the descriptor's data size field to 0). 
The OOM case seems to have been un-addressed in both drivers so far... If we 
take sh_eth_ring_format(), I believe the best course of action is to just fail 
on OOM since the driver doesn't correctly handle that case anyway AFAIR; and 
that was implemented in the ravb driver.

> When next packet arrived, in near future,
> the driver can try again to allocate the buffer and update the
> corresponding descriptor if succeeds.

    It would be too late, unless you still mean the RX refilling loop in this 
function.

> If memory is not yet available
> when the controller is trying to use the invalid descriptor, the
> controller will see it and DMA will stop.

    That means leaving RACT=0 and that's what the driver is even doing...
    Hm, then I don't understand how the error you've described can occur, 
unless we encounter OOM during sh_eth_ring_format()...

> Is it acceptable path to go?

    I'm not seeing a bug in this function, perhaps I'm missing something?

> Here is how I understand this driver:

[...]

> The driver utilizes array of sk_buffs for tx and rx.  For rx, the
> driver has an array of pointers of sk_buffs, rx_skbuff[]. This
> rx_skbuff[] is filled with sk_buffs in sh_eth_ring_format() which is
> called when the driver is open()ed.
>
> The controller, the driver is targeted to, is GETHER.

    Well, it depends on your SoC, it may be 100 Mbps Ether.

> A receive descriptor corresponds to one sk_buff.  The controller
> expects array of descriptors in the system memory and treat it as a
> ring, meaning that the controller process each descriptor one by one.
> Once the controller finished the last descriptor, it will go back to
> the first one.

    Yes, it seems a correct description.

> To achieve zero copy, the driver push the sk_buffs filled with
> received packet to the netdev core with netif_receive_skb() then
> netdev_alloc_skb() sk_buffs in the sh_eth_rx(), the poll method of the
> driver, and update the corresponding descriptor.

> If the allocation failed, it just leave the function, leaving old
> pointer in the descriptor as is.

    Yes, but note that it also leaves RACT=0, which basically means an invalid 
descriptor, encountering which the reception should just stop.

> In some future, the controller will
> access the descriptor and writes to the old memory address. (I haven't
> checked the state of bits in the descriptor yet)

   Check it.

> BTW, if any one has a bit of time, I have questions regarding to the
> atomic allocation:

    Sorry, I'm constantly short of time. Someone else will have to answer 
that. :-)

MBR, Sergei

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sh_eth.c::sh_eth_rx(): mdp->rx_skbuff[entry] can be NULL
  2015-10-22 21:17     ` Sergei Shtylyov
@ 2015-10-23 11:05       ` Sergei Shtylyov
  0 siblings, 0 replies; 5+ messages in thread
From: Sergei Shtylyov @ 2015-10-23 11:05 UTC (permalink / raw)
  To: Yasushi SHOJI; +Cc: netdev

On 10/23/2015 12:17 AM, Sergei Shtylyov wrote:

[...]

>> If memory is not yet available
>> when the controller is trying to use the invalid descriptor, the
>> controller will see it and DMA will stop.
>
>     That means leaving RACT=0 and that's what the driver is even doing...
>     Hm, then I don't understand how the error you've described can occur,
> unless we encounter OOM during sh_eth_ring_format()...
>
>> Is it acceptable path to go?
>
>     I'm not seeing a bug in this function, perhaps I'm missing something?

    Nevermind, I'm seeing the bug now -- occurred to me before I went to bed 
yesterday.

>> To achieve zero copy, the driver push the sk_buffs filled with
>> received packet to the netdev core with netif_receive_skb() then
>> netdev_alloc_skb() sk_buffs in the sh_eth_rx(), the poll method of the
>> driver, and update the corresponding descriptor.
>
>> If the allocation failed, it just leave the function, leaving old
>> pointer in the descriptor as is.
>
>     Yes, but note that it also leaves RACT=0, which basically means an invalid
> descriptor, encountering which the reception should just stop.

    The problem is that the first loop has no way of identifying the bad 
descriptors. Looks like we only can fix that by checking rx_skbuff[entry] for 
NULL.

MBR, Sergei

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-10-23 11:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-19 15:01 sh_eth.c::sh_eth_rx(): mdp->rx_skbuff[entry] can be NULL Yasushi SHOJI
2015-10-20 20:48 ` Sergei Shtylyov
2015-10-21  7:26   ` Yasushi SHOJI
2015-10-22 21:17     ` Sergei Shtylyov
2015-10-23 11:05       ` Sergei Shtylyov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.