[PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
@ 2012-08-07  8:55 Mel Gorman
  2012-08-08 19:14 ` Rik van Riel
  2012-08-08 22:50 ` David Miller
  0 siblings, 2 replies; 13+ messages in thread
From: Mel Gorman @ 2012-08-07  8:55 UTC (permalink / raw)
  To: David Miller
  Cc: Linux-MM, LKML, Linux-Netdev, Xen-devel, Konrad Rzeszutek Wilk,
	Ian Campbell, Andrew Morton

Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
for the following bug triggered by a xen network driver

[    1.908592] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[    1.908643] IP: [<ffffffffa0037750>] xennet_poll+0x980/0xec0 [xen_netfront]
[    1.908703] PGD ea1df067 PUD e8ada067 PMD 0
[    1.908774] Oops: 0000 [#1] SMP
[    1.908797] Modules linked in: fbcon tileblit font radeon bitblit softcursor ttm drm_kms_helper crc32c_intel xen_blkfront xen_netfront xen_fbfront fb_sys_fops sysimgblt sysfillrect syscopyarea +xen_kbdfront xenfs xen_privcmd
[    1.908938] CPU 0
[    1.908950] Pid: 2165, comm: ip Not tainted 3.5.0upstream-08854-g444fa66 #1
[    1.908983] RIP: e030:[<ffffffffa0037750>]  [<ffffffffa0037750>] xennet_poll+0x980/0xec0 [xen_netfront]
[    1.909029] RSP: e02b:ffff8800ffc03db8  EFLAGS: 00010282
[    1.909055] RAX: ffff8800ea010140 RBX: ffff8800f00e86c0 RCX: 000000000000009a
[    1.909055] RDX: 0000000000000040 RSI: 000000000000005a RDI: ffff8800fa7dee80
[    1.909055] RBP: ffff8800ffc03ee8 R08: ffff8800f00e86d8 R09: ffff8800ea010000
[    1.909055] R10: dead000000200200 R11: dead000000100100 R12: ffff8800fa7dee80
[    1.909055] R13: 000000000000005a R14: ffff8800fa7dee80 R15: 0000000000000200
[    1.909055] FS:  00007fbafc188700(0000) GS:ffff8800ffc00000(0000) knlGS:0000000000000000
[    1.909055] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[    1.909055] CR2: 0000000000000010 CR3: 00000000ea108000 CR4: 0000000000002660
[    1.909055] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.909055] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    1.909055] Process ip (pid: 2165, threadinfo ffff8800ea0f2000, task ffff8800fa783040)
[    1.909055] Stack:
[    1.909055]  ffff8800e27e5040 ffff8800ffc03e88 ffff8800ffc03e68 ffff8800ffc03e48
[    1.909055]  7fffffffffffffff ffff8800ffc03e00 ffff8800e27e5040 ffff8800f00e86d8
[    1.909055]  ffff8800ffc03eb0 00000040ffffffff ffff8800f00e8000 00000000ffc03e30
[    1.909055] Call Trace:
[    1.909055]  <IRQ>
[    1.909055]  [<ffffffff81066028>] ?  pvclock_clocksource_read+0x58/0xd0
[    1.909055]  [<ffffffff81486352>] net_rx_action+0x112/0x240
[    1.909055]  [<ffffffff8107f319>] __do_softirq+0xb9/0x190
[    1.909055]  [<ffffffff815d8d7c>] call_softirq+0x1c/0x30

The problem is that the xenfront driver is passing a NULL page to
__skb_fill_page_desc() which was unexpected. This patch checks that
there is a page before dereferencing.

Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 include/linux/skbuff.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 7632c87..8857669 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1256,7 +1256,7 @@ static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
 	 * do not lose pfmemalloc information as the pages would not be
 	 * allocated using __GFP_MEMALLOC.
 	 */
-	if (page->pfmemalloc && !page->mapping)
+	if (page && page->pfmemalloc && !page->mapping)
 		skb->pfmemalloc	= true;
 	frag->page.p		  = page;
 	frag->page_offset	  = off;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-07  8:55 [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag Mel Gorman
@ 2012-08-08 19:14 ` Rik van Riel
  2012-08-08 22:50 ` David Miller
  1 sibling, 0 replies; 13+ messages in thread
From: Rik van Riel @ 2012-08-08 19:14 UTC (permalink / raw)
  To: Mel Gorman
  Cc: David Miller, Linux-MM, LKML, Linux-Netdev, Xen-devel,
	Konrad Rzeszutek Wilk, Ian Campbell, Andrew Morton

On 08/07/2012 04:55 AM, Mel Gorman wrote:
> Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
> for the following bug triggered by a xen network driver
>
> [    1.908592] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> [    1.908643] IP: [<ffffffffa0037750>] xennet_poll+0x980/0xec0 [xen_netfront]
> [    1.908703] PGD ea1df067 PUD e8ada067 PMD 0
> [    1.908774] Oops: 0000 [#1] SMP
> [    1.908797] Modules linked in: fbcon tileblit font radeon bitblit softcursor ttm drm_kms_helper crc32c_intel xen_blkfront xen_netfront xen_fbfront fb_sys_fops sysimgblt sysfillrect syscopyarea +xen_kbdfront xenfs xen_privcmd
> [    1.908938] CPU 0
> [    1.908950] Pid: 2165, comm: ip Not tainted 3.5.0upstream-08854-g444fa66 #1
> [    1.908983] RIP: e030:[<ffffffffa0037750>]  [<ffffffffa0037750>] xennet_poll+0x980/0xec0 [xen_netfront]
> [    1.909029] RSP: e02b:ffff8800ffc03db8  EFLAGS: 00010282
> [    1.909055] RAX: ffff8800ea010140 RBX: ffff8800f00e86c0 RCX: 000000000000009a
> [    1.909055] RDX: 0000000000000040 RSI: 000000000000005a RDI: ffff8800fa7dee80
> [    1.909055] RBP: ffff8800ffc03ee8 R08: ffff8800f00e86d8 R09: ffff8800ea010000
> [    1.909055] R10: dead000000200200 R11: dead000000100100 R12: ffff8800fa7dee80
> [    1.909055] R13: 000000000000005a R14: ffff8800fa7dee80 R15: 0000000000000200
> [    1.909055] FS:  00007fbafc188700(0000) GS:ffff8800ffc00000(0000) knlGS:0000000000000000
> [    1.909055] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    1.909055] CR2: 0000000000000010 CR3: 00000000ea108000 CR4: 0000000000002660
> [    1.909055] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    1.909055] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [    1.909055] Process ip (pid: 2165, threadinfo ffff8800ea0f2000, task ffff8800fa783040)
> [    1.909055] Stack:
> [    1.909055]  ffff8800e27e5040 ffff8800ffc03e88 ffff8800ffc03e68 ffff8800ffc03e48
> [    1.909055]  7fffffffffffffff ffff8800ffc03e00 ffff8800e27e5040 ffff8800f00e86d8
> [    1.909055]  ffff8800ffc03eb0 00000040ffffffff ffff8800f00e8000 00000000ffc03e30
> [    1.909055] Call Trace:
> [    1.909055]  <IRQ>
> [    1.909055]  [<ffffffff81066028>] ?  pvclock_clocksource_read+0x58/0xd0
> [    1.909055]  [<ffffffff81486352>] net_rx_action+0x112/0x240
> [    1.909055]  [<ffffffff8107f319>] __do_softirq+0xb9/0x190
> [    1.909055]  [<ffffffff815d8d7c>] call_softirq+0x1c/0x30
>
> The problem is that the xenfront driver is passing a NULL page to
> __skb_fill_page_desc() which was unexpected. This patch checks that
> there is a page before dereferencing.
>
> Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Acked-by: Rik van Riel <riel@redhat.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-07  8:55 [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag Mel Gorman
  2012-08-08 19:14 ` Rik van Riel
@ 2012-08-08 22:50 ` David Miller
  2012-08-13 10:26   ` Mel Gorman
                     ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: David Miller @ 2012-08-08 22:50 UTC (permalink / raw)
  To: mgorman
  Cc: linux-mm, linux-kernel, netdev, xen-devel, konrad, Ian.Campbell, akpm

From: Mel Gorman <mgorman@suse.de>
Date: Tue, 7 Aug 2012 09:55:55 +0100

> Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
> for the following bug triggered by a xen network driver
 ...
> The problem is that the xenfront driver is passing a NULL page to
> __skb_fill_page_desc() which was unexpected. This patch checks that
> there is a page before dereferencing.
> 
> Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Mel Gorman <mgorman@suse.de>

That call to __skb_fill_page_desc() in xen-netfront.c looks completely bogus.
It's the only driver passing NULL here.

That whole song and dance figuring out what to do with the head
fragment page, depending upon whether the length is greater than the
RX_COPY_THRESHOLD, is completely unnecessary.

Just use something like a call to __pskb_pull_tail(skb, len) and all
that other crap around that area can simply be deleted.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-08 22:50 ` David Miller
@ 2012-08-13 10:26   ` Mel Gorman
  2012-08-13 10:47     ` Mel Gorman
  2012-08-13 15:41   ` [Xen-devel] " Konrad Rzeszutek Wilk
  2012-08-22 10:26   ` Ian Campbell
  2 siblings, 1 reply; 13+ messages in thread
From: Mel Gorman @ 2012-08-13 10:26 UTC (permalink / raw)
  To: David Miller
  Cc: linux-mm, linux-kernel, netdev, xen-devel, konrad, Ian.Campbell,
	Jeremy Fitzhardinge, akpm

On Wed, Aug 08, 2012 at 03:50:46PM -0700, David Miller wrote:
> From: Mel Gorman <mgorman@suse.de>
> Date: Tue, 7 Aug 2012 09:55:55 +0100
> 
> > Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
> > for the following bug triggered by a xen network driver
>  ...
> > The problem is that the xenfront driver is passing a NULL page to
> > __skb_fill_page_desc() which was unexpected. This patch checks that
> > there is a page before dereferencing.
> > 
> > Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Signed-off-by: Mel Gorman <mgorman@suse.de>
> 
> That call to __skb_fill_page_desc() in xen-netfront.c looks completely bogus.
> It's the only driver passing NULL here.
> 
> That whole song and dance figuring out what to do with the head
> fragment page, depending upon whether the length is greater than the
> RX_COPY_THRESHOLD, is completely unnecessary.
> 
> Just use something like a call to __pskb_pull_tail(skb, len) and all
> that other crap around that area can simply be deleted.

I looked at this for a while but I did not see how __pskb_pull_tail()
could be used sensibly but I'm simily not familiar with writing network
device drivers or Xen.

This messing with RX_COPY_THRESHOLD seems to be related to how the frontend
and backend communicate (maybe some fixed limitation of the xenbus). The
existing code looks like it is trying to take the fragments received and
pass them straight to the backend without copying by passing the fragments
to the backend without copying. I worry that if I try converting this to
__pskb_pull_tail() that it would either hit the limitation of xenbus or
introduce copying where it is not wanted.

I'm going to have to punt this to Jeremy and the other Xen folk as I'm not
sure what the original intention was and I don't have a Xen setup anywhere
to test any patch. Jeremy, xen folk? 

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-13 10:26   ` Mel Gorman
@ 2012-08-13 10:47     ` Mel Gorman
  2012-08-13 18:56       ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 13+ messages in thread
From: Mel Gorman @ 2012-08-13 10:47 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: linux-mm, linux-kernel, netdev, xen-devel, konrad, Ian.Campbell,
	David Miller, akpm

Resending to correct Jeremy's address.

On Wed, Aug 08, 2012 at 03:50:46PM -0700, David Miller wrote:
> From: Mel Gorman <mgorman@suse.de>
> Date: Tue, 7 Aug 2012 09:55:55 +0100
> 
> > Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
> > for the following bug triggered by a xen network driver
>  ...
> > The problem is that the xenfront driver is passing a NULL page to
> > __skb_fill_page_desc() which was unexpected. This patch checks that
> > there is a page before dereferencing.
> > 
> > Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Signed-off-by: Mel Gorman <mgorman@suse.de>
> 
> That call to __skb_fill_page_desc() in xen-netfront.c looks completely bogus.
> It's the only driver passing NULL here.
> 
> That whole song and dance figuring out what to do with the head
> fragment page, depending upon whether the length is greater than the
> RX_COPY_THRESHOLD, is completely unnecessary.
> 
> Just use something like a call to __pskb_pull_tail(skb, len) and all
> that other crap around that area can simply be deleted.

I looked at this for a while but I did not see how __pskb_pull_tail()
could be used sensibly but I'm simily not familiar with writing network
device drivers or Xen.

This messing with RX_COPY_THRESHOLD seems to be related to how the frontend
and backend communicate (maybe some fixed limitation of the xenbus). The
existing code looks like it is trying to take the fragments received and
pass them straight to the backend without copying by passing the fragments
to the backend without copying. I worry that if I try converting this to
__pskb_pull_tail() that it would either hit the limitation of xenbus or
introduce copying where it is not wanted.

I'm going to have to punt this to Jeremy and the other Xen folk as I'm not
sure what the original intention was and I don't have a Xen setup anywhere
to test any patch. Jeremy, xen folk? 

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xen-devel] [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-08 22:50 ` David Miller
  2012-08-13 10:26   ` Mel Gorman
@ 2012-08-13 15:41   ` Konrad Rzeszutek Wilk
  2012-08-14 10:05     ` Mel Gorman
  2012-08-22 10:26   ` Ian Campbell
  2 siblings, 1 reply; 13+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-08-13 15:41 UTC (permalink / raw)
  To: David Miller, Ian Campbell
  Cc: mgorman, xen-devel, netdev, linux-kernel, Ian.Campbell, linux-mm,
	konrad, akpm

On Wed, Aug 08, 2012 at 03:50:46PM -0700, David Miller wrote:
> From: Mel Gorman <mgorman@suse.de>
> Date: Tue, 7 Aug 2012 09:55:55 +0100
> 
> > Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
> > for the following bug triggered by a xen network driver
>  ...
> > The problem is that the xenfront driver is passing a NULL page to
> > __skb_fill_page_desc() which was unexpected. This patch checks that
> > there is a page before dereferencing.
> > 
> > Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Signed-off-by: Mel Gorman <mgorman@suse.de>
> 
> That call to __skb_fill_page_desc() in xen-netfront.c looks completely bogus.
> It's the only driver passing NULL here.

It looks to be passing a valid page pointer (at least by looking
at the code) so I am not sure how it got turned in a NULL.

But let me double-check by instrumenting the driver..
> 
> That whole song and dance figuring out what to do with the head
> fragment page, depending upon whether the length is greater than the
> RX_COPY_THRESHOLD, is completely unnecessary.
> 
> Just use something like a call to __pskb_pull_tail(skb, len) and all
> that other crap around that area can simply be deleted.

It looks like an overkill - it does a lot more than just allocate an SKB
and a page.

Deleting of extra code would be nice - however I am not going to be able
to do that for the next two weeks sadly - as my plate if full of debugging
some other stuff.

Lets see if Ian has some time.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-13 10:47     ` Mel Gorman
@ 2012-08-13 18:56       ` Jeremy Fitzhardinge
  2012-08-14 10:18         ` Mel Gorman
  0 siblings, 1 reply; 13+ messages in thread
From: Jeremy Fitzhardinge @ 2012-08-13 18:56 UTC (permalink / raw)
  To: Mel Gorman
  Cc: linux-mm, linux-kernel, netdev, xen-devel, konrad, Ian.Campbell,
	David Miller, akpm

On 08/13/2012 03:47 AM, Mel Gorman wrote:
> Resending to correct Jeremy's address.
>
> On Wed, Aug 08, 2012 at 03:50:46PM -0700, David Miller wrote:
>> From: Mel Gorman <mgorman@suse.de>
>> Date: Tue, 7 Aug 2012 09:55:55 +0100
>>
>>> Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
>>> for the following bug triggered by a xen network driver
>>  ...
>>> The problem is that the xenfront driver is passing a NULL page to
>>> __skb_fill_page_desc() which was unexpected. This patch checks that
>>> there is a page before dereferencing.
>>>
>>> Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> Signed-off-by: Mel Gorman <mgorman@suse.de>
>> That call to __skb_fill_page_desc() in xen-netfront.c looks completely bogus.
>> It's the only driver passing NULL here.
>>
>> That whole song and dance figuring out what to do with the head
>> fragment page, depending upon whether the length is greater than the
>> RX_COPY_THRESHOLD, is completely unnecessary.
>>
>> Just use something like a call to __pskb_pull_tail(skb, len) and all
>> that other crap around that area can simply be deleted.
> I looked at this for a while but I did not see how __pskb_pull_tail()
> could be used sensibly but I'm simily not familiar with writing network
> device drivers or Xen.
>
> This messing with RX_COPY_THRESHOLD seems to be related to how the frontend
> and backend communicate (maybe some fixed limitation of the xenbus). The
> existing code looks like it is trying to take the fragments received and
> pass them straight to the backend without copying by passing the fragments
> to the backend without copying. I worry that if I try converting this to
> __pskb_pull_tail() that it would either hit the limitation of xenbus or
> introduce copying where it is not wanted.
>
> I'm going to have to punt this to Jeremy and the other Xen folk as I'm not
> sure what the original intention was and I don't have a Xen setup anywhere
> to test any patch. Jeremy, xen folk? 

It's been a while since I've looked at that stuff, but as I remember,
the issue is that since the packet ring memory is shared with another
domain which may be untrustworthy, we want to make copies of the headers
before making any decisions based on them so that the other domain can't
change them after header processing but before they're actually sent. 
(The packet payload is considered less important, but of course the same
issue applies if you're using some kind of content-aware packet filter.)

So that's the rationale for always copying RX_COPY_THRESHOLD, even if
the packet is larger than that amount.  As far as I know, changing this
behaviour wouldn't break the ring protocol, but it does introduce a
potential security issue.

    J


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xen-devel] [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-13 15:41   ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2012-08-14 10:05     ` Mel Gorman
  2012-08-14 13:28       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 13+ messages in thread
From: Mel Gorman @ 2012-08-14 10:05 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: David Miller, Ian Campbell, xen-devel, netdev, linux-kernel,
	linux-mm, konrad, akpm

On Mon, Aug 13, 2012 at 11:41:44AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Aug 08, 2012 at 03:50:46PM -0700, David Miller wrote:
> > From: Mel Gorman <mgorman@suse.de>
> > Date: Tue, 7 Aug 2012 09:55:55 +0100
> > 
> > > Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
> > > for the following bug triggered by a xen network driver
> >  ...
> > > The problem is that the xenfront driver is passing a NULL page to
> > > __skb_fill_page_desc() which was unexpected. This patch checks that
> > > there is a page before dereferencing.
> > > 
> > > Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > Signed-off-by: Mel Gorman <mgorman@suse.de>
> > 
> > That call to __skb_fill_page_desc() in xen-netfront.c looks completely bogus.
> > It's the only driver passing NULL here.
> 
> It looks to be passing a valid page pointer (at least by looking
> at the code) so I am not sure how it got turned in a NULL.
> 

Are we looking at different code bases? I see this and I was assuming it
was the source of the bug.

	__skb_fill_page_desc(skb, 0, NULL, 0, 0);

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-13 18:56       ` Jeremy Fitzhardinge
@ 2012-08-14 10:18         ` Mel Gorman
  0 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2012-08-14 10:18 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: linux-mm, linux-kernel, netdev, xen-devel, konrad, Ian.Campbell,
	David Miller, akpm

On Mon, Aug 13, 2012 at 11:56:48AM -0700, Jeremy Fitzhardinge wrote:
> On 08/13/2012 03:47 AM, Mel Gorman wrote:
> > Resending to correct Jeremy's address.
> >
> > On Wed, Aug 08, 2012 at 03:50:46PM -0700, David Miller wrote:
> >> From: Mel Gorman <mgorman@suse.de>
> >> Date: Tue, 7 Aug 2012 09:55:55 +0100
> >>
> >>> Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
> >>> for the following bug triggered by a xen network driver
> >>  ...
> >>> The problem is that the xenfront driver is passing a NULL page to
> >>> __skb_fill_page_desc() which was unexpected. This patch checks that
> >>> there is a page before dereferencing.
> >>>
> >>> Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> >>> Signed-off-by: Mel Gorman <mgorman@suse.de>
> >> That call to __skb_fill_page_desc() in xen-netfront.c looks completely bogus.
> >> It's the only driver passing NULL here.
> >>
> >> That whole song and dance figuring out what to do with the head
> >> fragment page, depending upon whether the length is greater than the
> >> RX_COPY_THRESHOLD, is completely unnecessary.
> >>
> >> Just use something like a call to __pskb_pull_tail(skb, len) and all
> >> that other crap around that area can simply be deleted.
> > I looked at this for a while but I did not see how __pskb_pull_tail()
> > could be used sensibly but I'm simily not familiar with writing network
> > device drivers or Xen.
> >
> > This messing with RX_COPY_THRESHOLD seems to be related to how the frontend
> > and backend communicate (maybe some fixed limitation of the xenbus). The
> > existing code looks like it is trying to take the fragments received and
> > pass them straight to the backend without copying by passing the fragments
> > to the backend without copying. I worry that if I try converting this to
> > __pskb_pull_tail() that it would either hit the limitation of xenbus or
> > introduce copying where it is not wanted.
> >
> > I'm going to have to punt this to Jeremy and the other Xen folk as I'm not
> > sure what the original intention was and I don't have a Xen setup anywhere
> > to test any patch. Jeremy, xen folk? 
> 
> It's been a while since I've looked at that stuff, but as I remember,
> the issue is that since the packet ring memory is shared with another
> domain which may be untrustworthy, we want to make copies of the headers
> before making any decisions based on them so that the other domain can't
> change them after header processing but before they're actually sent. 
> (The packet payload is considered less important, but of course the same
> issue applies if you're using some kind of content-aware packet filter.)
> 
> So that's the rationale for always copying RX_COPY_THRESHOLD, even if
> the packet is larger than that amount.  As far as I know, changing this
> behaviour wouldn't break the ring protocol, but it does introduce a
> potential security issue.
> 

David,

This leaves us somewhat in a pickle. If I'm reading this right (which I may
not be) it means that using __pskb_pull_tail() will work in the ideal case
but potentially introduces a subtle issue in the future. This bug could be
"fixed" in the driver by partially reverting [01c68026: xen: netfront:
convert to SKB paged frag API.]. That could be viewed as sweeping the
problem under the carpet but it does contain the problem to the xen-netfront
driver. A new helper could be created like __skb_clear_page_desc but that
is overkill for one driver and feels as ugly.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xen-devel] [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-14 10:05     ` Mel Gorman
@ 2012-08-14 13:28       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 13+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-08-14 13:28 UTC (permalink / raw)
  To: Mel Gorman
  Cc: David Miller, Ian Campbell, xen-devel, netdev, linux-kernel,
	linux-mm, konrad, akpm

On Tue, Aug 14, 2012 at 11:05:22AM +0100, Mel Gorman wrote:
> On Mon, Aug 13, 2012 at 11:41:44AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Aug 08, 2012 at 03:50:46PM -0700, David Miller wrote:
> > > From: Mel Gorman <mgorman@suse.de>
> > > Date: Tue, 7 Aug 2012 09:55:55 +0100
> > > 
> > > > Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
> > > > for the following bug triggered by a xen network driver
> > >  ...
> > > > The problem is that the xenfront driver is passing a NULL page to
> > > > __skb_fill_page_desc() which was unexpected. This patch checks that
> > > > there is a page before dereferencing.
> > > > 
> > > > Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > > Signed-off-by: Mel Gorman <mgorman@suse.de>
> > > 
> > > That call to __skb_fill_page_desc() in xen-netfront.c looks completely bogus.
> > > It's the only driver passing NULL here.
> > 
> > It looks to be passing a valid page pointer (at least by looking
> > at the code) so I am not sure how it got turned in a NULL.
> > 
> 
> Are we looking at different code bases? I see this and I was assuming it
> was the source of the bug.
> 
> 	__skb_fill_page_desc(skb, 0, NULL, 0, 0);

Yes! Well, that is embarrassing. I was looking at the first invocation of 
__skb_fill_page_desc (which is in xennet_alloc_rx_buffers) <sigh>

> 
> -- 
> Mel Gorman
> SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-08 22:50 ` David Miller
  2012-08-13 10:26   ` Mel Gorman
  2012-08-13 15:41   ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2012-08-22 10:26   ` Ian Campbell
  2012-08-23 14:17     ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 13+ messages in thread
From: Ian Campbell @ 2012-08-22 10:26 UTC (permalink / raw)
  To: David Miller
  Cc: mgorman, linux-mm, linux-kernel, netdev, xen-devel, konrad, akpm

On Wed, 2012-08-08 at 23:50 +0100, David Miller wrote:
> Just use something like a call to __pskb_pull_tail(skb, len) and all
> that other crap around that area can simply be deleted.

I think you mean something like this, which works for me, although I've
only lightly tested it.

Ian.

8<----------------------------------------

>From 9e47e3e87a33b45974448649a97859a479183041 Mon Sep 17 00:00:00 2001
From: Ian Campbell <ian.campbell@citrix.com>
Date: Wed, 22 Aug 2012 10:15:29 +0100
Subject: [PATCH] xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX

I'm slightly concerned by the "only in exceptional circumstances"
comment on __pskb_pull_tail but the structure of an skb just created
by netfront shouldn't hit any of the especially slow cases.

This approach still does slightly more work than the old way, since if
we pull up the entire first frag we now have to shuffle everything
down where before we just received into the right place in the first
place.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: xen-devel@lists.xensource.com
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/net/xen-netfront.c |   39 ++++++++++-----------------------------
 1 files changed, 10 insertions(+), 29 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 3089990..650f79a 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -57,8 +57,7 @@
 static const struct ethtool_ops xennet_ethtool_ops;
 
 struct netfront_cb {
-	struct page *page;
-	unsigned offset;
+	int pull_to;
 };
 
 #define NETFRONT_SKB_CB(skb)	((struct netfront_cb *)((skb)->cb))
@@ -867,15 +866,9 @@ static int handle_incoming_queue(struct net_device *dev,
 	struct sk_buff *skb;
 
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
-		struct page *page = NETFRONT_SKB_CB(skb)->page;
-		void *vaddr = page_address(page);
-		unsigned offset = NETFRONT_SKB_CB(skb)->offset;
-
-		memcpy(skb->data, vaddr + offset,
-		       skb_headlen(skb));
+		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		if (page != skb_frag_page(&skb_shinfo(skb)->frags[0]))
-			__free_page(page);
+		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);
@@ -913,7 +906,6 @@ static int xennet_poll(struct napi_struct *napi, int budget)
 	struct sk_buff_head errq;
 	struct sk_buff_head tmpq;
 	unsigned long flags;
-	unsigned int len;
 	int err;
 
 	spin_lock(&np->rx_lock);
@@ -955,24 +947,13 @@ err:
 			}
 		}
 
-		NETFRONT_SKB_CB(skb)->page =
-			skb_frag_page(&skb_shinfo(skb)->frags[0]);
-		NETFRONT_SKB_CB(skb)->offset = rx->offset;
-
-		len = rx->status;
-		if (len > RX_COPY_THRESHOLD)
-			len = RX_COPY_THRESHOLD;
-		skb_put(skb, len);
+		NETFRONT_SKB_CB(skb)->pull_to = rx->status;
+		if (NETFRONT_SKB_CB(skb)->pull_to > RX_COPY_THRESHOLD)
+			NETFRONT_SKB_CB(skb)->pull_to = RX_COPY_THRESHOLD;
 
-		if (rx->status > len) {
-			skb_shinfo(skb)->frags[0].page_offset =
-				rx->offset + len;
-			skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status - len);
-			skb->data_len = rx->status - len;
-		} else {
-			__skb_fill_page_desc(skb, 0, NULL, 0, 0);
-			skb_shinfo(skb)->nr_frags = 0;
-		}
+		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
+		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
+		skb->data_len = rx->status;
 
 		i = xennet_fill_frags(np, skb, &tmpq);
 
@@ -999,7 +980,7 @@ err:
 		 * receive throughout using the standard receive
 		 * buffer size was cut by 25%(!!!).
 		 */
-		skb->truesize += skb->data_len - (RX_COPY_THRESHOLD - len);
+		skb->truesize += skb->data_len - RX_COPY_THRESHOLD;
 		skb->len += skb->data_len;
 
 		if (rx->flags & XEN_NETRXF_csum_blank)
-- 
1.7.2.5




^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-22 10:26   ` Ian Campbell
@ 2012-08-23 14:17     ` Konrad Rzeszutek Wilk
  2012-08-30 16:24       ` David Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-08-23 14:17 UTC (permalink / raw)
  To: Ian Campbell
  Cc: David Miller, mgorman, linux-mm, linux-kernel, netdev, xen-devel,
	konrad, akpm

On Wed, Aug 22, 2012 at 11:26:47AM +0100, Ian Campbell wrote:
> On Wed, 2012-08-08 at 23:50 +0100, David Miller wrote:
> > Just use something like a call to __pskb_pull_tail(skb, len) and all
> > that other crap around that area can simply be deleted.
> 
> I think you mean something like this, which works for me, although I've
> only lightly tested it.
> 

I've tested it heavily and works great.

Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
and I took a look at it too and:

Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Ian.
> 
> 8<----------------------------------------
> 
> >From 9e47e3e87a33b45974448649a97859a479183041 Mon Sep 17 00:00:00 2001
> From: Ian Campbell <ian.campbell@citrix.com>
> Date: Wed, 22 Aug 2012 10:15:29 +0100
> Subject: [PATCH] xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX
> 
> I'm slightly concerned by the "only in exceptional circumstances"
> comment on __pskb_pull_tail but the structure of an skb just created
> by netfront shouldn't hit any of the especially slow cases.
> 
> This approach still does slightly more work than the old way, since if
> we pull up the entire first frag we now have to shuffle everything
> down where before we just received into the right place in the first
> place.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Jeremy Fitzhardinge <jeremy@goop.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: xen-devel@lists.xensource.com
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/xen-netfront.c |   39 ++++++++++-----------------------------
>  1 files changed, 10 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 3089990..650f79a 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -57,8 +57,7 @@
>  static const struct ethtool_ops xennet_ethtool_ops;
>  
>  struct netfront_cb {
> -	struct page *page;
> -	unsigned offset;
> +	int pull_to;
>  };
>  
>  #define NETFRONT_SKB_CB(skb)	((struct netfront_cb *)((skb)->cb))
> @@ -867,15 +866,9 @@ static int handle_incoming_queue(struct net_device *dev,
>  	struct sk_buff *skb;
>  
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
> -		struct page *page = NETFRONT_SKB_CB(skb)->page;
> -		void *vaddr = page_address(page);
> -		unsigned offset = NETFRONT_SKB_CB(skb)->offset;
> -
> -		memcpy(skb->data, vaddr + offset,
> -		       skb_headlen(skb));
> +		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		if (page != skb_frag_page(&skb_shinfo(skb)->frags[0]))
> -			__free_page(page);
> +		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -913,7 +906,6 @@ static int xennet_poll(struct napi_struct *napi, int budget)
>  	struct sk_buff_head errq;
>  	struct sk_buff_head tmpq;
>  	unsigned long flags;
> -	unsigned int len;
>  	int err;
>  
>  	spin_lock(&np->rx_lock);
> @@ -955,24 +947,13 @@ err:
>  			}
>  		}
>  
> -		NETFRONT_SKB_CB(skb)->page =
> -			skb_frag_page(&skb_shinfo(skb)->frags[0]);
> -		NETFRONT_SKB_CB(skb)->offset = rx->offset;
> -
> -		len = rx->status;
> -		if (len > RX_COPY_THRESHOLD)
> -			len = RX_COPY_THRESHOLD;
> -		skb_put(skb, len);
> +		NETFRONT_SKB_CB(skb)->pull_to = rx->status;
> +		if (NETFRONT_SKB_CB(skb)->pull_to > RX_COPY_THRESHOLD)
> +			NETFRONT_SKB_CB(skb)->pull_to = RX_COPY_THRESHOLD;
>  
> -		if (rx->status > len) {
> -			skb_shinfo(skb)->frags[0].page_offset =
> -				rx->offset + len;
> -			skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status - len);
> -			skb->data_len = rx->status - len;
> -		} else {
> -			__skb_fill_page_desc(skb, 0, NULL, 0, 0);
> -			skb_shinfo(skb)->nr_frags = 0;
> -		}
> +		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
> +		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> +		skb->data_len = rx->status;
>  
>  		i = xennet_fill_frags(np, skb, &tmpq);
>  
> @@ -999,7 +980,7 @@ err:
>  		 * receive throughout using the standard receive
>  		 * buffer size was cut by 25%(!!!).
>  		 */
> -		skb->truesize += skb->data_len - (RX_COPY_THRESHOLD - len);
> +		skb->truesize += skb->data_len - RX_COPY_THRESHOLD;
>  		skb->len += skb->data_len;
>  
>  		if (rx->flags & XEN_NETRXF_csum_blank)
> -- 
> 1.7.2.5
> 
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag
  2012-08-23 14:17     ` Konrad Rzeszutek Wilk
@ 2012-08-30 16:24       ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2012-08-30 16:24 UTC (permalink / raw)
  To: konrad.wilk
  Cc: Ian.Campbell, mgorman, linux-mm, linux-kernel, netdev, xen-devel,
	konrad, akpm

From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Thu, 23 Aug 2012 10:17:40 -0400

> On Wed, Aug 22, 2012 at 11:26:47AM +0100, Ian Campbell wrote:
>> On Wed, 2012-08-08 at 23:50 +0100, David Miller wrote:
>> > Just use something like a call to __pskb_pull_tail(skb, len) and all
>> > that other crap around that area can simply be deleted.
>> 
>> I think you mean something like this, which works for me, although I've
>> only lightly tested it.
>> 
> 
> I've tested it heavily and works great.
> 
> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> and I took a look at it too and:
> 
> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Applied, thanks everyone.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-08-30 16:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-07  8:55 [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag Mel Gorman
2012-08-08 19:14 ` Rik van Riel
2012-08-08 22:50 ` David Miller
2012-08-13 10:26   ` Mel Gorman
2012-08-13 10:47     ` Mel Gorman
2012-08-13 18:56       ` Jeremy Fitzhardinge
2012-08-14 10:18         ` Mel Gorman
2012-08-13 15:41   ` [Xen-devel] " Konrad Rzeszutek Wilk
2012-08-14 10:05     ` Mel Gorman
2012-08-14 13:28       ` Konrad Rzeszutek Wilk
2012-08-22 10:26   ` Ian Campbell
2012-08-23 14:17     ` Konrad Rzeszutek Wilk
2012-08-30 16:24       ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).