All of lore.kernel.org
 help / color / mirror / Atom feed
* net: GPF in __netlink_ns_capable
@ 2016-01-15 22:31 Dmitry Vyukov
  2016-01-16  0:08 ` Richard Weinberger
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2016-01-15 22:31 UTC (permalink / raw)
  To: David S. Miller, Herbert Xu, Thomas Graf, Daniel Borkmann,
	Ken-ichirou MATSUZAWA, Nicolas Dichtel, Florian Westphal, netdev,
	LKML
  Cc: syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet

Hello,

The following program causes GPF in __netlink_ns_capable:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <stdint.h>
#include <string.h>
#include <sys/syscall.h>
#include <unistd.h>

int main()
{
  syscall(SYS_mmap, 0x20000000ul, 0xd000ul, 0x3ul, 0x32ul,
          0xfffffffffffffffful, 0x0ul);
  int fd = syscall(SYS_socket, 0x10ul, 0x3ul, 0x14ul, 0, 0, 0);
  *(uint32_t*)0x200067bb = (uint32_t)0x12;
  *(uint32_t*)0x200067bf = (uint32_t)0xffffffffffff1000;
  *(uint64_t*)0x200067c3 = (uint64_t)0x0;
  *(uint16_t*)0x200067cb = (uint16_t)0x4;
  syscall(SYS_write, fd, 0x200067bbul, 0x12ul, 0, 0, 0);
  return 0;
}

general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 3 PID: 7448 Comm: syz-executor Not tainted 4.4.0+ #255
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88006a534740 ti: ffff880063240000 task.ti: ffff880063240000
RIP: 0010:[<ffffffff8529bfbb>]  [<ffffffff8529bfbb>]
__netlink_ns_capable+0x8b/0x120
RSP: 0018:ffff880063247578  EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000079 RSI: ffffffff87597ba0 RDI: 00000000000003c8
RBP: ffff880063247590 R08: ffffed000c4c8f3d R09: ffff8800626479d0
R10: ffffed000c4c8f3e R11: 1ffff1000c4c8f3a R12: ffffffff87597ba0
R13: 000000000000000c R14: ffff880065895400 R15: ffff880063e4d338
FS:  0000000002638880(0063) GS:ffff88006d700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000200067bb CR3: 0000000063149000 CR4: 00000000000006e0
Stack:
 ffff880063752100 000000000000000c ffff880065895381 ffff8800632475b0
 ffffffff8529c0a5 00000000ffffffff dffffc0000000000 ffff880063247700
 ffffffff84986fef 1ffff1000c648ebf ffff880062646b10 0000000000000000
Call Trace:
 [<     inline     >] netlink_ns_capable net/netlink/af_netlink.c:1417
 [<ffffffff8529c0a5>] netlink_capable+0x25/0x30 net/netlink/af_netlink.c:1432
 [<ffffffff84986fef>] ib_nl_handle_resolve_resp+0xbf/0x910
drivers/infiniband/core/sa_query.c:792
 [<ffffffff852a34bd>] netlink_dump+0x38d/0xb20 net/netlink/af_netlink.c:2837
 [<ffffffff852a5844>] __netlink_dump_start+0x554/0x7e0
net/netlink/af_netlink.c:2934
 [<     inline     >] netlink_dump_start include/linux/netlink.h:175
 [<ffffffff8495ce63>] ibnl_rcv_msg+0x3c3/0x4b0
drivers/infiniband/core/netlink.c:184
 [<ffffffff852aded7>] netlink_rcv_skb+0x297/0x390 net/netlink/af_netlink.c:3016
 [<ffffffff8495d1ab>] ibnl_rcv+0x25b/0x300 drivers/infiniband/core/netlink.c:226
 [<     inline     >] netlink_unicast_kernel net/netlink/af_netlink.c:1834
 [<ffffffff852abd3a>] netlink_unicast+0x47a/0x700 net/netlink/af_netlink.c:1860
 [<ffffffff852ad046>] netlink_sendmsg+0x1086/0x1760
net/netlink/af_netlink.c:2511
 [<     inline     >] sock_sendmsg_nosec net/socket.c:611
 [<ffffffff85103bba>] sock_sendmsg+0xca/0x110 net/socket.c:621
 [<ffffffff85103e16>] sock_write_iter+0x216/0x3a0 net/socket.c:820
 [<     inline     >] new_sync_write fs/read_write.c:517
 [<ffffffff8178d122>] __vfs_write+0x302/0x480 fs/read_write.c:530
 [<ffffffff8178e9c7>] vfs_write+0x167/0x4a0 fs/read_write.c:577
 [<     inline     >] SYSC_write fs/read_write.c:624
 [<ffffffff81791cb1>] SyS_write+0x111/0x220 fs/read_write.c:616
 [<ffffffff8626be76>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 9f 00 00 00 48 8b 5b 18 48 b8
00 00 00 00 00 fc ff df 48 8d bb c8 03 00 00 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 76 48 8b 9b c8 03 00 00 48 b8 00 00 00 00 00 fc
RIP  [<ffffffff8529bfbb>] __netlink_ns_capable+0x8b/0x120
net/netlink/af_netlink.c:1399
 RSP <ffff880063247578>
---[ end trace 53f9276d885fafc4 ]---

On commit 67990608c8b95d2b8ccc29932376ae73d5818727.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net: GPF in __netlink_ns_capable
  2016-01-15 22:31 net: GPF in __netlink_ns_capable Dmitry Vyukov
@ 2016-01-16  0:08 ` Richard Weinberger
  2016-01-18  9:20   ` Herbert Xu
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Weinberger @ 2016-01-16  0:08 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: David S. Miller, Herbert Xu, Thomas Graf, Daniel Borkmann,
	Ken-ichirou MATSUZAWA, Nicolas Dichtel, Florian Westphal, netdev,
	LKML, syzkaller, Kostya Serebryany, Alexander Potapenko,
	Sasha Levin, Eric Dumazet

On Fri, Jan 15, 2016 at 11:31 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Call Trace:
>  [<     inline     >] netlink_ns_capable net/netlink/af_netlink.c:1417
>  [<ffffffff8529c0a5>] netlink_capable+0x25/0x30 net/netlink/af_netlink.c:1432

Hmm, we're crashing because NETLINK_CB(skb).sk is NULL.
netlink_dump() creates a new skb without a netlink control block,
but infiniband's dump functions use netlink_capable() which needs a valid
NETLINK_CB(skb).sk.

What about something like that?

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 81dc1bb..bb40ec5 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -919,6 +919,7 @@ static void netlink_skb_set_owner_r(struct sk_buff
*skb, struct sock *sk)
 {
        WARN_ON(skb->sk != NULL);
        skb->sk = sk;
+       NETLINK_CB(skb).sk = sk;
        skb->destructor = netlink_skb_destructor;
        atomic_add(skb->truesize, &sk->sk_rmem_alloc);
        sk_mem_charge(sk, skb->truesize);
-- 
Thanks,
//richard

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: net: GPF in __netlink_ns_capable
  2016-01-16  0:08 ` Richard Weinberger
@ 2016-01-18  9:20   ` Herbert Xu
  2016-01-18 20:21     ` Eric W. Biederman
  0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2016-01-18  9:20 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: David S. Miller, Thomas Graf, Daniel Borkmann,
	Ken-ichirou MATSUZAWA, Nicolas Dichtel, Florian Westphal, netdev,
	LKML, syzkaller, Kostya Serebryany, Alexander Potapenko,
	Sasha Levin, Eric Dumazet, Dmitry Vyukov, Eric W. Biederman

On Sat, Jan 16, 2016 at 01:08:33AM +0100, Richard Weinberger wrote:
> On Fri, Jan 15, 2016 at 11:31 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > Call Trace:
> >  [<     inline     >] netlink_ns_capable net/netlink/af_netlink.c:1417
> >  [<ffffffff8529c0a5>] netlink_capable+0x25/0x30 net/netlink/af_netlink.c:1432
> 
> Hmm, we're crashing because NETLINK_CB(skb).sk is NULL.
> netlink_dump() creates a new skb without a netlink control block,
> but infiniband's dump functions use netlink_capable() which needs a valid
> NETLINK_CB(skb).sk.
> 
> What about something like that?

No you can't do it here as netlink_unicast also calls this and for
that case you'd be overwriting the original sending user-space
socket with the kernel socket.

I'm adding Eric Bierderman as he wrote some of the code in question.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net: GPF in __netlink_ns_capable
  2016-01-18  9:20   ` Herbert Xu
@ 2016-01-18 20:21     ` Eric W. Biederman
  2016-01-18 20:26       ` Eric W. Biederman
  0 siblings, 1 reply; 9+ messages in thread
From: Eric W. Biederman @ 2016-01-18 20:21 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Richard Weinberger, David S. Miller, Thomas Graf,
	Daniel Borkmann, Ken-ichirou MATSUZAWA, Nicolas Dichtel,
	Florian Westphal, netdev, LKML, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Sasha Levin, Eric Dumazet, Dmitry Vyukov

Herbert Xu <herbert@gondor.apana.org.au> writes:

> On Sat, Jan 16, 2016 at 01:08:33AM +0100, Richard Weinberger wrote:
>> On Fri, Jan 15, 2016 at 11:31 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> > Call Trace:
>> >  [<     inline     >] netlink_ns_capable net/netlink/af_netlink.c:1417
>> >  [<ffffffff8529c0a5>] netlink_capable+0x25/0x30 net/netlink/af_netlink.c:1432
>> 
>> Hmm, we're crashing because NETLINK_CB(skb).sk is NULL.
>> netlink_dump() creates a new skb without a netlink control block,
>> but infiniband's dump functions use netlink_capable() which needs a valid
>> NETLINK_CB(skb).sk.
>> 
>> What about something like that?
>
> No you can't do it here as netlink_unicast also calls this and for
> that case you'd be overwriting the original sending user-space
> socket with the kernel socket.
>
> I'm adding Eric Bierderman as he wrote some of the code in question.

*Scratches my head*

I think I am just going to recommend removing that bit of code from
the infiniband stack.

There are several things very wrong here.

First netlinnk_capable and it's ilk are for the very specific purpose of
handling backwards compatibility as a truly clean solution of checking
at open or connect time would break existing applications.
ib_nl_handle_resolv_resp is new code.  So it can almost certainly do
something cleaner.

netlink_capable is very much not for filtering netlink dumps, but for
filtering the queries themselves.  So it appears the capability check is
very much in the wrong place.

All of this is newish code and apparently never even tested as this code
should have failed this way for everyone.  So since the code does not
work not apparently has never worked, it is probably easiest just to
remove the problematic code (AKA revert), and start fresh and not
something that requires backwards compatibility hacks from day one.

By new I mean code that came in through the commit below.

commit 2ca546b92a024d07adedd15b4c262b1c2c0786ec
Author: Kaike Wan <kaike.wan@intel.com>
Date:   Fri Aug 14 08:52:09 2015 -0400

    IB/sa: Route SA pathrecord query through netlink
    
    This patch routes a SA pathrecord query to netlink first and processes the
    response appropriately. If a failure is returned, the request will be sent
    through IB. The decision whether to route the request to netlink first is
    determined by the presence of a listener for the local service netlink
    multicast group. If the user-space local service netlink multicast group
    listener is not present, the request will be sent through IB, just like
    what is currently being done.
    
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    Signed-off-by: John Fleck <john.fleck@intel.com>
    Signed-off-by: Ira Weiny <ira.weiny@intel.com>
    Signed-off-by: Doug Ledford <dledford@redhat.com>

Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net: GPF in __netlink_ns_capable
  2016-01-18 20:21     ` Eric W. Biederman
@ 2016-01-18 20:26       ` Eric W. Biederman
  2016-01-19 20:47         ` Wan, Kaike
  2016-01-20 14:35         ` Wan, Kaike
  0 siblings, 2 replies; 9+ messages in thread
From: Eric W. Biederman @ 2016-01-18 20:26 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Richard Weinberger, David S. Miller, Thomas Graf,
	Daniel Borkmann, Ken-ichirou MATSUZAWA, Nicolas Dichtel,
	Florian Westphal, netdev, LKML, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Sasha Levin, Eric Dumazet, Dmitry Vyukov,
	Kaike Wan, John Fleck, Ira Weiny, Doug Ledford


Apparently we have missed entirely the folks who added this chunk
of this code to the kernel on this thread, so adding them now.

ebiederm@xmission.com (Eric W. Biederman) writes:

> Herbert Xu <herbert@gondor.apana.org.au> writes:
>
>> On Sat, Jan 16, 2016 at 01:08:33AM +0100, Richard Weinberger wrote:
>>> On Fri, Jan 15, 2016 at 11:31 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> > Call Trace:
>>> >  [<     inline     >] netlink_ns_capable net/netlink/af_netlink.c:1417
>>> >  [<ffffffff8529c0a5>] netlink_capable+0x25/0x30 net/netlink/af_netlink.c:1432
>>> 
>>> Hmm, we're crashing because NETLINK_CB(skb).sk is NULL.
>>> netlink_dump() creates a new skb without a netlink control block,
>>> but infiniband's dump functions use netlink_capable() which needs a valid
>>> NETLINK_CB(skb).sk.
>>> 
>>> What about something like that?
>>
>> No you can't do it here as netlink_unicast also calls this and for
>> that case you'd be overwriting the original sending user-space
>> socket with the kernel socket.
>>
>> I'm adding Eric Bierderman as he wrote some of the code in question.
>
> *Scratches my head*
>
> I think I am just going to recommend removing that bit of code from
> the infiniband stack.
>
> There are several things very wrong here.
>
> First netlinnk_capable and it's ilk are for the very specific purpose of
> handling backwards compatibility as a truly clean solution of checking
> at open or connect time would break existing applications.
> ib_nl_handle_resolv_resp is new code.  So it can almost certainly do
> something cleaner.
>
> netlink_capable is very much not for filtering netlink dumps, but for
> filtering the queries themselves.  So it appears the capability check is
> very much in the wrong place.
>
> All of this is newish code and apparently never even tested as this code
> should have failed this way for everyone.  So since the code does not
> work not apparently has never worked, it is probably easiest just to
> remove the problematic code (AKA revert), and start fresh and not
> something that requires backwards compatibility hacks from day one.
>
> By new I mean code that came in through the commit below.
>
> commit 2ca546b92a024d07adedd15b4c262b1c2c0786ec
> Author: Kaike Wan <kaike.wan@intel.com>
> Date:   Fri Aug 14 08:52:09 2015 -0400
>
>     IB/sa: Route SA pathrecord query through netlink
>     
>     This patch routes a SA pathrecord query to netlink first and processes the
>     response appropriately. If a failure is returned, the request will be sent
>     through IB. The decision whether to route the request to netlink first is
>     determined by the presence of a listener for the local service netlink
>     multicast group. If the user-space local service netlink multicast group
>     listener is not present, the request will be sent through IB, just like
>     what is currently being done.
>     
>     Signed-off-by: Kaike Wan <kaike.wan@intel.com>
>     Signed-off-by: John Fleck <john.fleck@intel.com>
>     Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>     Signed-off-by: Doug Ledford <dledford@redhat.com>

What was this code trying to do with netlink_capable besides oops the
kernel?

Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: net: GPF in __netlink_ns_capable
  2016-01-18 20:26       ` Eric W. Biederman
@ 2016-01-19 20:47         ` Wan, Kaike
  2016-01-20 14:35         ` Wan, Kaike
  1 sibling, 0 replies; 9+ messages in thread
From: Wan, Kaike @ 2016-01-19 20:47 UTC (permalink / raw)
  To: Eric W. Biederman, Herbert Xu
  Cc: Richard Weinberger, David S. Miller, Thomas Graf,
	Daniel Borkmann, Ken-ichirou MATSUZAWA, Nicolas Dichtel,
	Florian Westphal, netdev, LKML, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Sasha Levin, Eric Dumazet, Dmitry Vyukov,
	Fleck, John, Weiny, Ira, Doug Ledford

I need to do some more investigation before getting back.  The original patches were tested in Kernel 4.3 and apparently no crash was observed at the time.  Adding netlink_capable() in the patch was meant to make sure that only admin has access to the IB netlink service.

Kaike

> -----Original Message-----
> From: Eric W. Biederman [mailto:ebiederm@xmission.com]
> Sent: Monday, January 18, 2016 3:27 PM
> To: Herbert Xu
> Cc: Richard Weinberger; David S. Miller; Thomas Graf; Daniel Borkmann;
> Ken-ichirou MATSUZAWA; Nicolas Dichtel; Florian Westphal; netdev; LKML;
> syzkaller; Kostya Serebryany; Alexander Potapenko; Sasha Levin; Eric
> Dumazet; Dmitry Vyukov; Wan, Kaike; Fleck, John; Weiny, Ira; Doug Ledford
> Subject: Re: net: GPF in __netlink_ns_capable
> 
> 
> Apparently we have missed entirely the folks who added this chunk of this
> code to the kernel on this thread, so adding them now.
> 
> ebiederm@xmission.com (Eric W. Biederman) writes:
> 
> > Herbert Xu <herbert@gondor.apana.org.au> writes:
> >
> >> On Sat, Jan 16, 2016 at 01:08:33AM +0100, Richard Weinberger wrote:
> >>> On Fri, Jan 15, 2016 at 11:31 PM, Dmitry Vyukov <dvyukov@google.com>
> wrote:
> >>> > Call Trace:
> >>> >  [<     inline     >] netlink_ns_capable net/netlink/af_netlink.c:1417
> >>> >  [<ffffffff8529c0a5>] netlink_capable+0x25/0x30
> >>> > net/netlink/af_netlink.c:1432
> >>>
> >>> Hmm, we're crashing because NETLINK_CB(skb).sk is NULL.
> >>> netlink_dump() creates a new skb without a netlink control block,
> >>> but infiniband's dump functions use netlink_capable() which needs a
> >>> valid NETLINK_CB(skb).sk.
> >>>
> >>> What about something like that?
> >>
> >> No you can't do it here as netlink_unicast also calls this and for
> >> that case you'd be overwriting the original sending user-space socket
> >> with the kernel socket.
> >>
> >> I'm adding Eric Bierderman as he wrote some of the code in question.
> >
> > *Scratches my head*
> >
> > I think I am just going to recommend removing that bit of code from
> > the infiniband stack.
> >
> > There are several things very wrong here.
> >
> > First netlinnk_capable and it's ilk are for the very specific purpose
> > of handling backwards compatibility as a truly clean solution of
> > checking at open or connect time would break existing applications.
> > ib_nl_handle_resolv_resp is new code.  So it can almost certainly do
> > something cleaner.
> >
> > netlink_capable is very much not for filtering netlink dumps, but for
> > filtering the queries themselves.  So it appears the capability check
> > is very much in the wrong place.
> >
> > All of this is newish code and apparently never even tested as this
> > code should have failed this way for everyone.  So since the code does
> > not work not apparently has never worked, it is probably easiest just
> > to remove the problematic code (AKA revert), and start fresh and not
> > something that requires backwards compatibility hacks from day one.
> >
> > By new I mean code that came in through the commit below.
> >
> > commit 2ca546b92a024d07adedd15b4c262b1c2c0786ec
> > Author: Kaike Wan <kaike.wan@intel.com>
> > Date:   Fri Aug 14 08:52:09 2015 -0400
> >
> >     IB/sa: Route SA pathrecord query through netlink
> >
> >     This patch routes a SA pathrecord query to netlink first and processes the
> >     response appropriately. If a failure is returned, the request will be sent
> >     through IB. The decision whether to route the request to netlink first is
> >     determined by the presence of a listener for the local service netlink
> >     multicast group. If the user-space local service netlink multicast group
> >     listener is not present, the request will be sent through IB, just like
> >     what is currently being done.
> >
> >     Signed-off-by: Kaike Wan <kaike.wan@intel.com>
> >     Signed-off-by: John Fleck <john.fleck@intel.com>
> >     Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> >     Signed-off-by: Doug Ledford <dledford@redhat.com>
> 
> What was this code trying to do with netlink_capable besides oops the kernel?
> 
> Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: net: GPF in __netlink_ns_capable
  2016-01-18 20:26       ` Eric W. Biederman
  2016-01-19 20:47         ` Wan, Kaike
@ 2016-01-20 14:35         ` Wan, Kaike
  2016-01-20 15:00           ` Herbert Xu
  1 sibling, 1 reply; 9+ messages in thread
From: Wan, Kaike @ 2016-01-20 14:35 UTC (permalink / raw)
  To: Eric W. Biederman, Herbert Xu
  Cc: Richard Weinberger, David S. Miller, Thomas Graf,
	Daniel Borkmann, Ken-ichirou MATSUZAWA, Nicolas Dichtel,
	Florian Westphal, netdev, LKML, syzkaller, Kostya Serebryany,
	Alexander Potapenko, Sasha Levin, Eric Dumazet, Dmitry Vyukov,
	Fleck, John, Weiny, Ira, Doug Ledford, Jason Gunthorpe

>From the code (netlink_dump() in net/netlink/af_netlink.c), it shows that a skb is allocated without initializing the skb->cb[] field, which will cause oops if netlink_capable() is called with the duplicate skb. This will happen if the netlink_dump_start() path is followed (in ibnl_rcv_msg() in drivers/infiniband/core/netlink.c). However, for the IB netlink local service, we handle only request RDMA_NL_LS_OP_SET_TIMEOUT and response to RDMA_NL_LS_OP_RESOLVE, which directly call the registered dump function (ib_nl_handle_resolve_resp() and ib_nl_handle_resolve_resp()). See the following snippet:

/*
			 * For response or local service set_timeout request,
			 * there is no need to use netlink_dump_start.
			 */
			if (!(nlh->nlmsg_flags & NLM_F_REQUEST) ||
			    (index == RDMA_NL_LS &&
			     op == RDMA_NL_LS_OP_SET_TIMEOUT)) {
				struct netlink_callback cb = {
					.skb = skb,
					.nlh = nlh,
					.dump = client->cb_table[op].dump,
					.module = client->cb_table[op].module,
				};

				return cb.dump(skb, &cb);
			}

I have just tested the code with rping and ibacm with Doug's k.o/for-4.3 and k.o/for-4.5 branches:

git://github.com/dledford/linux.git

root@phifs011 wfr-ibacm]# rping -c -v -d -C 1 -a 10.228.216.26
count 1
created cm_id 0x24065d0
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x24065d0 (parent)
cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x24065d0 (parent)
rdma_resolve_addr - rdma_resolve_route successful
created pd 0x240b640
created channel 0x2406c30
created cq 0x2406290
created qp 0x2406c50
rping_setup_buffers called on cb 0x2403010
allocated & registered buffers...
cq_thread started.
cma_event type RDMA_CM_EVENT_ESTABLISHED cma_id 0x24065d0 (parent)
ESTABLISHED
rmda_connect successful
RDMA addr 2406da0 rkey 252600 len 64
send completion
recv completion
RDMA addr 2406d10 rkey 242500 len 64
send completion
recv completion
ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
cma_event type RDMA_CM_EVENT_DISCONNECTED cma_id 0x24065d0 (parent)
client DISCONNECT EVENT...
rping_free_buffers called on cb 0x2403010
destroy cm_id 0x24065d0
[root@phifs011 wfr-ibacm]#
[root@phifs011 wfr-ibacm]#
[root@phifs011 wfr-ibacm]# uname -a
Linux phifs011 4.3.0-rc3+ #2 SMP Thu Oct 29 09:40:20 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux


[root@phifs011 ~]# rping -c -v -d -C 1 -a 10.228.216.26
count 1
created cm_id 0xTCP: request_sock_TCP: Possible SYN flooding on port 6125. Sending cookies.  Check SNMP counters.
15fb5d0
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x15fb5d0 (parent)
cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x15fb5d0 (parent)
rdma_resolve_addr - rdma_resolve_route successful
created pd 0x1600640
created channel 0x15fbc30
created cq 0x15fb290
created qp 0x15fbc50
rping_setup_buffers called on cb 0x15f8010
allocated & registered buffers...
cq_thread started.
cma_event type RDMA_CM_EVENT_ESTABLISHED cma_id 0x15fb5d0 (parent)
ESTABLISHED
rmda_connect successful
RDMA addr 15fbda0 rkey 50600 len 64
send completion
recv completion
RDMA addr 15fbd10 rkey 40500 len 64
send completion
recv completion
ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
cma_event type RDMA_CM_EVENT_DISCONNECTED cma_id 0x15fb5d0 (parent)
client DISCONNECT EVENT...
rping_free_buffers called on cb 0x15f8010
destroy cm_id 0x15fb5d0
[root@phifs011 ~]# uname -a
Linux phifs011 4.4.0-rc6+ #1 SMP Wed Jan 20 10:03:27 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
[root@phifs011 ~]# cat /var/log/ibacm.log |grep _nl_
1453303250.873: acm_nl_receive: nlmsg: len 152 type 0x1000 flags 0x1 seq 1 pid 0
1453303250.873: acm_nl_process_resolve: path use 0x2
1453303250.873: acm_nl_parse_path_attr: service_id 0x1061c06
1453303250.873: acm_nl_parse_path_attr: path dgid fe80::11:7500:78:ad92
1453303250.873: acm_nl_parse_path_attr: path sgid fe80::11:7500:78:a424
1453303250.873: acm_nl_parse_path_attr: pkey 0xffff
1453303250.873: acm_nl_parse_path_attr: qos_class 0x0
1453303250.873: acm_nl_send: acm status success
1453303388.175: acm_nl_receive: nlmsg: len 152 type 0x1000 flags 0x1 seq 2 pid 0
1453303388.175: acm_nl_process_resolve: path use 0x2
1453303388.175: acm_nl_parse_path_attr: service_id 0x1061c06
1453303388.175: acm_nl_parse_path_attr: path dgid fe80::11:7500:78:ad92
1453303388.175: acm_nl_parse_path_attr: path sgid fe80::11:7500:78:a424
1453303388.175: acm_nl_parse_path_attr: pkey 0xffff
1453303388.175: acm_nl_parse_path_attr: qos_class 0x0
1453303388.175: acm_nl_send: acm status success

How could this cause crash?

Kaike

> -----Original Message-----
> From: Eric W. Biederman [mailto:ebiederm@xmission.com]
> Sent: Monday, January 18, 2016 3:27 PM
> To: Herbert Xu
> Cc: Richard Weinberger; David S. Miller; Thomas Graf; Daniel Borkmann;
> Ken-ichirou MATSUZAWA; Nicolas Dichtel; Florian Westphal; netdev; LKML;
> syzkaller; Kostya Serebryany; Alexander Potapenko; Sasha Levin; Eric
> Dumazet; Dmitry Vyukov; Wan, Kaike; Fleck, John; Weiny, Ira; Doug Ledford
> Subject: Re: net: GPF in __netlink_ns_capable
> 
> 
> Apparently we have missed entirely the folks who added this chunk of this
> code to the kernel on this thread, so adding them now.
> 
> ebiederm@xmission.com (Eric W. Biederman) writes:
> 
> > Herbert Xu <herbert@gondor.apana.org.au> writes:
> >
> >> On Sat, Jan 16, 2016 at 01:08:33AM +0100, Richard Weinberger wrote:
> >>> On Fri, Jan 15, 2016 at 11:31 PM, Dmitry Vyukov <dvyukov@google.com>
> wrote:
> >>> > Call Trace:
> >>> >  [<     inline     >] netlink_ns_capable net/netlink/af_netlink.c:1417
> >>> >  [<ffffffff8529c0a5>] netlink_capable+0x25/0x30
> >>> > net/netlink/af_netlink.c:1432
> >>>
> >>> Hmm, we're crashing because NETLINK_CB(skb).sk is NULL.
> >>> netlink_dump() creates a new skb without a netlink control block,
> >>> but infiniband's dump functions use netlink_capable() which needs a
> >>> valid NETLINK_CB(skb).sk.
> >>>
> >>> What about something like that?
> >>
> >> No you can't do it here as netlink_unicast also calls this and for
> >> that case you'd be overwriting the original sending user-space socket
> >> with the kernel socket.
> >>
> >> I'm adding Eric Bierderman as he wrote some of the code in question.
> >
> > *Scratches my head*
> >
> > I think I am just going to recommend removing that bit of code from
> > the infiniband stack.
> >
> > There are several things very wrong here.
> >
> > First netlinnk_capable and it's ilk are for the very specific purpose
> > of handling backwards compatibility as a truly clean solution of
> > checking at open or connect time would break existing applications.
> > ib_nl_handle_resolv_resp is new code.  So it can almost certainly do
> > something cleaner.
> >
> > netlink_capable is very much not for filtering netlink dumps, but for
> > filtering the queries themselves.  So it appears the capability check
> > is very much in the wrong place.
> >
> > All of this is newish code and apparently never even tested as this
> > code should have failed this way for everyone.  So since the code does
> > not work not apparently has never worked, it is probably easiest just
> > to remove the problematic code (AKA revert), and start fresh and not
> > something that requires backwards compatibility hacks from day one.
> >
> > By new I mean code that came in through the commit below.
> >
> > commit 2ca546b92a024d07adedd15b4c262b1c2c0786ec
> > Author: Kaike Wan <kaike.wan@intel.com>
> > Date:   Fri Aug 14 08:52:09 2015 -0400
> >
> >     IB/sa: Route SA pathrecord query through netlink
> >
> >     This patch routes a SA pathrecord query to netlink first and processes the
> >     response appropriately. If a failure is returned, the request will be sent
> >     through IB. The decision whether to route the request to netlink first is
> >     determined by the presence of a listener for the local service netlink
> >     multicast group. If the user-space local service netlink multicast group
> >     listener is not present, the request will be sent through IB, just like
> >     what is currently being done.
> >
> >     Signed-off-by: Kaike Wan <kaike.wan@intel.com>
> >     Signed-off-by: John Fleck <john.fleck@intel.com>
> >     Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> >     Signed-off-by: Doug Ledford <dledford@redhat.com>
> 
> What was this code trying to do with netlink_capable besides oops the kernel?
> 
> Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net: GPF in __netlink_ns_capable
  2016-01-20 14:35         ` Wan, Kaike
@ 2016-01-20 15:00           ` Herbert Xu
  2016-01-20 18:14             ` Wan, Kaike
  0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2016-01-20 15:00 UTC (permalink / raw)
  To: Wan, Kaike
  Cc: Eric W. Biederman, Richard Weinberger, David S. Miller,
	Thomas Graf, Daniel Borkmann, Ken-ichirou MATSUZAWA,
	Nicolas Dichtel, Florian Westphal, netdev, LKML, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet, Dmitry Vyukov, Fleck, John, Weiny, Ira,
	Doug Ledford, Jason Gunthorpe

On Wed, Jan 20, 2016 at 02:35:59PM +0000, Wan, Kaike wrote:
> >From the code (netlink_dump() in net/netlink/af_netlink.c), it shows that a skb is allocated without initializing the skb->cb[] field, which will cause oops if netlink_capable() is called with the duplicate skb. This will happen if the netlink_dump_start() path is followed (in ibnl_rcv_msg() in drivers/infiniband/core/netlink.c). However, for the IB netlink local service, we handle only request RDMA_NL_LS_OP_SET_TIMEOUT and response to RDMA_NL_LS_OP_RESOLVE, which directly call the registered dump function (ib_nl_handle_resolve_resp() and ib_nl_handle_resolve_resp()). See the following snippet:

You'll find a reproducer in the original email:

http://lkml.iu.edu/hypermail/linux/kernel/1601.1/06505.html

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: net: GPF in __netlink_ns_capable
  2016-01-20 15:00           ` Herbert Xu
@ 2016-01-20 18:14             ` Wan, Kaike
  0 siblings, 0 replies; 9+ messages in thread
From: Wan, Kaike @ 2016-01-20 18:14 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric W. Biederman, Richard Weinberger, David S. Miller,
	Thomas Graf, Daniel Borkmann, Ken-ichirou MATSUZAWA,
	Nicolas Dichtel, Florian Westphal, netdev, LKML, syzkaller,
	Kostya Serebryany, Alexander Potapenko, Sasha Levin,
	Eric Dumazet, Dmitry Vyukov, Fleck, John, Weiny, Ira,
	Doug Ledford, Jason Gunthorpe

The problem was caused by the RDMA_NL_LS_OP_RESOLVE request (not response) packet sent by the user application, which falls through the netlink_dump path and eventually calls ib_nl_handle_resp() with a new skb with uninitialized control block. Checking the NETLINK_CB(skb).sk before calling netlink_capable() will fix the problem.

I will submit a patch soon.

Kaike

> -----Original Message-----
> From: Herbert Xu [mailto:herbert@gondor.apana.org.au]
> Sent: Wednesday, January 20, 2016 10:00 AM
> To: Wan, Kaike
> Cc: Eric W. Biederman; Richard Weinberger; David S. Miller; Thomas Graf;
> Daniel Borkmann; Ken-ichirou MATSUZAWA; Nicolas Dichtel; Florian
> Westphal; netdev; LKML; syzkaller; Kostya Serebryany; Alexander Potapenko;
> Sasha Levin; Eric Dumazet; Dmitry Vyukov; Fleck, John; Weiny, Ira; Doug
> Ledford; Jason Gunthorpe
> Subject: Re: net: GPF in __netlink_ns_capable
> 
> On Wed, Jan 20, 2016 at 02:35:59PM +0000, Wan, Kaike wrote:
> > >From the code (netlink_dump() in net/netlink/af_netlink.c), it shows that a
> skb is allocated without initializing the skb->cb[] field, which will cause oops
> if netlink_capable() is called with the duplicate skb. This will happen if the
> netlink_dump_start() path is followed (in ibnl_rcv_msg() in
> drivers/infiniband/core/netlink.c). However, for the IB netlink local service,
> we handle only request RDMA_NL_LS_OP_SET_TIMEOUT and response to
> RDMA_NL_LS_OP_RESOLVE, which directly call the registered dump function
> (ib_nl_handle_resolve_resp() and ib_nl_handle_resolve_resp()). See the
> following snippet:
> 
> You'll find a reproducer in the original email:
> 
> http://lkml.iu.edu/hypermail/linux/kernel/1601.1/06505.html
> 
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page:
> http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-01-20 18:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-15 22:31 net: GPF in __netlink_ns_capable Dmitry Vyukov
2016-01-16  0:08 ` Richard Weinberger
2016-01-18  9:20   ` Herbert Xu
2016-01-18 20:21     ` Eric W. Biederman
2016-01-18 20:26       ` Eric W. Biederman
2016-01-19 20:47         ` Wan, Kaike
2016-01-20 14:35         ` Wan, Kaike
2016-01-20 15:00           ` Herbert Xu
2016-01-20 18:14             ` Wan, Kaike

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.