From: Daniel Borkmann <daniel@iogearbox.net>
To: Ken-ichirou MATSUZAWA <chamaken@gmail.com>,
David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org, fw@strlen.de
Subject: Re: [PATCHv1 net-next 0/5] netlink: mmap: kernel panic and some issues
Date: Fri, 14 Aug 2015 12:38:21 +0200 [thread overview]
Message-ID: <55CDC51D.1060204@iogearbox.net> (raw)
In-Reply-To: <55CDBC84.8020605@iogearbox.net>
On 08/14/2015 12:01 PM, Daniel Borkmann wrote:
> On 08/14/2015 10:58 AM, Ken-ichirou MATSUZAWA wrote:
>> Hi,
>>
>> Thank you for taking your time.
>> Please let me explain these with code samples on gist.
>> I can not describe and arrange it well, sorry.
>>
>> normal socket nflog sample:
>> https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/nflog.c
>>
>> set iptables
>>
>> iptables -A INPUT -p icmp --icmp-type echo-request \
>> -j NFLOG --nflog-group 2 --nflog-threshold 4
>>
>> monitor nlmon (like netsniff-ng), run this sample and
>> ping -i 0.2 -c 10 from another hosts. This sample only shows receive
>> size and nlmsg_type. Same things can be done with rx mmaped socket.
>>
>> rx only mmaped nflog sample:
>> https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/rxring-nflog.c
>>
>> This sample gets a panic if monitoring nlmon.
>>
>> panic message:
>> https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/mmaped_netlink_panic
>>
>> I think it's because of accessing a skb_shared_info when releasing
>> skb, although mmaped netlink skb does not have a skb_shared_info. I
>> tried to fix this at patch 1 and 2 by introducing helper function
>> which will not access a skb_shared_info.
>>
>> And I think nm_status should be set to UNUSED when releasing it so
>> also tried to fix it patch 3.
>
> Ok, I'm trying to understand the issue: you are saying that whenever
> there's an skb_clone on an netlink mmaped skb, we have the situation
> that skb->data, skb->head etc points to the mmaped user space buffer
> slot, and thus _must_ have no shared info.
>
> Currently, what happens is that the shared info accesses whatever
> memory is there in the mmaped region. So when you already do an
> skb_clone() you should already get into trouble right there f.e. when
> we test for orphaning frags etc (if at the right offset in the mmap
> buffer, the tx_flags member would contain a SKBTX_DEV_ZEROCOPY bit).
Ken-ichirou, have you observed this issue only in relation to nlmon?
Haven't checked yet if there are any upper layer netlink consumers that
would call for some reason into skb_clone() as well. I am thinking that
if taps are indeed the only ones affected, it might probably not be
worth adding that much complexity for a fix itself, but to keep it simple
instead. I don't know if there are any real users of netlink mmap, but
if it would really be needed, we could think about it on a net-next basis?
It seems you have some other, separate fixes in your series, so you might
want to submit them separately against the net tree, instead?
include/linux/netlink.h | 4 ++++
net/netlink/af_netlink.c | 12 +++++++-----
2 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index 9120edb..42cdcd8 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -35,6 +35,10 @@ struct netlink_skb_parms {
#define NETLINK_CB(skb) (*(struct netlink_skb_parms*)&((skb)->cb))
#define NETLINK_CREDS(skb) (&NETLINK_CB((skb)).creds)
+static inline bool netlink_skb_is_mmaped(const struct sk_buff *skb)
+{
+ return NETLINK_CB(skb).flags & NETLINK_SKB_MMAPED;
+}
extern void netlink_table_grab(void);
extern void netlink_table_ungrab(void);
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 67d2104..4307446 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -238,6 +238,13 @@ static void __netlink_deliver_tap(struct sk_buff *skb)
static void netlink_deliver_tap(struct sk_buff *skb)
{
+ /* Netlink mmaped skbs must not access shared info, and thus
+ * are not allowed to be cloned. For now, just don't allow
+ * them to get inspected by taps.
+ */
+ if (netlink_skb_is_mmaped(skb))
+ return;
+
rcu_read_lock();
if (unlikely(!list_empty(&netlink_tap_all)))
@@ -278,11 +285,6 @@ static void netlink_rcv_wake(struct sock *sk)
}
#ifdef CONFIG_NETLINK_MMAP
-static bool netlink_skb_is_mmaped(const struct sk_buff *skb)
-{
- return NETLINK_CB(skb).flags & NETLINK_SKB_MMAPED;
-}
-
static bool netlink_rx_is_mmaped(struct sock *sk)
{
return nlk_sk(sk)->rx_ring.pg_vec != NULL;
--
1.9.3
>> ----
>>
>> With both tx/rx mmaped,
>>
>> both tx/rx mmaped nflog sample:
>> https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/ring-nflog.c
>>
>> This sample will not work, since msg->msg_iter.type in
>> netlink_sendmsg() is set to 1 (WRITE) when this sample calls
>> sendto(). patch 4 fix this by accepting it.
>>
>> ----
>>
>> After applying patch 1 and 2, rx only sample can work but it behaves
>> differ from normal one. patch 5 may fix this.
>>
>> And it also works well with my another code which set frame
>> nm_status to SKIP and passes it to worker threads and the worker
>> threads set status to UNUSED, even though ring becomes full.
>>
>> That my another code may set UNUSED status in random, not
>> sequensially, so that it seems I need to check whole ring.
>>
>> Thanks,
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-08-14 10:38 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-22 13:17 [RFC PATCH 0/5] netlink: mmap kernel panic and some issues Ken-ichirou MATSUZAWA
2015-08-12 8:28 ` [PATCHv1 net-next 0/5] netlink: mmap: " Ken-ichirou MATSUZAWA
2015-08-12 8:31 ` [PATCHv1 net-next 1/5] netlink: mmap: introduce mmaped skb helper functions Ken-ichirou MATSUZAWA
2015-08-12 8:32 ` [PATCHv1 net-next 2/5] netlink: mmap: apply " Ken-ichirou MATSUZAWA
2015-08-12 8:34 ` [PATCHv1 net-next 3/5] netlink: mmap: fix status for not delivered skb Ken-ichirou MATSUZAWA
2015-08-12 8:35 ` [PATCHv1 net-next 4/5] netlink: mmap: update tx type check Ken-ichirou MATSUZAWA
2015-08-12 8:38 ` [PATCHv1 net-next 5/5] netlink: mmap: notify only when NL_MMAP_STATUS_VALID frame exists Ken-ichirou MATSUZAWA
2015-08-12 23:38 ` [PATCHv1 net-next 0/5] netlink: mmap: kernel panic and some issues David Miller
2015-08-14 8:58 ` Ken-ichirou MATSUZAWA
2015-08-14 10:01 ` Daniel Borkmann
2015-08-14 10:38 ` Daniel Borkmann [this message]
2015-08-15 2:25 ` Ken-ichirou MATSUZAWA
2015-08-17 21:02 ` David Miller
2015-08-19 14:29 ` Daniel Borkmann
2015-09-02 0:04 ` Ken-ichirou MATSUZAWA
2015-09-02 9:47 ` Daniel Borkmann
2015-09-02 11:35 ` Ken-ichirou MATSUZAWA
2015-09-02 15:56 ` Daniel Borkmann
2015-09-02 22:27 ` Ken-ichirou MATSUZAWA
2015-09-07 14:54 ` Daniel Borkmann
2015-09-09 5:59 ` David Miller
2015-09-09 8:53 ` Thomas Graf
2015-09-09 9:22 ` Daniel Borkmann
2015-08-20 3:43 ` [PATCH net] netlink: mmap: fix tx type check Ken-ichirou MATSUZAWA
2015-08-23 23:06 ` David Miller
2015-08-20 5:54 ` [PATCH net] netlink: rx mmap: fix POLLIN condition Ken-ichirou MATSUZAWA
2015-08-26 3:17 ` David Miller
2015-08-28 7:00 ` Ken-ichirou MATSUZAWA
2015-08-28 7:05 ` [PATCH net] netlink: mmap: fix lookup frame position Ken-ichirou MATSUZAWA
2015-08-29 5:26 ` David Miller
2015-08-30 22:54 ` [PATCH net] netlink: rx mmap: fix POLLIN condition Ken-ichirou MATSUZAWA
2015-08-31 4:56 ` David Miller
2015-08-20 7:07 ` [PATCH net] netlink: mmap: fix status setting in skb destructor Ken-ichirou MATSUZAWA
2015-08-26 3:22 ` David Miller
2015-08-28 7:37 ` Ken-ichirou MATSUZAWA
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55CDC51D.1060204@iogearbox.net \
--to=daniel@iogearbox.net \
--cc=chamaken@gmail.com \
--cc=davem@davemloft.net \
--cc=fw@strlen.de \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).