From: Yonghong Song <yhs@fb.com>
To: Carlos Antonio Neira Bustos <cneirabustos@gmail.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Eric Biederman <ebiederm@xmission.com>,
"brouer@redhat.com" <brouer@redhat.com>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next V9 1/3] bpf: new helper to obtain namespace data from current task
Date: Wed, 28 Aug 2019 20:53:25 +0000 [thread overview]
Message-ID: <4faeb577-387a-7186-e060-f0ca76395823@fb.com> (raw)
In-Reply-To: <20190828203951.qo4kaloahcnvp7nw@ebpf-metal>
On 8/28/19 1:39 PM, Carlos Antonio Neira Bustos wrote:
> Yonghong,
>
> Thanks for the pointer, I fixed this bug, but I found another one that's triggered
> now the test program I included in tools/testing/selftests/bpf/test_pidns.
> It's seemed that fname was not correctly setup when passing it to filename_lookup.
> This is fixed now and I'm doing some more testing.
> I think I'll remove the tests on samples/bpf as they are mostly end on -EPERM as
> the fix intended.
> Is ok to remove them and just focus to finish the self tests code?.
Yes, the samples/bpf test case can be removed.
Could you create a selftest with tracpoint net/netif_receive_skb, which
also uses the proposed helper? net/netif_receive_skb will happen in
interrupt context and it should catch the issue as well if
filename_lookup still get called in interrupt context.
>
> Bests
>
> On Wed, Aug 14, 2019 at 01:25:06AM -0400, carlos antonio neira bustos wrote:
>> Thank you very much!
>>
>> Bests
>>
>> El mié., 14 de ago. de 2019 00:50, Yonghong Song <yhs@fb.com> escribió:
>>
>>>
>>>
>>> On 8/13/19 5:56 PM, Carlos Antonio Neira Bustos wrote:
>>>> On Tue, Aug 13, 2019 at 11:11:14PM +0000, Yonghong Song wrote:
>>>>>
>>>>>
>>>>> On 8/13/19 11:47 AM, Carlos Neira wrote:
>>>>>> From: Carlos <cneirabustos@gmail.com>
>>>>>>
>>>>>> New bpf helper bpf_get_current_pidns_info.
>>>>>> This helper obtains the active namespace from current and returns
>>>>>> pid, tgid, device and namespace id as seen from that namespace,
>>>>>> allowing to instrument a process inside a container.
>>>>>>
>>>>>> Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
>>>>>> ---
>>>>>> fs/internal.h | 2 --
>>>>>> fs/namei.c | 1 -
>>>>>> include/linux/bpf.h | 1 +
>>>>>> include/linux/namei.h | 4 +++
>>>>>> include/uapi/linux/bpf.h | 31 ++++++++++++++++++++++-
>>>>>> kernel/bpf/core.c | 1 +
>>>>>> kernel/bpf/helpers.c | 64
>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> kernel/trace/bpf_trace.c | 2 ++
>>>>>> 8 files changed, 102 insertions(+), 4 deletions(-)
>>>>>>
>>> [...]
>>>>>>
>>>>>> +BPF_CALL_2(bpf_get_current_pidns_info, struct bpf_pidns_info *,
>>> pidns_info, u32,
>>>>>> + size)
>>>>>> +{
>>>>>> + const char *pidns_path = "/proc/self/ns/pid";
>>>>>> + struct pid_namespace *pidns = NULL;
>>>>>> + struct filename *tmp = NULL;
>>>>>> + struct inode *inode;
>>>>>> + struct path kp;
>>>>>> + pid_t tgid = 0;
>>>>>> + pid_t pid = 0;
>>>>>> + int ret;
>>>>>> + int len;
>>>>>
>>>>
>>>> Thank you very much for catching this!.
>>>> Could you share how to replicate this bug?.
>>>
>>> The config is attached. just run trace_ns_info and you
>>> can reproduce the issue.
>>>
>>>>
>>>>> I am running your sample program and get the following kernel bug:
>>>>>
>>>>> ...
>>>>> [ 26.414825] BUG: sleeping function called from invalid context at
>>>>> /data/users/yhs/work/net-next/fs
>>>>> /dcache.c:843
>>>>> [ 26.416314] in_atomic(): 1, irqs_disabled(): 0, pid: 1911, name: ping
>>>>> [ 26.417189] CPU: 0 PID: 1911 Comm: ping Tainted: G W
>>>>> 5.3.0-rc1+ #280
>>>>> [ 26.418182] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>>>>> BIOS 1.9.3-1.el7.centos 04/01/2
>>>>> 014
>>>>> [ 26.419393] Call Trace:
>>>>> [ 26.419697] <IRQ>
>>>>> [ 26.419960] dump_stack+0x46/0x5b
>>>>> [ 26.420434] ___might_sleep+0xe4/0x110
>>>>> [ 26.420894] dput+0x2a/0x200
>>>>> [ 26.421265] walk_component+0x10c/0x280
>>>>> [ 26.421773] link_path_walk+0x327/0x560
>>>>> [ 26.422280] ? proc_ns_dir_readdir+0x1a0/0x1a0
>>>>> [ 26.422848] ? path_init+0x232/0x330
>>>>> [ 26.423364] path_lookupat+0x88/0x200
>>>>> [ 26.423808] ? selinux_parse_skb.constprop.69+0x124/0x430
>>>>> [ 26.424521] filename_lookup+0xaf/0x190
>>>>> [ 26.425031] ? simple_attr_release+0x20/0x20
>>>>> [ 26.425560] bpf_get_current_pidns_info+0xfa/0x190
>>>>> [ 26.426168] bpf_prog_83627154cefed596+0xe66/0x1000
>>>>> [ 26.426779] trace_call_bpf+0xb5/0x160
>>>>> [ 26.427317] ? __netif_receive_skb_core+0x1/0xbb0
>>>>> [ 26.427929] ? __netif_receive_skb_core+0x1/0xbb0
>>>>> [ 26.428496] kprobe_perf_func+0x4d/0x280
>>>>> [ 26.428986] ? tracing_record_taskinfo_skip+0x1a/0x30
>>>>> [ 26.429584] ? tracing_record_taskinfo+0xe/0x80
>>>>> [ 26.430152] ? ttwu_do_wakeup.isra.114+0xcf/0xf0
>>>>> [ 26.430737] ? __netif_receive_skb_core+0x1/0xbb0
>>>>> [ 26.431334] ? __netif_receive_skb_core+0x5/0xbb0
>>>>> [ 26.431930] kprobe_ftrace_handler+0x90/0xf0
>>>>> [ 26.432495] ftrace_ops_assist_func+0x63/0x100
>>>>> [ 26.433060] 0xffffffffc03180bf
>>>>> [ 26.433471] ? __netif_receive_skb_core+0x1/0xbb0
>>>>> ...
>>>>>
>>>>> To prevent we are running in arbitrary task (e.g., idle task)
>>>>> context which may introduce sleeping issues, the following
>>>>> probably appropriate:
>>>>>
>>>>> if (in_nmi() || in_softirq())
>>>>> return -EPERM;
>>>>>
>>>>> Anyway, if in nmi or softirq, the namespace and pid/tgid
>>>>> we get may be just accidentally associated with the bpf running
>>>>> context, but it could be in a different context. So such info
>>>>> is not reliable any way.
>>>>>
>>>>>> +
>>>>>> + if (unlikely(size != sizeof(struct bpf_pidns_info)))
>>>>>> + return -EINVAL;
>>>>>> + pidns = task_active_pid_ns(current);
>>> [...]
>>>
next prev parent reply other threads:[~2019-08-28 20:54 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-13 18:47 [PATCH bpf-next V9 0/3] BPF: New helper to obtain namespace data from current task Carlos Neira
2019-08-13 18:47 ` [PATCH bpf-next V9 1/3] bpf: new " Carlos Neira
2019-08-13 22:35 ` Yonghong Song
2019-08-20 15:10 ` Carlos Antonio Neira Bustos
2019-08-20 17:29 ` Yonghong Song
2019-08-13 23:11 ` Yonghong Song
2019-08-13 23:51 ` [Potential Spoof] " Yonghong Song
2019-08-14 0:56 ` Carlos Antonio Neira Bustos
[not found] ` <9a2cacad-b79f-5d39-6d62-bb48cbaaac07@fb.com>
[not found] ` <CACiB22jyN9=0ATWWE+x=BoWD6u+8KO+MvBfsFQmcNfkmANb2_w@mail.gmail.com>
2019-08-28 20:39 ` Carlos Antonio Neira Bustos
2019-08-28 20:53 ` Yonghong Song [this message]
2019-08-28 21:03 ` Carlos Antonio Neira Bustos
2019-09-03 18:45 ` Carlos Antonio Neira Bustos
2019-09-03 20:36 ` Yonghong Song
2019-08-13 18:47 ` [PATCH bpf-next V9 2/3] samples/bpf: added sample code for bpf_get_current_pidns_info Carlos Neira
2019-08-13 18:47 ` [PATCH bpf-next V9 3/3] tools/testing/selftests/bpf: Add self-tests for new helper Carlos Neira
2019-08-13 23:19 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4faeb577-387a-7186-e060-f0ca76395823@fb.com \
--to=yhs@fb.com \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=cneirabustos@gmail.com \
--cc=ebiederm@xmission.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).