From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0C24C433EF for ; Wed, 20 Oct 2021 19:25:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CCE0E60FD8 for ; Wed, 20 Oct 2021 19:25:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231605AbhJTT1Q (ORCPT ); Wed, 20 Oct 2021 15:27:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231522AbhJTT1O (ORCPT ); Wed, 20 Oct 2021 15:27:14 -0400 Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC61BC06174E for ; Wed, 20 Oct 2021 12:24:59 -0700 (PDT) Received: by mail-io1-xd2c.google.com with SMTP id b188so21126283iof.8 for ; Wed, 20 Oct 2021 12:24:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sargun.me; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=j4BWqaRoOXX4hAkcrJT/f3tGF68yRFTv/w2sc+SKXvc=; b=RN81ylp+B4t8iEpsk9tiQvrF24QGy5e9a5pZLGwVNNMKGg7b3HVBLX59XxW5NImwFP 791CDv6JKUIqiyv/vyHwBfX1MQufRqPh6ihcZ2CWBZkuBrVduB1AbRUKe7L0hJgMuNZC wuemNqtS7LxxgAxfS68999eM5pD5/4GE3AaZ0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=j4BWqaRoOXX4hAkcrJT/f3tGF68yRFTv/w2sc+SKXvc=; b=xvTngcH8QfCzVB/3lkhg3cr/ORvPneKZLXtl/HVshkES+a1pvPOd3FsavzGkg50PrN a3SfXINb29cdKXYDiQt2r2DCQfeQ3Rt1f3/vLXuMgTeJU1BNMQE3AMCyU+dTPcyBZGHZ YNQSKEXYRiBlCxiRavOan6Hqoif8wz4AgJh0ADmJfLPPJTekwD+lXRXL26HL1sRQ1sxG Zs/+x91Ai8h5fXsY5EkwG2wxc4ozbjI4H1Icpfl4NzXTn8+1CQt+LHfEfQT4E5epr41v NP6110GgDg4K/u7x8S5AyiWmK5RL+9bmG19ZG6/MhfaFwKcGUlAgFu9JsyR0JdagqP7+ B12A== X-Gm-Message-State: AOAM530cpQazbd1g6K3zEbEbYZc9RCmkj7imF4/X//7wrzZY4E/5z84Q ZIjT7J1cQmB3oCEoxTYik5qTYg== X-Google-Smtp-Source: ABdhPJzPNmXj9qRz5w/z9fK91UgdGlWjPFSGN7oeQcyao7shhiGtU2yE/U6Xo7KpRncRXCvIs+rGlQ== X-Received: by 2002:a05:6602:1799:: with SMTP id y25mr776711iox.38.1634757899128; Wed, 20 Oct 2021 12:24:59 -0700 (PDT) Received: from ircssh-2.c.rugged-nimbus-611.internal (80.60.198.104.bc.googleusercontent.com. [104.198.60.80]) by smtp.gmail.com with ESMTPSA id m7sm1559987iov.30.2021.10.20.12.24.58 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Oct 2021 12:24:58 -0700 (PDT) Date: Wed, 20 Oct 2021 19:24:57 +0000 From: Sargun Dhillon To: Sergey Ryazanov Cc: LKML , netdev , Christian Brauner Subject: Re: Retrieving the network namespace of a socket Message-ID: <20211020192456.GA23489@ircssh-2.c.rugged-nimbus-611.internal> References: <20211020095707.GA16295@ircssh-2.c.rugged-nimbus-611.internal> <20211020163417.GA21040@ircssh-2.c.rugged-nimbus-611.internal> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211020163417.GA21040@ircssh-2.c.rugged-nimbus-611.internal> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 20, 2021 at 04:34:18PM +0000, Sargun Dhillon wrote: > On Wed, Oct 20, 2021 at 05:03:56PM +0300, Sergey Ryazanov wrote: > > Hello Sargun, > > > > On Wed, Oct 20, 2021 at 12:57 PM Sargun Dhillon wrote: > > > I'm working on a problem where I need to determine which network namespace a > > > given socket is in. I can currently bruteforce this by using INET_DIAG, and > > > enumerating namespaces and working backwards. > > > > Namespace is not a per-socket, but a per-process attribute. So each > > socket of a process belongs to the same namespace. > > > > Could you elaborate what kind of problem you are trying to solve? > > Maybe there is a more simple solution. for it. > > > > -- > > Sergey > > That's not entirely true. See the folowing code: > > int main() { > int fd1, fd2; > fd1 = socket(AF_INET, SOCK_STREAM, 0); > assert(fd1 >= 0); > assert(unshare(CLONE_NEWNET) == 0); > fd2 = socket(AF_INET, SOCK_STREAM, 0); > assert(fd2 >= 0); > } > > fd1 and fd2 have different sock_net. > > The context for this is: > https://linuxplumbersconf.org/event/11/contributions/932/ > > We need to figure out, for a given socket, if it has reachability to a given IP. So, I was lazy / misread documentation. It turns out SIOCGSKNS does exactly what I need. Nonetheless, it's a little weird and awkward that it is exists. I was wondering if this functionality made sense as part of kcmp. I wrote up a quick patch to see if anyone was interested: diff --git a/include/uapi/linux/kcmp.h b/include/uapi/linux/kcmp.h index ef1305010925..d6b9c3923d20 100644 --- a/include/uapi/linux/kcmp.h +++ b/include/uapi/linux/kcmp.h @@ -14,6 +14,7 @@ enum kcmp_type { KCMP_IO, KCMP_SYSVSEM, KCMP_EPOLL_TFD, + KCMP_NETNS, KCMP_TYPES, }; diff --git a/kernel/kcmp.c b/kernel/kcmp.c index 5353edfad8e1..8fadae4b588f 100644 --- a/kernel/kcmp.c +++ b/kernel/kcmp.c @@ -18,6 +18,8 @@ #include #include +#include +#include /* * We don't expose the real in-memory order of objects for security reasons. @@ -132,6 +134,58 @@ static int kcmp_epoll_target(struct task_struct *task1, } #endif +#ifdef CONFIG_NET +static int __kcmp_netns_target(struct task_struct *task1, + struct task_struct *task2, + struct file *filp1, + struct file *filp2) +{ + struct socket *sock1, *sock2; + struct net *net1, *net2; + + sock1 = sock_from_file(filp1); + sock2 = sock_from_file(filp1); + if (!sock1 || !sock2) + return -ENOTSOCK; + + net1 = sock_net(sock1->sk); + net2 = sock_net(sock2->sk); + + return kcmp_ptr(net1, net2, KCMP_NETNS); +} + +static int kcmp_netns_target(struct task_struct *task1, + struct task_struct *task2, + unsigned long idx1, + unsigned long idx2) +{ + struct file *filp1, *filp2; + + int ret = -EBADF; + + filp1 = fget_task(task1, idx1); + if (filp1) { + filp2 = fget_task(task2, idx2); + if (filp2) { + ret = __kcmp_netns_target(task1, task2, filp1, filp2); + fput(filp2); + } + + fput(filp1); + } + + return ret; +} +#else +static int kcmp_netns_target(struct task_struct *task1, + struct task_struct *task2, + unsigned long idx1, + unsigned long idx2) +{ + return -EOPNOTSUPP; +} +#endif + SYSCALL_DEFINE5(kcmp, pid_t, pid1, pid_t, pid2, int, type, unsigned long, idx1, unsigned long, idx2) { @@ -206,6 +260,9 @@ SYSCALL_DEFINE5(kcmp, pid_t, pid1, pid_t, pid2, int, type, case KCMP_EPOLL_TFD: ret = kcmp_epoll_target(task1, task2, idx1, (void *)idx2); break; + case KCMP_NETNS: + ret = kcmp_netns_target(task1, task2, idx1, idx2); + break; default: ret = -EINVAL; break;