All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sasha.levin@oracle.com>
To: stable@vger.kernel.org, stable-commits@vger.kernel.org
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
	Vlad Yasevich <vyasevich@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Sasha Levin <sasha.levin@oracle.com>
Subject: [added to the 3.18 stable tree] sctp: fix race on protocol/netns initialization
Date: Wed, 28 Oct 2015 01:22:21 -0400	[thread overview]
Message-ID: <1446009925-26739-45-git-send-email-sasha.levin@oracle.com> (raw)
In-Reply-To: <1446009925-26739-1-git-send-email-sasha.levin@oracle.com>

From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

This patch has been added to the 3.18 stable tree. If you have any
objections, please let us know.

===============

[ Upstream commit 8e2d61e0aed2b7c4ecb35844fe07e0b2b762dee4 ]

Consider sctp module is unloaded and is being requested because an user
is creating a sctp socket.

During initialization, sctp will add the new protocol type and then
initialize pernet subsys:

        status = sctp_v4_protosw_init();
        if (status)
                goto err_protosw_init;

        status = sctp_v6_protosw_init();
        if (status)
                goto err_v6_protosw_init;

        status = register_pernet_subsys(&sctp_net_ops);

The problem is that after those calls to sctp_v{4,6}_protosw_init(), it
is possible for userspace to create SCTP sockets like if the module is
already fully loaded. If that happens, one of the possible effects is
that we will have readers for net->sctp.local_addr_list list earlier
than expected and sctp_net_init() does not take precautions while
dealing with that list, leading to a potential panic but not limited to
that, as sctp_sock_init() will copy a bunch of blank/partially
initialized values from net->sctp.

The race happens like this:

     CPU 0                           |  CPU 1
  socket()                           |
   __sock_create                     | socket()
    inet_create                      |  __sock_create
     list_for_each_entry_rcu(        |
        answer, &inetsw[sock->type], |
        list) {                      |   inet_create
      /* no hits */                  |
     if (unlikely(err)) {            |
      ...                            |
      request_module()               |
      /* socket creation is blocked  |
       * the module is fully loaded  |
       */                            |
       sctp_init                     |
        sctp_v4_protosw_init         |
         inet_register_protosw       |
          list_add_rcu(&p->list,     |
                       last_perm);   |
                                     |  list_for_each_entry_rcu(
                                     |     answer, &inetsw[sock->type],
        sctp_v6_protosw_init         |     list) {
                                     |     /* hit, so assumes protocol
                                     |      * is already loaded
                                     |      */
                                     |  /* socket creation continues
                                     |   * before netns is initialized
                                     |   */
        register_pernet_subsys       |

Simply inverting the initialization order between
register_pernet_subsys() and sctp_v4_protosw_init() is not possible
because register_pernet_subsys() will create a control sctp socket, so
the protocol must be already visible by then. Deferring the socket
creation to a work-queue is not good specially because we loose the
ability to handle its errors.

So, as suggested by Vlad, the fix is to split netns initialization in
two moments: defaults and control socket, so that the defaults are
already loaded by when we register the protocol, while control socket
initialization is kept at the same moment it is today.

Fixes: 4db67e808640 ("sctp: Make the address lists per network namespace")
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
---
 net/sctp/protocol.c | 64 ++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 41 insertions(+), 23 deletions(-)

diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 8f34b27..143c4eb 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1166,7 +1166,7 @@ static void sctp_v4_del_protocol(void)
 	unregister_inetaddr_notifier(&sctp_inetaddr_notifier);
 }
 
-static int __net_init sctp_net_init(struct net *net)
+static int __net_init sctp_defaults_init(struct net *net)
 {
 	int status;
 
@@ -1259,12 +1259,6 @@ static int __net_init sctp_net_init(struct net *net)
 
 	sctp_dbg_objcnt_init(net);
 
-	/* Initialize the control inode/socket for handling OOTB packets.  */
-	if ((status = sctp_ctl_sock_init(net))) {
-		pr_err("Failed to initialize the SCTP control sock\n");
-		goto err_ctl_sock_init;
-	}
-
 	/* Initialize the local address list. */
 	INIT_LIST_HEAD(&net->sctp.local_addr_list);
 	spin_lock_init(&net->sctp.local_addr_lock);
@@ -1280,9 +1274,6 @@ static int __net_init sctp_net_init(struct net *net)
 
 	return 0;
 
-err_ctl_sock_init:
-	sctp_dbg_objcnt_exit(net);
-	sctp_proc_exit(net);
 err_init_proc:
 	cleanup_sctp_mibs(net);
 err_init_mibs:
@@ -1291,15 +1282,12 @@ err_sysctl_register:
 	return status;
 }
 
-static void __net_exit sctp_net_exit(struct net *net)
+static void __net_exit sctp_defaults_exit(struct net *net)
 {
 	/* Free the local address list */
 	sctp_free_addr_wq(net);
 	sctp_free_local_addr_list(net);
 
-	/* Free the control endpoint.  */
-	inet_ctl_sock_destroy(net->sctp.ctl_sock);
-
 	sctp_dbg_objcnt_exit(net);
 
 	sctp_proc_exit(net);
@@ -1307,9 +1295,32 @@ static void __net_exit sctp_net_exit(struct net *net)
 	sctp_sysctl_net_unregister(net);
 }
 
-static struct pernet_operations sctp_net_ops = {
-	.init = sctp_net_init,
-	.exit = sctp_net_exit,
+static struct pernet_operations sctp_defaults_ops = {
+	.init = sctp_defaults_init,
+	.exit = sctp_defaults_exit,
+};
+
+static int __net_init sctp_ctrlsock_init(struct net *net)
+{
+	int status;
+
+	/* Initialize the control inode/socket for handling OOTB packets.  */
+	status = sctp_ctl_sock_init(net);
+	if (status)
+		pr_err("Failed to initialize the SCTP control sock\n");
+
+	return status;
+}
+
+static void __net_init sctp_ctrlsock_exit(struct net *net)
+{
+	/* Free the control endpoint.  */
+	inet_ctl_sock_destroy(net->sctp.ctl_sock);
+}
+
+static struct pernet_operations sctp_ctrlsock_ops = {
+	.init = sctp_ctrlsock_init,
+	.exit = sctp_ctrlsock_exit,
 };
 
 /* Initialize the universe into something sensible.  */
@@ -1443,8 +1454,11 @@ static __init int sctp_init(void)
 	sctp_v4_pf_init();
 	sctp_v6_pf_init();
 
-	status = sctp_v4_protosw_init();
+	status = register_pernet_subsys(&sctp_defaults_ops);
+	if (status)
+		goto err_register_defaults;
 
+	status = sctp_v4_protosw_init();
 	if (status)
 		goto err_protosw_init;
 
@@ -1452,9 +1466,9 @@ static __init int sctp_init(void)
 	if (status)
 		goto err_v6_protosw_init;
 
-	status = register_pernet_subsys(&sctp_net_ops);
+	status = register_pernet_subsys(&sctp_ctrlsock_ops);
 	if (status)
-		goto err_register_pernet_subsys;
+		goto err_register_ctrlsock;
 
 	status = sctp_v4_add_protocol();
 	if (status)
@@ -1470,12 +1484,14 @@ out:
 err_v6_add_protocol:
 	sctp_v4_del_protocol();
 err_add_protocol:
-	unregister_pernet_subsys(&sctp_net_ops);
-err_register_pernet_subsys:
+	unregister_pernet_subsys(&sctp_ctrlsock_ops);
+err_register_ctrlsock:
 	sctp_v6_protosw_exit();
 err_v6_protosw_init:
 	sctp_v4_protosw_exit();
 err_protosw_init:
+	unregister_pernet_subsys(&sctp_defaults_ops);
+err_register_defaults:
 	sctp_v4_pf_exit();
 	sctp_v6_pf_exit();
 	sctp_sysctl_unregister();
@@ -1508,12 +1524,14 @@ static __exit void sctp_exit(void)
 	sctp_v6_del_protocol();
 	sctp_v4_del_protocol();
 
-	unregister_pernet_subsys(&sctp_net_ops);
+	unregister_pernet_subsys(&sctp_ctrlsock_ops);
 
 	/* Free protosw registrations */
 	sctp_v6_protosw_exit();
 	sctp_v4_protosw_exit();
 
+	unregister_pernet_subsys(&sctp_defaults_ops);
+
 	/* Unregister with socket layer. */
 	sctp_v6_pf_exit();
 	sctp_v4_pf_exit();
-- 
2.1.4


  parent reply	other threads:[~2015-10-28  5:27 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-28  5:21 [added to the 3.18 stable tree] blk-mq: fix buffer overflow when reading sysfs file of 'pending' Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] unshare: Unsharing a thread does not require unsharing a vm Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] tg3: Fix temperature reporting Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] mac80211: enable assoc check for mesh interfaces Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] arm64: kconfig: Move LIST_POISON to a safe value Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] arm64: compat: fix vfp save/restore across signal handlers in big-endian Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] arm64: head.S: initialise mdcr_el2 in el2_setup Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] arm64: errata: add module build workaround for erratum #843419 Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] arm64: KVM: Disable virtual timer even if the guest is not using it Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] Input: evdev - do not report errors form flush() Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] ALSA: hda - Enable headphone jack detect on old Fujitsu laptops Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437 Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] powerpc/mm: Recompute hash value after a failed update Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] CIFS: fix type confusion in copy offload ioctl Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] Add radeon suspend/resume quirk for HP Compaq dc5750 Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] x86/mm: Initialize pmd_idx in page_table_range_init_count() Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] [media] rc-core: fix remove uevent generation Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] [media] v4l: omap3isp: Fix sub-device power management code Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] Btrfs: check if previous transaction aborted to avoid fs corruption Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] NFSv4: don't set SETATTR for O_RDONLY|O_EXCL Sasha Levin
2015-10-28  5:21 ` [added to the 3.18 stable tree] NFS: Fix a NULL pointer dereference of migration recovery ops for v4.2 client Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] NFS: nfs_set_pgio_error sometimes misses errors Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] parisc: Use double word condition in 64bit CAS operation Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] parisc: Filter out spurious interrupts in PA-RISC irq handler Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] vmscan: fix increasing nr_isolated incurred by putback unevictable pages Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] fs: if a coredump already exists, unlink and recreate with O_EXCL Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] mmc: core: fix race condition in mmc_wait_data_done Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] md/raid10: always set reshape_safe when initializing reshape_position Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] hfs: fix B-tree corruption after insertion at position 0 Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] IB/qib: Change lkey table allocation to support more MRs Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] IB/uverbs: reject invalid or unknown opcodes Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] IB/uverbs: Fix race between ib_uverbs_open and remove_one Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] IB/mlx4: Forbid using sysfs to change RoCE pkeys Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] IB/mlx4: Use correct SL on AH query under RoCE Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] hfs,hfsplus: cache pages correctly between bnode_create and bnode_free Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] if_link: Add an additional parameter to ifla_vf_info for RSS querying Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] ip6_gre: release cached dst on tunnel removal Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] ipv6: fix exthdrs offload registration in out_rt path Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] net/ipv6: Correct PIM6 mrt_lock handling Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] netlink, mmap: transform mmap skb into full skb on taps Sasha Levin
2015-10-28  5:22 ` Sasha Levin [this message]
2015-10-28  5:22 ` [added to the 3.18 stable tree] openvswitch: Zero flows on allocation Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] fib_rules: fix fib rule dumps across multiple skbs Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] packet: missing dev_put() in packet_do_bind() Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] udp: fix dst races with multicast early demux Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] bna: fix interrupts storm caused by erroneous packets Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] x86/nmi/64: Improve nested NMI comments Sasha Levin
2015-10-28  5:22 ` [added to the 3.18 stable tree] x86/nmi/64: Reorder nested NMI checks Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1446009925-26739-45-git-send-email-sasha.levin@oracle.com \
    --to=sasha.levin@oracle.com \
    --cc=davem@davemloft.net \
    --cc=marcelo.leitner@gmail.com \
    --cc=stable-commits@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vyasevich@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.