linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Hutchings <ben@decadent.org.uk>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: akpm@linux-foundation.org, Denis Kirjanov <kda@linux-powerpc.org>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	"Will Deacon" <will@kernel.org>, "Petr Mladek" <pmladek@suse.com>,
	"Jann Horn" <jannh@google.com>, "Ingo Molnar" <mingo@kernel.org>,
	"Sergey Senozhatsky" <sergey.senozhatsky@gmail.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Linus Torvalds" <torvalds@linux-foundation.org>
Subject: [PATCH 3.16 10/83] sched/fair: Don't free p->numa_faults with concurrent readers
Date: Wed, 20 Nov 2019 15:37:20 +0000	[thread overview]
Message-ID: <lsq.1574264230.654885483@decadent.org.uk> (raw)
In-Reply-To: <lsq.1574264230.280218497@decadent.org.uk>

3.16.78-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Jann Horn <jannh@google.com>

commit 16d51a590a8ce3befb1308e0e7ab77f3b661af33 upstream.

When going through execve(), zero out the NUMA fault statistics instead of
freeing them.

During execve, the task is reachable through procfs and the scheduler. A
concurrent /proc/*/sched reader can read data from a freed ->numa_faults
allocation (confirmed by KASAN) and write it back to userspace.
I believe that it would also be possible for a use-after-free read to occur
through a race between a NUMA fault and execve(): task_numa_fault() can
lead to task_numa_compare(), which invokes task_weight() on the currently
running task of a different CPU.

Another way to fix this would be to make ->numa_faults RCU-managed or add
extra locking, but it seems easier to wipe the NUMA fault statistics on
execve.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Fixes: 82727018b0d3 ("sched/numa: Call task_numa_free() from do_execve()")
Link: https://lkml.kernel.org/r/20190716152047.14424-1-jannh@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.16: adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1589,7 +1589,7 @@ static int do_execve_common(struct filen
 	current->fs->in_exec = 0;
 	current->in_execve = 0;
 	acct_update_integrals(current);
-	task_numa_free(current);
+	task_numa_free(current, false);
 	free_bprm(bprm);
 	putname(filename);
 	if (displaced)
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1671,7 +1671,7 @@ struct task_struct {
 extern void task_numa_fault(int last_node, int node, int pages, int flags);
 extern pid_t task_numa_group_id(struct task_struct *p);
 extern void set_numabalancing_state(bool enabled);
-extern void task_numa_free(struct task_struct *p);
+extern void task_numa_free(struct task_struct *p, bool final);
 extern bool should_numa_migrate_memory(struct task_struct *p, struct page *page,
 					int src_nid, int dst_cpu);
 #else
@@ -1686,7 +1686,7 @@ static inline pid_t task_numa_group_id(s
 static inline void set_numabalancing_state(bool enabled)
 {
 }
-static inline void task_numa_free(struct task_struct *p)
+static inline void task_numa_free(struct task_struct *p, bool final)
 {
 }
 static inline bool should_numa_migrate_memory(struct task_struct *p,
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -242,7 +242,7 @@ void __put_task_struct(struct task_struc
 	WARN_ON(atomic_read(&tsk->usage));
 	WARN_ON(tsk == current);
 
-	task_numa_free(tsk);
+	task_numa_free(tsk, true);
 	security_task_free(tsk);
 	exit_creds(tsk);
 	delayacct_tsk_free(tsk);
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1747,13 +1747,23 @@ no_join:
 	return;
 }
 
-void task_numa_free(struct task_struct *p)
+/*
+ * Get rid of NUMA staticstics associated with a task (either current or dead).
+ * If @final is set, the task is dead and has reached refcount zero, so we can
+ * safely free all relevant data structures. Otherwise, there might be
+ * concurrent reads from places like load balancing and procfs, and we should
+ * reset the data back to default state without freeing ->numa_faults.
+ */
+void task_numa_free(struct task_struct *p, bool final)
 {
 	struct numa_group *grp = p->numa_group;
-	void *numa_faults = p->numa_faults_memory;
+	unsigned long *numa_faults = p->numa_faults_memory;
 	unsigned long flags;
 	int i;
 
+	if (!numa_faults)
+		return;
+
 	if (grp) {
 		spin_lock_irqsave(&grp->lock, flags);
 		for (i = 0; i < NR_NUMA_HINT_FAULT_STATS * nr_node_ids; i++)
@@ -1767,11 +1777,17 @@ void task_numa_free(struct task_struct *
 		put_numa_group(grp);
 	}
 
-	p->numa_faults_memory = NULL;
-	p->numa_faults_buffer_memory = NULL;
-	p->numa_faults_cpu= NULL;
-	p->numa_faults_buffer_cpu = NULL;
-	kfree(numa_faults);
+	if (final) {
+		p->numa_faults_memory = NULL;
+		p->numa_faults_buffer_memory = NULL;
+		p->numa_faults_cpu = NULL;
+		p->numa_faults_buffer_cpu = NULL;
+		kfree(numa_faults);
+	} else {
+		p->total_numa_faults = 0;
+		for (i = 0; i < NR_NUMA_HINT_FAULT_STATS * nr_node_ids; i++)
+			numa_faults[i] = 0;
+	}
 }
 
 /*


  parent reply	other threads:[~2019-11-20 15:40 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20 15:37 [PATCH 3.16 00/83] 3.16.78-rc1 review Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 01/83] hwmon: (nct6775) Fix register address and added missed tolerance for nct6106 Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 02/83] x86/sysfb_efi: Add quirks for some devices with swapped width and height Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 03/83] mmc: mmc_spi: Enable stable writes Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 04/83] ALSA: compress: Fix regression on compressed capture streams Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 05/83] can: peak_usb: fix potential double kfree_skb() Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 06/83] usb: pci-quirks: Correct AMD PLL quirk detection Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 07/83] usb: wusbcore: fix unbalanced get/put cluster_id Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 08/83] x86/speculation/mds: Apply more accurate check on hypervisor platform Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 09/83] hpet: Fix division by zero in hpet_time_div() Ben Hutchings
2019-11-20 15:37 ` Ben Hutchings [this message]
2019-11-20 15:37 ` [PATCH 3.16 11/83] tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 12/83] bnx2x: Disable multi-cos feature Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 13/83] arm64: compat: Allow single-byte watchpoints on all addresses Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 14/83] net: sched: Fix a possible null-pointer dereference in dequeue_func() Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 15/83] net: fix ifindex collision during namespace removal Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 16/83] libata: zpodd: Fix small read overflow in zpodd_get_mech_type() Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 17/83] selinux: fix memory leak in policydb_init() Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 18/83] net: bridge: mcast: don't delete permanent entries when fast leave is enabled Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 19/83] xen/swiotlb: fix condition for calling xen_destroy_contiguous_region() Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 20/83] s390/dasd: fix endless loop after read unit address configuration Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 21/83] can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 22/83] asm-generic: fix -Wtype-limits compiler warnings Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 23/83] NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim() Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 24/83] USB: serial: option: Add support for ZTE MF871A Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 25/83] usb: yurex: Fix use-after-free in yurex_delete Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 26/83] SMB3: Fix deadlock in validate negotiate hits reconnect Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 27/83] smb3: send CAP_DFS capability during session setup Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 28/83] sound: fix a memory leak bug Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 29/83] ALSA: firewire: " Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 30/83] ALSA: hda - Fix " Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 31/83] sh: kernel: hw_breakpoint: Fix missing break in switch statement Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 32/83] net: tc35815: Explicitly check NET_IP_ALIGN is not zero in tc35815_rx Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 33/83] staging: comedi: dt3000: Fix signed integer overflow 'divider * base' Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 34/83] staging: comedi: dt3000: Fix rounding up of timer divisor Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 35/83] USB: core: Fix races in character device registration and deregistraion Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 36/83] netfilter: conntrack: Use consistent ct id hash calculation Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 37/83] sctp: fix the transport error_count check Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 38/83] USB: serial: option: Add Motorola modem UARTs Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 39/83] usb: cdc-acm: make sure a refcount is taken early enough Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 40/83] net/packet: fix race in tpacket_snd() Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 41/83] Revert "cfg80211: fix processing world regdomain when non modular" Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 42/83] usb-storage: Add new JMS567 revision to unusual_devs Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 43/83] dm btree: fix order of block initialization in btree_split_beneath Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 44/83] dm space map metadata: fix missing store of apply_bops() return value Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 45/83] dm table: fix invalid memory accesses with too high sector number Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 46/83] x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386 Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 47/83] batman-adv: Only read OGM tvlv_len after buffer len check Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 48/83] ALSA: seq: Fix potential concurrent access to the deleted pool Ben Hutchings
2019-11-20 15:37 ` [PATCH 3.16 49/83] ptrace,x86: Make user_64bit_mode() available to 32-bit builds Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 50/83] uprobes/x86: Fix detection of 32-bit user mode Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 51/83] x86/apic: Do not initialize LDR and DFR for bigsmp Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 52/83] x86/apic: Drop logical_smp_processor_id() inline Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 53/83] x86/apic/32: Avoid bogus LDR warnings Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 54/83] usb: host: ohci: fix a race condition between shutdown and irq Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 55/83] USB: storage: ums-realtek: Update module parameter description for auto_delink_en Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 56/83] USB: storage: ums-realtek: Whitelist auto-delink support Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 57/83] USB: cdc-wdm: fix race between write and disconnect due to flag abuse Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 58/83] VMCI: Release resource if the work is already queued Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 59/83] mld: fix memory leak in mld_del_delrec() Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 60/83] ALSA: hda - Fix potential endless loop at applying quirks Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 61/83] mmc: core: Fix init of SD cards reporting an invalid VDD range Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 62/83] net: seeq: Fix the function used to release some memory in an error handling path Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 63/83] sched/fair: Don't assign runtime for throttled cfs_rq Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 64/83] vhost/test: fix build for vhost test Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 65/83] sctp: use transport pf_retrans in sctp_do_8_2_transport_strike Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 66/83] genirq: Prevent NULL pointer dereference in resend_irqs() Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 67/83] keys: Fix missing null pointer check in request_key_auth_describe() Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 68/83] sch_hhf: ensure quantum and hhf_non_hh_weight are non-zero Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 69/83] tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 70/83] tun: fix use-after-free when register netdev failed Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 71/83] ipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()' Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 72/83] net: Fix null de-reference of device refcount Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 73/83] sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()' Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 74/83] KVM: nVMX: handle page fault in vmread Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 75/83] KVM: x86: work around leak of uninitialized stack contents Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 76/83] PCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30 Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 77/83] alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 78/83] cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 79/83] CIFS: Fix use after free of file info structures Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 80/83] md/raid: raid5 preserve the writeback action after the parity check Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 81/83] btrfs: partially apply b8b93addde Ben Hutchings
2019-11-20 15:56   ` Hans van Kranenburg
2019-11-21 21:42     ` Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 82/83] btrfs: volumes: Cleanup stripe size calculation Ben Hutchings
2019-11-20 15:38 ` [PATCH 3.16 83/83] btrfs: alloc_chunk: fix more DUP stripe size handling Ben Hutchings
2019-11-20 15:46 ` [PATCH 3.16 00/83] 3.16.78-rc1 review Guenter Roeck
2019-11-20 15:50   ` Ben Hutchings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=lsq.1574264230.654885483@decadent.org.uk \
    --to=ben@decadent.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=jannh@google.com \
    --cc=kda@linux-powerpc.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).