linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
@ 2016-07-10 13:04 yhb
  2016-07-10 13:04 ` yhb
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: yhb @ 2016-07-10 13:04 UTC (permalink / raw)
  To: ralf; +Cc: linux-mips

From cd1eb951d4a7f01aaa24d2fb902f06b73ef4f608 Mon Sep 17 00:00:00 2001
From: yhb <yhb@ruijie.com.cn>
Date: Sun, 10 Jul 2016 20:43:05 +0800
Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
 when asid_cache(cpu) wraps to 0.

Suppose that asid_cache(cpu) wraps to 0 every n days.
case 1:
(1)Process 1 got ASID 0x101.
(2)Process 1 slept for n days.
(3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
(4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.

case 2:
(1)Process 1 got ASID 0x101 on CPU 1.
(2)Process 1 migrated to CPU 2.
(3)Process 1 migrated to CPU 1 after n days.
(4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
(5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.

So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.

Signed-off-by: yhb <yhb@ruijie.com.cn>
---
 arch/blackfin/kernel/trace.c               |  7 ++--
 arch/frv/mm/mmu-context.c                  |  6 ++--
 arch/mips/include/asm/mmu_context.h        | 53 ++++++++++++++++++++++++++++--
 arch/um/kernel/reboot.c                    |  5 +--
 block/blk-cgroup.c                         |  6 ++--
 block/blk-ioc.c                            | 17 ++++++----
 drivers/staging/android/ion/ion.c          |  5 +--
 drivers/staging/android/lowmemorykiller.c  | 15 +++++----
 drivers/staging/lustre/lustre/ptlrpc/sec.c |  5 +--
 drivers/tty/tty_io.c                       |  6 ++--
 fs/coredump.c                              |  5 +--
 fs/exec.c                                  | 17 ++++++----
 fs/file.c                                  | 16 +++++----
 fs/fs_struct.c                             | 16 +++++----
 fs/hugetlbfs/inode.c                       |  6 ++--
 fs/namespace.c                             |  5 +--
 fs/proc/array.c                            |  5 +--
 fs/proc/base.c                             | 40 +++++++++++++---------
 fs/proc/internal.h                         |  5 +--
 fs/proc/proc_net.c                         |  6 ++--
 fs/proc/task_mmu.c                         |  5 +--
 fs/proc_namespace.c                        |  9 ++---
 include/linux/cpuset.h                     |  8 ++---
 include/linux/nsproxy.h                    |  6 ++--
 include/linux/oom.h                        |  3 +-
 include/linux/sched.h                      |  8 ++---
 ipc/namespace.c                            |  5 +--
 kernel/cgroup.c                            |  5 +--
 kernel/cpu.c                               |  5 +--
 kernel/cpuset.c                            |  7 ++--
 kernel/exit.c                              | 19 +++++++----
 kernel/fork.c                              | 32 +++++++++++-------
 kernel/kcmp.c                              |  5 +--
 kernel/nsproxy.c                           |  5 +--
 kernel/ptrace.c                            | 11 ++++---
 kernel/sched/debug.c                       |  5 +--
 kernel/sys.c                               | 16 +++++----
 kernel/utsname.c                           |  5 +--
 mm/memcontrol.c                            |  5 +--
 mm/mempolicy.c                             | 46 ++++++++++++++++----------
 mm/mmu_context.c                           | 10 +++---
 mm/oom_kill.c                              | 37 ++++++++++++---------
 net/core/net_namespace.c                   | 11 ++++---
 net/core/netclassid_cgroup.c               |  6 ++--
 net/core/netprio_cgroup.c                  |  5 +--
 45 files changed, 337 insertions(+), 188 deletions(-)

diff --git a/arch/blackfin/kernel/trace.c b/arch/blackfin/kernel/trace.c
index 719dd79..a74843a 100644
--- a/arch/blackfin/kernel/trace.c
+++ b/arch/blackfin/kernel/trace.c
@@ -116,8 +116,9 @@ void decode_address(char *buf, unsigned long address)
 	read_lock(&tasklist_lock);
 	for_each_process(p) {
 		struct task_struct *t;
+		unsigned long irqflags;
 
-		t = find_lock_task_mm(p);
+		t = find_lock_task_mm(p, &irqflags);
 		if (!t)
 			continue;
 
@@ -165,7 +166,7 @@ void decode_address(char *buf, unsigned long address)
 						name, vma->vm_start, vma->vm_end);
 
 				up_read(&mm->mmap_sem);
-				task_unlock(t);
+				task_unlock(t, &irqflags);
 
 				if (buf[0] == '\0')
 					sprintf(buf, "[ %s ] dynamic memory", name);
@@ -176,7 +177,7 @@ void decode_address(char *buf, unsigned long address)
 
 		up_read(&mm->mmap_sem);
 __continue:
-		task_unlock(t);
+		task_unlock(t, &irqflags);
 	}
 
 	/*
diff --git a/arch/frv/mm/mmu-context.c b/arch/frv/mm/mmu-context.c
index 81757d5..dc525bd 100644
--- a/arch/frv/mm/mmu-context.c
+++ b/arch/frv/mm/mmu-context.c
@@ -183,15 +183,17 @@ int cxn_pin_by_pid(pid_t pid)
 	read_lock(&tasklist_lock);
 	tsk = find_task_by_vpid(pid);
 	if (tsk) {
+		unsigned long irqflags;
+
 		ret = -EINVAL;
 
-		task_lock(tsk);
+		task_lock(tsk, &irqflags);
 		if (tsk->mm) {
 			mm = tsk->mm;
 			atomic_inc(&mm->mm_users);
 			ret = 0;
 		}
-		task_unlock(tsk);
+		task_unlock(tsk, &irqflags);
 	}
 	read_unlock(&tasklist_lock);
 
diff --git a/arch/mips/include/asm/mmu_context.h b/arch/mips/include/asm/mmu_context.h
index 45914b5..68966b5 100644
--- a/arch/mips/include/asm/mmu_context.h
+++ b/arch/mips/include/asm/mmu_context.h
@@ -12,6 +12,7 @@
 #define _ASM_MMU_CONTEXT_H
 
 #include <linux/errno.h>
+#include <linux/oom.h>/* find_lock_task_mm */
 #include <linux/sched.h>
 #include <linux/smp.h>
 #include <linux/slab.h>
@@ -97,6 +98,52 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
 #define ASID_VERSION_MASK  ((unsigned long)~(ASID_MASK|(ASID_MASK-1)))
 #define ASID_FIRST_VERSION ((unsigned long)(~ASID_VERSION_MASK) + 1)
 
+/*
+ * Yu Huabing
+ * Suppose that asid_cache(cpu) wraps to 0 every n days.
+ * case 1:
+ * (1)Process 1 got ASID 0x101.
+ * (2)Process 1 slept for n days.
+ * (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
+ * (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
+ *
+ * case 2:
+ * (1)Process 1 got ASID 0x101 on CPU 1.
+ * (2)Process 1 migrated to CPU 2.
+ * (3)Process 1 migrated to CPU 1 after n days.
+ * (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
+ * (5)Process 1 is scheduled,and ASID of process 1 is same as ASID of process 2.
+ *
+ * So we need to clear MMU contexts of all other processes when asid_cache(cpu)
+ * wraps to 0.
+ *
+ * This function might be called from hardirq context or process context.
+ */
+static inline void clear_other_mmu_contexts(struct mm_struct *mm,
+						unsigned long cpu)
+{
+	struct task_struct *p;
+	unsigned long irqflags;
+
+	read_lock(&tasklist_lock);
+	for_each_process(p) {
+		struct task_struct *t;
+
+		/*
+		 * Main thread might exit, but other threads may still have
+		 * a valid mm. Find one.
+		 */
+		t = find_lock_task_mm(p, &irqflags);
+		if (!t)
+			continue;
+
+		if (t->mm != mm)
+			cpu_context(cpu, t->mm) = 0;
+		task_unlock(t, &irqflags);
+	}
+	read_unlock(&tasklist_lock);
+}
+
 /* Normal, classic MIPS get_new_mmu_context */
 static inline void
 get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
@@ -112,8 +159,10 @@ get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
 #else
 		local_flush_tlb_all();	/* start new asid cycle */
 #endif
-		if (!asid)		/* fix version if needed */
-			asid = ASID_FIRST_VERSION;
+		if (!asid) {
+			asid = ASID_FIRST_VERSION; /* fix version if needed */
+			clear_other_mmu_contexts(mm, cpu);
+		}
 	}
 
 	cpu_context(cpu, mm) = asid_cache(cpu) = asid;
diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c
index b60a9f8..452bd01 100644
--- a/arch/um/kernel/reboot.c
+++ b/arch/um/kernel/reboot.c
@@ -22,12 +22,13 @@ static void kill_off_processes(void)
 	read_lock(&tasklist_lock);
 	for_each_process(p) {
 		struct task_struct *t;
+		unsigned long irqflags;
 
-		t = find_lock_task_mm(p);
+		t = find_lock_task_mm(p, &irqflags);
 		if (!t)
 			continue;
 		pid = t->mm->context.id.u.pid;
-		task_unlock(t);
+		task_unlock(t, &irqflags);
 		os_kill_ptraced_process(pid, 1);
 	}
 	read_unlock(&tasklist_lock);
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 66e6f1a..3ffeb70 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1145,11 +1145,13 @@ static int blkcg_can_attach(struct cgroup_taskset *tset)
 
 	/* task_lock() is needed to avoid races with exit_io_context() */
 	cgroup_taskset_for_each(task, dst_css, tset) {
-		task_lock(task);
+		unsigned long irqflags;
+
+		task_lock(task, &irqflags);
 		ioc = task->io_context;
 		if (ioc && atomic_read(&ioc->nr_tasks) > 1)
 			ret = -EINVAL;
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 		if (ret)
 			break;
 	}
diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 381cb50..4add102 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -200,11 +200,12 @@ retry:
 void exit_io_context(struct task_struct *task)
 {
 	struct io_context *ioc;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	ioc = task->io_context;
 	task->io_context = NULL;
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	atomic_dec(&ioc->nr_tasks);
 	put_io_context_active(ioc);
@@ -235,6 +236,7 @@ int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)
 {
 	struct io_context *ioc;
 	int ret;
+	unsigned long irqflags;
 
 	ioc = kmem_cache_alloc_node(iocontext_cachep, gfp_flags | __GFP_ZERO,
 				    node);
@@ -257,7 +259,7 @@ int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)
 	 * path may issue IOs from e.g. exit_files().  The exit path is
 	 * responsible for not issuing IO after exit_io_context().
 	 */
-	task_lock(task);
+	task_lock(task, &irqflags);
 	if (!task->io_context &&
 	    (task == current || !(task->flags & PF_EXITING)))
 		task->io_context = ioc;
@@ -266,7 +268,7 @@ int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)
 
 	ret = task->io_context ? 0 : -EBUSY;
 
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ret;
 }
@@ -288,18 +290,19 @@ struct io_context *get_task_io_context(struct task_struct *task,
 				       gfp_t gfp_flags, int node)
 {
 	struct io_context *ioc;
+	unsigned long irqflags;
 
 	might_sleep_if(gfpflags_allow_blocking(gfp_flags));
 
 	do {
-		task_lock(task);
+		task_lock(task, &irqflags);
 		ioc = task->io_context;
 		if (likely(ioc)) {
 			get_io_context(ioc);
-			task_unlock(task);
+			task_unlock(task, &irqflags);
 			return ioc;
 		}
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 	} while (!create_task_io_context(task, gfp_flags, node));
 
 	return NULL;
diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
index 8536567..7560f2f 100644
--- a/drivers/staging/android/ion/ion.c
+++ b/drivers/staging/android/ion/ion.c
@@ -806,6 +806,7 @@ struct ion_client *ion_client_create(struct ion_device *dev,
 	struct rb_node *parent = NULL;
 	struct ion_client *entry;
 	pid_t pid;
+	unsigned long irqflags;
 
 	if (!name) {
 		pr_err("%s: Name cannot be null\n", __func__);
@@ -813,7 +814,7 @@ struct ion_client *ion_client_create(struct ion_device *dev,
 	}
 
 	get_task_struct(current->group_leader);
-	task_lock(current->group_leader);
+	task_lock(current->group_leader, &irqflags);
 	pid = task_pid_nr(current->group_leader);
 	/*
 	 * don't bother to store task struct for kernel threads,
@@ -825,7 +826,7 @@ struct ion_client *ion_client_create(struct ion_device *dev,
 	} else {
 		task = current->group_leader;
 	}
-	task_unlock(current->group_leader);
+	task_unlock(current->group_leader, &irqflags);
 
 	client = kzalloc(sizeof(struct ion_client), GFP_KERNEL);
 	if (!client)
diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index 2509e5d..963aab9 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -123,27 +123,28 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
 	for_each_process(tsk) {
 		struct task_struct *p;
 		short oom_score_adj;
+		unsigned long irqflags;
 
 		if (tsk->flags & PF_KTHREAD)
 			continue;
 
-		p = find_lock_task_mm(tsk);
+		p = find_lock_task_mm(tsk, &irqflags);
 		if (!p)
 			continue;
 
 		if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
 		    time_before_eq(jiffies, lowmem_deathpending_timeout)) {
-			task_unlock(p);
+			task_unlock(p, &irqflags);
 			rcu_read_unlock();
 			return 0;
 		}
 		oom_score_adj = p->signal->oom_score_adj;
 		if (oom_score_adj < min_score_adj) {
-			task_unlock(p);
+			task_unlock(p, &irqflags);
 			continue;
 		}
 		tasksize = get_mm_rss(p->mm);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 		if (tasksize <= 0)
 			continue;
 		if (selected) {
@@ -160,7 +161,9 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
 			     p->comm, p->pid, oom_score_adj, tasksize);
 	}
 	if (selected) {
-		task_lock(selected);
+		unsigned long irqflags;
+
+		task_lock(selected, &irqflags);
 		send_sig(SIGKILL, selected, 0);
 		/*
 		 * FIXME: lowmemorykiller shouldn't abuse global OOM killer
@@ -169,7 +172,7 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
 		 */
 		if (selected->mm)
 			mark_oom_victim(selected);
-		task_unlock(selected);
+		task_unlock(selected, &irqflags);
 		lowmem_print(1, "Killing '%s' (%d), adj %hd,\n"
 				 "   to free %ldkB on behalf of '%s' (%d) because\n"
 				 "   cache %ldkB is below limit %ldkB for oom_score_adj %hd\n"
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
index 187fd1d..74549d3 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
@@ -2193,6 +2193,7 @@ EXPORT_SYMBOL(sptlrpc_current_user_desc_size);
 int sptlrpc_pack_user_desc(struct lustre_msg *msg, int offset)
 {
 	struct ptlrpc_user_desc *pud;
+	unsigned long irqflags;
 
 	pud = lustre_msg_buf(msg, offset, 0);
 
@@ -2203,12 +2204,12 @@ int sptlrpc_pack_user_desc(struct lustre_msg *msg, int offset)
 	pud->pud_cap = cfs_curproc_cap_pack();
 	pud->pud_ngroups = (msg->lm_buflens[offset] - sizeof(*pud)) / 4;
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	if (pud->pud_ngroups > current_ngroups)
 		pud->pud_ngroups = current_ngroups;
 	memcpy(pud->pud_groups, current_cred()->group_info->blocks[0],
 	       pud->pud_ngroups * sizeof(__u32));
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 
 	return 0;
 }
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 24d5491..d10475b 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3085,20 +3085,22 @@ void __do_SAK(struct tty_struct *tty)
 
 	/* Now kill any processes that happen to have the tty open */
 	do_each_thread(g, p) {
+		unsigned long irqflags;
+
 		if (p->signal->tty == tty) {
 			tty_notice(tty, "SAK: killed process %d (%s): by controlling tty\n",
 				   task_pid_nr(p), p->comm);
 			send_sig(SIGKILL, p, 1);
 			continue;
 		}
-		task_lock(p);
+		task_lock(p, &irqflags);
 		i = iterate_fd(p->files, 0, this_tty, tty);
 		if (i != 0) {
 			tty_notice(tty, "SAK: killed process %d (%s): by fd#%d\n",
 				   task_pid_nr(p), p->comm, i - 1);
 			force_sig(SIGKILL, p);
 		}
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} while_each_thread(g, p);
 	read_unlock(&tasklist_lock);
 #endif
diff --git a/fs/coredump.c b/fs/coredump.c
index 47c32c3..5122e6f 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -703,10 +703,11 @@ void do_coredump(const siginfo_t *siginfo)
 			 * root directory of init_task.
 			 */
 			struct path root;
+			unsigned long irqflags;
 
-			task_lock(&init_task);
+			task_lock(&init_task, &irqflags);
 			get_fs_root(init_task.fs, &root);
-			task_unlock(&init_task);
+			task_unlock(&init_task, &irqflags);
 			cprm.file = file_open_root(root.dentry, root.mnt,
 				cn.corename, open_flags, 0600);
 			path_put(&root);
diff --git a/fs/exec.c b/fs/exec.c
index c4010b8..7e95215 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -940,6 +940,7 @@ static int exec_mmap(struct mm_struct *mm)
 {
 	struct task_struct *tsk;
 	struct mm_struct *old_mm, *active_mm;
+	unsigned long irqflags;
 
 	/* Notify parent that we're no longer interested in the old VM */
 	tsk = current;
@@ -960,14 +961,14 @@ static int exec_mmap(struct mm_struct *mm)
 			return -EINTR;
 		}
 	}
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	active_mm = tsk->active_mm;
 	tsk->mm = mm;
 	tsk->active_mm = mm;
 	activate_mm(active_mm, mm);
 	tsk->mm->vmacache_seqnum = 0;
 	vmacache_flush(tsk);
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	if (old_mm) {
 		up_read(&old_mm->mmap_sem);
 		BUG_ON(active_mm != old_mm);
@@ -1153,10 +1154,12 @@ killed:
 
 char *get_task_comm(char *buf, struct task_struct *tsk)
 {
+	unsigned long irqflags;
+
 	/* buf must be at least sizeof(tsk->comm) in size */
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	strncpy(buf, tsk->comm, sizeof(tsk->comm));
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	return buf;
 }
 EXPORT_SYMBOL_GPL(get_task_comm);
@@ -1168,10 +1171,12 @@ EXPORT_SYMBOL_GPL(get_task_comm);
 
 void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec)
 {
-	task_lock(tsk);
+	unsigned long irqflags;
+
+	task_lock(tsk, &irqflags);
 	trace_task_rename(tsk, buf);
 	strlcpy(tsk->comm, buf, sizeof(tsk->comm));
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	perf_event_comm(tsk, exec);
 }
 
diff --git a/fs/file.c b/fs/file.c
index 1fbc5c0..19cbf9b 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -418,12 +418,13 @@ static struct fdtable *close_files(struct files_struct * files)
 struct files_struct *get_files_struct(struct task_struct *task)
 {
 	struct files_struct *files;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	files = task->files;
 	if (files)
 		atomic_inc(&files->count);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return files;
 }
@@ -444,11 +445,12 @@ void reset_files_struct(struct files_struct *files)
 {
 	struct task_struct *tsk = current;
 	struct files_struct *old;
+	unsigned long irqflags;
 
 	old = tsk->files;
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	tsk->files = files;
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	put_files_struct(old);
 }
 
@@ -457,9 +459,11 @@ void exit_files(struct task_struct *tsk)
 	struct files_struct * files = tsk->files;
 
 	if (files) {
-		task_lock(tsk);
+		unsigned long irqflags;
+
+		task_lock(tsk, &irqflags);
 		tsk->files = NULL;
-		task_unlock(tsk);
+		task_unlock(tsk, &irqflags);
 		put_files_struct(files);
 	}
 }
diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index 7dca743..426dab4 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -58,10 +58,11 @@ void chroot_fs_refs(const struct path *old_root, const struct path *new_root)
 	struct task_struct *g, *p;
 	struct fs_struct *fs;
 	int count = 0;
+	unsigned long irqflags;
 
 	read_lock(&tasklist_lock);
 	do_each_thread(g, p) {
-		task_lock(p);
+		task_lock(p, &irqflags);
 		fs = p->fs;
 		if (fs) {
 			int hits = 0;
@@ -76,7 +77,7 @@ void chroot_fs_refs(const struct path *old_root, const struct path *new_root)
 			}
 			spin_unlock(&fs->lock);
 		}
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} while_each_thread(g, p);
 	read_unlock(&tasklist_lock);
 	while (count--)
@@ -95,13 +96,15 @@ void exit_fs(struct task_struct *tsk)
 	struct fs_struct *fs = tsk->fs;
 
 	if (fs) {
+		unsigned long irqflags;
 		int kill;
-		task_lock(tsk);
+
+		task_lock(tsk, &irqflags);
 		spin_lock(&fs->lock);
 		tsk->fs = NULL;
 		kill = !--fs->users;
 		spin_unlock(&fs->lock);
-		task_unlock(tsk);
+		task_unlock(tsk, &irqflags);
 		if (kill)
 			free_fs_struct(fs);
 	}
@@ -133,16 +136,17 @@ int unshare_fs_struct(void)
 	struct fs_struct *fs = current->fs;
 	struct fs_struct *new_fs = copy_fs_struct(fs);
 	int kill;
+	unsigned long irqflags;
 
 	if (!new_fs)
 		return -ENOMEM;
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	spin_lock(&fs->lock);
 	kill = !--fs->users;
 	current->fs = new_fs;
 	spin_unlock(&fs->lock);
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 
 	if (kill)
 		free_fs_struct(fs);
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 4ea71eb..73e7591 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1260,10 +1260,12 @@ struct file *hugetlb_file_setup(const char *name, size_t size,
 	if (creat_flags == HUGETLB_SHMFS_INODE && !can_do_hugetlb_shm()) {
 		*user = current_user();
 		if (user_shm_lock(size, *user)) {
-			task_lock(current);
+			unsigned long irqflags;
+
+			task_lock(current, &irqflags);
 			pr_warn_once("%s (%d): Using mlock ulimits for SHM_HUGETLB is deprecated\n",
 				current->comm, current->pid);
-			task_unlock(current);
+			task_unlock(current, &irqflags);
 		} else {
 			*user = NULL;
 			return ERR_PTR(-EPERM);
diff --git a/fs/namespace.c b/fs/namespace.c
index 4fb1691..504408d 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3294,16 +3294,17 @@ found:
 
 static struct ns_common *mntns_get(struct task_struct *task)
 {
+	unsigned long irqflags;
 	struct ns_common *ns = NULL;
 	struct nsproxy *nsproxy;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy) {
 		ns = &nsproxy->mnt_ns->ns;
 		get_mnt_ns(to_mnt_ns(ns));
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ns;
 }
diff --git a/fs/proc/array.c b/fs/proc/array.c
index b6c00ce..07907e7 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -149,6 +149,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	const struct cred *cred;
 	pid_t ppid, tpid = 0, tgid, ngid;
 	unsigned int max_fds = 0;
+	unsigned long irqflags;
 
 	rcu_read_lock();
 	ppid = pid_alive(p) ?
@@ -162,10 +163,10 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	ngid = task_numa_group_id(p);
 	cred = get_task_cred(p);
 
-	task_lock(p);
+	task_lock(p, &irqflags);
 	if (p->files)
 		max_fds = files_fdtable(p->files)->max_fds;
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 	rcu_read_unlock();
 
 	seq_printf(m,
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 0d163a8..eef7d4d 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -157,13 +157,14 @@ static unsigned int pid_entry_count_dirs(const struct pid_entry *entries,
 static int get_task_root(struct task_struct *task, struct path *root)
 {
 	int result = -ENOENT;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	if (task->fs) {
 		get_fs_root(task->fs, root);
 		result = 0;
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	return result;
 }
 
@@ -173,12 +174,14 @@ static int proc_cwd_link(struct dentry *dentry, struct path *path)
 	int result = -ENOENT;
 
 	if (task) {
-		task_lock(task);
+		unsigned long irqflags;
+
+		task_lock(task, &irqflags);
 		if (task->fs) {
 			get_fs_pwd(task->fs, path);
 			result = 0;
 		}
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 		put_task_struct(task);
 	}
 	return result;
@@ -1057,7 +1060,7 @@ static ssize_t oom_adj_write(struct file *file, const char __user *buf,
 	struct task_struct *task;
 	char buffer[PROC_NUMBUF];
 	int oom_adj;
-	unsigned long flags;
+	unsigned long flags, irqflags;
 	int err;
 
 	memset(buffer, 0, sizeof(buffer));
@@ -1083,7 +1086,7 @@ static ssize_t oom_adj_write(struct file *file, const char __user *buf,
 		goto out;
 	}
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	if (!task->mm) {
 		err = -EINVAL;
 		goto err_task_lock;
@@ -1122,7 +1125,7 @@ static ssize_t oom_adj_write(struct file *file, const char __user *buf,
 err_sighand:
 	unlock_task_sighand(task, &flags);
 err_task_lock:
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	put_task_struct(task);
 out:
 	return err < 0 ? err : count;
@@ -1159,7 +1162,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
 {
 	struct task_struct *task;
 	char buffer[PROC_NUMBUF];
-	unsigned long flags;
+	unsigned long flags, irqflags;
 	int oom_score_adj;
 	int err;
 
@@ -1186,7 +1189,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
 		goto out;
 	}
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	if (!task->mm) {
 		err = -EINVAL;
 		goto err_task_lock;
@@ -1211,7 +1214,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
 err_sighand:
 	unlock_task_sighand(task, &flags);
 err_task_lock:
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	put_task_struct(task);
 out:
 	return err < 0 ? err : count;
@@ -1522,14 +1525,15 @@ static int comm_show(struct seq_file *m, void *v)
 {
 	struct inode *inode = m->private;
 	struct task_struct *p;
+	unsigned long irqflags;
 
 	p = get_proc_task(inode);
 	if (!p)
 		return -ESRCH;
 
-	task_lock(p);
+	task_lock(p, &irqflags);
 	seq_printf(m, "%s\n", p->comm);
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	put_task_struct(p);
 
@@ -2277,12 +2281,14 @@ static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
 		return -ESRCH;
 
 	if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
-		task_lock(p);
+		unsigned long irqflags;
+
+		task_lock(p, &irqflags);
 		if (slack_ns == 0)
 			p->timer_slack_ns = p->default_timer_slack_ns;
 		else
 			p->timer_slack_ns = slack_ns;
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} else
 		count = -EPERM;
 
@@ -2302,9 +2308,11 @@ static int timerslack_ns_show(struct seq_file *m, void *v)
 		return -ESRCH;
 
 	if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
-		task_lock(p);
+		unsigned long irqflags;
+
+		task_lock(p, &irqflags);
 		seq_printf(m, "%llu\n", p->timer_slack_ns);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} else
 		err = -EPERM;
 
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..33da171 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -99,14 +99,15 @@ static inline struct task_struct *get_proc_task(struct inode *inode)
 
 static inline int task_dumpable(struct task_struct *task)
 {
+	unsigned long irqflags;
 	int dumpable = 0;
 	struct mm_struct *mm;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	mm = task->mm;
 	if (mm)
 		dumpable = get_dumpable(mm);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	if (dumpable == SUID_DUMP_USER)
 		return 1;
 	return 0;
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index 350984a..ffb8c8f 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -113,11 +113,13 @@ static struct net *get_proc_task_net(struct inode *dir)
 	rcu_read_lock();
 	task = pid_task(proc_pid(dir), PIDTYPE_PID);
 	if (task != NULL) {
-		task_lock(task);
+		unsigned long irqflags;
+
+		task_lock(task, &irqflags);
 		ns = task->nsproxy;
 		if (ns != NULL)
 			net = get_net(ns->net_ns);
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 	}
 	rcu_read_unlock();
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 5415835..f86f3bb 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -107,12 +107,13 @@ unsigned long task_statm(struct mm_struct *mm,
  */
 static void hold_task_mempolicy(struct proc_maps_private *priv)
 {
+	unsigned long irqflags;
 	struct task_struct *task = priv->task;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	priv->task_mempolicy = get_task_policy(task);
 	mpol_get(priv->task_mempolicy);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 }
 static void release_task_mempolicy(struct proc_maps_private *priv)
 {
diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
index 3f1190d..a3cc5cb 100644
--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -242,27 +242,28 @@ static int mounts_open_common(struct inode *inode, struct file *file,
 	struct proc_mounts *p;
 	struct seq_file *m;
 	int ret = -EINVAL;
+	unsigned long irqflags;
 
 	if (!task)
 		goto err;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsp = task->nsproxy;
 	if (!nsp || !nsp->mnt_ns) {
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 		put_task_struct(task);
 		goto err;
 	}
 	ns = nsp->mnt_ns;
 	get_mnt_ns(ns);
 	if (!task->fs) {
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 		put_task_struct(task);
 		ret = -ENOENT;
 		goto err_put_ns;
 	}
 	get_fs_root(task->fs, &root);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	put_task_struct(task);
 
 	ret = seq_open_private(file, &mounts_op, sizeof(struct proc_mounts));
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index 85a868c..ed091ba 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -126,15 +126,13 @@ static inline bool read_mems_allowed_retry(unsigned int seq)
 
 static inline void set_mems_allowed(nodemask_t nodemask)
 {
-	unsigned long flags;
+	unsigned long irqflags;
 
-	task_lock(current);
-	local_irq_save(flags);
+	task_lock(current, &irqflags);
 	write_seqcount_begin(&current->mems_allowed_seq);
 	current->mems_allowed = nodemask;
 	write_seqcount_end(&current->mems_allowed_seq);
-	local_irq_restore(flags);
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 }
 
 #else /* !CONFIG_CPUSETS */
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index ac0d65b..32b80dc 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -49,7 +49,9 @@ extern struct nsproxy init_nsproxy;
  *     precautions should be taken - just dereference the pointers
  *
  *  3. the access to other task namespaces is performed like this
- *     task_lock(task);
+ *     unsigned long irqflags;
+ *
+ *     task_lock(task, &irqflags);
  *     nsproxy = task->nsproxy;
  *     if (nsproxy != NULL) {
  *             / *
@@ -60,7 +62,7 @@ extern struct nsproxy init_nsproxy;
  *         * NULL task->nsproxy means that this task is
  *         * almost dead (zombie)
  *         * /
- *     task_unlock(task);
+ *     task_unlock(task, &irqflags);
  *
  */
 
diff --git a/include/linux/oom.h b/include/linux/oom.h
index 628a432..80251a8 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -98,7 +98,8 @@ extern bool oom_killer_disabled;
 extern bool oom_killer_disable(void);
 extern void oom_killer_enable(void);
 
-extern struct task_struct *find_lock_task_mm(struct task_struct *p);
+extern struct task_struct *find_lock_task_mm(struct task_struct *p,
+								unsigned long *irqflags);
 
 static inline bool task_will_free_mem(struct task_struct *task)
 {
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 52c4847..9e643fd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2769,14 +2769,14 @@ static inline int thread_group_empty(struct task_struct *p)
  * It must not be nested with write_lock_irq(&tasklist_lock),
  * neither inside nor outside.
  */
-static inline void task_lock(struct task_struct *p)
+static inline void task_lock(struct task_struct *p, unsigned long *irqflags)
 {
-	spin_lock(&p->alloc_lock);
+	spin_lock_irqsave(&p->alloc_lock, *irqflags);
 }
 
-static inline void task_unlock(struct task_struct *p)
+static inline void task_unlock(struct task_struct *p, unsigned long *irqflags)
 {
-	spin_unlock(&p->alloc_lock);
+	spin_unlock_irqrestore(&p->alloc_lock, *irqflags);
 }
 
 extern struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
diff --git a/ipc/namespace.c b/ipc/namespace.c
index 068caf1..4994299 100644
--- a/ipc/namespace.c
+++ b/ipc/namespace.c
@@ -135,14 +135,15 @@ static inline struct ipc_namespace *to_ipc_ns(struct ns_common *ns)
 
 static struct ns_common *ipcns_get(struct task_struct *task)
 {
+	unsigned long irqflags;
 	struct ipc_namespace *ns = NULL;
 	struct nsproxy *nsproxy;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy)
 		ns = get_ipc_ns(nsproxy->ipc_ns);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ns ? &ns->ns : NULL;
 }
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 86cb5c6..693b474 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -6354,14 +6354,15 @@ static struct ns_common *cgroupns_get(struct task_struct *task)
 {
 	struct cgroup_namespace *ns = NULL;
 	struct nsproxy *nsproxy;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy) {
 		ns = nsproxy->cgroup_ns;
 		get_cgroup_ns(ns);
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ns ? &ns->ns : NULL;
 }
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 3e3f6e4..0109299 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -615,16 +615,17 @@ void clear_tasks_mm_cpumask(int cpu)
 	rcu_read_lock();
 	for_each_process(p) {
 		struct task_struct *t;
+		unsigned long irqflags;
 
 		/*
 		 * Main thread might exit, but other threads may still have
 		 * a valid mm. Find one.
 		 */
-		t = find_lock_task_mm(p);
+		t = find_lock_task_mm(p, &irqflags);
 		if (!t)
 			continue;
 		cpumask_clear_cpu(cpu, mm_cpumask(t->mm));
-		task_unlock(t);
+		task_unlock(t, &irqflags);
 	}
 	rcu_read_unlock();
 }
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 1902956b..8bd56cc 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1033,6 +1033,7 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
 					nodemask_t *newmems)
 {
 	bool need_loop;
+	unsigned long irqflags;
 
 	/*
 	 * Allow tasks that have access to memory reserves because they have
@@ -1043,7 +1044,7 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
 	if (current->flags & PF_EXITING) /* Let dying task have memory */
 		return;
 
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	/*
 	 * Determine if a loop is necessary if another thread is doing
 	 * read_mems_allowed_begin().  If at least one node remains unchanged and
@@ -1054,7 +1055,6 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
 			!nodes_intersects(*newmems, tsk->mems_allowed);
 
 	if (need_loop) {
-		local_irq_disable();
 		write_seqcount_begin(&tsk->mems_allowed_seq);
 	}
 
@@ -1066,10 +1066,9 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
 
 	if (need_loop) {
 		write_seqcount_end(&tsk->mems_allowed_seq);
-		local_irq_enable();
 	}
 
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 }
 
 static void *cpuset_being_rebound;
diff --git a/kernel/exit.c b/kernel/exit.c
index 79c7e38..a4de5ed 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -298,6 +298,7 @@ kill_orphaned_pgrp(struct task_struct *tsk, struct task_struct *parent)
 void mm_update_next_owner(struct mm_struct *mm)
 {
 	struct task_struct *c, *g, *p = current;
+	unsigned long irqflags;
 
 retry:
 	/*
@@ -362,19 +363,19 @@ assign_new_owner:
 	 * The task_lock protects c->mm from changing.
 	 * We always want mm->owner->mm == mm
 	 */
-	task_lock(c);
+	task_lock(c, &irqflags);
 	/*
 	 * Delay read_unlock() till we have the task_lock()
 	 * to ensure that c does not slip away underneath us
 	 */
 	read_unlock(&tasklist_lock);
 	if (c->mm != mm) {
-		task_unlock(c);
+		task_unlock(c, &irqflags);
 		put_task_struct(c);
 		goto retry;
 	}
 	mm->owner = c;
-	task_unlock(c);
+	task_unlock(c, &irqflags);
 	put_task_struct(c);
 }
 #endif /* CONFIG_MEMCG */
@@ -387,6 +388,7 @@ static void exit_mm(struct task_struct *tsk)
 {
 	struct mm_struct *mm = tsk->mm;
 	struct core_state *core_state;
+	unsigned long irqflags;
 
 	mm_release(tsk, mm);
 	if (!mm)
@@ -427,11 +429,11 @@ static void exit_mm(struct task_struct *tsk)
 	atomic_inc(&mm->mm_count);
 	BUG_ON(mm != tsk->active_mm);
 	/* more a memory barrier than a real lock */
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	tsk->mm = NULL;
 	up_read(&mm->mmap_sem);
 	enter_lazy_tlb(mm, current);
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	mm_update_next_owner(mm);
 	mmput(mm);
 	if (test_thread_flag(TIF_MEMDIE))
@@ -654,6 +656,9 @@ void do_exit(long code)
 	struct task_struct *tsk = current;
 	int group_dead;
 	TASKS_RCU(int tasks_rcu_i);
+#ifdef CONFIG_NUMA
+	unsigned long irqflags;
+#endif
 
 	profile_task_exit(tsk);
 	kcov_task_exit(tsk);
@@ -769,10 +774,10 @@ void do_exit(long code)
 	exit_notify(tsk, group_dead);
 	proc_exit_connector(tsk);
 #ifdef CONFIG_NUMA
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	mpol_put(tsk->mempolicy);
 	tsk->mempolicy = NULL;
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 #endif
 #ifdef CONFIG_FUTEX
 	if (unlikely(current->pi_state_cache))
diff --git a/kernel/fork.c b/kernel/fork.c
index d277e83..388ba3f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -785,8 +785,9 @@ EXPORT_SYMBOL(get_mm_exe_file);
 struct mm_struct *get_task_mm(struct task_struct *task)
 {
 	struct mm_struct *mm;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	mm = task->mm;
 	if (mm) {
 		if (task->flags & PF_KTHREAD)
@@ -794,7 +795,7 @@ struct mm_struct *get_task_mm(struct task_struct *task)
 		else
 			atomic_inc(&mm->mm_users);
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	return mm;
 }
 EXPORT_SYMBOL_GPL(get_task_mm);
@@ -822,14 +823,15 @@ struct mm_struct *mm_access(struct task_struct *task, unsigned int mode)
 static void complete_vfork_done(struct task_struct *tsk)
 {
 	struct completion *vfork;
+	unsigned long irqflags;
 
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	vfork = tsk->vfork_done;
 	if (likely(vfork)) {
 		tsk->vfork_done = NULL;
 		complete(vfork);
 	}
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 }
 
 static int wait_for_vfork_done(struct task_struct *child,
@@ -842,9 +844,11 @@ static int wait_for_vfork_done(struct task_struct *child,
 	freezer_count();
 
 	if (killed) {
-		task_lock(child);
+		unsigned long irqflags;
+
+		task_lock(child, &irqflags);
 		child->vfork_done = NULL;
-		task_unlock(child);
+		task_unlock(child, &irqflags);
 	}
 
 	put_task_struct(child);
@@ -1126,6 +1130,7 @@ static void posix_cpu_timers_init_group(struct signal_struct *sig)
 static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 {
 	struct signal_struct *sig;
+	unsigned long irqflags;
 
 	if (clone_flags & CLONE_THREAD)
 		return 0;
@@ -1153,9 +1158,9 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 	hrtimer_init(&sig->real_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	sig->real_timer.function = it_real_fn;
 
-	task_lock(current->group_leader);
+	task_lock(current->group_leader, &irqflags);
 	memcpy(sig->rlim, current->signal->rlim, sizeof sig->rlim);
-	task_unlock(current->group_leader);
+	task_unlock(current->group_leader, &irqflags);
 
 	posix_cpu_timers_init_group(sig);
 
@@ -2022,6 +2027,8 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 		goto bad_unshare_cleanup_cred;
 
 	if (new_fs || new_fd || do_sysvsem || new_cred || new_nsproxy) {
+		unsigned long irqflags;
+
 		if (do_sysvsem) {
 			/*
 			 * CLONE_SYSVSEM is equivalent to sys_exit().
@@ -2037,7 +2044,7 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 		if (new_nsproxy)
 			switch_task_namespaces(current, new_nsproxy);
 
-		task_lock(current);
+		task_lock(current, &irqflags);
 
 		if (new_fs) {
 			fs = current->fs;
@@ -2056,7 +2063,7 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 			new_fd = fd;
 		}
 
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 
 		if (new_cred) {
 			/* Install the new user namespace */
@@ -2091,6 +2098,7 @@ int unshare_files(struct files_struct **displaced)
 	struct task_struct *task = current;
 	struct files_struct *copy = NULL;
 	int error;
+	unsigned long irqflags;
 
 	error = unshare_fd(CLONE_FILES, &copy);
 	if (error || !copy) {
@@ -2098,9 +2106,9 @@ int unshare_files(struct files_struct **displaced)
 		return error;
 	}
 	*displaced = task->files;
-	task_lock(task);
+	task_lock(task, &irqflags);
 	task->files = copy;
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	return 0;
 }
 
diff --git a/kernel/kcmp.c b/kernel/kcmp.c
index 3a47fa9..e5b5412 100644
--- a/kernel/kcmp.c
+++ b/kernel/kcmp.c
@@ -57,15 +57,16 @@ static struct file *
 get_file_raw_ptr(struct task_struct *task, unsigned int idx)
 {
 	struct file *file = NULL;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	rcu_read_lock();
 
 	if (task->files)
 		file = fcheck_files(task->files, idx);
 
 	rcu_read_unlock();
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return file;
 }
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 782102e..65f407d 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -216,13 +216,14 @@ out:
 void switch_task_namespaces(struct task_struct *p, struct nsproxy *new)
 {
 	struct nsproxy *ns;
+	unsigned long irqflags;
 
 	might_sleep();
 
-	task_lock(p);
+	task_lock(p, &irqflags);
 	ns = p->nsproxy;
 	p->nsproxy = new;
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	if (ns && atomic_dec_and_test(&ns->count))
 		free_nsproxy(ns);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index d49bfa1..2aba142 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -286,9 +286,11 @@ ok:
 bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 {
 	int err;
-	task_lock(task);
+	unsigned long irqflags;
+
+	task_lock(task, &irqflags);
 	err = __ptrace_may_access(task, mode);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	return !err;
 }
 
@@ -298,6 +300,7 @@ static int ptrace_attach(struct task_struct *task, long request,
 {
 	bool seize = (request == PTRACE_SEIZE);
 	int retval;
+	unsigned long irqflags;
 
 	retval = -EIO;
 	if (seize) {
@@ -327,9 +330,9 @@ static int ptrace_attach(struct task_struct *task, long request,
 	if (mutex_lock_interruptible(&task->signal->cred_guard_mutex))
 		goto out;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	if (retval)
 		goto unlock_creds;
 
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 4fbc3bd..a95f445 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -841,16 +841,17 @@ static void sched_show_numa(struct task_struct *p, struct seq_file *m)
 {
 #ifdef CONFIG_NUMA_BALANCING
 	struct mempolicy *pol;
+	unsigned long irqflags;
 
 	if (p->mm)
 		P(mm->numa_scan_seq);
 
-	task_lock(p);
+	task_lock(p, &irqflags);
 	pol = p->mempolicy;
 	if (pol && !(pol->flags & MPOL_F_MORON))
 		pol = NULL;
 	mpol_get(pol);
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	P(numa_pages_migrated);
 	P(numa_preferred_nid);
diff --git a/kernel/sys.c b/kernel/sys.c
index cf8ba54..fb2e6a8 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1308,12 +1308,14 @@ SYSCALL_DEFINE2(old_getrlimit, unsigned int, resource,
 		struct rlimit __user *, rlim)
 {
 	struct rlimit x;
+	unsigned long irqflags;
+
 	if (resource >= RLIM_NLIMITS)
 		return -EINVAL;
 
-	task_lock(current->group_leader);
+	task_lock(current->group_leader, &irqflags);
 	x = current->signal->rlim[resource];
-	task_unlock(current->group_leader);
+	task_unlock(current->group_leader, &irqflags);
 	if (x.rlim_cur > 0x7FFFFFFF)
 		x.rlim_cur = 0x7FFFFFFF;
 	if (x.rlim_max > 0x7FFFFFFF)
@@ -1362,6 +1364,7 @@ int do_prlimit(struct task_struct *tsk, unsigned int resource,
 {
 	struct rlimit *rlim;
 	int retval = 0;
+	unsigned long irqflags;
 
 	if (resource >= RLIM_NLIMITS)
 		return -EINVAL;
@@ -1381,7 +1384,7 @@ int do_prlimit(struct task_struct *tsk, unsigned int resource,
 	}
 
 	rlim = tsk->signal->rlim + resource;
-	task_lock(tsk->group_leader);
+	task_lock(tsk->group_leader, &irqflags);
 	if (new_rlim) {
 		/* Keep the capable check against init_user_ns until
 		   cgroups can contain all limits */
@@ -1407,7 +1410,7 @@ int do_prlimit(struct task_struct *tsk, unsigned int resource,
 		if (new_rlim)
 			*rlim = *new_rlim;
 	}
-	task_unlock(tsk->group_leader);
+	task_unlock(tsk->group_leader, &irqflags);
 
 	/*
 	 * RLIMIT_CPU handling.   Note that the kernel fails to return an error
@@ -1911,6 +1914,7 @@ static int prctl_set_auxv(struct mm_struct *mm, unsigned long addr,
 	 * tools which use this vector might be unhappy.
 	 */
 	unsigned long user_auxv[AT_VECTOR_SIZE];
+	unsigned long irqflags;
 
 	if (len > sizeof(user_auxv))
 		return -EINVAL;
@@ -1924,9 +1928,9 @@ static int prctl_set_auxv(struct mm_struct *mm, unsigned long addr,
 
 	BUILD_BUG_ON(sizeof(user_auxv) != sizeof(mm->saved_auxv));
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	memcpy(mm->saved_auxv, user_auxv, len);
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 
 	return 0;
 }
diff --git a/kernel/utsname.c b/kernel/utsname.c
index 831ea71..5732717 100644
--- a/kernel/utsname.c
+++ b/kernel/utsname.c
@@ -99,14 +99,15 @@ static struct ns_common *utsns_get(struct task_struct *task)
 {
 	struct uts_namespace *ns = NULL;
 	struct nsproxy *nsproxy;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy) {
 		ns = nsproxy->uts_ns;
 		get_uts_ns(ns);
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ns ? &ns->ns : NULL;
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a2e79b8..21d50f0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1046,11 +1046,12 @@ bool task_in_mem_cgroup(struct task_struct *task, struct mem_cgroup *memcg)
 	struct mem_cgroup *task_memcg;
 	struct task_struct *p;
 	bool ret;
+	unsigned long irqflags;
 
-	p = find_lock_task_mm(task);
+	p = find_lock_task_mm(task, &irqflags);
 	if (p) {
 		task_memcg = get_mem_cgroup_from_mm(p->mm);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} else {
 		/*
 		 * All threads may have already detached their mm's, but the oom
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 36cc01b..05abb22 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -789,6 +789,7 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags,
 	struct mempolicy *new, *old;
 	NODEMASK_SCRATCH(scratch);
 	int ret;
+	unsigned long irqflags;
 
 	if (!scratch)
 		return -ENOMEM;
@@ -799,10 +800,10 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags,
 		goto out;
 	}
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	ret = mpol_set_nodemask(new, nodes, scratch);
 	if (ret) {
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 		mpol_put(new);
 		goto out;
 	}
@@ -811,7 +812,7 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags,
 	if (new && new->mode == MPOL_INTERLEAVE &&
 	    nodes_weight(new->v.nodes))
 		current->il_next = first_node(new->v.nodes);
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 	mpol_put(old);
 	ret = 0;
 out:
@@ -873,12 +874,14 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
 		return -EINVAL;
 
 	if (flags & MPOL_F_MEMS_ALLOWED) {
+		unsigned long irqflags;
+
 		if (flags & (MPOL_F_NODE|MPOL_F_ADDR))
 			return -EINVAL;
 		*policy = 0;	/* just so it's initialized */
-		task_lock(current);
+		task_lock(current, &irqflags);
 		*nmask  = cpuset_current_mems_allowed;
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 		return 0;
 	}
 
@@ -937,9 +940,11 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
 		if (mpol_store_user_nodemask(pol)) {
 			*nmask = pol->w.user_nodemask;
 		} else {
-			task_lock(current);
+			unsigned long irqflags;
+
+			task_lock(current, &irqflags);
 			get_policy_nodemask(pol, nmask);
-			task_unlock(current);
+			task_unlock(current, &irqflags);
 		}
 	}
 
@@ -1221,10 +1226,12 @@ static long do_mbind(unsigned long start, unsigned long len,
 	{
 		NODEMASK_SCRATCH(scratch);
 		if (scratch) {
+			unsigned long irqflags;
+
 			down_write(&mm->mmap_sem);
-			task_lock(current);
+			task_lock(current, &irqflags);
 			err = mpol_set_nodemask(new, nmask, scratch);
-			task_unlock(current);
+			task_unlock(current, &irqflags);
 			if (err)
 				up_write(&mm->mmap_sem);
 		} else
@@ -1876,11 +1883,12 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask)
 {
 	struct mempolicy *mempolicy;
 	int nid;
+	unsigned long irqflags;
 
 	if (!(mask && current->mempolicy))
 		return false;
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	mempolicy = current->mempolicy;
 	switch (mempolicy->mode) {
 	case MPOL_PREFERRED:
@@ -1900,7 +1908,7 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask)
 	default:
 		BUG();
 	}
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 
 	return true;
 }
@@ -1921,10 +1929,11 @@ bool mempolicy_nodemask_intersects(struct task_struct *tsk,
 {
 	struct mempolicy *mempolicy;
 	bool ret = true;
+	unsigned long irqflags;
 
 	if (!mask)
 		return ret;
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	mempolicy = tsk->mempolicy;
 	if (!mempolicy)
 		goto out;
@@ -1946,7 +1955,7 @@ bool mempolicy_nodemask_intersects(struct task_struct *tsk,
 		BUG();
 	}
 out:
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	return ret;
 }
 
@@ -2127,9 +2136,11 @@ struct mempolicy *__mpol_dup(struct mempolicy *old)
 
 	/* task's mempolicy is protected by alloc_lock */
 	if (old == current->mempolicy) {
-		task_lock(current);
+		unsigned long irqflags;
+
+		task_lock(current, &irqflags);
 		*new = *old;
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 	} else
 		*new = *old;
 
@@ -2474,6 +2485,7 @@ void mpol_shared_policy_init(struct shared_policy *sp, struct mempolicy *mpol)
 		struct vm_area_struct pvma;
 		struct mempolicy *new;
 		NODEMASK_SCRATCH(scratch);
+		unsigned long irqflags;
 
 		if (!scratch)
 			goto put_mpol;
@@ -2482,9 +2494,9 @@ void mpol_shared_policy_init(struct shared_policy *sp, struct mempolicy *mpol)
 		if (IS_ERR(new))
 			goto free_scratch; /* no valid nodemask intersection */
 
-		task_lock(current);
+		task_lock(current, &irqflags);
 		ret = mpol_set_nodemask(new, &mpol->w.user_nodemask, scratch);
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 		if (ret)
 			goto put_new;
 
diff --git a/mm/mmu_context.c b/mm/mmu_context.c
index f802c2d..0651d21 100644
--- a/mm/mmu_context.c
+++ b/mm/mmu_context.c
@@ -21,8 +21,9 @@ void use_mm(struct mm_struct *mm)
 {
 	struct mm_struct *active_mm;
 	struct task_struct *tsk = current;
+	unsigned long irqflags;
 
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	active_mm = tsk->active_mm;
 	if (active_mm != mm) {
 		atomic_inc(&mm->mm_count);
@@ -30,7 +31,7 @@ void use_mm(struct mm_struct *mm)
 	}
 	tsk->mm = mm;
 	switch_mm(active_mm, mm, tsk);
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 #ifdef finish_arch_post_lock_switch
 	finish_arch_post_lock_switch();
 #endif
@@ -51,12 +52,13 @@ EXPORT_SYMBOL_GPL(use_mm);
 void unuse_mm(struct mm_struct *mm)
 {
 	struct task_struct *tsk = current;
+	unsigned long irqflags;
 
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	sync_mm_rss(mm);
 	tsk->mm = NULL;
 	/* active_mm is still 'mm' */
 	enter_lazy_tlb(mm, tsk);
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 }
 EXPORT_SYMBOL_GPL(unuse_mm);
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 8634958..e8039784 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -104,17 +104,18 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk,
  * pointer.  Return p, or any of its subthreads with a valid ->mm, with
  * task_lock() held.
  */
-struct task_struct *find_lock_task_mm(struct task_struct *p)
+struct task_struct *find_lock_task_mm(struct task_struct *p,
+						unsigned long *irqflags)
 {
 	struct task_struct *t;
 
 	rcu_read_lock();
 
 	for_each_thread(p, t) {
-		task_lock(t);
+		task_lock(t, irqflags);
 		if (likely(t->mm))
 			goto found;
-		task_unlock(t);
+		task_unlock(t, irqflags);
 	}
 	t = NULL;
 found:
@@ -166,17 +167,18 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 {
 	long points;
 	long adj;
+	unsigned long irqflags;
 
 	if (oom_unkillable_task(p, memcg, nodemask))
 		return 0;
 
-	p = find_lock_task_mm(p);
+	p = find_lock_task_mm(p, &irqflags);
 	if (!p)
 		return 0;
 
 	adj = (long)p->signal->oom_score_adj;
 	if (adj == OOM_SCORE_ADJ_MIN) {
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 		return 0;
 	}
 
@@ -186,7 +188,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 */
 	points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
 		atomic_long_read(&p->mm->nr_ptes) + mm_nr_pmds(p->mm);
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	/*
 	 * Root processes get 3% bonus, just like the __vm_enough_memory()
@@ -356,6 +358,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
 {
 	struct task_struct *p;
 	struct task_struct *task;
+	unsigned long irqflags;
 
 	pr_info("[ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name\n");
 	rcu_read_lock();
@@ -363,7 +366,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
 		if (oom_unkillable_task(p, memcg, nodemask))
 			continue;
 
-		task = find_lock_task_mm(p);
+		task = find_lock_task_mm(p, &irqflags);
 		if (!task) {
 			/*
 			 * This is a kthread or all of p's threads have already
@@ -380,7 +383,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
 			mm_nr_pmds(task->mm),
 			get_mm_counter(task->mm, MM_SWAPENTS),
 			task->signal->oom_score_adj, task->comm);
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 	}
 	rcu_read_unlock();
 }
@@ -432,6 +435,7 @@ static bool __oom_reap_task(struct task_struct *tsk)
 	struct zap_details details = {.check_swap_entries = true,
 				      .ignore_dirty = true};
 	bool ret = true;
+	unsigned long irqflags;
 
 	/*
 	 * Make sure we find the associated mm_struct even when the particular
@@ -439,17 +443,17 @@ static bool __oom_reap_task(struct task_struct *tsk)
 	 * We might have race with exit path so consider our work done if there
 	 * is no mm.
 	 */
-	p = find_lock_task_mm(tsk);
+	p = find_lock_task_mm(tsk, &irqflags);
 	if (!p)
 		return true;
 
 	mm = p->mm;
 	if (!atomic_inc_not_zero(&mm->mm_users)) {
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 		return true;
 	}
 
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	if (!down_read_trylock(&mm->mmap_sem)) {
 		ret = false;
@@ -686,19 +690,20 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
 					      DEFAULT_RATELIMIT_BURST);
 	bool can_oom_reap = true;
+	unsigned long irqflags;
 
 	/*
 	 * If the task is already exiting, don't alarm the sysadmin or kill
 	 * its children or threads, just set TIF_MEMDIE so it can die quickly
 	 */
-	task_lock(p);
+	task_lock(p, &irqflags);
 	if (p->mm && task_will_free_mem(p)) {
 		mark_oom_victim(p);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 		put_task_struct(p);
 		return;
 	}
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	if (__ratelimit(&oom_rs))
 		dump_header(oc, p, memcg);
@@ -734,7 +739,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	}
 	read_unlock(&tasklist_lock);
 
-	p = find_lock_task_mm(victim);
+	p = find_lock_task_mm(victim, &irqflags);
 	if (!p) {
 		put_task_struct(victim);
 		return;
@@ -759,7 +764,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 		K(get_mm_counter(victim->mm, MM_ANONPAGES)),
 		K(get_mm_counter(victim->mm, MM_FILEPAGES)),
 		K(get_mm_counter(victim->mm, MM_SHMEMPAGES)));
-	task_unlock(victim);
+	task_unlock(victim, &irqflags);
 
 	/*
 	 * Kill all user processes sharing victim->mm in other thread groups, if
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 2c2eb1b..421fccd 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -502,11 +502,13 @@ struct net *get_net_ns_by_pid(pid_t pid)
 	tsk = find_task_by_vpid(pid);
 	if (tsk) {
 		struct nsproxy *nsproxy;
-		task_lock(tsk);
+		unsigned long irqflags;
+
+		task_lock(tsk, &irqflags);
 		nsproxy = tsk->nsproxy;
 		if (nsproxy)
 			net = get_net(nsproxy->net_ns);
-		task_unlock(tsk);
+		task_unlock(tsk, &irqflags);
 	}
 	rcu_read_unlock();
 	return net;
@@ -963,12 +965,13 @@ static struct ns_common *netns_get(struct task_struct *task)
 {
 	struct net *net = NULL;
 	struct nsproxy *nsproxy;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy)
 		net = get_net(nsproxy->net_ns);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return net ? &net->ns : NULL;
 }
diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
index 11fce17..32482fd 100644
--- a/net/core/netclassid_cgroup.c
+++ b/net/core/netclassid_cgroup.c
@@ -76,9 +76,11 @@ static void update_classid(struct cgroup_subsys_state *css, void *v)
 
 	css_task_iter_start(css, &it);
 	while ((p = css_task_iter_next(&it))) {
-		task_lock(p);
+		unsigned long irqflags;
+
+		task_lock(p, &irqflags);
 		iterate_fd(p->files, 0, update_classid_sock, v);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	}
 	css_task_iter_end(&it);
 }
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 2ec86fc..e573c46 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -238,11 +238,12 @@ static void net_prio_attach(struct cgroup_taskset *tset)
 	struct cgroup_subsys_state *css;
 
 	cgroup_taskset_for_each(p, css, tset) {
+		unsigned long irqflags;
 		void *v = (void *)(unsigned long)css->cgroup->id;
 
-		task_lock(p);
+		task_lock(p, &irqflags);
 		iterate_fd(p->files, 0, update_netprio, v);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	}
 }
 
-- 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-10 13:04 MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0 yhb
@ 2016-07-10 13:04 ` yhb
  2016-07-11  9:30 ` James Hogan
  2016-07-11 18:02 ` Leonid Yegoshin
  2 siblings, 0 replies; 18+ messages in thread
From: yhb @ 2016-07-10 13:04 UTC (permalink / raw)
  To: ralf; +Cc: linux-mips

From cd1eb951d4a7f01aaa24d2fb902f06b73ef4f608 Mon Sep 17 00:00:00 2001
From: yhb <yhb@ruijie.com.cn>
Date: Sun, 10 Jul 2016 20:43:05 +0800
Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
 when asid_cache(cpu) wraps to 0.

Suppose that asid_cache(cpu) wraps to 0 every n days.
case 1:
(1)Process 1 got ASID 0x101.
(2)Process 1 slept for n days.
(3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
(4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.

case 2:
(1)Process 1 got ASID 0x101 on CPU 1.
(2)Process 1 migrated to CPU 2.
(3)Process 1 migrated to CPU 1 after n days.
(4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
(5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.

So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.

Signed-off-by: yhb <yhb@ruijie.com.cn>
---
 arch/blackfin/kernel/trace.c               |  7 ++--
 arch/frv/mm/mmu-context.c                  |  6 ++--
 arch/mips/include/asm/mmu_context.h        | 53 ++++++++++++++++++++++++++++--
 arch/um/kernel/reboot.c                    |  5 +--
 block/blk-cgroup.c                         |  6 ++--
 block/blk-ioc.c                            | 17 ++++++----
 drivers/staging/android/ion/ion.c          |  5 +--
 drivers/staging/android/lowmemorykiller.c  | 15 +++++----
 drivers/staging/lustre/lustre/ptlrpc/sec.c |  5 +--
 drivers/tty/tty_io.c                       |  6 ++--
 fs/coredump.c                              |  5 +--
 fs/exec.c                                  | 17 ++++++----
 fs/file.c                                  | 16 +++++----
 fs/fs_struct.c                             | 16 +++++----
 fs/hugetlbfs/inode.c                       |  6 ++--
 fs/namespace.c                             |  5 +--
 fs/proc/array.c                            |  5 +--
 fs/proc/base.c                             | 40 +++++++++++++---------
 fs/proc/internal.h                         |  5 +--
 fs/proc/proc_net.c                         |  6 ++--
 fs/proc/task_mmu.c                         |  5 +--
 fs/proc_namespace.c                        |  9 ++---
 include/linux/cpuset.h                     |  8 ++---
 include/linux/nsproxy.h                    |  6 ++--
 include/linux/oom.h                        |  3 +-
 include/linux/sched.h                      |  8 ++---
 ipc/namespace.c                            |  5 +--
 kernel/cgroup.c                            |  5 +--
 kernel/cpu.c                               |  5 +--
 kernel/cpuset.c                            |  7 ++--
 kernel/exit.c                              | 19 +++++++----
 kernel/fork.c                              | 32 +++++++++++-------
 kernel/kcmp.c                              |  5 +--
 kernel/nsproxy.c                           |  5 +--
 kernel/ptrace.c                            | 11 ++++---
 kernel/sched/debug.c                       |  5 +--
 kernel/sys.c                               | 16 +++++----
 kernel/utsname.c                           |  5 +--
 mm/memcontrol.c                            |  5 +--
 mm/mempolicy.c                             | 46 ++++++++++++++++----------
 mm/mmu_context.c                           | 10 +++---
 mm/oom_kill.c                              | 37 ++++++++++++---------
 net/core/net_namespace.c                   | 11 ++++---
 net/core/netclassid_cgroup.c               |  6 ++--
 net/core/netprio_cgroup.c                  |  5 +--
 45 files changed, 337 insertions(+), 188 deletions(-)

diff --git a/arch/blackfin/kernel/trace.c b/arch/blackfin/kernel/trace.c
index 719dd79..a74843a 100644
--- a/arch/blackfin/kernel/trace.c
+++ b/arch/blackfin/kernel/trace.c
@@ -116,8 +116,9 @@ void decode_address(char *buf, unsigned long address)
 	read_lock(&tasklist_lock);
 	for_each_process(p) {
 		struct task_struct *t;
+		unsigned long irqflags;
 
-		t = find_lock_task_mm(p);
+		t = find_lock_task_mm(p, &irqflags);
 		if (!t)
 			continue;
 
@@ -165,7 +166,7 @@ void decode_address(char *buf, unsigned long address)
 						name, vma->vm_start, vma->vm_end);
 
 				up_read(&mm->mmap_sem);
-				task_unlock(t);
+				task_unlock(t, &irqflags);
 
 				if (buf[0] == '\0')
 					sprintf(buf, "[ %s ] dynamic memory", name);
@@ -176,7 +177,7 @@ void decode_address(char *buf, unsigned long address)
 
 		up_read(&mm->mmap_sem);
 __continue:
-		task_unlock(t);
+		task_unlock(t, &irqflags);
 	}
 
 	/*
diff --git a/arch/frv/mm/mmu-context.c b/arch/frv/mm/mmu-context.c
index 81757d5..dc525bd 100644
--- a/arch/frv/mm/mmu-context.c
+++ b/arch/frv/mm/mmu-context.c
@@ -183,15 +183,17 @@ int cxn_pin_by_pid(pid_t pid)
 	read_lock(&tasklist_lock);
 	tsk = find_task_by_vpid(pid);
 	if (tsk) {
+		unsigned long irqflags;
+
 		ret = -EINVAL;
 
-		task_lock(tsk);
+		task_lock(tsk, &irqflags);
 		if (tsk->mm) {
 			mm = tsk->mm;
 			atomic_inc(&mm->mm_users);
 			ret = 0;
 		}
-		task_unlock(tsk);
+		task_unlock(tsk, &irqflags);
 	}
 	read_unlock(&tasklist_lock);
 
diff --git a/arch/mips/include/asm/mmu_context.h b/arch/mips/include/asm/mmu_context.h
index 45914b5..68966b5 100644
--- a/arch/mips/include/asm/mmu_context.h
+++ b/arch/mips/include/asm/mmu_context.h
@@ -12,6 +12,7 @@
 #define _ASM_MMU_CONTEXT_H
 
 #include <linux/errno.h>
+#include <linux/oom.h>/* find_lock_task_mm */
 #include <linux/sched.h>
 #include <linux/smp.h>
 #include <linux/slab.h>
@@ -97,6 +98,52 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
 #define ASID_VERSION_MASK  ((unsigned long)~(ASID_MASK|(ASID_MASK-1)))
 #define ASID_FIRST_VERSION ((unsigned long)(~ASID_VERSION_MASK) + 1)
 
+/*
+ * Yu Huabing
+ * Suppose that asid_cache(cpu) wraps to 0 every n days.
+ * case 1:
+ * (1)Process 1 got ASID 0x101.
+ * (2)Process 1 slept for n days.
+ * (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
+ * (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
+ *
+ * case 2:
+ * (1)Process 1 got ASID 0x101 on CPU 1.
+ * (2)Process 1 migrated to CPU 2.
+ * (3)Process 1 migrated to CPU 1 after n days.
+ * (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
+ * (5)Process 1 is scheduled,and ASID of process 1 is same as ASID of process 2.
+ *
+ * So we need to clear MMU contexts of all other processes when asid_cache(cpu)
+ * wraps to 0.
+ *
+ * This function might be called from hardirq context or process context.
+ */
+static inline void clear_other_mmu_contexts(struct mm_struct *mm,
+						unsigned long cpu)
+{
+	struct task_struct *p;
+	unsigned long irqflags;
+
+	read_lock(&tasklist_lock);
+	for_each_process(p) {
+		struct task_struct *t;
+
+		/*
+		 * Main thread might exit, but other threads may still have
+		 * a valid mm. Find one.
+		 */
+		t = find_lock_task_mm(p, &irqflags);
+		if (!t)
+			continue;
+
+		if (t->mm != mm)
+			cpu_context(cpu, t->mm) = 0;
+		task_unlock(t, &irqflags);
+	}
+	read_unlock(&tasklist_lock);
+}
+
 /* Normal, classic MIPS get_new_mmu_context */
 static inline void
 get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
@@ -112,8 +159,10 @@ get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
 #else
 		local_flush_tlb_all();	/* start new asid cycle */
 #endif
-		if (!asid)		/* fix version if needed */
-			asid = ASID_FIRST_VERSION;
+		if (!asid) {
+			asid = ASID_FIRST_VERSION; /* fix version if needed */
+			clear_other_mmu_contexts(mm, cpu);
+		}
 	}
 
 	cpu_context(cpu, mm) = asid_cache(cpu) = asid;
diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c
index b60a9f8..452bd01 100644
--- a/arch/um/kernel/reboot.c
+++ b/arch/um/kernel/reboot.c
@@ -22,12 +22,13 @@ static void kill_off_processes(void)
 	read_lock(&tasklist_lock);
 	for_each_process(p) {
 		struct task_struct *t;
+		unsigned long irqflags;
 
-		t = find_lock_task_mm(p);
+		t = find_lock_task_mm(p, &irqflags);
 		if (!t)
 			continue;
 		pid = t->mm->context.id.u.pid;
-		task_unlock(t);
+		task_unlock(t, &irqflags);
 		os_kill_ptraced_process(pid, 1);
 	}
 	read_unlock(&tasklist_lock);
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 66e6f1a..3ffeb70 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1145,11 +1145,13 @@ static int blkcg_can_attach(struct cgroup_taskset *tset)
 
 	/* task_lock() is needed to avoid races with exit_io_context() */
 	cgroup_taskset_for_each(task, dst_css, tset) {
-		task_lock(task);
+		unsigned long irqflags;
+
+		task_lock(task, &irqflags);
 		ioc = task->io_context;
 		if (ioc && atomic_read(&ioc->nr_tasks) > 1)
 			ret = -EINVAL;
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 		if (ret)
 			break;
 	}
diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 381cb50..4add102 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -200,11 +200,12 @@ retry:
 void exit_io_context(struct task_struct *task)
 {
 	struct io_context *ioc;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	ioc = task->io_context;
 	task->io_context = NULL;
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	atomic_dec(&ioc->nr_tasks);
 	put_io_context_active(ioc);
@@ -235,6 +236,7 @@ int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)
 {
 	struct io_context *ioc;
 	int ret;
+	unsigned long irqflags;
 
 	ioc = kmem_cache_alloc_node(iocontext_cachep, gfp_flags | __GFP_ZERO,
 				    node);
@@ -257,7 +259,7 @@ int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)
 	 * path may issue IOs from e.g. exit_files().  The exit path is
 	 * responsible for not issuing IO after exit_io_context().
 	 */
-	task_lock(task);
+	task_lock(task, &irqflags);
 	if (!task->io_context &&
 	    (task == current || !(task->flags & PF_EXITING)))
 		task->io_context = ioc;
@@ -266,7 +268,7 @@ int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)
 
 	ret = task->io_context ? 0 : -EBUSY;
 
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ret;
 }
@@ -288,18 +290,19 @@ struct io_context *get_task_io_context(struct task_struct *task,
 				       gfp_t gfp_flags, int node)
 {
 	struct io_context *ioc;
+	unsigned long irqflags;
 
 	might_sleep_if(gfpflags_allow_blocking(gfp_flags));
 
 	do {
-		task_lock(task);
+		task_lock(task, &irqflags);
 		ioc = task->io_context;
 		if (likely(ioc)) {
 			get_io_context(ioc);
-			task_unlock(task);
+			task_unlock(task, &irqflags);
 			return ioc;
 		}
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 	} while (!create_task_io_context(task, gfp_flags, node));
 
 	return NULL;
diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
index 8536567..7560f2f 100644
--- a/drivers/staging/android/ion/ion.c
+++ b/drivers/staging/android/ion/ion.c
@@ -806,6 +806,7 @@ struct ion_client *ion_client_create(struct ion_device *dev,
 	struct rb_node *parent = NULL;
 	struct ion_client *entry;
 	pid_t pid;
+	unsigned long irqflags;
 
 	if (!name) {
 		pr_err("%s: Name cannot be null\n", __func__);
@@ -813,7 +814,7 @@ struct ion_client *ion_client_create(struct ion_device *dev,
 	}
 
 	get_task_struct(current->group_leader);
-	task_lock(current->group_leader);
+	task_lock(current->group_leader, &irqflags);
 	pid = task_pid_nr(current->group_leader);
 	/*
 	 * don't bother to store task struct for kernel threads,
@@ -825,7 +826,7 @@ struct ion_client *ion_client_create(struct ion_device *dev,
 	} else {
 		task = current->group_leader;
 	}
-	task_unlock(current->group_leader);
+	task_unlock(current->group_leader, &irqflags);
 
 	client = kzalloc(sizeof(struct ion_client), GFP_KERNEL);
 	if (!client)
diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index 2509e5d..963aab9 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -123,27 +123,28 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
 	for_each_process(tsk) {
 		struct task_struct *p;
 		short oom_score_adj;
+		unsigned long irqflags;
 
 		if (tsk->flags & PF_KTHREAD)
 			continue;
 
-		p = find_lock_task_mm(tsk);
+		p = find_lock_task_mm(tsk, &irqflags);
 		if (!p)
 			continue;
 
 		if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
 		    time_before_eq(jiffies, lowmem_deathpending_timeout)) {
-			task_unlock(p);
+			task_unlock(p, &irqflags);
 			rcu_read_unlock();
 			return 0;
 		}
 		oom_score_adj = p->signal->oom_score_adj;
 		if (oom_score_adj < min_score_adj) {
-			task_unlock(p);
+			task_unlock(p, &irqflags);
 			continue;
 		}
 		tasksize = get_mm_rss(p->mm);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 		if (tasksize <= 0)
 			continue;
 		if (selected) {
@@ -160,7 +161,9 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
 			     p->comm, p->pid, oom_score_adj, tasksize);
 	}
 	if (selected) {
-		task_lock(selected);
+		unsigned long irqflags;
+
+		task_lock(selected, &irqflags);
 		send_sig(SIGKILL, selected, 0);
 		/*
 		 * FIXME: lowmemorykiller shouldn't abuse global OOM killer
@@ -169,7 +172,7 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
 		 */
 		if (selected->mm)
 			mark_oom_victim(selected);
-		task_unlock(selected);
+		task_unlock(selected, &irqflags);
 		lowmem_print(1, "Killing '%s' (%d), adj %hd,\n"
 				 "   to free %ldkB on behalf of '%s' (%d) because\n"
 				 "   cache %ldkB is below limit %ldkB for oom_score_adj %hd\n"
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
index 187fd1d..74549d3 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
@@ -2193,6 +2193,7 @@ EXPORT_SYMBOL(sptlrpc_current_user_desc_size);
 int sptlrpc_pack_user_desc(struct lustre_msg *msg, int offset)
 {
 	struct ptlrpc_user_desc *pud;
+	unsigned long irqflags;
 
 	pud = lustre_msg_buf(msg, offset, 0);
 
@@ -2203,12 +2204,12 @@ int sptlrpc_pack_user_desc(struct lustre_msg *msg, int offset)
 	pud->pud_cap = cfs_curproc_cap_pack();
 	pud->pud_ngroups = (msg->lm_buflens[offset] - sizeof(*pud)) / 4;
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	if (pud->pud_ngroups > current_ngroups)
 		pud->pud_ngroups = current_ngroups;
 	memcpy(pud->pud_groups, current_cred()->group_info->blocks[0],
 	       pud->pud_ngroups * sizeof(__u32));
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 
 	return 0;
 }
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 24d5491..d10475b 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3085,20 +3085,22 @@ void __do_SAK(struct tty_struct *tty)
 
 	/* Now kill any processes that happen to have the tty open */
 	do_each_thread(g, p) {
+		unsigned long irqflags;
+
 		if (p->signal->tty == tty) {
 			tty_notice(tty, "SAK: killed process %d (%s): by controlling tty\n",
 				   task_pid_nr(p), p->comm);
 			send_sig(SIGKILL, p, 1);
 			continue;
 		}
-		task_lock(p);
+		task_lock(p, &irqflags);
 		i = iterate_fd(p->files, 0, this_tty, tty);
 		if (i != 0) {
 			tty_notice(tty, "SAK: killed process %d (%s): by fd#%d\n",
 				   task_pid_nr(p), p->comm, i - 1);
 			force_sig(SIGKILL, p);
 		}
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} while_each_thread(g, p);
 	read_unlock(&tasklist_lock);
 #endif
diff --git a/fs/coredump.c b/fs/coredump.c
index 47c32c3..5122e6f 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -703,10 +703,11 @@ void do_coredump(const siginfo_t *siginfo)
 			 * root directory of init_task.
 			 */
 			struct path root;
+			unsigned long irqflags;
 
-			task_lock(&init_task);
+			task_lock(&init_task, &irqflags);
 			get_fs_root(init_task.fs, &root);
-			task_unlock(&init_task);
+			task_unlock(&init_task, &irqflags);
 			cprm.file = file_open_root(root.dentry, root.mnt,
 				cn.corename, open_flags, 0600);
 			path_put(&root);
diff --git a/fs/exec.c b/fs/exec.c
index c4010b8..7e95215 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -940,6 +940,7 @@ static int exec_mmap(struct mm_struct *mm)
 {
 	struct task_struct *tsk;
 	struct mm_struct *old_mm, *active_mm;
+	unsigned long irqflags;
 
 	/* Notify parent that we're no longer interested in the old VM */
 	tsk = current;
@@ -960,14 +961,14 @@ static int exec_mmap(struct mm_struct *mm)
 			return -EINTR;
 		}
 	}
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	active_mm = tsk->active_mm;
 	tsk->mm = mm;
 	tsk->active_mm = mm;
 	activate_mm(active_mm, mm);
 	tsk->mm->vmacache_seqnum = 0;
 	vmacache_flush(tsk);
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	if (old_mm) {
 		up_read(&old_mm->mmap_sem);
 		BUG_ON(active_mm != old_mm);
@@ -1153,10 +1154,12 @@ killed:
 
 char *get_task_comm(char *buf, struct task_struct *tsk)
 {
+	unsigned long irqflags;
+
 	/* buf must be at least sizeof(tsk->comm) in size */
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	strncpy(buf, tsk->comm, sizeof(tsk->comm));
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	return buf;
 }
 EXPORT_SYMBOL_GPL(get_task_comm);
@@ -1168,10 +1171,12 @@ EXPORT_SYMBOL_GPL(get_task_comm);
 
 void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec)
 {
-	task_lock(tsk);
+	unsigned long irqflags;
+
+	task_lock(tsk, &irqflags);
 	trace_task_rename(tsk, buf);
 	strlcpy(tsk->comm, buf, sizeof(tsk->comm));
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	perf_event_comm(tsk, exec);
 }
 
diff --git a/fs/file.c b/fs/file.c
index 1fbc5c0..19cbf9b 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -418,12 +418,13 @@ static struct fdtable *close_files(struct files_struct * files)
 struct files_struct *get_files_struct(struct task_struct *task)
 {
 	struct files_struct *files;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	files = task->files;
 	if (files)
 		atomic_inc(&files->count);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return files;
 }
@@ -444,11 +445,12 @@ void reset_files_struct(struct files_struct *files)
 {
 	struct task_struct *tsk = current;
 	struct files_struct *old;
+	unsigned long irqflags;
 
 	old = tsk->files;
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	tsk->files = files;
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	put_files_struct(old);
 }
 
@@ -457,9 +459,11 @@ void exit_files(struct task_struct *tsk)
 	struct files_struct * files = tsk->files;
 
 	if (files) {
-		task_lock(tsk);
+		unsigned long irqflags;
+
+		task_lock(tsk, &irqflags);
 		tsk->files = NULL;
-		task_unlock(tsk);
+		task_unlock(tsk, &irqflags);
 		put_files_struct(files);
 	}
 }
diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index 7dca743..426dab4 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -58,10 +58,11 @@ void chroot_fs_refs(const struct path *old_root, const struct path *new_root)
 	struct task_struct *g, *p;
 	struct fs_struct *fs;
 	int count = 0;
+	unsigned long irqflags;
 
 	read_lock(&tasklist_lock);
 	do_each_thread(g, p) {
-		task_lock(p);
+		task_lock(p, &irqflags);
 		fs = p->fs;
 		if (fs) {
 			int hits = 0;
@@ -76,7 +77,7 @@ void chroot_fs_refs(const struct path *old_root, const struct path *new_root)
 			}
 			spin_unlock(&fs->lock);
 		}
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} while_each_thread(g, p);
 	read_unlock(&tasklist_lock);
 	while (count--)
@@ -95,13 +96,15 @@ void exit_fs(struct task_struct *tsk)
 	struct fs_struct *fs = tsk->fs;
 
 	if (fs) {
+		unsigned long irqflags;
 		int kill;
-		task_lock(tsk);
+
+		task_lock(tsk, &irqflags);
 		spin_lock(&fs->lock);
 		tsk->fs = NULL;
 		kill = !--fs->users;
 		spin_unlock(&fs->lock);
-		task_unlock(tsk);
+		task_unlock(tsk, &irqflags);
 		if (kill)
 			free_fs_struct(fs);
 	}
@@ -133,16 +136,17 @@ int unshare_fs_struct(void)
 	struct fs_struct *fs = current->fs;
 	struct fs_struct *new_fs = copy_fs_struct(fs);
 	int kill;
+	unsigned long irqflags;
 
 	if (!new_fs)
 		return -ENOMEM;
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	spin_lock(&fs->lock);
 	kill = !--fs->users;
 	current->fs = new_fs;
 	spin_unlock(&fs->lock);
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 
 	if (kill)
 		free_fs_struct(fs);
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 4ea71eb..73e7591 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1260,10 +1260,12 @@ struct file *hugetlb_file_setup(const char *name, size_t size,
 	if (creat_flags == HUGETLB_SHMFS_INODE && !can_do_hugetlb_shm()) {
 		*user = current_user();
 		if (user_shm_lock(size, *user)) {
-			task_lock(current);
+			unsigned long irqflags;
+
+			task_lock(current, &irqflags);
 			pr_warn_once("%s (%d): Using mlock ulimits for SHM_HUGETLB is deprecated\n",
 				current->comm, current->pid);
-			task_unlock(current);
+			task_unlock(current, &irqflags);
 		} else {
 			*user = NULL;
 			return ERR_PTR(-EPERM);
diff --git a/fs/namespace.c b/fs/namespace.c
index 4fb1691..504408d 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3294,16 +3294,17 @@ found:
 
 static struct ns_common *mntns_get(struct task_struct *task)
 {
+	unsigned long irqflags;
 	struct ns_common *ns = NULL;
 	struct nsproxy *nsproxy;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy) {
 		ns = &nsproxy->mnt_ns->ns;
 		get_mnt_ns(to_mnt_ns(ns));
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ns;
 }
diff --git a/fs/proc/array.c b/fs/proc/array.c
index b6c00ce..07907e7 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -149,6 +149,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	const struct cred *cred;
 	pid_t ppid, tpid = 0, tgid, ngid;
 	unsigned int max_fds = 0;
+	unsigned long irqflags;
 
 	rcu_read_lock();
 	ppid = pid_alive(p) ?
@@ -162,10 +163,10 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
 	ngid = task_numa_group_id(p);
 	cred = get_task_cred(p);
 
-	task_lock(p);
+	task_lock(p, &irqflags);
 	if (p->files)
 		max_fds = files_fdtable(p->files)->max_fds;
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 	rcu_read_unlock();
 
 	seq_printf(m,
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 0d163a8..eef7d4d 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -157,13 +157,14 @@ static unsigned int pid_entry_count_dirs(const struct pid_entry *entries,
 static int get_task_root(struct task_struct *task, struct path *root)
 {
 	int result = -ENOENT;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	if (task->fs) {
 		get_fs_root(task->fs, root);
 		result = 0;
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	return result;
 }
 
@@ -173,12 +174,14 @@ static int proc_cwd_link(struct dentry *dentry, struct path *path)
 	int result = -ENOENT;
 
 	if (task) {
-		task_lock(task);
+		unsigned long irqflags;
+
+		task_lock(task, &irqflags);
 		if (task->fs) {
 			get_fs_pwd(task->fs, path);
 			result = 0;
 		}
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 		put_task_struct(task);
 	}
 	return result;
@@ -1057,7 +1060,7 @@ static ssize_t oom_adj_write(struct file *file, const char __user *buf,
 	struct task_struct *task;
 	char buffer[PROC_NUMBUF];
 	int oom_adj;
-	unsigned long flags;
+	unsigned long flags, irqflags;
 	int err;
 
 	memset(buffer, 0, sizeof(buffer));
@@ -1083,7 +1086,7 @@ static ssize_t oom_adj_write(struct file *file, const char __user *buf,
 		goto out;
 	}
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	if (!task->mm) {
 		err = -EINVAL;
 		goto err_task_lock;
@@ -1122,7 +1125,7 @@ static ssize_t oom_adj_write(struct file *file, const char __user *buf,
 err_sighand:
 	unlock_task_sighand(task, &flags);
 err_task_lock:
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	put_task_struct(task);
 out:
 	return err < 0 ? err : count;
@@ -1159,7 +1162,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
 {
 	struct task_struct *task;
 	char buffer[PROC_NUMBUF];
-	unsigned long flags;
+	unsigned long flags, irqflags;
 	int oom_score_adj;
 	int err;
 
@@ -1186,7 +1189,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
 		goto out;
 	}
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	if (!task->mm) {
 		err = -EINVAL;
 		goto err_task_lock;
@@ -1211,7 +1214,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
 err_sighand:
 	unlock_task_sighand(task, &flags);
 err_task_lock:
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	put_task_struct(task);
 out:
 	return err < 0 ? err : count;
@@ -1522,14 +1525,15 @@ static int comm_show(struct seq_file *m, void *v)
 {
 	struct inode *inode = m->private;
 	struct task_struct *p;
+	unsigned long irqflags;
 
 	p = get_proc_task(inode);
 	if (!p)
 		return -ESRCH;
 
-	task_lock(p);
+	task_lock(p, &irqflags);
 	seq_printf(m, "%s\n", p->comm);
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	put_task_struct(p);
 
@@ -2277,12 +2281,14 @@ static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
 		return -ESRCH;
 
 	if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
-		task_lock(p);
+		unsigned long irqflags;
+
+		task_lock(p, &irqflags);
 		if (slack_ns == 0)
 			p->timer_slack_ns = p->default_timer_slack_ns;
 		else
 			p->timer_slack_ns = slack_ns;
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} else
 		count = -EPERM;
 
@@ -2302,9 +2308,11 @@ static int timerslack_ns_show(struct seq_file *m, void *v)
 		return -ESRCH;
 
 	if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
-		task_lock(p);
+		unsigned long irqflags;
+
+		task_lock(p, &irqflags);
 		seq_printf(m, "%llu\n", p->timer_slack_ns);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} else
 		err = -EPERM;
 
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..33da171 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -99,14 +99,15 @@ static inline struct task_struct *get_proc_task(struct inode *inode)
 
 static inline int task_dumpable(struct task_struct *task)
 {
+	unsigned long irqflags;
 	int dumpable = 0;
 	struct mm_struct *mm;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	mm = task->mm;
 	if (mm)
 		dumpable = get_dumpable(mm);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	if (dumpable == SUID_DUMP_USER)
 		return 1;
 	return 0;
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index 350984a..ffb8c8f 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -113,11 +113,13 @@ static struct net *get_proc_task_net(struct inode *dir)
 	rcu_read_lock();
 	task = pid_task(proc_pid(dir), PIDTYPE_PID);
 	if (task != NULL) {
-		task_lock(task);
+		unsigned long irqflags;
+
+		task_lock(task, &irqflags);
 		ns = task->nsproxy;
 		if (ns != NULL)
 			net = get_net(ns->net_ns);
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 	}
 	rcu_read_unlock();
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 5415835..f86f3bb 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -107,12 +107,13 @@ unsigned long task_statm(struct mm_struct *mm,
  */
 static void hold_task_mempolicy(struct proc_maps_private *priv)
 {
+	unsigned long irqflags;
 	struct task_struct *task = priv->task;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	priv->task_mempolicy = get_task_policy(task);
 	mpol_get(priv->task_mempolicy);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 }
 static void release_task_mempolicy(struct proc_maps_private *priv)
 {
diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
index 3f1190d..a3cc5cb 100644
--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -242,27 +242,28 @@ static int mounts_open_common(struct inode *inode, struct file *file,
 	struct proc_mounts *p;
 	struct seq_file *m;
 	int ret = -EINVAL;
+	unsigned long irqflags;
 
 	if (!task)
 		goto err;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsp = task->nsproxy;
 	if (!nsp || !nsp->mnt_ns) {
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 		put_task_struct(task);
 		goto err;
 	}
 	ns = nsp->mnt_ns;
 	get_mnt_ns(ns);
 	if (!task->fs) {
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 		put_task_struct(task);
 		ret = -ENOENT;
 		goto err_put_ns;
 	}
 	get_fs_root(task->fs, &root);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	put_task_struct(task);
 
 	ret = seq_open_private(file, &mounts_op, sizeof(struct proc_mounts));
diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
index 85a868c..ed091ba 100644
--- a/include/linux/cpuset.h
+++ b/include/linux/cpuset.h
@@ -126,15 +126,13 @@ static inline bool read_mems_allowed_retry(unsigned int seq)
 
 static inline void set_mems_allowed(nodemask_t nodemask)
 {
-	unsigned long flags;
+	unsigned long irqflags;
 
-	task_lock(current);
-	local_irq_save(flags);
+	task_lock(current, &irqflags);
 	write_seqcount_begin(&current->mems_allowed_seq);
 	current->mems_allowed = nodemask;
 	write_seqcount_end(&current->mems_allowed_seq);
-	local_irq_restore(flags);
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 }
 
 #else /* !CONFIG_CPUSETS */
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index ac0d65b..32b80dc 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -49,7 +49,9 @@ extern struct nsproxy init_nsproxy;
  *     precautions should be taken - just dereference the pointers
  *
  *  3. the access to other task namespaces is performed like this
- *     task_lock(task);
+ *     unsigned long irqflags;
+ *
+ *     task_lock(task, &irqflags);
  *     nsproxy = task->nsproxy;
  *     if (nsproxy != NULL) {
  *             / *
@@ -60,7 +62,7 @@ extern struct nsproxy init_nsproxy;
  *         * NULL task->nsproxy means that this task is
  *         * almost dead (zombie)
  *         * /
- *     task_unlock(task);
+ *     task_unlock(task, &irqflags);
  *
  */
 
diff --git a/include/linux/oom.h b/include/linux/oom.h
index 628a432..80251a8 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -98,7 +98,8 @@ extern bool oom_killer_disabled;
 extern bool oom_killer_disable(void);
 extern void oom_killer_enable(void);
 
-extern struct task_struct *find_lock_task_mm(struct task_struct *p);
+extern struct task_struct *find_lock_task_mm(struct task_struct *p,
+								unsigned long *irqflags);
 
 static inline bool task_will_free_mem(struct task_struct *task)
 {
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 52c4847..9e643fd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2769,14 +2769,14 @@ static inline int thread_group_empty(struct task_struct *p)
  * It must not be nested with write_lock_irq(&tasklist_lock),
  * neither inside nor outside.
  */
-static inline void task_lock(struct task_struct *p)
+static inline void task_lock(struct task_struct *p, unsigned long *irqflags)
 {
-	spin_lock(&p->alloc_lock);
+	spin_lock_irqsave(&p->alloc_lock, *irqflags);
 }
 
-static inline void task_unlock(struct task_struct *p)
+static inline void task_unlock(struct task_struct *p, unsigned long *irqflags)
 {
-	spin_unlock(&p->alloc_lock);
+	spin_unlock_irqrestore(&p->alloc_lock, *irqflags);
 }
 
 extern struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
diff --git a/ipc/namespace.c b/ipc/namespace.c
index 068caf1..4994299 100644
--- a/ipc/namespace.c
+++ b/ipc/namespace.c
@@ -135,14 +135,15 @@ static inline struct ipc_namespace *to_ipc_ns(struct ns_common *ns)
 
 static struct ns_common *ipcns_get(struct task_struct *task)
 {
+	unsigned long irqflags;
 	struct ipc_namespace *ns = NULL;
 	struct nsproxy *nsproxy;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy)
 		ns = get_ipc_ns(nsproxy->ipc_ns);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ns ? &ns->ns : NULL;
 }
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 86cb5c6..693b474 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -6354,14 +6354,15 @@ static struct ns_common *cgroupns_get(struct task_struct *task)
 {
 	struct cgroup_namespace *ns = NULL;
 	struct nsproxy *nsproxy;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy) {
 		ns = nsproxy->cgroup_ns;
 		get_cgroup_ns(ns);
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ns ? &ns->ns : NULL;
 }
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 3e3f6e4..0109299 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -615,16 +615,17 @@ void clear_tasks_mm_cpumask(int cpu)
 	rcu_read_lock();
 	for_each_process(p) {
 		struct task_struct *t;
+		unsigned long irqflags;
 
 		/*
 		 * Main thread might exit, but other threads may still have
 		 * a valid mm. Find one.
 		 */
-		t = find_lock_task_mm(p);
+		t = find_lock_task_mm(p, &irqflags);
 		if (!t)
 			continue;
 		cpumask_clear_cpu(cpu, mm_cpumask(t->mm));
-		task_unlock(t);
+		task_unlock(t, &irqflags);
 	}
 	rcu_read_unlock();
 }
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 1902956b..8bd56cc 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1033,6 +1033,7 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
 					nodemask_t *newmems)
 {
 	bool need_loop;
+	unsigned long irqflags;
 
 	/*
 	 * Allow tasks that have access to memory reserves because they have
@@ -1043,7 +1044,7 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
 	if (current->flags & PF_EXITING) /* Let dying task have memory */
 		return;
 
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	/*
 	 * Determine if a loop is necessary if another thread is doing
 	 * read_mems_allowed_begin().  If at least one node remains unchanged and
@@ -1054,7 +1055,6 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
 			!nodes_intersects(*newmems, tsk->mems_allowed);
 
 	if (need_loop) {
-		local_irq_disable();
 		write_seqcount_begin(&tsk->mems_allowed_seq);
 	}
 
@@ -1066,10 +1066,9 @@ static void cpuset_change_task_nodemask(struct task_struct *tsk,
 
 	if (need_loop) {
 		write_seqcount_end(&tsk->mems_allowed_seq);
-		local_irq_enable();
 	}
 
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 }
 
 static void *cpuset_being_rebound;
diff --git a/kernel/exit.c b/kernel/exit.c
index 79c7e38..a4de5ed 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -298,6 +298,7 @@ kill_orphaned_pgrp(struct task_struct *tsk, struct task_struct *parent)
 void mm_update_next_owner(struct mm_struct *mm)
 {
 	struct task_struct *c, *g, *p = current;
+	unsigned long irqflags;
 
 retry:
 	/*
@@ -362,19 +363,19 @@ assign_new_owner:
 	 * The task_lock protects c->mm from changing.
 	 * We always want mm->owner->mm == mm
 	 */
-	task_lock(c);
+	task_lock(c, &irqflags);
 	/*
 	 * Delay read_unlock() till we have the task_lock()
 	 * to ensure that c does not slip away underneath us
 	 */
 	read_unlock(&tasklist_lock);
 	if (c->mm != mm) {
-		task_unlock(c);
+		task_unlock(c, &irqflags);
 		put_task_struct(c);
 		goto retry;
 	}
 	mm->owner = c;
-	task_unlock(c);
+	task_unlock(c, &irqflags);
 	put_task_struct(c);
 }
 #endif /* CONFIG_MEMCG */
@@ -387,6 +388,7 @@ static void exit_mm(struct task_struct *tsk)
 {
 	struct mm_struct *mm = tsk->mm;
 	struct core_state *core_state;
+	unsigned long irqflags;
 
 	mm_release(tsk, mm);
 	if (!mm)
@@ -427,11 +429,11 @@ static void exit_mm(struct task_struct *tsk)
 	atomic_inc(&mm->mm_count);
 	BUG_ON(mm != tsk->active_mm);
 	/* more a memory barrier than a real lock */
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	tsk->mm = NULL;
 	up_read(&mm->mmap_sem);
 	enter_lazy_tlb(mm, current);
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	mm_update_next_owner(mm);
 	mmput(mm);
 	if (test_thread_flag(TIF_MEMDIE))
@@ -654,6 +656,9 @@ void do_exit(long code)
 	struct task_struct *tsk = current;
 	int group_dead;
 	TASKS_RCU(int tasks_rcu_i);
+#ifdef CONFIG_NUMA
+	unsigned long irqflags;
+#endif
 
 	profile_task_exit(tsk);
 	kcov_task_exit(tsk);
@@ -769,10 +774,10 @@ void do_exit(long code)
 	exit_notify(tsk, group_dead);
 	proc_exit_connector(tsk);
 #ifdef CONFIG_NUMA
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	mpol_put(tsk->mempolicy);
 	tsk->mempolicy = NULL;
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 #endif
 #ifdef CONFIG_FUTEX
 	if (unlikely(current->pi_state_cache))
diff --git a/kernel/fork.c b/kernel/fork.c
index d277e83..388ba3f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -785,8 +785,9 @@ EXPORT_SYMBOL(get_mm_exe_file);
 struct mm_struct *get_task_mm(struct task_struct *task)
 {
 	struct mm_struct *mm;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	mm = task->mm;
 	if (mm) {
 		if (task->flags & PF_KTHREAD)
@@ -794,7 +795,7 @@ struct mm_struct *get_task_mm(struct task_struct *task)
 		else
 			atomic_inc(&mm->mm_users);
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	return mm;
 }
 EXPORT_SYMBOL_GPL(get_task_mm);
@@ -822,14 +823,15 @@ struct mm_struct *mm_access(struct task_struct *task, unsigned int mode)
 static void complete_vfork_done(struct task_struct *tsk)
 {
 	struct completion *vfork;
+	unsigned long irqflags;
 
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	vfork = tsk->vfork_done;
 	if (likely(vfork)) {
 		tsk->vfork_done = NULL;
 		complete(vfork);
 	}
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 }
 
 static int wait_for_vfork_done(struct task_struct *child,
@@ -842,9 +844,11 @@ static int wait_for_vfork_done(struct task_struct *child,
 	freezer_count();
 
 	if (killed) {
-		task_lock(child);
+		unsigned long irqflags;
+
+		task_lock(child, &irqflags);
 		child->vfork_done = NULL;
-		task_unlock(child);
+		task_unlock(child, &irqflags);
 	}
 
 	put_task_struct(child);
@@ -1126,6 +1130,7 @@ static void posix_cpu_timers_init_group(struct signal_struct *sig)
 static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 {
 	struct signal_struct *sig;
+	unsigned long irqflags;
 
 	if (clone_flags & CLONE_THREAD)
 		return 0;
@@ -1153,9 +1158,9 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 	hrtimer_init(&sig->real_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	sig->real_timer.function = it_real_fn;
 
-	task_lock(current->group_leader);
+	task_lock(current->group_leader, &irqflags);
 	memcpy(sig->rlim, current->signal->rlim, sizeof sig->rlim);
-	task_unlock(current->group_leader);
+	task_unlock(current->group_leader, &irqflags);
 
 	posix_cpu_timers_init_group(sig);
 
@@ -2022,6 +2027,8 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 		goto bad_unshare_cleanup_cred;
 
 	if (new_fs || new_fd || do_sysvsem || new_cred || new_nsproxy) {
+		unsigned long irqflags;
+
 		if (do_sysvsem) {
 			/*
 			 * CLONE_SYSVSEM is equivalent to sys_exit().
@@ -2037,7 +2044,7 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 		if (new_nsproxy)
 			switch_task_namespaces(current, new_nsproxy);
 
-		task_lock(current);
+		task_lock(current, &irqflags);
 
 		if (new_fs) {
 			fs = current->fs;
@@ -2056,7 +2063,7 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
 			new_fd = fd;
 		}
 
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 
 		if (new_cred) {
 			/* Install the new user namespace */
@@ -2091,6 +2098,7 @@ int unshare_files(struct files_struct **displaced)
 	struct task_struct *task = current;
 	struct files_struct *copy = NULL;
 	int error;
+	unsigned long irqflags;
 
 	error = unshare_fd(CLONE_FILES, &copy);
 	if (error || !copy) {
@@ -2098,9 +2106,9 @@ int unshare_files(struct files_struct **displaced)
 		return error;
 	}
 	*displaced = task->files;
-	task_lock(task);
+	task_lock(task, &irqflags);
 	task->files = copy;
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	return 0;
 }
 
diff --git a/kernel/kcmp.c b/kernel/kcmp.c
index 3a47fa9..e5b5412 100644
--- a/kernel/kcmp.c
+++ b/kernel/kcmp.c
@@ -57,15 +57,16 @@ static struct file *
 get_file_raw_ptr(struct task_struct *task, unsigned int idx)
 {
 	struct file *file = NULL;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	rcu_read_lock();
 
 	if (task->files)
 		file = fcheck_files(task->files, idx);
 
 	rcu_read_unlock();
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return file;
 }
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 782102e..65f407d 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -216,13 +216,14 @@ out:
 void switch_task_namespaces(struct task_struct *p, struct nsproxy *new)
 {
 	struct nsproxy *ns;
+	unsigned long irqflags;
 
 	might_sleep();
 
-	task_lock(p);
+	task_lock(p, &irqflags);
 	ns = p->nsproxy;
 	p->nsproxy = new;
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	if (ns && atomic_dec_and_test(&ns->count))
 		free_nsproxy(ns);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index d49bfa1..2aba142 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -286,9 +286,11 @@ ok:
 bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 {
 	int err;
-	task_lock(task);
+	unsigned long irqflags;
+
+	task_lock(task, &irqflags);
 	err = __ptrace_may_access(task, mode);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	return !err;
 }
 
@@ -298,6 +300,7 @@ static int ptrace_attach(struct task_struct *task, long request,
 {
 	bool seize = (request == PTRACE_SEIZE);
 	int retval;
+	unsigned long irqflags;
 
 	retval = -EIO;
 	if (seize) {
@@ -327,9 +330,9 @@ static int ptrace_attach(struct task_struct *task, long request,
 	if (mutex_lock_interruptible(&task->signal->cred_guard_mutex))
 		goto out;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 	if (retval)
 		goto unlock_creds;
 
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 4fbc3bd..a95f445 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -841,16 +841,17 @@ static void sched_show_numa(struct task_struct *p, struct seq_file *m)
 {
 #ifdef CONFIG_NUMA_BALANCING
 	struct mempolicy *pol;
+	unsigned long irqflags;
 
 	if (p->mm)
 		P(mm->numa_scan_seq);
 
-	task_lock(p);
+	task_lock(p, &irqflags);
 	pol = p->mempolicy;
 	if (pol && !(pol->flags & MPOL_F_MORON))
 		pol = NULL;
 	mpol_get(pol);
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	P(numa_pages_migrated);
 	P(numa_preferred_nid);
diff --git a/kernel/sys.c b/kernel/sys.c
index cf8ba54..fb2e6a8 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1308,12 +1308,14 @@ SYSCALL_DEFINE2(old_getrlimit, unsigned int, resource,
 		struct rlimit __user *, rlim)
 {
 	struct rlimit x;
+	unsigned long irqflags;
+
 	if (resource >= RLIM_NLIMITS)
 		return -EINVAL;
 
-	task_lock(current->group_leader);
+	task_lock(current->group_leader, &irqflags);
 	x = current->signal->rlim[resource];
-	task_unlock(current->group_leader);
+	task_unlock(current->group_leader, &irqflags);
 	if (x.rlim_cur > 0x7FFFFFFF)
 		x.rlim_cur = 0x7FFFFFFF;
 	if (x.rlim_max > 0x7FFFFFFF)
@@ -1362,6 +1364,7 @@ int do_prlimit(struct task_struct *tsk, unsigned int resource,
 {
 	struct rlimit *rlim;
 	int retval = 0;
+	unsigned long irqflags;
 
 	if (resource >= RLIM_NLIMITS)
 		return -EINVAL;
@@ -1381,7 +1384,7 @@ int do_prlimit(struct task_struct *tsk, unsigned int resource,
 	}
 
 	rlim = tsk->signal->rlim + resource;
-	task_lock(tsk->group_leader);
+	task_lock(tsk->group_leader, &irqflags);
 	if (new_rlim) {
 		/* Keep the capable check against init_user_ns until
 		   cgroups can contain all limits */
@@ -1407,7 +1410,7 @@ int do_prlimit(struct task_struct *tsk, unsigned int resource,
 		if (new_rlim)
 			*rlim = *new_rlim;
 	}
-	task_unlock(tsk->group_leader);
+	task_unlock(tsk->group_leader, &irqflags);
 
 	/*
 	 * RLIMIT_CPU handling.   Note that the kernel fails to return an error
@@ -1911,6 +1914,7 @@ static int prctl_set_auxv(struct mm_struct *mm, unsigned long addr,
 	 * tools which use this vector might be unhappy.
 	 */
 	unsigned long user_auxv[AT_VECTOR_SIZE];
+	unsigned long irqflags;
 
 	if (len > sizeof(user_auxv))
 		return -EINVAL;
@@ -1924,9 +1928,9 @@ static int prctl_set_auxv(struct mm_struct *mm, unsigned long addr,
 
 	BUILD_BUG_ON(sizeof(user_auxv) != sizeof(mm->saved_auxv));
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	memcpy(mm->saved_auxv, user_auxv, len);
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 
 	return 0;
 }
diff --git a/kernel/utsname.c b/kernel/utsname.c
index 831ea71..5732717 100644
--- a/kernel/utsname.c
+++ b/kernel/utsname.c
@@ -99,14 +99,15 @@ static struct ns_common *utsns_get(struct task_struct *task)
 {
 	struct uts_namespace *ns = NULL;
 	struct nsproxy *nsproxy;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy) {
 		ns = nsproxy->uts_ns;
 		get_uts_ns(ns);
 	}
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return ns ? &ns->ns : NULL;
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a2e79b8..21d50f0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1046,11 +1046,12 @@ bool task_in_mem_cgroup(struct task_struct *task, struct mem_cgroup *memcg)
 	struct mem_cgroup *task_memcg;
 	struct task_struct *p;
 	bool ret;
+	unsigned long irqflags;
 
-	p = find_lock_task_mm(task);
+	p = find_lock_task_mm(task, &irqflags);
 	if (p) {
 		task_memcg = get_mem_cgroup_from_mm(p->mm);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	} else {
 		/*
 		 * All threads may have already detached their mm's, but the oom
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 36cc01b..05abb22 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -789,6 +789,7 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags,
 	struct mempolicy *new, *old;
 	NODEMASK_SCRATCH(scratch);
 	int ret;
+	unsigned long irqflags;
 
 	if (!scratch)
 		return -ENOMEM;
@@ -799,10 +800,10 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags,
 		goto out;
 	}
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	ret = mpol_set_nodemask(new, nodes, scratch);
 	if (ret) {
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 		mpol_put(new);
 		goto out;
 	}
@@ -811,7 +812,7 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags,
 	if (new && new->mode == MPOL_INTERLEAVE &&
 	    nodes_weight(new->v.nodes))
 		current->il_next = first_node(new->v.nodes);
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 	mpol_put(old);
 	ret = 0;
 out:
@@ -873,12 +874,14 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
 		return -EINVAL;
 
 	if (flags & MPOL_F_MEMS_ALLOWED) {
+		unsigned long irqflags;
+
 		if (flags & (MPOL_F_NODE|MPOL_F_ADDR))
 			return -EINVAL;
 		*policy = 0;	/* just so it's initialized */
-		task_lock(current);
+		task_lock(current, &irqflags);
 		*nmask  = cpuset_current_mems_allowed;
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 		return 0;
 	}
 
@@ -937,9 +940,11 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
 		if (mpol_store_user_nodemask(pol)) {
 			*nmask = pol->w.user_nodemask;
 		} else {
-			task_lock(current);
+			unsigned long irqflags;
+
+			task_lock(current, &irqflags);
 			get_policy_nodemask(pol, nmask);
-			task_unlock(current);
+			task_unlock(current, &irqflags);
 		}
 	}
 
@@ -1221,10 +1226,12 @@ static long do_mbind(unsigned long start, unsigned long len,
 	{
 		NODEMASK_SCRATCH(scratch);
 		if (scratch) {
+			unsigned long irqflags;
+
 			down_write(&mm->mmap_sem);
-			task_lock(current);
+			task_lock(current, &irqflags);
 			err = mpol_set_nodemask(new, nmask, scratch);
-			task_unlock(current);
+			task_unlock(current, &irqflags);
 			if (err)
 				up_write(&mm->mmap_sem);
 		} else
@@ -1876,11 +1883,12 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask)
 {
 	struct mempolicy *mempolicy;
 	int nid;
+	unsigned long irqflags;
 
 	if (!(mask && current->mempolicy))
 		return false;
 
-	task_lock(current);
+	task_lock(current, &irqflags);
 	mempolicy = current->mempolicy;
 	switch (mempolicy->mode) {
 	case MPOL_PREFERRED:
@@ -1900,7 +1908,7 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask)
 	default:
 		BUG();
 	}
-	task_unlock(current);
+	task_unlock(current, &irqflags);
 
 	return true;
 }
@@ -1921,10 +1929,11 @@ bool mempolicy_nodemask_intersects(struct task_struct *tsk,
 {
 	struct mempolicy *mempolicy;
 	bool ret = true;
+	unsigned long irqflags;
 
 	if (!mask)
 		return ret;
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	mempolicy = tsk->mempolicy;
 	if (!mempolicy)
 		goto out;
@@ -1946,7 +1955,7 @@ bool mempolicy_nodemask_intersects(struct task_struct *tsk,
 		BUG();
 	}
 out:
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 	return ret;
 }
 
@@ -2127,9 +2136,11 @@ struct mempolicy *__mpol_dup(struct mempolicy *old)
 
 	/* task's mempolicy is protected by alloc_lock */
 	if (old == current->mempolicy) {
-		task_lock(current);
+		unsigned long irqflags;
+
+		task_lock(current, &irqflags);
 		*new = *old;
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 	} else
 		*new = *old;
 
@@ -2474,6 +2485,7 @@ void mpol_shared_policy_init(struct shared_policy *sp, struct mempolicy *mpol)
 		struct vm_area_struct pvma;
 		struct mempolicy *new;
 		NODEMASK_SCRATCH(scratch);
+		unsigned long irqflags;
 
 		if (!scratch)
 			goto put_mpol;
@@ -2482,9 +2494,9 @@ void mpol_shared_policy_init(struct shared_policy *sp, struct mempolicy *mpol)
 		if (IS_ERR(new))
 			goto free_scratch; /* no valid nodemask intersection */
 
-		task_lock(current);
+		task_lock(current, &irqflags);
 		ret = mpol_set_nodemask(new, &mpol->w.user_nodemask, scratch);
-		task_unlock(current);
+		task_unlock(current, &irqflags);
 		if (ret)
 			goto put_new;
 
diff --git a/mm/mmu_context.c b/mm/mmu_context.c
index f802c2d..0651d21 100644
--- a/mm/mmu_context.c
+++ b/mm/mmu_context.c
@@ -21,8 +21,9 @@ void use_mm(struct mm_struct *mm)
 {
 	struct mm_struct *active_mm;
 	struct task_struct *tsk = current;
+	unsigned long irqflags;
 
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	active_mm = tsk->active_mm;
 	if (active_mm != mm) {
 		atomic_inc(&mm->mm_count);
@@ -30,7 +31,7 @@ void use_mm(struct mm_struct *mm)
 	}
 	tsk->mm = mm;
 	switch_mm(active_mm, mm, tsk);
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 #ifdef finish_arch_post_lock_switch
 	finish_arch_post_lock_switch();
 #endif
@@ -51,12 +52,13 @@ EXPORT_SYMBOL_GPL(use_mm);
 void unuse_mm(struct mm_struct *mm)
 {
 	struct task_struct *tsk = current;
+	unsigned long irqflags;
 
-	task_lock(tsk);
+	task_lock(tsk, &irqflags);
 	sync_mm_rss(mm);
 	tsk->mm = NULL;
 	/* active_mm is still 'mm' */
 	enter_lazy_tlb(mm, tsk);
-	task_unlock(tsk);
+	task_unlock(tsk, &irqflags);
 }
 EXPORT_SYMBOL_GPL(unuse_mm);
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 8634958..e8039784 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -104,17 +104,18 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk,
  * pointer.  Return p, or any of its subthreads with a valid ->mm, with
  * task_lock() held.
  */
-struct task_struct *find_lock_task_mm(struct task_struct *p)
+struct task_struct *find_lock_task_mm(struct task_struct *p,
+						unsigned long *irqflags)
 {
 	struct task_struct *t;
 
 	rcu_read_lock();
 
 	for_each_thread(p, t) {
-		task_lock(t);
+		task_lock(t, irqflags);
 		if (likely(t->mm))
 			goto found;
-		task_unlock(t);
+		task_unlock(t, irqflags);
 	}
 	t = NULL;
 found:
@@ -166,17 +167,18 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 {
 	long points;
 	long adj;
+	unsigned long irqflags;
 
 	if (oom_unkillable_task(p, memcg, nodemask))
 		return 0;
 
-	p = find_lock_task_mm(p);
+	p = find_lock_task_mm(p, &irqflags);
 	if (!p)
 		return 0;
 
 	adj = (long)p->signal->oom_score_adj;
 	if (adj == OOM_SCORE_ADJ_MIN) {
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 		return 0;
 	}
 
@@ -186,7 +188,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
 	 */
 	points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
 		atomic_long_read(&p->mm->nr_ptes) + mm_nr_pmds(p->mm);
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	/*
 	 * Root processes get 3% bonus, just like the __vm_enough_memory()
@@ -356,6 +358,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
 {
 	struct task_struct *p;
 	struct task_struct *task;
+	unsigned long irqflags;
 
 	pr_info("[ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name\n");
 	rcu_read_lock();
@@ -363,7 +366,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
 		if (oom_unkillable_task(p, memcg, nodemask))
 			continue;
 
-		task = find_lock_task_mm(p);
+		task = find_lock_task_mm(p, &irqflags);
 		if (!task) {
 			/*
 			 * This is a kthread or all of p's threads have already
@@ -380,7 +383,7 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)
 			mm_nr_pmds(task->mm),
 			get_mm_counter(task->mm, MM_SWAPENTS),
 			task->signal->oom_score_adj, task->comm);
-		task_unlock(task);
+		task_unlock(task, &irqflags);
 	}
 	rcu_read_unlock();
 }
@@ -432,6 +435,7 @@ static bool __oom_reap_task(struct task_struct *tsk)
 	struct zap_details details = {.check_swap_entries = true,
 				      .ignore_dirty = true};
 	bool ret = true;
+	unsigned long irqflags;
 
 	/*
 	 * Make sure we find the associated mm_struct even when the particular
@@ -439,17 +443,17 @@ static bool __oom_reap_task(struct task_struct *tsk)
 	 * We might have race with exit path so consider our work done if there
 	 * is no mm.
 	 */
-	p = find_lock_task_mm(tsk);
+	p = find_lock_task_mm(tsk, &irqflags);
 	if (!p)
 		return true;
 
 	mm = p->mm;
 	if (!atomic_inc_not_zero(&mm->mm_users)) {
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 		return true;
 	}
 
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	if (!down_read_trylock(&mm->mmap_sem)) {
 		ret = false;
@@ -686,19 +690,20 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
 					      DEFAULT_RATELIMIT_BURST);
 	bool can_oom_reap = true;
+	unsigned long irqflags;
 
 	/*
 	 * If the task is already exiting, don't alarm the sysadmin or kill
 	 * its children or threads, just set TIF_MEMDIE so it can die quickly
 	 */
-	task_lock(p);
+	task_lock(p, &irqflags);
 	if (p->mm && task_will_free_mem(p)) {
 		mark_oom_victim(p);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 		put_task_struct(p);
 		return;
 	}
-	task_unlock(p);
+	task_unlock(p, &irqflags);
 
 	if (__ratelimit(&oom_rs))
 		dump_header(oc, p, memcg);
@@ -734,7 +739,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 	}
 	read_unlock(&tasklist_lock);
 
-	p = find_lock_task_mm(victim);
+	p = find_lock_task_mm(victim, &irqflags);
 	if (!p) {
 		put_task_struct(victim);
 		return;
@@ -759,7 +764,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
 		K(get_mm_counter(victim->mm, MM_ANONPAGES)),
 		K(get_mm_counter(victim->mm, MM_FILEPAGES)),
 		K(get_mm_counter(victim->mm, MM_SHMEMPAGES)));
-	task_unlock(victim);
+	task_unlock(victim, &irqflags);
 
 	/*
 	 * Kill all user processes sharing victim->mm in other thread groups, if
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 2c2eb1b..421fccd 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -502,11 +502,13 @@ struct net *get_net_ns_by_pid(pid_t pid)
 	tsk = find_task_by_vpid(pid);
 	if (tsk) {
 		struct nsproxy *nsproxy;
-		task_lock(tsk);
+		unsigned long irqflags;
+
+		task_lock(tsk, &irqflags);
 		nsproxy = tsk->nsproxy;
 		if (nsproxy)
 			net = get_net(nsproxy->net_ns);
-		task_unlock(tsk);
+		task_unlock(tsk, &irqflags);
 	}
 	rcu_read_unlock();
 	return net;
@@ -963,12 +965,13 @@ static struct ns_common *netns_get(struct task_struct *task)
 {
 	struct net *net = NULL;
 	struct nsproxy *nsproxy;
+	unsigned long irqflags;
 
-	task_lock(task);
+	task_lock(task, &irqflags);
 	nsproxy = task->nsproxy;
 	if (nsproxy)
 		net = get_net(nsproxy->net_ns);
-	task_unlock(task);
+	task_unlock(task, &irqflags);
 
 	return net ? &net->ns : NULL;
 }
diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
index 11fce17..32482fd 100644
--- a/net/core/netclassid_cgroup.c
+++ b/net/core/netclassid_cgroup.c
@@ -76,9 +76,11 @@ static void update_classid(struct cgroup_subsys_state *css, void *v)
 
 	css_task_iter_start(css, &it);
 	while ((p = css_task_iter_next(&it))) {
-		task_lock(p);
+		unsigned long irqflags;
+
+		task_lock(p, &irqflags);
 		iterate_fd(p->files, 0, update_classid_sock, v);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	}
 	css_task_iter_end(&it);
 }
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 2ec86fc..e573c46 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -238,11 +238,12 @@ static void net_prio_attach(struct cgroup_taskset *tset)
 	struct cgroup_subsys_state *css;
 
 	cgroup_taskset_for_each(p, css, tset) {
+		unsigned long irqflags;
 		void *v = (void *)(unsigned long)css->cgroup->id;
 
-		task_lock(p);
+		task_lock(p, &irqflags);
 		iterate_fd(p->files, 0, update_netprio, v);
-		task_unlock(p);
+		task_unlock(p, &irqflags);
 	}
 }
 
-- 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-10 13:04 MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0 yhb
  2016-07-10 13:04 ` yhb
@ 2016-07-11  9:30 ` James Hogan
  2016-07-11  9:30   ` James Hogan
  2016-07-11 18:02 ` Leonid Yegoshin
  2 siblings, 1 reply; 18+ messages in thread
From: James Hogan @ 2016-07-11  9:30 UTC (permalink / raw)
  To: yhb; +Cc: ralf, linux-mips

[-- Attachment #1: Type: text/plain, Size: 8711 bytes --]

Hi,

On Sun, Jul 10, 2016 at 01:04:47PM +0000, yhb@ruijie.com.cn wrote:
> From cd1eb951d4a7f01aaa24d2fb902f06b73ef4f608 Mon Sep 17 00:00:00 2001
> From: yhb <yhb@ruijie.com.cn>
> Date: Sun, 10 Jul 2016 20:43:05 +0800
> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
>  when asid_cache(cpu) wraps to 0.
> 
> Suppose that asid_cache(cpu) wraps to 0 every n days.
> case 1:
> (1)Process 1 got ASID 0x101.
> (2)Process 1 slept for n days.
> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
> 
> case 2:
> (1)Process 1 got ASID 0x101 on CPU 1.
> (2)Process 1 migrated to CPU 2.
> (3)Process 1 migrated to CPU 1 after n days.
> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
> 
> So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
> 
> Signed-off-by: yhb <yhb@ruijie.com.cn>
> ---
>  arch/blackfin/kernel/trace.c               |  7 ++--
>  arch/frv/mm/mmu-context.c                  |  6 ++--
>  arch/mips/include/asm/mmu_context.h        | 53 ++++++++++++++++++++++++++++--
>  arch/um/kernel/reboot.c                    |  5 +--
>  block/blk-cgroup.c                         |  6 ++--
>  block/blk-ioc.c                            | 17 ++++++----
>  drivers/staging/android/ion/ion.c          |  5 +--
>  drivers/staging/android/lowmemorykiller.c  | 15 +++++----
>  drivers/staging/lustre/lustre/ptlrpc/sec.c |  5 +--
>  drivers/tty/tty_io.c                       |  6 ++--
>  fs/coredump.c                              |  5 +--
>  fs/exec.c                                  | 17 ++++++----
>  fs/file.c                                  | 16 +++++----
>  fs/fs_struct.c                             | 16 +++++----
>  fs/hugetlbfs/inode.c                       |  6 ++--
>  fs/namespace.c                             |  5 +--
>  fs/proc/array.c                            |  5 +--
>  fs/proc/base.c                             | 40 +++++++++++++---------
>  fs/proc/internal.h                         |  5 +--
>  fs/proc/proc_net.c                         |  6 ++--
>  fs/proc/task_mmu.c                         |  5 +--
>  fs/proc_namespace.c                        |  9 ++---
>  include/linux/cpuset.h                     |  8 ++---
>  include/linux/nsproxy.h                    |  6 ++--
>  include/linux/oom.h                        |  3 +-
>  include/linux/sched.h                      |  8 ++---
>  ipc/namespace.c                            |  5 +--
>  kernel/cgroup.c                            |  5 +--
>  kernel/cpu.c                               |  5 +--
>  kernel/cpuset.c                            |  7 ++--
>  kernel/exit.c                              | 19 +++++++----
>  kernel/fork.c                              | 32 +++++++++++-------
>  kernel/kcmp.c                              |  5 +--
>  kernel/nsproxy.c                           |  5 +--
>  kernel/ptrace.c                            | 11 ++++---
>  kernel/sched/debug.c                       |  5 +--
>  kernel/sys.c                               | 16 +++++----
>  kernel/utsname.c                           |  5 +--
>  mm/memcontrol.c                            |  5 +--
>  mm/mempolicy.c                             | 46 ++++++++++++++++----------
>  mm/mmu_context.c                           | 10 +++---
>  mm/oom_kill.c                              | 37 ++++++++++++---------
>  net/core/net_namespace.c                   | 11 ++++---
>  net/core/netclassid_cgroup.c               |  6 ++--
>  net/core/netprio_cgroup.c                  |  5 +--
>  45 files changed, 337 insertions(+), 188 deletions(-)

[snip / reorder]

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 52c4847..9e643fd 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2769,14 +2769,14 @@ static inline int thread_group_empty(struct task_struct *p)
>   * It must not be nested with write_lock_irq(&tasklist_lock),
>   * neither inside nor outside.
>   */
> -static inline void task_lock(struct task_struct *p)
> +static inline void task_lock(struct task_struct *p, unsigned long *irqflags)
>  {
> -	spin_lock(&p->alloc_lock);
> +	spin_lock_irqsave(&p->alloc_lock, *irqflags);

Since most of the patch is relating to this change, which is only a
means to an end, I suggest some changes if you stick to this approach
(but see my comments below too):

1) Please separate the change to use _irqsave/_irqrestore (and pass in
irqflags parameter) from whatever else this patch contains (presumably
only the arch/mips/include/asm/mmu_context.h change).

2) Please provide some explanation for why the irqsave change is
necessary. Presumably its due to the desire to lock tasks between irq
context (when context switching, in order to clear all *other* task's
asids) and the various other places.

3) This will affect other arches which don't need the irqsave at all,
which will bloat the kernel slightly unnecessarily. We should consider
if it can/should be made MIPS specific, and whether it can be avoided
entirely.

[snip / reorder]

> diff --git a/arch/mips/include/asm/mmu_context.h b/arch/mips/include/asm/mmu_context.h
> index 45914b5..68966b5 100644
> --- a/arch/mips/include/asm/mmu_context.h
> +++ b/arch/mips/include/asm/mmu_context.h
> @@ -12,6 +12,7 @@
>  #define _ASM_MMU_CONTEXT_H
>  
>  #include <linux/errno.h>
> +#include <linux/oom.h>/* find_lock_task_mm */
>  #include <linux/sched.h>
>  #include <linux/smp.h>
>  #include <linux/slab.h>
> @@ -97,6 +98,52 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
>  #define ASID_VERSION_MASK  ((unsigned long)~(ASID_MASK|(ASID_MASK-1)))
>  #define ASID_FIRST_VERSION ((unsigned long)(~ASID_VERSION_MASK) + 1)
>  
> +/*
> + * Yu Huabing
> + * Suppose that asid_cache(cpu) wraps to 0 every n days.
> + * case 1:
> + * (1)Process 1 got ASID 0x101.
> + * (2)Process 1 slept for n days.
> + * (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> + * (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
> + *
> + * case 2:
> + * (1)Process 1 got ASID 0x101 on CPU 1.
> + * (2)Process 1 migrated to CPU 2.
> + * (3)Process 1 migrated to CPU 1 after n days.
> + * (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> + * (5)Process 1 is scheduled,and ASID of process 1 is same as ASID of process 2.
> + *
> + * So we need to clear MMU contexts of all other processes when asid_cache(cpu)
> + * wraps to 0.
> + *
> + * This function might be called from hardirq context or process context.
> + */
> +static inline void clear_other_mmu_contexts(struct mm_struct *mm,
> +						unsigned long cpu)
> +{
> +	struct task_struct *p;
> +	unsigned long irqflags;
> +
> +	read_lock(&tasklist_lock);
> +	for_each_process(p) {
> +		struct task_struct *t;
> +
> +		/*
> +		 * Main thread might exit, but other threads may still have
> +		 * a valid mm. Find one.
> +		 */
> +		t = find_lock_task_mm(p, &irqflags);
> +		if (!t)
> +			continue;
> +
> +		if (t->mm != mm)
> +			cpu_context(cpu, t->mm) = 0;
> +		task_unlock(t, &irqflags);
> +	}
> +	read_unlock(&tasklist_lock);
> +}
> +
>  /* Normal, classic MIPS get_new_mmu_context */
>  static inline void
>  get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
> @@ -112,8 +159,10 @@ get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
>  #else
>  		local_flush_tlb_all();	/* start new asid cycle */
>  #endif
> -		if (!asid)		/* fix version if needed */
> -			asid = ASID_FIRST_VERSION;
> +		if (!asid) {
> +			asid = ASID_FIRST_VERSION; /* fix version if needed */
> +			clear_other_mmu_contexts(mm, cpu);
> +		}
>  	}
>  
>  	cpu_context(cpu, mm) = asid_cache(cpu) = asid;

Thank you for pointing out this issue. Clearly it needs to be fixed.

Would it be sufficient/better though to expand the ASID to 64-bit with
e.g. u64 (i.e. even on 32-bit MIPS kernels), and maybe enforce that the
ASID is stored shifted to the least significant bits, rather than in a
form that can be directly written to EntryHi?

Even at 2GHz with an ASID generated every CPU cycle (completely
unrealistic behaviour), it'd still take 292 years before we'd hit this
problem.

That would increase overhead slightly in the common cases around ASID
handling rather than the exceptional overflow case, but would avoid the
additional overhead needed around task locking.

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11  9:30 ` James Hogan
@ 2016-07-11  9:30   ` James Hogan
  0 siblings, 0 replies; 18+ messages in thread
From: James Hogan @ 2016-07-11  9:30 UTC (permalink / raw)
  To: yhb; +Cc: ralf, linux-mips

[-- Attachment #1: Type: text/plain, Size: 8711 bytes --]

Hi,

On Sun, Jul 10, 2016 at 01:04:47PM +0000, yhb@ruijie.com.cn wrote:
> From cd1eb951d4a7f01aaa24d2fb902f06b73ef4f608 Mon Sep 17 00:00:00 2001
> From: yhb <yhb@ruijie.com.cn>
> Date: Sun, 10 Jul 2016 20:43:05 +0800
> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
>  when asid_cache(cpu) wraps to 0.
> 
> Suppose that asid_cache(cpu) wraps to 0 every n days.
> case 1:
> (1)Process 1 got ASID 0x101.
> (2)Process 1 slept for n days.
> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
> 
> case 2:
> (1)Process 1 got ASID 0x101 on CPU 1.
> (2)Process 1 migrated to CPU 2.
> (3)Process 1 migrated to CPU 1 after n days.
> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
> 
> So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
> 
> Signed-off-by: yhb <yhb@ruijie.com.cn>
> ---
>  arch/blackfin/kernel/trace.c               |  7 ++--
>  arch/frv/mm/mmu-context.c                  |  6 ++--
>  arch/mips/include/asm/mmu_context.h        | 53 ++++++++++++++++++++++++++++--
>  arch/um/kernel/reboot.c                    |  5 +--
>  block/blk-cgroup.c                         |  6 ++--
>  block/blk-ioc.c                            | 17 ++++++----
>  drivers/staging/android/ion/ion.c          |  5 +--
>  drivers/staging/android/lowmemorykiller.c  | 15 +++++----
>  drivers/staging/lustre/lustre/ptlrpc/sec.c |  5 +--
>  drivers/tty/tty_io.c                       |  6 ++--
>  fs/coredump.c                              |  5 +--
>  fs/exec.c                                  | 17 ++++++----
>  fs/file.c                                  | 16 +++++----
>  fs/fs_struct.c                             | 16 +++++----
>  fs/hugetlbfs/inode.c                       |  6 ++--
>  fs/namespace.c                             |  5 +--
>  fs/proc/array.c                            |  5 +--
>  fs/proc/base.c                             | 40 +++++++++++++---------
>  fs/proc/internal.h                         |  5 +--
>  fs/proc/proc_net.c                         |  6 ++--
>  fs/proc/task_mmu.c                         |  5 +--
>  fs/proc_namespace.c                        |  9 ++---
>  include/linux/cpuset.h                     |  8 ++---
>  include/linux/nsproxy.h                    |  6 ++--
>  include/linux/oom.h                        |  3 +-
>  include/linux/sched.h                      |  8 ++---
>  ipc/namespace.c                            |  5 +--
>  kernel/cgroup.c                            |  5 +--
>  kernel/cpu.c                               |  5 +--
>  kernel/cpuset.c                            |  7 ++--
>  kernel/exit.c                              | 19 +++++++----
>  kernel/fork.c                              | 32 +++++++++++-------
>  kernel/kcmp.c                              |  5 +--
>  kernel/nsproxy.c                           |  5 +--
>  kernel/ptrace.c                            | 11 ++++---
>  kernel/sched/debug.c                       |  5 +--
>  kernel/sys.c                               | 16 +++++----
>  kernel/utsname.c                           |  5 +--
>  mm/memcontrol.c                            |  5 +--
>  mm/mempolicy.c                             | 46 ++++++++++++++++----------
>  mm/mmu_context.c                           | 10 +++---
>  mm/oom_kill.c                              | 37 ++++++++++++---------
>  net/core/net_namespace.c                   | 11 ++++---
>  net/core/netclassid_cgroup.c               |  6 ++--
>  net/core/netprio_cgroup.c                  |  5 +--
>  45 files changed, 337 insertions(+), 188 deletions(-)

[snip / reorder]

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 52c4847..9e643fd 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2769,14 +2769,14 @@ static inline int thread_group_empty(struct task_struct *p)
>   * It must not be nested with write_lock_irq(&tasklist_lock),
>   * neither inside nor outside.
>   */
> -static inline void task_lock(struct task_struct *p)
> +static inline void task_lock(struct task_struct *p, unsigned long *irqflags)
>  {
> -	spin_lock(&p->alloc_lock);
> +	spin_lock_irqsave(&p->alloc_lock, *irqflags);

Since most of the patch is relating to this change, which is only a
means to an end, I suggest some changes if you stick to this approach
(but see my comments below too):

1) Please separate the change to use _irqsave/_irqrestore (and pass in
irqflags parameter) from whatever else this patch contains (presumably
only the arch/mips/include/asm/mmu_context.h change).

2) Please provide some explanation for why the irqsave change is
necessary. Presumably its due to the desire to lock tasks between irq
context (when context switching, in order to clear all *other* task's
asids) and the various other places.

3) This will affect other arches which don't need the irqsave at all,
which will bloat the kernel slightly unnecessarily. We should consider
if it can/should be made MIPS specific, and whether it can be avoided
entirely.

[snip / reorder]

> diff --git a/arch/mips/include/asm/mmu_context.h b/arch/mips/include/asm/mmu_context.h
> index 45914b5..68966b5 100644
> --- a/arch/mips/include/asm/mmu_context.h
> +++ b/arch/mips/include/asm/mmu_context.h
> @@ -12,6 +12,7 @@
>  #define _ASM_MMU_CONTEXT_H
>  
>  #include <linux/errno.h>
> +#include <linux/oom.h>/* find_lock_task_mm */
>  #include <linux/sched.h>
>  #include <linux/smp.h>
>  #include <linux/slab.h>
> @@ -97,6 +98,52 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
>  #define ASID_VERSION_MASK  ((unsigned long)~(ASID_MASK|(ASID_MASK-1)))
>  #define ASID_FIRST_VERSION ((unsigned long)(~ASID_VERSION_MASK) + 1)
>  
> +/*
> + * Yu Huabing
> + * Suppose that asid_cache(cpu) wraps to 0 every n days.
> + * case 1:
> + * (1)Process 1 got ASID 0x101.
> + * (2)Process 1 slept for n days.
> + * (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> + * (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
> + *
> + * case 2:
> + * (1)Process 1 got ASID 0x101 on CPU 1.
> + * (2)Process 1 migrated to CPU 2.
> + * (3)Process 1 migrated to CPU 1 after n days.
> + * (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> + * (5)Process 1 is scheduled,and ASID of process 1 is same as ASID of process 2.
> + *
> + * So we need to clear MMU contexts of all other processes when asid_cache(cpu)
> + * wraps to 0.
> + *
> + * This function might be called from hardirq context or process context.
> + */
> +static inline void clear_other_mmu_contexts(struct mm_struct *mm,
> +						unsigned long cpu)
> +{
> +	struct task_struct *p;
> +	unsigned long irqflags;
> +
> +	read_lock(&tasklist_lock);
> +	for_each_process(p) {
> +		struct task_struct *t;
> +
> +		/*
> +		 * Main thread might exit, but other threads may still have
> +		 * a valid mm. Find one.
> +		 */
> +		t = find_lock_task_mm(p, &irqflags);
> +		if (!t)
> +			continue;
> +
> +		if (t->mm != mm)
> +			cpu_context(cpu, t->mm) = 0;
> +		task_unlock(t, &irqflags);
> +	}
> +	read_unlock(&tasklist_lock);
> +}
> +
>  /* Normal, classic MIPS get_new_mmu_context */
>  static inline void
>  get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
> @@ -112,8 +159,10 @@ get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
>  #else
>  		local_flush_tlb_all();	/* start new asid cycle */
>  #endif
> -		if (!asid)		/* fix version if needed */
> -			asid = ASID_FIRST_VERSION;
> +		if (!asid) {
> +			asid = ASID_FIRST_VERSION; /* fix version if needed */
> +			clear_other_mmu_contexts(mm, cpu);
> +		}
>  	}
>  
>  	cpu_context(cpu, mm) = asid_cache(cpu) = asid;

Thank you for pointing out this issue. Clearly it needs to be fixed.

Would it be sufficient/better though to expand the ASID to 64-bit with
e.g. u64 (i.e. even on 32-bit MIPS kernels), and maybe enforce that the
ASID is stored shifted to the least significant bits, rather than in a
form that can be directly written to EntryHi?

Even at 2GHz with an ASID generated every CPU cycle (completely
unrealistic behaviour), it'd still take 292 years before we'd hit this
problem.

That would increase overhead slightly in the common cases around ASID
handling rather than the exceptional overflow case, but would avoid the
additional overhead needed around task locking.

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-10 13:04 MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0 yhb
  2016-07-10 13:04 ` yhb
  2016-07-11  9:30 ` James Hogan
@ 2016-07-11 18:02 ` Leonid Yegoshin
  2016-07-11 18:02   ` Leonid Yegoshin
                     ` (2 more replies)
  2 siblings, 3 replies; 18+ messages in thread
From: Leonid Yegoshin @ 2016-07-11 18:02 UTC (permalink / raw)
  To: yhb, ralf; +Cc: linux-mips

On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
>   when asid_cache(cpu) wraps to 0.
>
> Suppose that asid_cache(cpu) wraps to 0 every n days.
> case 1:
> (1)Process 1 got ASID 0x101.
> (2)Process 1 slept for n days.
> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
>
> case 2:
> (1)Process 1 got ASID 0x101 on CPU 1.
> (2)Process 1 migrated to CPU 2.
> (3)Process 1 migrated to CPU 1 after n days.
> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
>
> So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
>
> Signed-off-by: yhb <yhb@ruijie.com.cn>
>
I think a more clear description should be given here - there is no 
indication that wrap happens over 32bit integer.

And taking into account "n days" frequency - can we just kill all local 
ASIDs in all processes (additionally to local_flush_tlb_all) and enforce 
reassignment if wrap happens? It should be a very rare event, you are 
first to hit this.

It seems to be some localized stuff in get_new_mmu_context() instead of 
widespread patching.

- Leonid.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 18:02 ` Leonid Yegoshin
@ 2016-07-11 18:02   ` Leonid Yegoshin
  2016-07-11 18:05   ` [PATCH] " Leonid Yegoshin
  2016-07-11 18:07   ` James Hogan
  2 siblings, 0 replies; 18+ messages in thread
From: Leonid Yegoshin @ 2016-07-11 18:02 UTC (permalink / raw)
  To: yhb, ralf; +Cc: linux-mips

On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
>   when asid_cache(cpu) wraps to 0.
>
> Suppose that asid_cache(cpu) wraps to 0 every n days.
> case 1:
> (1)Process 1 got ASID 0x101.
> (2)Process 1 slept for n days.
> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
>
> case 2:
> (1)Process 1 got ASID 0x101 on CPU 1.
> (2)Process 1 migrated to CPU 2.
> (3)Process 1 migrated to CPU 1 after n days.
> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
>
> So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
>
> Signed-off-by: yhb <yhb@ruijie.com.cn>
>
I think a more clear description should be given here - there is no 
indication that wrap happens over 32bit integer.

And taking into account "n days" frequency - can we just kill all local 
ASIDs in all processes (additionally to local_flush_tlb_all) and enforce 
reassignment if wrap happens? It should be a very rare event, you are 
first to hit this.

It seems to be some localized stuff in get_new_mmu_context() instead of 
widespread patching.

- Leonid.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 18:02 ` Leonid Yegoshin
  2016-07-11 18:02   ` Leonid Yegoshin
@ 2016-07-11 18:05   ` Leonid Yegoshin
  2016-07-11 18:05     ` Leonid Yegoshin
  2016-07-11 18:07   ` James Hogan
  2 siblings, 1 reply; 18+ messages in thread
From: Leonid Yegoshin @ 2016-07-11 18:05 UTC (permalink / raw)
  To: yhb, ralf; +Cc: linux-mips

> On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
>> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other 
>> processes
>>   when asid_cache(cpu) wraps to 0.
>>
>> Suppose that asid_cache(cpu) wraps to 0 every n days.
>> case 1:
>> (1)Process 1 got ASID 0x101.
>> (2)Process 1 slept for n days.
>> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
>> (4)Process 1 is woken,and ASID of process 1 is same as ASID of 
>> process 2.
>>
>> case 2:
>> (1)Process 1 got ASID 0x101 on CPU 1.
>> (2)Process 1 migrated to CPU 2.
>> (3)Process 1 migrated to CPU 1 after n days.
>> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
>> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of 
>> process 2.
>>
>> So we need to clear MMU contexts of all other processes when 
>> asid_cache(cpu) wraps to 0.
>>
>> Signed-off-by: yhb <yhb@ruijie.com.cn>
>>
>
I think a more clear description should be given here - there is no 
indication that wrap happens over 32bit integer.

And taking into account "n days" frequency - can we just kill all local 
ASIDs in all processes (additionally to local_flush_tlb_all) and enforce 
reassignment if wrap happens? It should be a very rare event, you are 
first to hit this.

It seems to be some localized stuff in get_new_mmu_context() instead of 
widespread patching.

- Leonid

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 18:05   ` [PATCH] " Leonid Yegoshin
@ 2016-07-11 18:05     ` Leonid Yegoshin
  0 siblings, 0 replies; 18+ messages in thread
From: Leonid Yegoshin @ 2016-07-11 18:05 UTC (permalink / raw)
  To: yhb, ralf; +Cc: linux-mips

> On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
>> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other 
>> processes
>>   when asid_cache(cpu) wraps to 0.
>>
>> Suppose that asid_cache(cpu) wraps to 0 every n days.
>> case 1:
>> (1)Process 1 got ASID 0x101.
>> (2)Process 1 slept for n days.
>> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
>> (4)Process 1 is woken,and ASID of process 1 is same as ASID of 
>> process 2.
>>
>> case 2:
>> (1)Process 1 got ASID 0x101 on CPU 1.
>> (2)Process 1 migrated to CPU 2.
>> (3)Process 1 migrated to CPU 1 after n days.
>> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
>> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of 
>> process 2.
>>
>> So we need to clear MMU contexts of all other processes when 
>> asid_cache(cpu) wraps to 0.
>>
>> Signed-off-by: yhb <yhb@ruijie.com.cn>
>>
>
I think a more clear description should be given here - there is no 
indication that wrap happens over 32bit integer.

And taking into account "n days" frequency - can we just kill all local 
ASIDs in all processes (additionally to local_flush_tlb_all) and enforce 
reassignment if wrap happens? It should be a very rare event, you are 
first to hit this.

It seems to be some localized stuff in get_new_mmu_context() instead of 
widespread patching.

- Leonid

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 18:02 ` Leonid Yegoshin
  2016-07-11 18:02   ` Leonid Yegoshin
  2016-07-11 18:05   ` [PATCH] " Leonid Yegoshin
@ 2016-07-11 18:07   ` James Hogan
  2016-07-11 18:07     ` James Hogan
  2016-07-11 18:19     ` [PATCH] " Leonid Yegoshin
  2 siblings, 2 replies; 18+ messages in thread
From: James Hogan @ 2016-07-11 18:07 UTC (permalink / raw)
  To: Leonid Yegoshin; +Cc: yhb, ralf, linux-mips

[-- Attachment #1: Type: text/plain, Size: 1835 bytes --]

Hi Leonid,

On Mon, Jul 11, 2016 at 11:02:00AM -0700, Leonid Yegoshin wrote:
> On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
> > Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
> >   when asid_cache(cpu) wraps to 0.
> >
> > Suppose that asid_cache(cpu) wraps to 0 every n days.
> > case 1:
> > (1)Process 1 got ASID 0x101.
> > (2)Process 1 slept for n days.
> > (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> > (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
> >
> > case 2:
> > (1)Process 1 got ASID 0x101 on CPU 1.
> > (2)Process 1 migrated to CPU 2.
> > (3)Process 1 migrated to CPU 1 after n days.
> > (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> > (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
> >
> > So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
> >
> > Signed-off-by: yhb <yhb@ruijie.com.cn>
> >
> I think a more clear description should be given here - there is no 
> indication that wrap happens over 32bit integer.
> 
> And taking into account "n days" frequency - can we just kill all local 
> ASIDs in all processes (additionally to local_flush_tlb_all) and enforce 
> reassignment if wrap happens? It should be a very rare event, you are 
> first to hit this.
> 
> It seems to be some localized stuff in get_new_mmu_context() instead of 
> widespread patching.

That is what this patch does, but to do so it appears you need to lock
the other tasks one by one, and that must be doable from a context
switch, i.e. hardirq context, and that requires the task lock to be of
the _irqsave variant, hence the widespread changes and the relatively
tiny MIPS change hidden in the middle.

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 18:07   ` James Hogan
@ 2016-07-11 18:07     ` James Hogan
  2016-07-11 18:19     ` [PATCH] " Leonid Yegoshin
  1 sibling, 0 replies; 18+ messages in thread
From: James Hogan @ 2016-07-11 18:07 UTC (permalink / raw)
  To: Leonid Yegoshin; +Cc: yhb, ralf, linux-mips

[-- Attachment #1: Type: text/plain, Size: 1835 bytes --]

Hi Leonid,

On Mon, Jul 11, 2016 at 11:02:00AM -0700, Leonid Yegoshin wrote:
> On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
> > Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
> >   when asid_cache(cpu) wraps to 0.
> >
> > Suppose that asid_cache(cpu) wraps to 0 every n days.
> > case 1:
> > (1)Process 1 got ASID 0x101.
> > (2)Process 1 slept for n days.
> > (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> > (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
> >
> > case 2:
> > (1)Process 1 got ASID 0x101 on CPU 1.
> > (2)Process 1 migrated to CPU 2.
> > (3)Process 1 migrated to CPU 1 after n days.
> > (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> > (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
> >
> > So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
> >
> > Signed-off-by: yhb <yhb@ruijie.com.cn>
> >
> I think a more clear description should be given here - there is no 
> indication that wrap happens over 32bit integer.
> 
> And taking into account "n days" frequency - can we just kill all local 
> ASIDs in all processes (additionally to local_flush_tlb_all) and enforce 
> reassignment if wrap happens? It should be a very rare event, you are 
> first to hit this.
> 
> It seems to be some localized stuff in get_new_mmu_context() instead of 
> widespread patching.

That is what this patch does, but to do so it appears you need to lock
the other tasks one by one, and that must be doable from a context
switch, i.e. hardirq context, and that requires the task lock to be of
the _irqsave variant, hence the widespread changes and the relatively
tiny MIPS change hidden in the middle.

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 18:07   ` James Hogan
  2016-07-11 18:07     ` James Hogan
@ 2016-07-11 18:19     ` Leonid Yegoshin
  2016-07-11 18:19       ` Leonid Yegoshin
  2016-07-11 19:21       ` James Hogan
  1 sibling, 2 replies; 18+ messages in thread
From: Leonid Yegoshin @ 2016-07-11 18:19 UTC (permalink / raw)
  To: James Hogan; +Cc: yhb, ralf, linux-mips

On 07/11/2016 11:07 AM, James Hogan wrote:
> Hi Leonid,
>
> On Mon, Jul 11, 2016 at 11:02:00AM -0700, Leonid Yegoshin wrote:
>> On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
>>> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
>>>    when asid_cache(cpu) wraps to 0.
>>>
>>> Suppose that asid_cache(cpu) wraps to 0 every n days.
>>> case 1:
>>> (1)Process 1 got ASID 0x101.
>>> (2)Process 1 slept for n days.
>>> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
>>> (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
>>>
>>> case 2:
>>> (1)Process 1 got ASID 0x101 on CPU 1.
>>> (2)Process 1 migrated to CPU 2.
>>> (3)Process 1 migrated to CPU 1 after n days.
>>> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
>>> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
>>>
>>> So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
>>>
>>> Signed-off-by: yhb <yhb@ruijie.com.cn>
>>>
>> I think a more clear description should be given here - there is no
>> indication that wrap happens over 32bit integer.
>>
>> And taking into account "n days" frequency - can we just kill all local
>> ASIDs in all processes (additionally to local_flush_tlb_all) and enforce
>> reassignment if wrap happens? It should be a very rare event, you are
>> first to hit this.
>>
>> It seems to be some localized stuff in get_new_mmu_context() instead of
>> widespread patching.
> That is what this patch does, but to do so it appears you need to lock
> the other tasks one by one, and that must be doable from a context
> switch, i.e. hardirq context, and that requires the task lock to be of
> the _irqsave variant, hence the widespread changes and the relatively
> tiny MIPS change hidden in the middle.
>
Not exactly. The change must be done only for local CPU which executes 
at the moment get_new_mmu_context(). Just prevent preemption here and 
change of cpu_context(THIS_CPU,...) can be done safely - other CPUs 
don't do anything with this variable besides killing it (writing 0 to it).

You can look into flush_tlb_mm() for example how it is cleared for 
single memory map.

We have a macro to safely walk all processes, right? (don't remember 
it's name).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 18:19     ` [PATCH] " Leonid Yegoshin
@ 2016-07-11 18:19       ` Leonid Yegoshin
  2016-07-11 19:21       ` James Hogan
  1 sibling, 0 replies; 18+ messages in thread
From: Leonid Yegoshin @ 2016-07-11 18:19 UTC (permalink / raw)
  To: James Hogan; +Cc: yhb, ralf, linux-mips

On 07/11/2016 11:07 AM, James Hogan wrote:
> Hi Leonid,
>
> On Mon, Jul 11, 2016 at 11:02:00AM -0700, Leonid Yegoshin wrote:
>> On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
>>> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
>>>    when asid_cache(cpu) wraps to 0.
>>>
>>> Suppose that asid_cache(cpu) wraps to 0 every n days.
>>> case 1:
>>> (1)Process 1 got ASID 0x101.
>>> (2)Process 1 slept for n days.
>>> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
>>> (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
>>>
>>> case 2:
>>> (1)Process 1 got ASID 0x101 on CPU 1.
>>> (2)Process 1 migrated to CPU 2.
>>> (3)Process 1 migrated to CPU 1 after n days.
>>> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
>>> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
>>>
>>> So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
>>>
>>> Signed-off-by: yhb <yhb@ruijie.com.cn>
>>>
>> I think a more clear description should be given here - there is no
>> indication that wrap happens over 32bit integer.
>>
>> And taking into account "n days" frequency - can we just kill all local
>> ASIDs in all processes (additionally to local_flush_tlb_all) and enforce
>> reassignment if wrap happens? It should be a very rare event, you are
>> first to hit this.
>>
>> It seems to be some localized stuff in get_new_mmu_context() instead of
>> widespread patching.
> That is what this patch does, but to do so it appears you need to lock
> the other tasks one by one, and that must be doable from a context
> switch, i.e. hardirq context, and that requires the task lock to be of
> the _irqsave variant, hence the widespread changes and the relatively
> tiny MIPS change hidden in the middle.
>
Not exactly. The change must be done only for local CPU which executes 
at the moment get_new_mmu_context(). Just prevent preemption here and 
change of cpu_context(THIS_CPU,...) can be done safely - other CPUs 
don't do anything with this variable besides killing it (writing 0 to it).

You can look into flush_tlb_mm() for example how it is cleared for 
single memory map.

We have a macro to safely walk all processes, right? (don't remember 
it's name).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 18:19     ` [PATCH] " Leonid Yegoshin
  2016-07-11 18:19       ` Leonid Yegoshin
@ 2016-07-11 19:21       ` James Hogan
  2016-07-11 19:21         ` James Hogan
  2016-07-11 19:39         ` Leonid Yegoshin
  1 sibling, 2 replies; 18+ messages in thread
From: James Hogan @ 2016-07-11 19:21 UTC (permalink / raw)
  To: Leonid Yegoshin; +Cc: yhb, ralf, linux-mips

[-- Attachment #1: Type: text/plain, Size: 3588 bytes --]

On Mon, Jul 11, 2016 at 11:19:30AM -0700, Leonid Yegoshin wrote:
> On 07/11/2016 11:07 AM, James Hogan wrote:
> > Hi Leonid,
> >
> > On Mon, Jul 11, 2016 at 11:02:00AM -0700, Leonid Yegoshin wrote:
> >> On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
> >>> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
> >>>    when asid_cache(cpu) wraps to 0.
> >>>
> >>> Suppose that asid_cache(cpu) wraps to 0 every n days.
> >>> case 1:
> >>> (1)Process 1 got ASID 0x101.
> >>> (2)Process 1 slept for n days.
> >>> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> >>> (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
> >>>
> >>> case 2:
> >>> (1)Process 1 got ASID 0x101 on CPU 1.
> >>> (2)Process 1 migrated to CPU 2.
> >>> (3)Process 1 migrated to CPU 1 after n days.
> >>> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> >>> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
> >>>
> >>> So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
> >>>
> >>> Signed-off-by: yhb <yhb@ruijie.com.cn>
> >>>
> >> I think a more clear description should be given here - there is no
> >> indication that wrap happens over 32bit integer.
> >>
> >> And taking into account "n days" frequency - can we just kill all local
> >> ASIDs in all processes (additionally to local_flush_tlb_all) and enforce
> >> reassignment if wrap happens? It should be a very rare event, you are
> >> first to hit this.
> >>
> >> It seems to be some localized stuff in get_new_mmu_context() instead of
> >> widespread patching.
> > That is what this patch does, but to do so it appears you need to lock
> > the other tasks one by one, and that must be doable from a context
> > switch, i.e. hardirq context, and that requires the task lock to be of
> > the _irqsave variant, hence the widespread changes and the relatively
> > tiny MIPS change hidden in the middle.
> >
> Not exactly. The change must be done only for local CPU which executes 
> at the moment get_new_mmu_context(). Just prevent preemption here and 
> change of cpu_context(THIS_CPU,...) can be done safely - other CPUs 
> don't do anything with this variable besides killing it (writing 0 to it).

Right, but I was thinking more along the lines of whether you can ensure
the other tasks / mm continues to exist. I think this is partly achieved
by the read_lock'ing of tasklist_lock, but also possibly by the
find_lock_task_mm() call, which has a comment saying:

/*
 * The process p may have detached its own ->mm while exiting or through
 * use_mm(), but one or more of its subthreads may still have a valid
 * pointer.  Return p, or any of its subthreads with a valid ->mm, with
 * task_lock() held.
 */

(but of course I could be mistaken and something else guarantees it
won't go away).

Note also that I have a patch I'm about to submit which changes some of
those assignments of 0 to assign 1 instead (so as not to confuse the
cache management code into thinking the CPU has never run the code when
it has, while still triggering ASID regeneration). That applies here
too, so it should perhaps be doing something like this instead:

if (t->mm != mm && cpu_context(cpu, t->mm))
	cpu_context(cpu, t->mm) = 1;

Cheers
James

> 
> You can look into flush_tlb_mm() for example how it is cleared for 
> single memory map.
> 
> We have a macro to safely walk all processes, right? (don't remember 
> it's name).
> 
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 19:21       ` James Hogan
@ 2016-07-11 19:21         ` James Hogan
  2016-07-11 19:39         ` Leonid Yegoshin
  1 sibling, 0 replies; 18+ messages in thread
From: James Hogan @ 2016-07-11 19:21 UTC (permalink / raw)
  To: Leonid Yegoshin; +Cc: yhb, ralf, linux-mips

[-- Attachment #1: Type: text/plain, Size: 3588 bytes --]

On Mon, Jul 11, 2016 at 11:19:30AM -0700, Leonid Yegoshin wrote:
> On 07/11/2016 11:07 AM, James Hogan wrote:
> > Hi Leonid,
> >
> > On Mon, Jul 11, 2016 at 11:02:00AM -0700, Leonid Yegoshin wrote:
> >> On 07/10/2016 06:04 AM, yhb@ruijie.com.cn wrote:
> >>> Subject: [PATCH] MIPS: We need to clear MMU contexts of all other processes
> >>>    when asid_cache(cpu) wraps to 0.
> >>>
> >>> Suppose that asid_cache(cpu) wraps to 0 every n days.
> >>> case 1:
> >>> (1)Process 1 got ASID 0x101.
> >>> (2)Process 1 slept for n days.
> >>> (3)asid_cache(cpu) wrapped to 0x101, and process 2 got ASID 0x101.
> >>> (4)Process 1 is woken,and ASID of process 1 is same as ASID of process 2.
> >>>
> >>> case 2:
> >>> (1)Process 1 got ASID 0x101 on CPU 1.
> >>> (2)Process 1 migrated to CPU 2.
> >>> (3)Process 1 migrated to CPU 1 after n days.
> >>> (4)asid_cache on CPU 1 wrapped to 0x101, and process 2 got ASID 0x101.
> >>> (5)Process 1 is scheduled, and ASID of process 1 is same as ASID of process 2.
> >>>
> >>> So we need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
> >>>
> >>> Signed-off-by: yhb <yhb@ruijie.com.cn>
> >>>
> >> I think a more clear description should be given here - there is no
> >> indication that wrap happens over 32bit integer.
> >>
> >> And taking into account "n days" frequency - can we just kill all local
> >> ASIDs in all processes (additionally to local_flush_tlb_all) and enforce
> >> reassignment if wrap happens? It should be a very rare event, you are
> >> first to hit this.
> >>
> >> It seems to be some localized stuff in get_new_mmu_context() instead of
> >> widespread patching.
> > That is what this patch does, but to do so it appears you need to lock
> > the other tasks one by one, and that must be doable from a context
> > switch, i.e. hardirq context, and that requires the task lock to be of
> > the _irqsave variant, hence the widespread changes and the relatively
> > tiny MIPS change hidden in the middle.
> >
> Not exactly. The change must be done only for local CPU which executes 
> at the moment get_new_mmu_context(). Just prevent preemption here and 
> change of cpu_context(THIS_CPU,...) can be done safely - other CPUs 
> don't do anything with this variable besides killing it (writing 0 to it).

Right, but I was thinking more along the lines of whether you can ensure
the other tasks / mm continues to exist. I think this is partly achieved
by the read_lock'ing of tasklist_lock, but also possibly by the
find_lock_task_mm() call, which has a comment saying:

/*
 * The process p may have detached its own ->mm while exiting or through
 * use_mm(), but one or more of its subthreads may still have a valid
 * pointer.  Return p, or any of its subthreads with a valid ->mm, with
 * task_lock() held.
 */

(but of course I could be mistaken and something else guarantees it
won't go away).

Note also that I have a patch I'm about to submit which changes some of
those assignments of 0 to assign 1 instead (so as not to confuse the
cache management code into thinking the CPU has never run the code when
it has, while still triggering ASID regeneration). That applies here
too, so it should perhaps be doing something like this instead:

if (t->mm != mm && cpu_context(cpu, t->mm))
	cpu_context(cpu, t->mm) = 1;

Cheers
James

> 
> You can look into flush_tlb_mm() for example how it is cleared for 
> single memory map.
> 
> We have a macro to safely walk all processes, right? (don't remember 
> it's name).
> 
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 19:21       ` James Hogan
  2016-07-11 19:21         ` James Hogan
@ 2016-07-11 19:39         ` Leonid Yegoshin
  2016-07-11 19:39           ` Leonid Yegoshin
  2016-07-11 20:18           ` James Hogan
  1 sibling, 2 replies; 18+ messages in thread
From: Leonid Yegoshin @ 2016-07-11 19:39 UTC (permalink / raw)
  To: James Hogan; +Cc: yhb, ralf, linux-mips

On 07/11/2016 12:21 PM, James Hogan wrote:
> On Mon, Jul 11, 2016 at 11:19:30AM -0700, Leonid Yegoshin wrote:
>> On 07/11/2016 11:07 AM, James Hogan wrote:
>>>
>> Not exactly. The change must be done only for local CPU which executes
>> at the moment get_new_mmu_context(). Just prevent preemption here and
>> change of cpu_context(THIS_CPU,...) can be done safely - other CPUs
>> don't do anything with this variable besides killing it (writing 0 to it).
> Right, but I was thinking more along the lines of whether you can ensure
> the other tasks / mm continues to exist. I think this is partly achieved
> by the read_lock'ing of tasklist_lock, but also possibly by the
> find_lock_task_mm() call, which has a comment saying:
>
> /*
>   * The process p may have detached its own ->mm while exiting or through
>   * use_mm(), but one or more of its subthreads may still have a valid
>   * pointer.  Return p, or any of its subthreads with a valid ->mm, with
>   * task_lock() held.
>   */
>
> (but of course I could be mistaken and something else guarantees it
> won't go away).
I don't look into details of that but a safe way to do is - walk through 
all memory maps and lock it before change.

And to walk through memory maps we could use something like 
'find_lock_task_mm' but if there is a concern like you stated above then 
we could walk through all subthreads of task or just through all threads 
in system - anywhere, this even (ASID wrap) is pretty rare.

The advantage is in keeping all that stuff local and avoid patching 
other arch and common code.

>
> Note also that I have a patch I'm about to submit which changes some of
> those assignments of 0 to assign 1 instead (so as not to confuse the
> cache management code into thinking the CPU has never run the code when
> it has, while still triggering ASID regeneration). That applies here
> too, so it should perhaps be doing something like this instead:
>
> if (t->mm != mm && cpu_context(cpu, t->mm))
> 	cpu_context(cpu, t->mm) = 1;
Not sure, but did you have chance to look into having another variable 
for cache flush control? It can be that some more states may be needed 
in future, so - just disjoin both, TLB and cache coontrol.

- Leonid.

>
> Cheers
> James
>
>> You can look into flush_tlb_mm() for example how it is cleared for
>> single memory map.
>>
>> We have a macro to safely walk all processes, right? (don't remember
>> it's name).
>>
>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 19:39         ` Leonid Yegoshin
@ 2016-07-11 19:39           ` Leonid Yegoshin
  2016-07-11 20:18           ` James Hogan
  1 sibling, 0 replies; 18+ messages in thread
From: Leonid Yegoshin @ 2016-07-11 19:39 UTC (permalink / raw)
  To: James Hogan; +Cc: yhb, ralf, linux-mips

On 07/11/2016 12:21 PM, James Hogan wrote:
> On Mon, Jul 11, 2016 at 11:19:30AM -0700, Leonid Yegoshin wrote:
>> On 07/11/2016 11:07 AM, James Hogan wrote:
>>>
>> Not exactly. The change must be done only for local CPU which executes
>> at the moment get_new_mmu_context(). Just prevent preemption here and
>> change of cpu_context(THIS_CPU,...) can be done safely - other CPUs
>> don't do anything with this variable besides killing it (writing 0 to it).
> Right, but I was thinking more along the lines of whether you can ensure
> the other tasks / mm continues to exist. I think this is partly achieved
> by the read_lock'ing of tasklist_lock, but also possibly by the
> find_lock_task_mm() call, which has a comment saying:
>
> /*
>   * The process p may have detached its own ->mm while exiting or through
>   * use_mm(), but one or more of its subthreads may still have a valid
>   * pointer.  Return p, or any of its subthreads with a valid ->mm, with
>   * task_lock() held.
>   */
>
> (but of course I could be mistaken and something else guarantees it
> won't go away).
I don't look into details of that but a safe way to do is - walk through 
all memory maps and lock it before change.

And to walk through memory maps we could use something like 
'find_lock_task_mm' but if there is a concern like you stated above then 
we could walk through all subthreads of task or just through all threads 
in system - anywhere, this even (ASID wrap) is pretty rare.

The advantage is in keeping all that stuff local and avoid patching 
other arch and common code.

>
> Note also that I have a patch I'm about to submit which changes some of
> those assignments of 0 to assign 1 instead (so as not to confuse the
> cache management code into thinking the CPU has never run the code when
> it has, while still triggering ASID regeneration). That applies here
> too, so it should perhaps be doing something like this instead:
>
> if (t->mm != mm && cpu_context(cpu, t->mm))
> 	cpu_context(cpu, t->mm) = 1;
Not sure, but did you have chance to look into having another variable 
for cache flush control? It can be that some more states may be needed 
in future, so - just disjoin both, TLB and cache coontrol.

- Leonid.

>
> Cheers
> James
>
>> You can look into flush_tlb_mm() for example how it is cleared for
>> single memory map.
>>
>> We have a macro to safely walk all processes, right? (don't remember
>> it's name).
>>
>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 19:39         ` Leonid Yegoshin
  2016-07-11 19:39           ` Leonid Yegoshin
@ 2016-07-11 20:18           ` James Hogan
  2016-07-11 20:18             ` James Hogan
  1 sibling, 1 reply; 18+ messages in thread
From: James Hogan @ 2016-07-11 20:18 UTC (permalink / raw)
  To: Leonid Yegoshin; +Cc: yhb, ralf, linux-mips

[-- Attachment #1: Type: text/plain, Size: 966 bytes --]

Hi Leonid,

On Mon, Jul 11, 2016 at 12:39:03PM -0700, Leonid Yegoshin wrote:
> On 07/11/2016 12:21 PM, James Hogan wrote:
> > Note also that I have a patch I'm about to submit which changes some of
> > those assignments of 0 to assign 1 instead (so as not to confuse the
> > cache management code into thinking the CPU has never run the code when
> > it has, while still triggering ASID regeneration). That applies here
> > too, so it should perhaps be doing something like this instead:
> >
> > if (t->mm != mm && cpu_context(cpu, t->mm))
> > 	cpu_context(cpu, t->mm) = 1;
> Not sure, but did you have chance to look into having another variable 
> for cache flush control? It can be that some more states may be needed 
> in future, so - just disjoin both, TLB and cache coontrol.

No, I haven't yet. I'll Cc you so we can discuss there instead, and in
the mean time perhaps its best to ignore what I said above for this
patch.

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0.
  2016-07-11 20:18           ` James Hogan
@ 2016-07-11 20:18             ` James Hogan
  0 siblings, 0 replies; 18+ messages in thread
From: James Hogan @ 2016-07-11 20:18 UTC (permalink / raw)
  To: Leonid Yegoshin; +Cc: yhb, ralf, linux-mips

[-- Attachment #1: Type: text/plain, Size: 966 bytes --]

Hi Leonid,

On Mon, Jul 11, 2016 at 12:39:03PM -0700, Leonid Yegoshin wrote:
> On 07/11/2016 12:21 PM, James Hogan wrote:
> > Note also that I have a patch I'm about to submit which changes some of
> > those assignments of 0 to assign 1 instead (so as not to confuse the
> > cache management code into thinking the CPU has never run the code when
> > it has, while still triggering ASID regeneration). That applies here
> > too, so it should perhaps be doing something like this instead:
> >
> > if (t->mm != mm && cpu_context(cpu, t->mm))
> > 	cpu_context(cpu, t->mm) = 1;
> Not sure, but did you have chance to look into having another variable 
> for cache flush control? It can be that some more states may be needed 
> in future, so - just disjoin both, TLB and cache coontrol.

No, I haven't yet. I'll Cc you so we can discuss there instead, and in
the mean time perhaps its best to ignore what I said above for this
patch.

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-07-11 20:18 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-10 13:04 MIPS: We need to clear MMU contexts of all other processes when asid_cache(cpu) wraps to 0 yhb
2016-07-10 13:04 ` yhb
2016-07-11  9:30 ` James Hogan
2016-07-11  9:30   ` James Hogan
2016-07-11 18:02 ` Leonid Yegoshin
2016-07-11 18:02   ` Leonid Yegoshin
2016-07-11 18:05   ` [PATCH] " Leonid Yegoshin
2016-07-11 18:05     ` Leonid Yegoshin
2016-07-11 18:07   ` James Hogan
2016-07-11 18:07     ` James Hogan
2016-07-11 18:19     ` [PATCH] " Leonid Yegoshin
2016-07-11 18:19       ` Leonid Yegoshin
2016-07-11 19:21       ` James Hogan
2016-07-11 19:21         ` James Hogan
2016-07-11 19:39         ` Leonid Yegoshin
2016-07-11 19:39           ` Leonid Yegoshin
2016-07-11 20:18           ` James Hogan
2016-07-11 20:18             ` James Hogan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).