From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25A4EC433E3 for ; Fri, 7 Aug 2020 06:17:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E290122C9F for ; Fri, 7 Aug 2020 06:17:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="LiR/rG0Y" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E290122C9F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 890208D0005; Fri, 7 Aug 2020 02:17:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 844DD8D0002; Fri, 7 Aug 2020 02:17:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 70B9D8D0005; Fri, 7 Aug 2020 02:17:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0153.hostedemail.com [216.40.44.153]) by kanga.kvack.org (Postfix) with ESMTP id 573578D0002 for ; Fri, 7 Aug 2020 02:17:19 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1B2093644 for ; Fri, 7 Aug 2020 06:17:19 +0000 (UTC) X-FDA: 77122765398.30.clam03_461239f26fbe Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id E8E6C180B3AA7 for ; Fri, 7 Aug 2020 06:17:18 +0000 (UTC) X-HE-Tag: clam03_461239f26fbe X-Filterd-Recvd-Size: 4456 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Fri, 7 Aug 2020 06:17:18 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2A1E72177B; Fri, 7 Aug 2020 06:17:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596781037; bh=Qt1PcoJNJQycui0r/yXB5pNj6s8z3BlaXbgYR0ObX4g=; h=Date:From:To:Subject:In-Reply-To:From; b=LiR/rG0YPlk2vr+rAOhGWbsIYfsp/d+QouqBGOTjN/0bikSPNXa4wtLmX4p484Ti8 02Gdk/S3ZZBOLON/Ef5FKAndMjSca3rXkiKWEGuyiuz5y+XofpWQFoeN+LOMWP1lu5 U8REiVn21Xc0uldeDIg1O7HJDGKSq8ce8LA5OHG4= Date: Thu, 06 Aug 2020 23:17:16 -0700 From: Andrew Morton To: akpm@linux-foundation.org, axboe@kernel.dk, hch@lst.de, jannh@google.com, keescook@chromium.org, linux-mm@kvack.org, luto@amacapital.net, mathieu.desnoyers@efficios.com, mm-commits@vger.kernel.org, npiggin@gmail.com, peterz@infradead.org, stable@vger.kernel.org, torvalds@linux-foundation.org, will@kernel.org Subject: [patch 004/163] mm: fix kthread_use_mm() vs TLB invalidate Message-ID: <20200807061716.0Q5HJvMJ8%akpm@linux-foundation.org> In-Reply-To: <20200806231643.a2711a608dd0f18bff2caf2b@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: E8E6C180B3AA7 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Peter Zijlstra Subject: mm: fix kthread_use_mm() vs TLB invalidate For SMP systems using IPI based TLB invalidation, looking at current->active_mm is entirely reasonable. This then presents the following race condition: CPU0 CPU1 flush_tlb_mm(mm) use_mm(mm) tsk->active_mm = mm; if (tsk->active_mm == mm) // flush TLBs switch_mm(old_mm,mm,tsk); Where it is possible the IPI flushed the TLBs for @old_mm, not @mm, because the IPI lands before we actually switched. Avoid this by disabling IRQs across changing ->active_mm and switch_mm(). Of the (SMP) architectures that have IPI based TLB invalidate: Alpha - checks active_mm ARC - ASID specific IA64 - checks active_mm MIPS - ASID specific flush OpenRISC - shoots down world PARISC - shoots down world SH - ASID specific SPARC - ASID specific x86 - N/A xtensa - checks active_mm So at the very least Alpha, IA64 and Xtensa are suspect. On top of this, for scheduler consistency we need at least preemption disabled across changing tsk->mm and doing switch_mm(), which is currently provided by task_lock(), but that's not sufficient for PREEMPT_RT. [akpm@linux-foundation.org: add comment] Link: http://lkml.kernel.org/r/20200721154106.GE10769@hirez.programming.kicks-ass.net Signed-off-by: Peter Zijlstra (Intel) Reported-by: Andy Lutomirski Cc: Nicholas Piggin Cc: Jens Axboe Cc: Kees Cook Cc: Jann Horn Cc: Will Deacon Cc: Christoph Hellwig Cc: Nicholas Piggin Cc: Mathieu Desnoyers Cc: Signed-off-by: Andrew Morton --- kernel/kthread.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/kernel/kthread.c~mm-fix-kthread_use_mm-vs-tlb-invalidate +++ a/kernel/kthread.c @@ -1241,13 +1241,16 @@ void kthread_use_mm(struct mm_struct *mm WARN_ON_ONCE(tsk->mm); task_lock(tsk); + /* Hold off tlb flush IPIs while switching mm's */ + local_irq_disable(); active_mm = tsk->active_mm; if (active_mm != mm) { mmgrab(mm); tsk->active_mm = mm; } tsk->mm = mm; - switch_mm(active_mm, mm, tsk); + switch_mm_irqs_off(active_mm, mm, tsk); + local_irq_enable(); task_unlock(tsk); #ifdef finish_arch_post_lock_switch finish_arch_post_lock_switch(); @@ -1276,9 +1279,11 @@ void kthread_unuse_mm(struct mm_struct * task_lock(tsk); sync_mm_rss(mm); + local_irq_disable(); tsk->mm = NULL; /* active_mm is still 'mm' */ enter_lazy_tlb(mm, tsk); + local_irq_enable(); task_unlock(tsk); } EXPORT_SYMBOL_GPL(kthread_unuse_mm); _