From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BED3C433DB for ; Tue, 22 Dec 2020 20:26:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C5FFE229C6 for ; Tue, 22 Dec 2020 20:26:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C5FFE229C6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2550B6B009E; Tue, 22 Dec 2020 15:26:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2056A6B00AC; Tue, 22 Dec 2020 15:26:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 11A956B00B2; Tue, 22 Dec 2020 15:26:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0111.hostedemail.com [216.40.44.111]) by kanga.kvack.org (Postfix) with ESMTP id F0BAB6B009E for ; Tue, 22 Dec 2020 15:26:13 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BF81482499A8 for ; Tue, 22 Dec 2020 20:26:13 +0000 (UTC) X-FDA: 77622050226.10.tray82_3605f2427462 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id 9E27D16A06B for ; Tue, 22 Dec 2020 20:26:13 +0000 (UTC) X-HE-Tag: tray82_3605f2427462 X-Filterd-Recvd-Size: 5280 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Dec 2020 20:26:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1608668772; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=s7akrYpynUQd+BLDMzyxOvN0FZj7YM32yeR9YZxTgZw=; b=OX7v46SzsngkCni75BMAI4Io0064Bnw6jdlvlHwXer/W5QGkvVFyvv26DmTR9n95P1NmWU k3o8KIyhlfvBffte2cQRj0FWmXFXsXvXJIfXnojcq3N/v3ac9CpTB2nLn2x8UoU0kQdhpp VRWhMBUfW2ASS/4IK7VjfkG3IjJSBk8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-44-KhixZ06fOeqJknDg46z2uA-1; Tue, 22 Dec 2020 15:26:08 -0500 X-MC-Unique: KhixZ06fOeqJknDg46z2uA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CE031180E469; Tue, 22 Dec 2020 20:26:06 +0000 (UTC) Received: from mail (ovpn-112-5.rdu2.redhat.com [10.10.112.5]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A34BB5D9CC; Tue, 22 Dec 2020 20:26:03 +0000 (UTC) Date: Tue, 22 Dec 2020 15:26:03 -0500 From: Andrea Arcangeli To: Matthew Wilcox Cc: Andy Lutomirski , Linus Torvalds , Peter Xu , Nadav Amit , Yu Zhao , linux-mm , lkml , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , stable , Minchan Kim , Will Deacon , Peter Zijlstra , Kent Overstreet Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect Message-ID: References: <9E301C7C-882A-4E0F-8D6D-1170E792065A@gmail.com> <1FCC8F93-FF29-44D3-A73A-DF943D056680@gmail.com> <20201221223041.GL6640@xz-x1> <20201222201553.GM874@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201222201553.GM874@casper.infradead.org> User-Agent: Mutt/2.0.3 (2020-12-04) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Dec 22, 2020 at 08:15:53PM +0000, Matthew Wilcox wrote: > On Tue, Dec 22, 2020 at 02:31:52PM -0500, Andrea Arcangeli wrote: > > My previous suggestion to use a mutex to serialize > > userfaultfd_writeprotect with a mutex will still work, but we can run > > as many wrprotect and un-wrprotect as we want in parallel, as long as > > they're not simultaneous, we can do much better than a mutex. > > > > Ideally we would need a new two_group_semaphore, where each group can > > run as many parallel instances as it wants, but no instance of one > > group can run in parallel with any instance of the other group. AFIK > > such a kind of lock doesn't exist right now. > > Kent and I worked on one for a bit, and we called it a red-black mutex. > If team red had the lock, more members of team red could join in. > If team black had the lock, more members of team black could join in. > I forget what our rule was around fairness (if team red has the lock, > and somebody from team black is waiting, can another member of team red > take the lock, or must they block?) In this case they would need to block and provide full fairness. Well maybe just a bit of unfariness (to let a few more through the door before it shuts) wouldn't be a deal breaker but it would need to be bound or it'd starve the other color/side indefinitely. Otherwise an ioctl mode_wp = true would block forever, if more ioctl mode_wp = false keep coming in other CPUs (or the other way around). The approximation with rwsem and two atomics provides full fariness in both read and write mode (originally the read would stave the write IIRC which was an issue for all mprotect etc.. not anymore thankfully). > It was to solve the direct-IO vs buffered-IO problem (you can have as many > direct-IO readers/writers at once or you can have as many buffered-IO > readers/writers at once, but exclude a mix of direct and buffered I/O). > In the end, we decided it didn't work all that well. Well mixing buffered and direct-IO is certainly not a good practice so it's reasonable to leave it up to userland to serialize if such mix is needed, the kernel behavior is undefined if the mix is concurrent out of order.