From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.1 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MSGID_FROM_MTA_HEADER,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 067C3C35242 for ; Tue, 11 Feb 2020 20:53:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 82C45206D6 for ; Tue, 11 Feb 2020 20:53:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=Mellanox.com header.i=@Mellanox.com header.b="P0ZPQX5O" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 82C45206D6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=mellanox.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DD1BA6B031E; Tue, 11 Feb 2020 15:53:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D5C876B031F; Tue, 11 Feb 2020 15:53:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BFCB56B0320; Tue, 11 Feb 2020 15:53:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0210.hostedemail.com [216.40.44.210]) by kanga.kvack.org (Postfix) with ESMTP id A1D5F6B031E for ; Tue, 11 Feb 2020 15:53:00 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5289B181AC9CB for ; Tue, 11 Feb 2020 20:53:00 +0000 (UTC) X-FDA: 76479045720.19.fifth92_3731554a0e410 X-HE-Tag: fifth92_3731554a0e410 X-Filterd-Recvd-Size: 12376 Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30048.outbound.protection.outlook.com [40.107.3.48]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Tue, 11 Feb 2020 20:52:59 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JHbqef2GRklQzBx0PUdvguEz6mky1cw7OQrPgnc7wdPGZJUWzCepFfGIbEmPgzNfNlV8Yhjk1TG+7O8oF5dWe2UkPlObxpKp0SEnhmVZPpHAiAMjrzYja/mreaerdNyn8KYjOsOrg0cf04GF6DGLKaKtod6QFZBctzDdyUIUyoy+4vVqn56sbsd2G/PPlCCQLqBxyFfK0ExyhvsKnJaBKnI+ZzkR40RtSjq11lptTSjbEIXD/fkj9Kl1FPipsc4uHUEIPos9UX6hllHvHWaJ5qoiyp8Khu6s75BnJEIJSI3HHSnI5//z3aFSoWqLyrsvgf9AF1GXZUuJK+d+1TPFtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=syZ2Q0Unc3a/w+RuGGy1lKAKFgPuuHYvIiumD3xCl7I=; b=oY+tp6JydboqNfy2Mrnku5dsZvWrkK2sO7RRha+tE2wmcmQkit2EKmYVRiE6LRY1HAOB4hBIU3ki6dVExFr+8ta0KoME952Luna5tnGFt6lpLipt3lrJUiWBcM3lmODia+EE4qnv8R7qAxuRy0R4UQVFXOycDpr5FrxBImo3Fp1e44vYLOpY89zdbWjjezyk4BVxvn5IDDUh+171JxEEhR6lFDT0EhzrnQUu1rmJDard6e0g2U6ueUz94DHfeL6T2SVqRDWAGKaKLo54BLEaiAnwXlFflpon70kb/BVJVT/gHTFFybDz0UQQS5QufqNfMrzKY93OZT6+kJjd8+s6DA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=syZ2Q0Unc3a/w+RuGGy1lKAKFgPuuHYvIiumD3xCl7I=; b=P0ZPQX5Ogxs+4G39xW+6FvHkxOPJwqNqNLZjtDdNY17wKb/E6B+EbZmEdbYeNVlaMYEdTlAmvictnlGc0zNakzMm8Zc85vwhUIakaamoyh+M4zApFR6ebyDDQHQw8a8B6fuoqOzSVn8CZcVkBMUrxnjFuQAUcAmfseckS7RkOGs= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=jgg@mellanox.com; Received: from VI1PR05MB4141.eurprd05.prod.outlook.com (52.133.14.15) by VI1PR05MB5679.eurprd05.prod.outlook.com (20.178.122.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2707.26; Tue, 11 Feb 2020 20:52:56 +0000 Received: from VI1PR05MB4141.eurprd05.prod.outlook.com ([fe80::1c00:7925:d5c6:d60d]) by VI1PR05MB4141.eurprd05.prod.outlook.com ([fe80::1c00:7925:d5c6:d60d%7]) with mapi id 15.20.2707.030; Tue, 11 Feb 2020 20:52:56 +0000 Date: Tue, 11 Feb 2020 16:52:52 -0400 From: Jason Gunthorpe To: linux-mm@kvack.org Cc: Michal Hocko , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Christoph Hellwig Subject: [PATCH v3] mm/mmu_notifier: prevent unpaired invalidate_start and invalidate_end Message-ID: <20200211205252.GA10003@ziepe.ca> Content-Type: text/plain; charset=utf-8 Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: MN2PR05CA0018.namprd05.prod.outlook.com (2603:10b6:208:c0::31) To VI1PR05MB4141.eurprd05.prod.outlook.com (2603:10a6:803:44::15) MIME-Version: 1.0 Received: from mlx.ziepe.ca (142.68.57.212) by MN2PR05CA0018.namprd05.prod.outlook.com (2603:10b6:208:c0::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2729.14 via Frontend Transport; Tue, 11 Feb 2020 20:52:56 +0000 Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1j1cWa-0002eU-38; Tue, 11 Feb 2020 16:52:52 -0400 X-Originating-IP: [142.68.57.212] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 21e4ecb5-cf28-4b24-ac0a-08d7af345ed7 X-MS-TrafficTypeDiagnostic: VI1PR05MB5679: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-Forefront-PRVS: 0310C78181 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4636009)(39860400002)(366004)(346002)(376002)(136003)(396003)(199004)(189003)(26005)(86362001)(316002)(81156014)(81166006)(8676002)(66574012)(1076003)(52116002)(186003)(6916009)(2906002)(4326008)(8936002)(66556008)(66476007)(9746002)(66946007)(966005)(9686003)(9786002)(36756003)(54906003)(5660300002)(33656002)(478600001)(24400500001);DIR:OUT;SFP:1101;SCL:1;SRVR:VI1PR05MB5679;H:VI1PR05MB4141.eurprd05.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1;MX:1; Received-SPF: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: f8hHG1KhIp8qvU05CO9LhuPltLgxexLSzjcLMUED27YR/TeQ7WT4lDej7cJ/I6hCC72/MMIeCvbCxC3urEiJR/68eMKZTIABfOPfUejAMgh/PWdn3hD2C1kswZULqLdasFlmQ44qA6MEOBqa0Zaqvb78qy5BwsWQINtbJuKEjmJMRy26FjzW3BASwJy95t3NzQdnnMqYlJ0jE0rDpFPc/ItjbnaRyHYvi5/d9/wunZamHXjBxiexVe5M5TGQbPERaLIqBLqTL0sVRR8qvoX4EZ6epuVsvCMueyKLUjgFgd3eFCZe2QAvwkT6We9+Lqyr7ofjYC7lnydmLRWPnGPSsNENJJJRJMfiMpL9bJAeIItYoIfr3TUww/b6hP1nm1ZqXqKkQMLu2vyOrZ8GbmArLnp5Jhk5xTLYDdrGVODIBFsebJt6nPXltSo6zVYlVfR+eJ4tWHGII6DwJHHoAV837jT+ZEzo1/u0PXqizMUx7fEVPKoayaj2Kkf0whfoyy6DUpmKNfW9nYGUdEC2lmUKpJJESbTF/h7P1dk2XsgjN/H+Np+BoZINPAMR3IGf+IQs X-MS-Exchange-AntiSpam-MessageData: 1q7OTnBqjcTIOrRxfO1leP2CeldXRMdsLP9tPvI0Q6KYWnqp1UEUl3a7bizss4245VOzc2skcnbU3dGBxm4edbn3I4saHXcJP8BIDQDl6YdbWaJr32jyjd3yj7/5cJ+4rpO6CQxV6FIIV7RePxIitQ== X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 21e4ecb5-cf28-4b24-ac0a-08d7af345ed7 X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Feb 2020 20:52:56.3125 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: eMg6Sj64TN8qn5/vDt/hosbCrYq+kOHq/z7tKXl2U+S3kl855RBSCk4NKimEoxSP7QAS3rh515kX44EFN1BReA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB5679 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Many users of the mmu_notifier invalidate_range callbacks maintain locking/counters/etc on a paired basis and have long expected that invalidate_range_start/end() are always paired. For instance kvm_mmu_notifier_invalidate_range_end() undoes kvm->mmu_notifier_count which was incremented during start(). The recent change to add non-blocking notifiers breaks this assumption when multiple notifiers are present in the list. When EAGAIN is returned from an invalidate_range_start() then no invalidate_range_ends() are called, even if the subscription's start had previously been called. Unfortunately, due to the RCU list traversal we can't reliably generate a subset of the linked list representing the notifiers already called to generate an invalidate_range_end() pairing. One case works correctly, if only one subscription requires invalidate_range_end() and it is the last entry in the hlist. In this case, when invalidate_range_start() returns -EAGAIN there will be nothing to unwind. Keep the notifier hlist sorted so that notifiers that require invalidate_range_end() are always last, and if two are added then disable non-blocking invalidation for the mm. A warning is printed for this case, if in future we determine this never happens then we can simply fail during registration when there are unsupported combinations of notifiers. Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifie= rs") Cc: Michal Hocko Cc: "J=C3=A9r=C3=B4me Glisse" Cc: Christoph Hellwig Signed-off-by: Jason Gunthorpe --- mm/mmu_notifier.c | 53 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 50 insertions(+), 3 deletions(-) v1: https://lore.kernel.org/linux-mm/20190724152858.GB28493@ziepe.ca/ v2: https://lore.kernel.org/linux-mm/20190807191627.GA3008@ziepe.ca/ * Abandon attempting to fix it by calling invalidate_range_end() during a= n EAGAIN start * Just trivially ban multiple subscriptions v3: * Be more sophisticated, ban only multiple subscriptions if the result is a failure. Allows multiple subscriptions without invalidate_range_end * Include a printk when this condition is hit (Michal) At this point the rework Christoph requested during the first posting is completed and there are now only 3 drivers using invalidate_range_end(): drivers/misc/mic/scif/scif_dma.c: .invalidate_range_end =3D scif_mm= u_notifier_invalidate_range_end}; drivers/misc/sgi-gru/grutlbpurge.c: .invalidate_range_end =3D gru_i= nvalidate_range_end, virt/kvm/kvm_main.c: .invalidate_range_end =3D kvm_mmu_notifier_inva= lidate_range_end, While I think it is unlikely that any of these drivers will be used in combination with each other, display a printk in hopes to check. Someday I expect to just fail the registration on this condition. I think this also addresses Michal's concern about a 'big hammer' as it probably won't ever trigger now. Regards, Jason diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index ef3973a5d34a94..f3aba7a970f576 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -37,7 +37,8 @@ struct lockdep_map __mmu_notifier_invalidate_range_star= t_map =3D { struct mmu_notifier_subscriptions { /* all mmu notifiers registered in this mm are queued in this list */ struct hlist_head list; - bool has_itree; + u8 has_itree; + u8 no_blocking; /* to serialize the list modifications and hlist_unhashed */ spinlock_t lock; unsigned long invalidate_seq; @@ -475,6 +476,10 @@ static int mn_hlist_invalidate_range_start( int ret =3D 0; int id; =20 + if (unlikely(subscriptions->no_blocking && + !mmu_notifier_range_blockable(range))) + return -EAGAIN; + id =3D srcu_read_lock(&srcu); hlist_for_each_entry_rcu(subscription, &subscriptions->list, hlist) { const struct mmu_notifier_ops *ops =3D subscription->ops; @@ -590,6 +595,48 @@ void __mmu_notifier_invalidate_range(struct mm_struc= t *mm, srcu_read_unlock(&srcu, id); } =20 +/* + * Add a hlist subscription to the list. The list is kept sorted by the + * existence of ops->invalidate_range_end. If there is more than one + * invalidate_range_end in the list then this process can no longer supp= ort + * non-blocking invalidation. + * + * non-blocking invalidation is problematic as a requirement to block re= sults in + * the invalidation being aborted, however due to the use of RCU we have= no + * reliable way to ensure that every sueessful invalidate_range_start() = results + * in a call to invalidate_range_end(). + * + * Thus to support blocking only the last subscription in the list can h= ave + * invalidate_range_end() set. + */ +static void +mn_hist_add_subscription(struct mmu_notifier_subscriptions *subscription= s, + struct mmu_notifier *subscription) +{ + struct mmu_notifier *last =3D NULL; + struct mmu_notifier *itr; + + hlist_for_each_entry(itr, &subscriptions->list, hlist) + last =3D itr; + + if (last && last->ops->invalidate_range_end && + subscription->ops->invalidate_range_end) { + subscriptions->no_blocking =3D true; + pr_warn_once( + "%s (%d) created two mmu_notifier's with invalidate_range_end(): %ps = and %ps, non-blocking notifiers disabled\n", + current->comm, current->pid, + last->ops->invalidate_range_end, + subscription->ops->invalidate_range_end); + } + if (!last || !last->ops->invalidate_range_end) + subscriptions->no_blocking =3D false; + + if (last && subscription->ops->invalidate_range_end) + hlist_add_behind_rcu(&subscription->hlist, &last->hlist); + else + hlist_add_head_rcu(&subscription->hlist, &subscriptions->list); +} + /* * Same as mmu_notifier_register but here the caller must hold the mmap_= sem in * write mode. A NULL mn signals the notifier is being registered for it= ree @@ -660,8 +707,8 @@ int __mmu_notifier_register(struct mmu_notifier *subs= cription, subscription->users =3D 1; =20 spin_lock(&mm->notifier_subscriptions->lock); - hlist_add_head_rcu(&subscription->hlist, - &mm->notifier_subscriptions->list); + mn_hist_add_subscription(mm->notifier_subscriptions, + subscription); spin_unlock(&mm->notifier_subscriptions->lock); } else mm->notifier_subscriptions->has_itree =3D true; --=20 2.25.0