From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EA01C433EF for ; Fri, 8 Apr 2022 12:55:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C5C06B0071; Fri, 8 Apr 2022 08:55:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 673EB6B0072; Fri, 8 Apr 2022 08:55:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 53BAB6B0074; Fri, 8 Apr 2022 08:55:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 47D396B0071 for ; Fri, 8 Apr 2022 08:55:07 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 1F72F121D93 for ; Fri, 8 Apr 2022 12:55:07 +0000 (UTC) X-FDA: 79333707054.07.FFD0490 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf03.hostedemail.com (Postfix) with ESMTP id E194920007 for ; Fri, 8 Apr 2022 12:55:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649422506; x=1680958506; h=date:from:to:cc:subject:message-id:reply-to:references: mime-version:in-reply-to; bh=O9odkqTVXr/6ytKIrHbe23E6RNpY/PYB3IDXxuv8ddY=; b=iDdMcCL4N4E5RrkLHJALB5MmjGpe9cncYAx/JEowSXbk+NUozilw81h4 AtGT6yXlO04D2ARp20dMb32v2HrgllscBikO5Pp0jT6lQAwH9hPWWTL63 nVJ6lDTtQN8DNxhClb5P7Cdg5CJet9sg8OwJxoimzfY1qvjGhJrsPZ+Zo VPs4LI1nSL1GEAzgAiJZJ1K1i5jXGeIqdydOqyLgC2wBEKoyEF0aG6o4/ W4CmN62r2RVJlLSdaexIr+mv5fsgheSooWOdjXxDRR1CCff98u3Rw4WGn HEiBWl3J7ar3W4W8IHDo7EkOSRQzTiUFa9n/KG6IKfTt9GN8L6IynT+45 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10310"; a="241518268" X-IronPort-AV: E=Sophos;i="5.90,245,1643702400"; d="scan'208";a="241518268" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2022 05:55:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,245,1643702400"; d="scan'208";a="698172668" Received: from chaop.bj.intel.com (HELO localhost) ([10.240.192.101]) by fmsmga001.fm.intel.com with ESMTP; 08 Apr 2022 05:54:56 -0700 Date: Fri, 8 Apr 2022 20:54:45 +0800 From: Chao Peng To: Sean Christopherson Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com Subject: Re: [PATCH v5 02/13] mm: Introduce memfile_notifier Message-ID: <20220408125445.GA57095@chaop.bj.intel.com> Reply-To: Chao Peng References: <20220310140911.50924-1-chao.p.peng@linux.intel.com> <20220310140911.50924-3-chao.p.peng@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: pgkk5zzt6r1pixgubjr91bybsky1ao6y Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=iDdMcCL4; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf03.hostedemail.com: domain of chao.p.peng@linux.intel.com has no SPF policy when checking 192.55.52.136) smtp.mailfrom=chao.p.peng@linux.intel.com X-Rspamd-Queue-Id: E194920007 X-HE-Tag: 1649422505-518923 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 29, 2022 at 06:45:16PM +0000, Sean Christopherson wrote: > On Thu, Mar 10, 2022, Chao Peng wrote: > > diff --git a/mm/Makefile b/mm/Makefile > > index 70d4309c9ce3..f628256dce0d 100644 > > +void memfile_notifier_invalidate(struct memfile_notifier_list *list, > > + pgoff_t start, pgoff_t end) > > +{ > > + struct memfile_notifier *notifier; > > + int id; > > + > > + id = srcu_read_lock(&srcu); > > + list_for_each_entry_srcu(notifier, &list->head, list, > > + srcu_read_lock_held(&srcu)) { > > + if (notifier->ops && notifier->ops->invalidate) > > Any reason notifier->ops isn't mandatory? Yes it's mandatory, will skip the check here. > > > + notifier->ops->invalidate(notifier, start, end); > > + } > > + srcu_read_unlock(&srcu, id); > > +} > > + > > +void memfile_notifier_fallocate(struct memfile_notifier_list *list, > > + pgoff_t start, pgoff_t end) > > +{ > > + struct memfile_notifier *notifier; > > + int id; > > + > > + id = srcu_read_lock(&srcu); > > + list_for_each_entry_srcu(notifier, &list->head, list, > > + srcu_read_lock_held(&srcu)) { > > + if (notifier->ops && notifier->ops->fallocate) > > + notifier->ops->fallocate(notifier, start, end); > > + } > > + srcu_read_unlock(&srcu, id); > > +} > > + > > +void memfile_register_backing_store(struct memfile_backing_store *bs) > > +{ > > + BUG_ON(!bs || !bs->get_notifier_list); > > + > > + list_add_tail(&bs->list, &backing_store_list); > > +} > > + > > +void memfile_unregister_backing_store(struct memfile_backing_store *bs) > > +{ > > + list_del(&bs->list); > > Allowing unregistration of a backing store is broken. Using the _safe() variant > is not sufficient to guard against concurrent modification. I don't see any reason > to support this out of the gate, the only reason to support unregistering a backing > store is if the backing store is implemented as a module, and AFAIK none of the > backing stores we plan on supporting initially support being built as a module. > These aren't exported, so it's not like that's even possible. Registration would > also be broken if modules are allowed, I'm pretty sure module init doesn't run > under a global lock. > > We can always add this complexity if it's needed in the future, but for now the > easiest thing would be to tag memfile_register_backing_store() with __init and > make backing_store_list __ro_after_init. The only currently supported backing store shmem does not need this so can remove it for now. > > > +} > > + > > +static int memfile_get_notifier_info(struct inode *inode, > > + struct memfile_notifier_list **list, > > + struct memfile_pfn_ops **ops) > > +{ > > + struct memfile_backing_store *bs, *iter; > > + struct memfile_notifier_list *tmp; > > + > > + list_for_each_entry_safe(bs, iter, &backing_store_list, list) { > > + tmp = bs->get_notifier_list(inode); > > + if (tmp) { > > + *list = tmp; > > + if (ops) > > + *ops = &bs->pfn_ops; > > + return 0; > > + } > > + } > > + return -EOPNOTSUPP; > > +} > > + > > +int memfile_register_notifier(struct inode *inode, > > Taking an inode is a bit odd from a user perspective. Any reason not to take a > "struct file *" and get the inode here? That would give callers a hint that they > need to hold a reference to the file for the lifetime of the registration. Yes, I can change. > > > + struct memfile_notifier *notifier, > > + struct memfile_pfn_ops **pfn_ops) > > +{ > > + struct memfile_notifier_list *list; > > + int ret; > > + > > + if (!inode || !notifier | !pfn_ops) > > Bitwise | instead of logical ||. But IMO taking in a pfn_ops pointer is silly. > More below. > > > + return -EINVAL; > > + > > + ret = memfile_get_notifier_info(inode, &list, pfn_ops); > > + if (ret) > > + return ret; > > + > > + spin_lock(&list->lock); > > + list_add_rcu(¬ifier->list, &list->head); > > + spin_unlock(&list->lock); > > + > > + return 0; > > +} > > +EXPORT_SYMBOL_GPL(memfile_register_notifier); > > + > > +void memfile_unregister_notifier(struct inode *inode, > > + struct memfile_notifier *notifier) > > +{ > > + struct memfile_notifier_list *list; > > + > > + if (!inode || !notifier) > > + return; > > + > > + BUG_ON(memfile_get_notifier_info(inode, &list, NULL)); > > Eww. Rather than force the caller to provide the inode/file and the notifier, > what about grabbing the backing store itself in the notifier? > > struct memfile_notifier { > struct list_head list; > struct memfile_notifier_ops *ops; > > struct memfile_backing_store *bs; > }; > > That also helps avoid confusing between "ops" and "pfn_ops". IMO, exposing > memfile_backing_store to the caller isn't a big deal, and is preferable to having > to rewalk multiple lists just to delete a notifier. Agreed, good suggestion. > > Then this can become: > > void memfile_unregister_notifier(struct memfile_notifier *notifier) > { > spin_lock(¬ifier->bs->list->lock); > list_del_rcu(¬ifier->list); > spin_unlock(¬ifier->bs->list->lock); > > synchronize_srcu(&srcu); > } > > and registration can be: > > int memfile_register_notifier(const struct file *file, > struct memfile_notifier *notifier) > { > struct memfile_notifier_list *list; > struct memfile_backing_store *bs; > int ret; > > if (!file || !notifier) > return -EINVAL; > > list_for_each_entry(bs, &backing_store_list, list) { > list = bs->get_notifier_list(file_inode(file)); > if (list) { > notifier->bs = bs; > > spin_lock(&list->lock); > list_add_rcu(¬ifier->list, &list->head); > spin_unlock(&list->lock); > return 0; > } > } > > return -EOPNOTSUPP; > }