* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
@ 2021-02-05 16:04 kernel test robot
0 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2021-02-05 16:04 UTC (permalink / raw)
To: kbuild
[-- Attachment #1: Type: text/plain, Size: 8932 bytes --]
CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210205103259.42866-2-pbonzini@redhat.com>
References: <20210205103259.42866-2-pbonzini@redhat.com>
TO: Paolo Bonzini <pbonzini@redhat.com>
TO: linux-kernel(a)vger.kernel.org
TO: kvm(a)vger.kernel.org
CC: jgg(a)ziepe.ca
CC: linux-mm(a)kvack.org
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux Memory Management List <linux-mm@kvack.org>
CC: dan.j.williams(a)intel.com
Hi Paolo,
I love your patch! Perhaps something to improve:
[auto build test WARNING on linux/master]
[also build test WARNING on linus/master v5.11-rc6]
[cannot apply to kvm/linux-next hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Paolo-Bonzini/KVM-do-not-assume-PTE-is-writable-after-follow_pfn/20210205-183434
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2ab38c17aac10bf55ab3efde4c4db3893d8691d2
:::::: branch date: 5 hours ago
:::::: commit date: 5 hours ago
config: i386-randconfig-s001-20210205 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.3-215-g0fb77bb6-dirty
# https://github.com/0day-ci/linux/commit/ad5c7f2ccaf2872e1f56278a9e1a0c839c5cf518
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Paolo-Bonzini/KVM-do-not-assume-PTE-is-writable-after-follow_pfn/20210205-183434
git checkout ad5c7f2ccaf2872e1f56278a9e1a0c839c5cf518
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
"sparse warnings: (new ones prefixed by >>)"
mm/memory.c:5197:22: sparse: sparse: cast removes address space '__user' of expression
mm/memory.c:943:17: sparse: sparse: context imbalance in 'copy_pte_range' - different lock contexts for basic block
mm/memory.c:1623:16: sparse: sparse: context imbalance in '__get_locked_pte' - different lock contexts for basic block
mm/memory.c:1672:9: sparse: sparse: context imbalance in 'insert_page' - different lock contexts for basic block
mm/memory.c:2174:17: sparse: sparse: context imbalance in 'remap_pte_range' - different lock contexts for basic block
mm/memory.c:2419:17: sparse: sparse: context imbalance in 'apply_to_pte_range' - unexpected unlock
mm/memory.c:2676:9: sparse: sparse: context imbalance in 'wp_page_copy' - different lock contexts for basic block
mm/memory.c:3022:17: sparse: sparse: context imbalance in 'wp_pfn_shared' - unexpected unlock
mm/memory.c:3085:19: sparse: sparse: context imbalance in 'do_wp_page' - different lock contexts for basic block
mm/memory.c:3657:19: sparse: sparse: context imbalance in 'pte_alloc_one_map' - different lock contexts for basic block
mm/memory.c:3884:17: sparse: sparse: context imbalance in 'finish_fault' - unexpected unlock
mm/memory.c:3993:9: sparse: sparse: context imbalance in 'do_fault_around' - unexpected unlock
>> mm/memory.c:4712:5: sparse: sparse: context imbalance in 'follow_invalidate_pte' - wrong count at exit
mm/memory.c:4827:23: sparse: sparse: context imbalance in 'follow_pfn' - unexpected unlock
mm/memory.c:4857:9: sparse: sparse: context imbalance in 'follow_phys' - unexpected unlock
vim +/follow_invalidate_pte +4712 mm/memory.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 4711
ad5c7f2ccaf287 Paolo Bonzini 2021-02-05 @4712 int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
ad5c7f2ccaf287 Paolo Bonzini 2021-02-05 4713 struct mmu_notifier_range *range, pte_t **ptepp,
ad5c7f2ccaf287 Paolo Bonzini 2021-02-05 4714 pmd_t **pmdpp, spinlock_t **ptlp)
f8ad0f499fad5c Johannes Weiner 2009-06-16 4715 {
f8ad0f499fad5c Johannes Weiner 2009-06-16 4716 pgd_t *pgd;
c2febafc67734a Kirill A. Shutemov 2017-03-09 4717 p4d_t *p4d;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4718 pud_t *pud;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4719 pmd_t *pmd;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4720 pte_t *ptep;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4721
f8ad0f499fad5c Johannes Weiner 2009-06-16 4722 pgd = pgd_offset(mm, address);
f8ad0f499fad5c Johannes Weiner 2009-06-16 4723 if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
f8ad0f499fad5c Johannes Weiner 2009-06-16 4724 goto out;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4725
c2febafc67734a Kirill A. Shutemov 2017-03-09 4726 p4d = p4d_offset(pgd, address);
c2febafc67734a Kirill A. Shutemov 2017-03-09 4727 if (p4d_none(*p4d) || unlikely(p4d_bad(*p4d)))
c2febafc67734a Kirill A. Shutemov 2017-03-09 4728 goto out;
c2febafc67734a Kirill A. Shutemov 2017-03-09 4729
c2febafc67734a Kirill A. Shutemov 2017-03-09 4730 pud = pud_offset(p4d, address);
f8ad0f499fad5c Johannes Weiner 2009-06-16 4731 if (pud_none(*pud) || unlikely(pud_bad(*pud)))
f8ad0f499fad5c Johannes Weiner 2009-06-16 4732 goto out;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4733
f8ad0f499fad5c Johannes Weiner 2009-06-16 4734 pmd = pmd_offset(pud, address);
f66055ab6fb973 Andrea Arcangeli 2011-01-13 4735 VM_BUG_ON(pmd_trans_huge(*pmd));
097963959594c5 Ross Zwisler 2017-01-10 4736
097963959594c5 Ross Zwisler 2017-01-10 4737 if (pmd_huge(*pmd)) {
097963959594c5 Ross Zwisler 2017-01-10 4738 if (!pmdpp)
f8ad0f499fad5c Johannes Weiner 2009-06-16 4739 goto out;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4740
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4741 if (range) {
7269f999934b28 Jérôme Glisse 2019-05-13 4742 mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0,
6f4f13e8d9e27c Jérôme Glisse 2019-05-13 4743 NULL, mm, address & PMD_MASK,
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4744 (address & PMD_MASK) + PMD_SIZE);
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4745 mmu_notifier_invalidate_range_start(range);
a4d1a885251382 Jérôme Glisse 2017-08-31 4746 }
097963959594c5 Ross Zwisler 2017-01-10 4747 *ptlp = pmd_lock(mm, pmd);
097963959594c5 Ross Zwisler 2017-01-10 4748 if (pmd_huge(*pmd)) {
097963959594c5 Ross Zwisler 2017-01-10 4749 *pmdpp = pmd;
097963959594c5 Ross Zwisler 2017-01-10 4750 return 0;
097963959594c5 Ross Zwisler 2017-01-10 4751 }
097963959594c5 Ross Zwisler 2017-01-10 4752 spin_unlock(*ptlp);
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4753 if (range)
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4754 mmu_notifier_invalidate_range_end(range);
097963959594c5 Ross Zwisler 2017-01-10 4755 }
097963959594c5 Ross Zwisler 2017-01-10 4756
097963959594c5 Ross Zwisler 2017-01-10 4757 if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
f8ad0f499fad5c Johannes Weiner 2009-06-16 4758 goto out;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4759
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4760 if (range) {
7269f999934b28 Jérôme Glisse 2019-05-13 4761 mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, NULL, mm,
6f4f13e8d9e27c Jérôme Glisse 2019-05-13 4762 address & PAGE_MASK,
1ed7293ac40c5b Matthew Wilcox 2019-01-08 4763 (address & PAGE_MASK) + PAGE_SIZE);
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4764 mmu_notifier_invalidate_range_start(range);
a4d1a885251382 Jérôme Glisse 2017-08-31 4765 }
f8ad0f499fad5c Johannes Weiner 2009-06-16 4766 ptep = pte_offset_map_lock(mm, pmd, address, ptlp);
f8ad0f499fad5c Johannes Weiner 2009-06-16 4767 if (!pte_present(*ptep))
f8ad0f499fad5c Johannes Weiner 2009-06-16 4768 goto unlock;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4769 *ptepp = ptep;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4770 return 0;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4771 unlock:
f8ad0f499fad5c Johannes Weiner 2009-06-16 4772 pte_unmap_unlock(ptep, *ptlp);
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4773 if (range)
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4774 mmu_notifier_invalidate_range_end(range);
f8ad0f499fad5c Johannes Weiner 2009-06-16 4775 out:
f8ad0f499fad5c Johannes Weiner 2009-06-16 4776 return -EINVAL;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4777 }
f8ad0f499fad5c Johannes Weiner 2009-06-16 4778
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 28570 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
2021-02-09 8:14 ` Christoph Hellwig
@ 2021-02-09 9:19 ` Paolo Bonzini
0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2021-02-09 9:19 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-kernel, kvm, jgg, linux-mm, Andrew Morton, dan.j.williams,
Daniel Vetter
On 09/02/21 09:14, Christoph Hellwig wrote:
> On Mon, Feb 08, 2021 at 07:18:56PM +0100, Paolo Bonzini wrote:
>> Fair enough. I would expect that pretty much everyone using follow_pfn will
>> at least want to switch to this one (as it's less bad and not impossible to
>> use correctly), but I'll squash this in:
>
>
> Daniel looked into them, so he may correct me, but the other follow_pfn
> users and their destiny are:
>
> - SGX, which is not modular and I think I just saw a patch to kill them
> - v4l videobuf and frame vector: I think those are going away
> entirely as they implement a rather broken pre-dmabuf P2P scheme
> - vfio: should use MMU notifiers eventually
Yes, I'm thinking mostly of vfio, which could use follow_pte as a
short-term fix for just the missing permission check.
There's also s390 PCI, which is also not modular.
Paolo
> Daniel, what happened to your follow_pfn series?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
2021-02-08 18:18 ` Paolo Bonzini
@ 2021-02-09 8:14 ` Christoph Hellwig
2021-02-09 9:19 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2021-02-09 8:14 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Christoph Hellwig, linux-kernel, kvm, jgg, linux-mm,
Andrew Morton, dan.j.williams, Daniel Vetter
On Mon, Feb 08, 2021 at 07:18:56PM +0100, Paolo Bonzini wrote:
> Fair enough. I would expect that pretty much everyone using follow_pfn will
> at least want to switch to this one (as it's less bad and not impossible to
> use correctly), but I'll squash this in:
Daniel looked into them, so he may correct me, but the other follow_pfn
users and their destiny are:
- SGX, which is not modular and I think I just saw a patch to kill them
- v4l videobuf and frame vector: I think those are going away
entirely as they implement a rather broken pre-dmabuf P2P scheme
- vfio: should use MMU notifiers eventually
Daniel, what happened to your follow_pfn series?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
2021-02-08 17:39 ` Christoph Hellwig
@ 2021-02-08 18:18 ` Paolo Bonzini
2021-02-09 8:14 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2021-02-08 18:18 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-kernel, kvm, jgg, linux-mm, Andrew Morton, dan.j.williams
On 08/02/21 18:39, Christoph Hellwig wrote:
>> +int follow_pte(struct mm_struct *mm, unsigned long address,
>> + pte_t **ptepp, spinlock_t **ptlp)
>> +{
>> + return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp);
>> +}
>> +EXPORT_SYMBOL_GPL(follow_pte);
>
> I still don't think this is good as a general API. Please document this
> as KVM only for now, and hopefully next merge window I'll finish an
> export variant restricting us to specific modules.
Fair enough. I would expect that pretty much everyone using follow_pfn
will at least want to switch to this one (as it's less bad and not
impossible to use correctly), but I'll squash this in:
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 90b527260edf..24b292fce8e5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1659,8 +1659,8 @@ void free_pgd_range(struct mmu_gather *tlb,
unsigned long addr,
int
copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct
*src_vma);
int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
- struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
- spinlock_t **ptlp);
+ struct mmu_notifier_range *range, pte_t **ptepp,
+ pmd_t **pmdpp, spinlock_t **ptlp);
int follow_pte(struct mm_struct *mm, unsigned long address,
pte_t **ptepp, spinlock_t **ptlp);
int follow_pfn(struct vm_area_struct *vma, unsigned long address,
diff --git a/mm/memory.c b/mm/memory.c
index 3632f7416248..c8679b15c004 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4792,6 +4792,9 @@ int follow_invalidate_pte(struct mm_struct *mm,
unsigned long address,
* Only IO mappings and raw PFN mappings are allowed. The mmap semaphore
* should be taken for read.
*
+ * KVM uses this function. While it is arguably less bad than
+ * ``follow_pfn``, it is not a good general-purpose API.
+ *
* Return: zero on success, -ve otherwise.
*/
int follow_pte(struct mm_struct *mm, unsigned long address,
@@ -4809,6 +4812,9 @@ EXPORT_SYMBOL_GPL(follow_pte);
*
* Only IO mappings and raw PFN mappings are allowed.
*
+ * This function does not allow the caller to read the permissions
+ * of the PTE. Do not use it.
+ *
* Return: zero and the pfn at @pfn on success, -ve otherwise.
*/
int follow_pfn(struct vm_area_struct *vma, unsigned long address,
(apologies in advance if Thunderbird destroys the patch).
Paolo
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
2021-02-05 10:32 ` [PATCH 1/2] mm: provide a sane PTE walking API for modules Paolo Bonzini
2021-02-05 13:49 ` Jason Gunthorpe
@ 2021-02-08 17:39 ` Christoph Hellwig
2021-02-08 18:18 ` Paolo Bonzini
1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2021-02-08 17:39 UTC (permalink / raw)
To: Paolo Bonzini
Cc: linux-kernel, kvm, jgg, linux-mm, Andrew Morton, dan.j.williams
> +int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
> + struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
> + spinlock_t **ptlp);
This adds a very pointless overy long line.
> +/**
> + * follow_pte - look up PTE at a user virtual address
> + * @vma: memory mapping
> + * @address: user virtual address
> + * @ptepp: location to store found PTE
> + * @ptlp: location to store the lock for the PTE
> + *
> + * On a successful return, the pointer to the PTE is stored in @ptepp;
> + * the corresponding lock is taken and its location is stored in @ptlp.
> + * The contents of the PTE are only stable until @ptlp is released;
> + * any further use, if any, must be protected against invalidation
> + * with MMU notifiers.
> + *
> + * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore
> + * should be taken for read.
> + *
> + * Return: zero on success, -ve otherwise.
> + */
> +int follow_pte(struct mm_struct *mm, unsigned long address,
> + pte_t **ptepp, spinlock_t **ptlp)
> +{
> + return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp);
> +}
> +EXPORT_SYMBOL_GPL(follow_pte);
I still don't think this is good as a general API. Please document this
as KVM only for now, and hopefully next merge window I'll finish an
export variant restricting us to specific modules.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
2021-02-05 10:32 ` [PATCH 1/2] mm: provide a sane PTE walking API for modules Paolo Bonzini
@ 2021-02-05 13:49 ` Jason Gunthorpe
2021-02-08 17:39 ` Christoph Hellwig
1 sibling, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2021-02-05 13:49 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: linux-kernel, kvm, linux-mm, Andrew Morton, dan.j.williams
On Fri, Feb 05, 2021 at 05:32:58AM -0500, Paolo Bonzini wrote:
> Currently, the follow_pfn function is exported for modules but
> follow_pte is not. However, follow_pfn is very easy to misuse,
> because it does not provide protections (so most of its callers
> assume the page is writable!) and because it returns after having
> already unlocked the page table lock.
>
> Provide instead a simplified version of follow_pte that does
> not have the pmdpp and range arguments. The older version
> survives as follow_invalidate_pte() for use by fs/dax.c.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/s390/pci/pci_mmio.c | 2 +-
> fs/dax.c | 5 +++--
> include/linux/mm.h | 6 ++++--
> mm/memory.c | 35 ++++++++++++++++++++++++++++++-----
> 4 files changed, 38 insertions(+), 10 deletions(-)
Looks good to me, thanks
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
@ 2021-02-05 13:09 kernel test robot
0 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2021-02-05 13:09 UTC (permalink / raw)
To: kbuild
[-- Attachment #1: Type: text/plain, Size: 9028 bytes --]
CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210205103259.42866-2-pbonzini@redhat.com>
References: <20210205103259.42866-2-pbonzini@redhat.com>
TO: Paolo Bonzini <pbonzini@redhat.com>
TO: linux-kernel(a)vger.kernel.org
TO: kvm(a)vger.kernel.org
CC: jgg(a)ziepe.ca
CC: linux-mm(a)kvack.org
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux Memory Management List <linux-mm@kvack.org>
CC: dan.j.williams(a)intel.com
Hi Paolo,
I love your patch! Perhaps something to improve:
[auto build test WARNING on linux/master]
[also build test WARNING on linus/master v5.11-rc6]
[cannot apply to kvm/linux-next hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Paolo-Bonzini/KVM-do-not-assume-PTE-is-writable-after-follow_pfn/20210205-183434
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2ab38c17aac10bf55ab3efde4c4db3893d8691d2
:::::: branch date: 3 hours ago
:::::: commit date: 3 hours ago
config: s390-randconfig-s031-20210205 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.3-215-g0fb77bb6-dirty
# https://github.com/0day-ci/linux/commit/ad5c7f2ccaf2872e1f56278a9e1a0c839c5cf518
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Paolo-Bonzini/KVM-do-not-assume-PTE-is-writable-after-follow_pfn/20210205-183434
git checkout ad5c7f2ccaf2872e1f56278a9e1a0c839c5cf518
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=s390
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
"sparse warnings: (new ones prefixed by >>)"
mm/memory.c:5197:22: sparse: sparse: cast removes address space '__user' of expression
mm/memory.c:943:17: sparse: sparse: context imbalance in 'copy_pte_range' - different lock contexts for basic block
mm/memory.c:1623:16: sparse: sparse: context imbalance in '__get_locked_pte' - different lock contexts for basic block
mm/memory.c:1672:9: sparse: sparse: context imbalance in 'insert_page' - different lock contexts for basic block
mm/memory.c:2174:17: sparse: sparse: context imbalance in 'remap_pte_range' - different lock contexts for basic block
mm/memory.c:2419:17: sparse: sparse: context imbalance in 'apply_to_pte_range' - unexpected unlock
mm/memory.c:2675:17: sparse: sparse: context imbalance in 'wp_page_copy' - unexpected unlock
mm/memory.c:3022:17: sparse: sparse: context imbalance in 'wp_pfn_shared' - unexpected unlock
mm/memory.c:3085:19: sparse: sparse: context imbalance in 'do_wp_page' - different lock contexts for basic block
mm/memory.c:3657:19: sparse: sparse: context imbalance in 'pte_alloc_one_map' - different lock contexts for basic block
mm/memory.c:3884:17: sparse: sparse: context imbalance in 'finish_fault' - unexpected unlock
mm/memory.c:3993:9: sparse: sparse: context imbalance in 'do_fault_around' - unexpected unlock
>> mm/memory.c:4712:5: sparse: sparse: context imbalance in 'follow_invalidate_pte' - different lock contexts for basic block
mm/memory.c:4827:9: sparse: sparse: context imbalance in 'follow_pfn' - unexpected unlock
vim +/follow_invalidate_pte +4712 mm/memory.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 4711
ad5c7f2ccaf287 Paolo Bonzini 2021-02-05 @4712 int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
ad5c7f2ccaf287 Paolo Bonzini 2021-02-05 4713 struct mmu_notifier_range *range, pte_t **ptepp,
ad5c7f2ccaf287 Paolo Bonzini 2021-02-05 4714 pmd_t **pmdpp, spinlock_t **ptlp)
f8ad0f499fad5c Johannes Weiner 2009-06-16 4715 {
f8ad0f499fad5c Johannes Weiner 2009-06-16 4716 pgd_t *pgd;
c2febafc67734a Kirill A. Shutemov 2017-03-09 4717 p4d_t *p4d;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4718 pud_t *pud;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4719 pmd_t *pmd;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4720 pte_t *ptep;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4721
f8ad0f499fad5c Johannes Weiner 2009-06-16 4722 pgd = pgd_offset(mm, address);
f8ad0f499fad5c Johannes Weiner 2009-06-16 4723 if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
f8ad0f499fad5c Johannes Weiner 2009-06-16 4724 goto out;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4725
c2febafc67734a Kirill A. Shutemov 2017-03-09 4726 p4d = p4d_offset(pgd, address);
c2febafc67734a Kirill A. Shutemov 2017-03-09 4727 if (p4d_none(*p4d) || unlikely(p4d_bad(*p4d)))
c2febafc67734a Kirill A. Shutemov 2017-03-09 4728 goto out;
c2febafc67734a Kirill A. Shutemov 2017-03-09 4729
c2febafc67734a Kirill A. Shutemov 2017-03-09 4730 pud = pud_offset(p4d, address);
f8ad0f499fad5c Johannes Weiner 2009-06-16 4731 if (pud_none(*pud) || unlikely(pud_bad(*pud)))
f8ad0f499fad5c Johannes Weiner 2009-06-16 4732 goto out;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4733
f8ad0f499fad5c Johannes Weiner 2009-06-16 4734 pmd = pmd_offset(pud, address);
f66055ab6fb973 Andrea Arcangeli 2011-01-13 4735 VM_BUG_ON(pmd_trans_huge(*pmd));
097963959594c5 Ross Zwisler 2017-01-10 4736
097963959594c5 Ross Zwisler 2017-01-10 4737 if (pmd_huge(*pmd)) {
097963959594c5 Ross Zwisler 2017-01-10 4738 if (!pmdpp)
f8ad0f499fad5c Johannes Weiner 2009-06-16 4739 goto out;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4740
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4741 if (range) {
7269f999934b28 Jérôme Glisse 2019-05-13 4742 mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0,
6f4f13e8d9e27c Jérôme Glisse 2019-05-13 4743 NULL, mm, address & PMD_MASK,
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4744 (address & PMD_MASK) + PMD_SIZE);
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4745 mmu_notifier_invalidate_range_start(range);
a4d1a885251382 Jérôme Glisse 2017-08-31 4746 }
097963959594c5 Ross Zwisler 2017-01-10 4747 *ptlp = pmd_lock(mm, pmd);
097963959594c5 Ross Zwisler 2017-01-10 4748 if (pmd_huge(*pmd)) {
097963959594c5 Ross Zwisler 2017-01-10 4749 *pmdpp = pmd;
097963959594c5 Ross Zwisler 2017-01-10 4750 return 0;
097963959594c5 Ross Zwisler 2017-01-10 4751 }
097963959594c5 Ross Zwisler 2017-01-10 4752 spin_unlock(*ptlp);
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4753 if (range)
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4754 mmu_notifier_invalidate_range_end(range);
097963959594c5 Ross Zwisler 2017-01-10 4755 }
097963959594c5 Ross Zwisler 2017-01-10 4756
097963959594c5 Ross Zwisler 2017-01-10 4757 if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
f8ad0f499fad5c Johannes Weiner 2009-06-16 4758 goto out;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4759
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4760 if (range) {
7269f999934b28 Jérôme Glisse 2019-05-13 4761 mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, NULL, mm,
6f4f13e8d9e27c Jérôme Glisse 2019-05-13 4762 address & PAGE_MASK,
1ed7293ac40c5b Matthew Wilcox 2019-01-08 4763 (address & PAGE_MASK) + PAGE_SIZE);
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4764 mmu_notifier_invalidate_range_start(range);
a4d1a885251382 Jérôme Glisse 2017-08-31 4765 }
f8ad0f499fad5c Johannes Weiner 2009-06-16 4766 ptep = pte_offset_map_lock(mm, pmd, address, ptlp);
f8ad0f499fad5c Johannes Weiner 2009-06-16 4767 if (!pte_present(*ptep))
f8ad0f499fad5c Johannes Weiner 2009-06-16 4768 goto unlock;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4769 *ptepp = ptep;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4770 return 0;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4771 unlock:
f8ad0f499fad5c Johannes Weiner 2009-06-16 4772 pte_unmap_unlock(ptep, *ptlp);
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4773 if (range)
ac46d4f3c43241 Jérôme Glisse 2018-12-28 4774 mmu_notifier_invalidate_range_end(range);
f8ad0f499fad5c Johannes Weiner 2009-06-16 4775 out:
f8ad0f499fad5c Johannes Weiner 2009-06-16 4776 return -EINVAL;
f8ad0f499fad5c Johannes Weiner 2009-06-16 4777 }
f8ad0f499fad5c Johannes Weiner 2009-06-16 4778
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 12027 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] mm: provide a sane PTE walking API for modules
2021-02-05 10:32 [PATCH 0/2] KVM: do not assume PTE is writable after follow_pfn Paolo Bonzini
@ 2021-02-05 10:32 ` Paolo Bonzini
2021-02-05 13:49 ` Jason Gunthorpe
2021-02-08 17:39 ` Christoph Hellwig
0 siblings, 2 replies; 8+ messages in thread
From: Paolo Bonzini @ 2021-02-05 10:32 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: jgg, linux-mm, Andrew Morton, dan.j.williams
Currently, the follow_pfn function is exported for modules but
follow_pte is not. However, follow_pfn is very easy to misuse,
because it does not provide protections (so most of its callers
assume the page is writable!) and because it returns after having
already unlocked the page table lock.
Provide instead a simplified version of follow_pte that does
not have the pmdpp and range arguments. The older version
survives as follow_invalidate_pte() for use by fs/dax.c.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/s390/pci/pci_mmio.c | 2 +-
fs/dax.c | 5 +++--
include/linux/mm.h | 6 ++++--
mm/memory.c | 35 ++++++++++++++++++++++++++++++-----
4 files changed, 38 insertions(+), 10 deletions(-)
diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
index 18f2d10c3176..5922dce94bfd 100644
--- a/arch/s390/pci/pci_mmio.c
+++ b/arch/s390/pci/pci_mmio.c
@@ -170,7 +170,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
if (!(vma->vm_flags & VM_WRITE))
goto out_unlock_mmap;
- ret = follow_pte(vma->vm_mm, mmio_addr, NULL, &ptep, NULL, &ptl);
+ ret = follow_pte(vma->vm_mm, mmio_addr, &ptep, &ptl);
if (ret)
goto out_unlock_mmap;
@@ -311,7 +311,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
if (!(vma->vm_flags & VM_WRITE))
goto out_unlock_mmap;
- ret = follow_pte(vma->vm_mm, mmio_addr, NULL, &ptep, NULL, &ptl);
+ ret = follow_pte(vma->vm_mm, mmio_addr, &ptep, &ptl);
if (ret)
goto out_unlock_mmap;
diff --git a/fs/dax.c b/fs/dax.c
index 26d5dcd2d69e..b14874c7b983 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -810,11 +810,12 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index,
address = pgoff_address(index, vma);
/*
- * Note because we provide range to follow_pte it will call
+ * follow_invalidate_pte() will use the range to call
* mmu_notifier_invalidate_range_start() on our behalf before
* taking any lock.
*/
- if (follow_pte(vma->vm_mm, address, &range, &ptep, &pmdp, &ptl))
+ if (follow_invalidate_pte(vma->vm_mm, address, &range, &ptep,
+ &pmdp, &ptl))
continue;
/*
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecdf8a8cd6ae..90b527260edf 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1658,9 +1658,11 @@ void free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
unsigned long end, unsigned long floor, unsigned long ceiling);
int
copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma);
+int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
+ struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
+ spinlock_t **ptlp);
int follow_pte(struct mm_struct *mm, unsigned long address,
- struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
- spinlock_t **ptlp);
+ pte_t **ptepp, spinlock_t **ptlp);
int follow_pfn(struct vm_area_struct *vma, unsigned long address,
unsigned long *pfn);
int follow_phys(struct vm_area_struct *vma, unsigned long address,
diff --git a/mm/memory.c b/mm/memory.c
index feff48e1465a..b247d296931c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4709,9 +4709,9 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
}
#endif /* __PAGETABLE_PMD_FOLDED */
-int follow_pte(struct mm_struct *mm, unsigned long address,
- struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
- spinlock_t **ptlp)
+int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
+ struct mmu_notifier_range *range, pte_t **ptepp,
+ pmd_t **pmdpp, spinlock_t **ptlp)
{
pgd_t *pgd;
p4d_t *p4d;
@@ -4776,6 +4776,31 @@ int follow_pte(struct mm_struct *mm, unsigned long address,
return -EINVAL;
}
+/**
+ * follow_pte - look up PTE at a user virtual address
+ * @vma: memory mapping
+ * @address: user virtual address
+ * @ptepp: location to store found PTE
+ * @ptlp: location to store the lock for the PTE
+ *
+ * On a successful return, the pointer to the PTE is stored in @ptepp;
+ * the corresponding lock is taken and its location is stored in @ptlp.
+ * The contents of the PTE are only stable until @ptlp is released;
+ * any further use, if any, must be protected against invalidation
+ * with MMU notifiers.
+ *
+ * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore
+ * should be taken for read.
+ *
+ * Return: zero on success, -ve otherwise.
+ */
+int follow_pte(struct mm_struct *mm, unsigned long address,
+ pte_t **ptepp, spinlock_t **ptlp)
+{
+ return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp);
+}
+EXPORT_SYMBOL_GPL(follow_pte);
+
/**
* follow_pfn - look up PFN at a user virtual address
* @vma: memory mapping
@@ -4796,7 +4821,7 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address,
if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
return ret;
- ret = follow_pte(vma->vm_mm, address, NULL, &ptep, NULL, &ptl);
+ ret = follow_pte(vma->vm_mm, address, &ptep, &ptl);
if (ret)
return ret;
*pfn = pte_pfn(*ptep);
@@ -4817,7 +4842,7 @@ int follow_phys(struct vm_area_struct *vma,
if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
goto out;
- if (follow_pte(vma->vm_mm, address, NULL, &ptep, NULL, &ptl))
+ if (follow_pte(vma->vm_mm, address, &ptep, &ptl))
goto out;
pte = *ptep;
--
2.26.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-02-09 9:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-05 16:04 [PATCH 1/2] mm: provide a sane PTE walking API for modules kernel test robot
-- strict thread matches above, loose matches on Subject: below --
2021-02-05 13:09 kernel test robot
2021-02-05 10:32 [PATCH 0/2] KVM: do not assume PTE is writable after follow_pfn Paolo Bonzini
2021-02-05 10:32 ` [PATCH 1/2] mm: provide a sane PTE walking API for modules Paolo Bonzini
2021-02-05 13:49 ` Jason Gunthorpe
2021-02-08 17:39 ` Christoph Hellwig
2021-02-08 18:18 ` Paolo Bonzini
2021-02-09 8:14 ` Christoph Hellwig
2021-02-09 9:19 ` Paolo Bonzini
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.