All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
@ 2021-02-05 16:04 kernel test robot
  0 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2021-02-05 16:04 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 8932 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210205103259.42866-2-pbonzini@redhat.com>
References: <20210205103259.42866-2-pbonzini@redhat.com>
TO: Paolo Bonzini <pbonzini@redhat.com>
TO: linux-kernel(a)vger.kernel.org
TO: kvm(a)vger.kernel.org
CC: jgg(a)ziepe.ca
CC: linux-mm(a)kvack.org
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux Memory Management List <linux-mm@kvack.org>
CC: dan.j.williams(a)intel.com

Hi Paolo,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linux/master]
[also build test WARNING on linus/master v5.11-rc6]
[cannot apply to kvm/linux-next hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Paolo-Bonzini/KVM-do-not-assume-PTE-is-writable-after-follow_pfn/20210205-183434
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2ab38c17aac10bf55ab3efde4c4db3893d8691d2
:::::: branch date: 5 hours ago
:::::: commit date: 5 hours ago
config: i386-randconfig-s001-20210205 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-215-g0fb77bb6-dirty
        # https://github.com/0day-ci/linux/commit/ad5c7f2ccaf2872e1f56278a9e1a0c839c5cf518
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Paolo-Bonzini/KVM-do-not-assume-PTE-is-writable-after-follow_pfn/20210205-183434
        git checkout ad5c7f2ccaf2872e1f56278a9e1a0c839c5cf518
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


"sparse warnings: (new ones prefixed by >>)"
   mm/memory.c:5197:22: sparse: sparse: cast removes address space '__user' of expression
   mm/memory.c:943:17: sparse: sparse: context imbalance in 'copy_pte_range' - different lock contexts for basic block
   mm/memory.c:1623:16: sparse: sparse: context imbalance in '__get_locked_pte' - different lock contexts for basic block
   mm/memory.c:1672:9: sparse: sparse: context imbalance in 'insert_page' - different lock contexts for basic block
   mm/memory.c:2174:17: sparse: sparse: context imbalance in 'remap_pte_range' - different lock contexts for basic block
   mm/memory.c:2419:17: sparse: sparse: context imbalance in 'apply_to_pte_range' - unexpected unlock
   mm/memory.c:2676:9: sparse: sparse: context imbalance in 'wp_page_copy' - different lock contexts for basic block
   mm/memory.c:3022:17: sparse: sparse: context imbalance in 'wp_pfn_shared' - unexpected unlock
   mm/memory.c:3085:19: sparse: sparse: context imbalance in 'do_wp_page' - different lock contexts for basic block
   mm/memory.c:3657:19: sparse: sparse: context imbalance in 'pte_alloc_one_map' - different lock contexts for basic block
   mm/memory.c:3884:17: sparse: sparse: context imbalance in 'finish_fault' - unexpected unlock
   mm/memory.c:3993:9: sparse: sparse: context imbalance in 'do_fault_around' - unexpected unlock
>> mm/memory.c:4712:5: sparse: sparse: context imbalance in 'follow_invalidate_pte' - wrong count at exit
   mm/memory.c:4827:23: sparse: sparse: context imbalance in 'follow_pfn' - unexpected unlock
   mm/memory.c:4857:9: sparse: sparse: context imbalance in 'follow_phys' - unexpected unlock

vim +/follow_invalidate_pte +4712 mm/memory.c

^1da177e4c3f41 Linus Torvalds     2005-04-16  4711  
ad5c7f2ccaf287 Paolo Bonzini      2021-02-05 @4712  int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
ad5c7f2ccaf287 Paolo Bonzini      2021-02-05  4713  			  struct mmu_notifier_range *range, pte_t **ptepp,
ad5c7f2ccaf287 Paolo Bonzini      2021-02-05  4714  			  pmd_t **pmdpp, spinlock_t **ptlp)
f8ad0f499fad5c Johannes Weiner    2009-06-16  4715  {
f8ad0f499fad5c Johannes Weiner    2009-06-16  4716  	pgd_t *pgd;
c2febafc67734a Kirill A. Shutemov 2017-03-09  4717  	p4d_t *p4d;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4718  	pud_t *pud;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4719  	pmd_t *pmd;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4720  	pte_t *ptep;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4721  
f8ad0f499fad5c Johannes Weiner    2009-06-16  4722  	pgd = pgd_offset(mm, address);
f8ad0f499fad5c Johannes Weiner    2009-06-16  4723  	if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
f8ad0f499fad5c Johannes Weiner    2009-06-16  4724  		goto out;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4725  
c2febafc67734a Kirill A. Shutemov 2017-03-09  4726  	p4d = p4d_offset(pgd, address);
c2febafc67734a Kirill A. Shutemov 2017-03-09  4727  	if (p4d_none(*p4d) || unlikely(p4d_bad(*p4d)))
c2febafc67734a Kirill A. Shutemov 2017-03-09  4728  		goto out;
c2febafc67734a Kirill A. Shutemov 2017-03-09  4729  
c2febafc67734a Kirill A. Shutemov 2017-03-09  4730  	pud = pud_offset(p4d, address);
f8ad0f499fad5c Johannes Weiner    2009-06-16  4731  	if (pud_none(*pud) || unlikely(pud_bad(*pud)))
f8ad0f499fad5c Johannes Weiner    2009-06-16  4732  		goto out;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4733  
f8ad0f499fad5c Johannes Weiner    2009-06-16  4734  	pmd = pmd_offset(pud, address);
f66055ab6fb973 Andrea Arcangeli   2011-01-13  4735  	VM_BUG_ON(pmd_trans_huge(*pmd));
097963959594c5 Ross Zwisler       2017-01-10  4736  
097963959594c5 Ross Zwisler       2017-01-10  4737  	if (pmd_huge(*pmd)) {
097963959594c5 Ross Zwisler       2017-01-10  4738  		if (!pmdpp)
f8ad0f499fad5c Johannes Weiner    2009-06-16  4739  			goto out;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4740  
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4741  		if (range) {
7269f999934b28 Jérôme Glisse      2019-05-13  4742  			mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0,
6f4f13e8d9e27c Jérôme Glisse      2019-05-13  4743  						NULL, mm, address & PMD_MASK,
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4744  						(address & PMD_MASK) + PMD_SIZE);
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4745  			mmu_notifier_invalidate_range_start(range);
a4d1a885251382 Jérôme Glisse      2017-08-31  4746  		}
097963959594c5 Ross Zwisler       2017-01-10  4747  		*ptlp = pmd_lock(mm, pmd);
097963959594c5 Ross Zwisler       2017-01-10  4748  		if (pmd_huge(*pmd)) {
097963959594c5 Ross Zwisler       2017-01-10  4749  			*pmdpp = pmd;
097963959594c5 Ross Zwisler       2017-01-10  4750  			return 0;
097963959594c5 Ross Zwisler       2017-01-10  4751  		}
097963959594c5 Ross Zwisler       2017-01-10  4752  		spin_unlock(*ptlp);
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4753  		if (range)
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4754  			mmu_notifier_invalidate_range_end(range);
097963959594c5 Ross Zwisler       2017-01-10  4755  	}
097963959594c5 Ross Zwisler       2017-01-10  4756  
097963959594c5 Ross Zwisler       2017-01-10  4757  	if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
f8ad0f499fad5c Johannes Weiner    2009-06-16  4758  		goto out;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4759  
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4760  	if (range) {
7269f999934b28 Jérôme Glisse      2019-05-13  4761  		mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, NULL, mm,
6f4f13e8d9e27c Jérôme Glisse      2019-05-13  4762  					address & PAGE_MASK,
1ed7293ac40c5b Matthew Wilcox     2019-01-08  4763  					(address & PAGE_MASK) + PAGE_SIZE);
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4764  		mmu_notifier_invalidate_range_start(range);
a4d1a885251382 Jérôme Glisse      2017-08-31  4765  	}
f8ad0f499fad5c Johannes Weiner    2009-06-16  4766  	ptep = pte_offset_map_lock(mm, pmd, address, ptlp);
f8ad0f499fad5c Johannes Weiner    2009-06-16  4767  	if (!pte_present(*ptep))
f8ad0f499fad5c Johannes Weiner    2009-06-16  4768  		goto unlock;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4769  	*ptepp = ptep;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4770  	return 0;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4771  unlock:
f8ad0f499fad5c Johannes Weiner    2009-06-16  4772  	pte_unmap_unlock(ptep, *ptlp);
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4773  	if (range)
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4774  		mmu_notifier_invalidate_range_end(range);
f8ad0f499fad5c Johannes Weiner    2009-06-16  4775  out:
f8ad0f499fad5c Johannes Weiner    2009-06-16  4776  	return -EINVAL;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4777  }
f8ad0f499fad5c Johannes Weiner    2009-06-16  4778  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 28570 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
  2021-02-09  8:14       ` Christoph Hellwig
@ 2021-02-09  9:19         ` Paolo Bonzini
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2021-02-09  9:19 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, kvm, jgg, linux-mm, Andrew Morton, dan.j.williams,
	Daniel Vetter

On 09/02/21 09:14, Christoph Hellwig wrote:
> On Mon, Feb 08, 2021 at 07:18:56PM +0100, Paolo Bonzini wrote:
>> Fair enough.  I would expect that pretty much everyone using follow_pfn will
>> at least want to switch to this one (as it's less bad and not impossible to
>> use correctly), but I'll squash this in:
> 
> 
> Daniel looked into them, so he may correct me, but the other follow_pfn
> users and their destiny are:
> 
>   - SGX, which is not modular and I think I just saw a patch to kill them
>   - v4l videobuf and frame vector: I think those are going away
>     entirely as they implement a rather broken pre-dmabuf P2P scheme
>   - vfio: should use MMU notifiers eventually

Yes, I'm thinking mostly of vfio, which could use follow_pte as a 
short-term fix for just the missing permission check.

There's also s390 PCI, which is also not modular.

Paolo

> Daniel, what happened to your follow_pfn series?



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
  2021-02-08 18:18     ` Paolo Bonzini
@ 2021-02-09  8:14       ` Christoph Hellwig
  2021-02-09  9:19         ` Paolo Bonzini
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2021-02-09  8:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Christoph Hellwig, linux-kernel, kvm, jgg, linux-mm,
	Andrew Morton, dan.j.williams, Daniel Vetter

On Mon, Feb 08, 2021 at 07:18:56PM +0100, Paolo Bonzini wrote:
> Fair enough.  I would expect that pretty much everyone using follow_pfn will
> at least want to switch to this one (as it's less bad and not impossible to
> use correctly), but I'll squash this in:


Daniel looked into them, so he may correct me, but the other follow_pfn
users and their destiny are:

 - SGX, which is not modular and I think I just saw a patch to kill them
 - v4l videobuf and frame vector: I think those are going away
   entirely as they implement a rather broken pre-dmabuf P2P scheme
 - vfio: should use MMU notifiers eventually

Daniel, what happened to your follow_pfn series?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
  2021-02-08 17:39   ` Christoph Hellwig
@ 2021-02-08 18:18     ` Paolo Bonzini
  2021-02-09  8:14       ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2021-02-08 18:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, kvm, jgg, linux-mm, Andrew Morton, dan.j.williams

On 08/02/21 18:39, Christoph Hellwig wrote:
>> +int follow_pte(struct mm_struct *mm, unsigned long address,
>> +	       pte_t **ptepp, spinlock_t **ptlp)
>> +{
>> +	return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp);
>> +}
>> +EXPORT_SYMBOL_GPL(follow_pte);
>
> I still don't think this is good as a general API.  Please document this
> as KVM only for now, and hopefully next merge window I'll finish an
> export variant restricting us to specific modules.

Fair enough.  I would expect that pretty much everyone using follow_pfn 
will at least want to switch to this one (as it's less bad and not 
impossible to use correctly), but I'll squash this in:

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 90b527260edf..24b292fce8e5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1659,8 +1659,8 @@ void free_pgd_range(struct mmu_gather *tlb, 
unsigned long addr,
  int
  copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct 
*src_vma);
  int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
-			  struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
-			  spinlock_t **ptlp);
+			  struct mmu_notifier_range *range, pte_t **ptepp,
+			  pmd_t **pmdpp, spinlock_t **ptlp);
  int follow_pte(struct mm_struct *mm, unsigned long address,
  	       pte_t **ptepp, spinlock_t **ptlp);
  int follow_pfn(struct vm_area_struct *vma, unsigned long address,
diff --git a/mm/memory.c b/mm/memory.c
index 3632f7416248..c8679b15c004 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4792,6 +4792,9 @@ int follow_invalidate_pte(struct mm_struct *mm, 
unsigned long address,
   * Only IO mappings and raw PFN mappings are allowed.  The mmap semaphore
   * should be taken for read.
   *
+ * KVM uses this function.  While it is arguably less bad than
+ * ``follow_pfn``, it is not a good general-purpose API.
+ *
   * Return: zero on success, -ve otherwise.
   */
  int follow_pte(struct mm_struct *mm, unsigned long address,
@@ -4809,6 +4812,9 @@ EXPORT_SYMBOL_GPL(follow_pte);
   *
   * Only IO mappings and raw PFN mappings are allowed.
   *
+ * This function does not allow the caller to read the permissions
+ * of the PTE.  Do not use it.
+ *
   * Return: zero and the pfn at @pfn on success, -ve otherwise.
   */
  int follow_pfn(struct vm_area_struct *vma, unsigned long address,


(apologies in advance if Thunderbird destroys the patch).

Paolo


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
  2021-02-05 10:32 ` [PATCH 1/2] mm: provide a sane PTE walking API for modules Paolo Bonzini
  2021-02-05 13:49   ` Jason Gunthorpe
@ 2021-02-08 17:39   ` Christoph Hellwig
  2021-02-08 18:18     ` Paolo Bonzini
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2021-02-08 17:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, jgg, linux-mm, Andrew Morton, dan.j.williams

> +int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
> +			  struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
> +			  spinlock_t **ptlp);

This adds a very pointless overy long line.

> +/**
> + * follow_pte - look up PTE at a user virtual address
> + * @vma: memory mapping
> + * @address: user virtual address
> + * @ptepp: location to store found PTE
> + * @ptlp: location to store the lock for the PTE
> + *
> + * On a successful return, the pointer to the PTE is stored in @ptepp;
> + * the corresponding lock is taken and its location is stored in @ptlp.
> + * The contents of the PTE are only stable until @ptlp is released;
> + * any further use, if any, must be protected against invalidation
> + * with MMU notifiers.
> + *
> + * Only IO mappings and raw PFN mappings are allowed.  The mmap semaphore
> + * should be taken for read.
> + *
> + * Return: zero on success, -ve otherwise.
> + */
> +int follow_pte(struct mm_struct *mm, unsigned long address,
> +	       pte_t **ptepp, spinlock_t **ptlp)
> +{
> +	return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp);
> +}
> +EXPORT_SYMBOL_GPL(follow_pte);

I still don't think this is good as a general API.  Please document this
as KVM only for now, and hopefully next merge window I'll finish an
export variant restricting us to specific modules.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
  2021-02-05 10:32 ` [PATCH 1/2] mm: provide a sane PTE walking API for modules Paolo Bonzini
@ 2021-02-05 13:49   ` Jason Gunthorpe
  2021-02-08 17:39   ` Christoph Hellwig
  1 sibling, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2021-02-05 13:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, linux-mm, Andrew Morton, dan.j.williams

On Fri, Feb 05, 2021 at 05:32:58AM -0500, Paolo Bonzini wrote:
> Currently, the follow_pfn function is exported for modules but
> follow_pte is not.  However, follow_pfn is very easy to misuse,
> because it does not provide protections (so most of its callers
> assume the page is writable!) and because it returns after having
> already unlocked the page table lock.
> 
> Provide instead a simplified version of follow_pte that does
> not have the pmdpp and range arguments.  The older version
> survives as follow_invalidate_pte() for use by fs/dax.c.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/s390/pci/pci_mmio.c |  2 +-
>  fs/dax.c                 |  5 +++--
>  include/linux/mm.h       |  6 ++++--
>  mm/memory.c              | 35 ++++++++++++++++++++++++++++++-----
>  4 files changed, 38 insertions(+), 10 deletions(-)

Looks good to me, thanks

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm: provide a sane PTE walking API for modules
@ 2021-02-05 13:09 kernel test robot
  0 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2021-02-05 13:09 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 9028 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20210205103259.42866-2-pbonzini@redhat.com>
References: <20210205103259.42866-2-pbonzini@redhat.com>
TO: Paolo Bonzini <pbonzini@redhat.com>
TO: linux-kernel(a)vger.kernel.org
TO: kvm(a)vger.kernel.org
CC: jgg(a)ziepe.ca
CC: linux-mm(a)kvack.org
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux Memory Management List <linux-mm@kvack.org>
CC: dan.j.williams(a)intel.com

Hi Paolo,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linux/master]
[also build test WARNING on linus/master v5.11-rc6]
[cannot apply to kvm/linux-next hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Paolo-Bonzini/KVM-do-not-assume-PTE-is-writable-after-follow_pfn/20210205-183434
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2ab38c17aac10bf55ab3efde4c4db3893d8691d2
:::::: branch date: 3 hours ago
:::::: commit date: 3 hours ago
config: s390-randconfig-s031-20210205 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.3-215-g0fb77bb6-dirty
        # https://github.com/0day-ci/linux/commit/ad5c7f2ccaf2872e1f56278a9e1a0c839c5cf518
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Paolo-Bonzini/KVM-do-not-assume-PTE-is-writable-after-follow_pfn/20210205-183434
        git checkout ad5c7f2ccaf2872e1f56278a9e1a0c839c5cf518
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


"sparse warnings: (new ones prefixed by >>)"
   mm/memory.c:5197:22: sparse: sparse: cast removes address space '__user' of expression
   mm/memory.c:943:17: sparse: sparse: context imbalance in 'copy_pte_range' - different lock contexts for basic block
   mm/memory.c:1623:16: sparse: sparse: context imbalance in '__get_locked_pte' - different lock contexts for basic block
   mm/memory.c:1672:9: sparse: sparse: context imbalance in 'insert_page' - different lock contexts for basic block
   mm/memory.c:2174:17: sparse: sparse: context imbalance in 'remap_pte_range' - different lock contexts for basic block
   mm/memory.c:2419:17: sparse: sparse: context imbalance in 'apply_to_pte_range' - unexpected unlock
   mm/memory.c:2675:17: sparse: sparse: context imbalance in 'wp_page_copy' - unexpected unlock
   mm/memory.c:3022:17: sparse: sparse: context imbalance in 'wp_pfn_shared' - unexpected unlock
   mm/memory.c:3085:19: sparse: sparse: context imbalance in 'do_wp_page' - different lock contexts for basic block
   mm/memory.c:3657:19: sparse: sparse: context imbalance in 'pte_alloc_one_map' - different lock contexts for basic block
   mm/memory.c:3884:17: sparse: sparse: context imbalance in 'finish_fault' - unexpected unlock
   mm/memory.c:3993:9: sparse: sparse: context imbalance in 'do_fault_around' - unexpected unlock
>> mm/memory.c:4712:5: sparse: sparse: context imbalance in 'follow_invalidate_pte' - different lock contexts for basic block
   mm/memory.c:4827:9: sparse: sparse: context imbalance in 'follow_pfn' - unexpected unlock

vim +/follow_invalidate_pte +4712 mm/memory.c

^1da177e4c3f41 Linus Torvalds     2005-04-16  4711  
ad5c7f2ccaf287 Paolo Bonzini      2021-02-05 @4712  int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
ad5c7f2ccaf287 Paolo Bonzini      2021-02-05  4713  			  struct mmu_notifier_range *range, pte_t **ptepp,
ad5c7f2ccaf287 Paolo Bonzini      2021-02-05  4714  			  pmd_t **pmdpp, spinlock_t **ptlp)
f8ad0f499fad5c Johannes Weiner    2009-06-16  4715  {
f8ad0f499fad5c Johannes Weiner    2009-06-16  4716  	pgd_t *pgd;
c2febafc67734a Kirill A. Shutemov 2017-03-09  4717  	p4d_t *p4d;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4718  	pud_t *pud;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4719  	pmd_t *pmd;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4720  	pte_t *ptep;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4721  
f8ad0f499fad5c Johannes Weiner    2009-06-16  4722  	pgd = pgd_offset(mm, address);
f8ad0f499fad5c Johannes Weiner    2009-06-16  4723  	if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
f8ad0f499fad5c Johannes Weiner    2009-06-16  4724  		goto out;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4725  
c2febafc67734a Kirill A. Shutemov 2017-03-09  4726  	p4d = p4d_offset(pgd, address);
c2febafc67734a Kirill A. Shutemov 2017-03-09  4727  	if (p4d_none(*p4d) || unlikely(p4d_bad(*p4d)))
c2febafc67734a Kirill A. Shutemov 2017-03-09  4728  		goto out;
c2febafc67734a Kirill A. Shutemov 2017-03-09  4729  
c2febafc67734a Kirill A. Shutemov 2017-03-09  4730  	pud = pud_offset(p4d, address);
f8ad0f499fad5c Johannes Weiner    2009-06-16  4731  	if (pud_none(*pud) || unlikely(pud_bad(*pud)))
f8ad0f499fad5c Johannes Weiner    2009-06-16  4732  		goto out;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4733  
f8ad0f499fad5c Johannes Weiner    2009-06-16  4734  	pmd = pmd_offset(pud, address);
f66055ab6fb973 Andrea Arcangeli   2011-01-13  4735  	VM_BUG_ON(pmd_trans_huge(*pmd));
097963959594c5 Ross Zwisler       2017-01-10  4736  
097963959594c5 Ross Zwisler       2017-01-10  4737  	if (pmd_huge(*pmd)) {
097963959594c5 Ross Zwisler       2017-01-10  4738  		if (!pmdpp)
f8ad0f499fad5c Johannes Weiner    2009-06-16  4739  			goto out;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4740  
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4741  		if (range) {
7269f999934b28 Jérôme Glisse      2019-05-13  4742  			mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0,
6f4f13e8d9e27c Jérôme Glisse      2019-05-13  4743  						NULL, mm, address & PMD_MASK,
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4744  						(address & PMD_MASK) + PMD_SIZE);
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4745  			mmu_notifier_invalidate_range_start(range);
a4d1a885251382 Jérôme Glisse      2017-08-31  4746  		}
097963959594c5 Ross Zwisler       2017-01-10  4747  		*ptlp = pmd_lock(mm, pmd);
097963959594c5 Ross Zwisler       2017-01-10  4748  		if (pmd_huge(*pmd)) {
097963959594c5 Ross Zwisler       2017-01-10  4749  			*pmdpp = pmd;
097963959594c5 Ross Zwisler       2017-01-10  4750  			return 0;
097963959594c5 Ross Zwisler       2017-01-10  4751  		}
097963959594c5 Ross Zwisler       2017-01-10  4752  		spin_unlock(*ptlp);
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4753  		if (range)
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4754  			mmu_notifier_invalidate_range_end(range);
097963959594c5 Ross Zwisler       2017-01-10  4755  	}
097963959594c5 Ross Zwisler       2017-01-10  4756  
097963959594c5 Ross Zwisler       2017-01-10  4757  	if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
f8ad0f499fad5c Johannes Weiner    2009-06-16  4758  		goto out;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4759  
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4760  	if (range) {
7269f999934b28 Jérôme Glisse      2019-05-13  4761  		mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, NULL, mm,
6f4f13e8d9e27c Jérôme Glisse      2019-05-13  4762  					address & PAGE_MASK,
1ed7293ac40c5b Matthew Wilcox     2019-01-08  4763  					(address & PAGE_MASK) + PAGE_SIZE);
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4764  		mmu_notifier_invalidate_range_start(range);
a4d1a885251382 Jérôme Glisse      2017-08-31  4765  	}
f8ad0f499fad5c Johannes Weiner    2009-06-16  4766  	ptep = pte_offset_map_lock(mm, pmd, address, ptlp);
f8ad0f499fad5c Johannes Weiner    2009-06-16  4767  	if (!pte_present(*ptep))
f8ad0f499fad5c Johannes Weiner    2009-06-16  4768  		goto unlock;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4769  	*ptepp = ptep;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4770  	return 0;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4771  unlock:
f8ad0f499fad5c Johannes Weiner    2009-06-16  4772  	pte_unmap_unlock(ptep, *ptlp);
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4773  	if (range)
ac46d4f3c43241 Jérôme Glisse      2018-12-28  4774  		mmu_notifier_invalidate_range_end(range);
f8ad0f499fad5c Johannes Weiner    2009-06-16  4775  out:
f8ad0f499fad5c Johannes Weiner    2009-06-16  4776  	return -EINVAL;
f8ad0f499fad5c Johannes Weiner    2009-06-16  4777  }
f8ad0f499fad5c Johannes Weiner    2009-06-16  4778  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 12027 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] mm: provide a sane PTE walking API for modules
  2021-02-05 10:32 [PATCH 0/2] KVM: do not assume PTE is writable after follow_pfn Paolo Bonzini
@ 2021-02-05 10:32 ` Paolo Bonzini
  2021-02-05 13:49   ` Jason Gunthorpe
  2021-02-08 17:39   ` Christoph Hellwig
  0 siblings, 2 replies; 8+ messages in thread
From: Paolo Bonzini @ 2021-02-05 10:32 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jgg, linux-mm, Andrew Morton, dan.j.williams

Currently, the follow_pfn function is exported for modules but
follow_pte is not.  However, follow_pfn is very easy to misuse,
because it does not provide protections (so most of its callers
assume the page is writable!) and because it returns after having
already unlocked the page table lock.

Provide instead a simplified version of follow_pte that does
not have the pmdpp and range arguments.  The older version
survives as follow_invalidate_pte() for use by fs/dax.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/s390/pci/pci_mmio.c |  2 +-
 fs/dax.c                 |  5 +++--
 include/linux/mm.h       |  6 ++++--
 mm/memory.c              | 35 ++++++++++++++++++++++++++++++-----
 4 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
index 18f2d10c3176..5922dce94bfd 100644
--- a/arch/s390/pci/pci_mmio.c
+++ b/arch/s390/pci/pci_mmio.c
@@ -170,7 +170,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
 	if (!(vma->vm_flags & VM_WRITE))
 		goto out_unlock_mmap;
 
-	ret = follow_pte(vma->vm_mm, mmio_addr, NULL, &ptep, NULL, &ptl);
+	ret = follow_pte(vma->vm_mm, mmio_addr, &ptep, &ptl);
 	if (ret)
 		goto out_unlock_mmap;
 
@@ -311,7 +311,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
 	if (!(vma->vm_flags & VM_WRITE))
 		goto out_unlock_mmap;
 
-	ret = follow_pte(vma->vm_mm, mmio_addr, NULL, &ptep, NULL, &ptl);
+	ret = follow_pte(vma->vm_mm, mmio_addr, &ptep, &ptl);
 	if (ret)
 		goto out_unlock_mmap;
 
diff --git a/fs/dax.c b/fs/dax.c
index 26d5dcd2d69e..b14874c7b983 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -810,11 +810,12 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index,
 		address = pgoff_address(index, vma);
 
 		/*
-		 * Note because we provide range to follow_pte it will call
+		 * follow_invalidate_pte() will use the range to call
 		 * mmu_notifier_invalidate_range_start() on our behalf before
 		 * taking any lock.
 		 */
-		if (follow_pte(vma->vm_mm, address, &range, &ptep, &pmdp, &ptl))
+		if (follow_invalidate_pte(vma->vm_mm, address, &range, &ptep,
+					  &pmdp, &ptl))
 			continue;
 
 		/*
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecdf8a8cd6ae..90b527260edf 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1658,9 +1658,11 @@ void free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
 		unsigned long end, unsigned long floor, unsigned long ceiling);
 int
 copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma);
+int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
+			  struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
+			  spinlock_t **ptlp);
 int follow_pte(struct mm_struct *mm, unsigned long address,
-		struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
-		spinlock_t **ptlp);
+	       pte_t **ptepp, spinlock_t **ptlp);
 int follow_pfn(struct vm_area_struct *vma, unsigned long address,
 	unsigned long *pfn);
 int follow_phys(struct vm_area_struct *vma, unsigned long address,
diff --git a/mm/memory.c b/mm/memory.c
index feff48e1465a..b247d296931c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4709,9 +4709,9 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
 }
 #endif /* __PAGETABLE_PMD_FOLDED */
 
-int follow_pte(struct mm_struct *mm, unsigned long address,
-	       struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp,
-	       spinlock_t **ptlp)
+int follow_invalidate_pte(struct mm_struct *mm, unsigned long address,
+			  struct mmu_notifier_range *range, pte_t **ptepp,
+			  pmd_t **pmdpp, spinlock_t **ptlp)
 {
 	pgd_t *pgd;
 	p4d_t *p4d;
@@ -4776,6 +4776,31 @@ int follow_pte(struct mm_struct *mm, unsigned long address,
 	return -EINVAL;
 }
 
+/**
+ * follow_pte - look up PTE at a user virtual address
+ * @vma: memory mapping
+ * @address: user virtual address
+ * @ptepp: location to store found PTE
+ * @ptlp: location to store the lock for the PTE
+ *
+ * On a successful return, the pointer to the PTE is stored in @ptepp;
+ * the corresponding lock is taken and its location is stored in @ptlp.
+ * The contents of the PTE are only stable until @ptlp is released;
+ * any further use, if any, must be protected against invalidation
+ * with MMU notifiers.
+ *
+ * Only IO mappings and raw PFN mappings are allowed.  The mmap semaphore
+ * should be taken for read.
+ *
+ * Return: zero on success, -ve otherwise.
+ */
+int follow_pte(struct mm_struct *mm, unsigned long address,
+	       pte_t **ptepp, spinlock_t **ptlp)
+{
+	return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp);
+}
+EXPORT_SYMBOL_GPL(follow_pte);
+
 /**
  * follow_pfn - look up PFN at a user virtual address
  * @vma: memory mapping
@@ -4796,7 +4821,7 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address,
 	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
 		return ret;
 
-	ret = follow_pte(vma->vm_mm, address, NULL, &ptep, NULL, &ptl);
+	ret = follow_pte(vma->vm_mm, address, &ptep, &ptl);
 	if (ret)
 		return ret;
 	*pfn = pte_pfn(*ptep);
@@ -4817,7 +4842,7 @@ int follow_phys(struct vm_area_struct *vma,
 	if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
 		goto out;
 
-	if (follow_pte(vma->vm_mm, address, NULL, &ptep, NULL, &ptl))
+	if (follow_pte(vma->vm_mm, address, &ptep, &ptl))
 		goto out;
 	pte = *ptep;
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-02-09  9:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-05 16:04 [PATCH 1/2] mm: provide a sane PTE walking API for modules kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2021-02-05 13:09 kernel test robot
2021-02-05 10:32 [PATCH 0/2] KVM: do not assume PTE is writable after follow_pfn Paolo Bonzini
2021-02-05 10:32 ` [PATCH 1/2] mm: provide a sane PTE walking API for modules Paolo Bonzini
2021-02-05 13:49   ` Jason Gunthorpe
2021-02-08 17:39   ` Christoph Hellwig
2021-02-08 18:18     ` Paolo Bonzini
2021-02-09  8:14       ` Christoph Hellwig
2021-02-09  9:19         ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.