linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Claudio Imbrenda <imbrenda@linux.ibm.com>,
	linux-next@vger.kernel.org, akpm@linux-foundation.org
Cc: david@redhat.com, aarcange@redhat.com, linux-mm@kvack.org,
	frankja@linux.ibm.com, sfr@canb.auug.org.au, jhubbard@nvidia.com,
	linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org,
	Will Deacon <will@kernel.org>
Subject: Re: [RFC v1 2/2] mm/gup/writeback: add callbacks for inaccessible pages
Date: Fri, 28 Feb 2020 17:08:19 +0100	[thread overview]
Message-ID: <2e3bf1a2-b672-68e0-97b6-42f08133e077@de.ibm.com> (raw)
In-Reply-To: <20200228154322.329228-4-imbrenda@linux.ibm.com>

Andrew,

while patch 1 is a fixup for the FOLL_PIN work in your patch queue,
I would really love to see this patch in 5.7. The exploitation code
of kvm/s390 is in Linux next also scheduled for 5.7.

Christian

On 28.02.20 16:43, Claudio Imbrenda wrote:
> With the introduction of protected KVM guests on s390 there is now a
> concept of inaccessible pages. These pages need to be made accessible
> before the host can access them.
> 
> While cpu accesses will trigger a fault that can be resolved, I/O
> accesses will just fail.  We need to add a callback into architecture
> code for places that will do I/O, namely when writeback is started or
> when a page reference is taken.
> 
> This is not only to enable paging, file backing etc, it is also
> necessary to protect the host against a malicious user space. For
> example a bad QEMU could simply start direct I/O on such protected
> memory.  We do not want userspace to be able to trigger I/O errors and
> thus we the logic is "whenever somebody accesses that page (gup) or
> does I/O, make sure that this page can be accessed". When the guest
> tries to access that page we will wait in the page fault handler for
> writeback to have finished and for the page_ref to be the expected
> value.
> 
> On s390x the function is not supposed to fail, so it is ok to use a
> WARN_ON on failure. If we ever need some more finegrained handling
> we can tackle this when we know the details.
> 
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Acked-by: Will Deacon <will@kernel.org>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  include/linux/gfp.h |  6 ++++++
>  mm/gup.c            | 19 ++++++++++++++++---
>  mm/page-writeback.c |  5 +++++
>  3 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index e5b817cb86e7..be2754841369 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -485,6 +485,12 @@ static inline void arch_free_page(struct page *page, int order) { }
>  #ifndef HAVE_ARCH_ALLOC_PAGE
>  static inline void arch_alloc_page(struct page *page, int order) { }
>  #endif
> +#ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE
> +static inline int arch_make_page_accessible(struct page *page)
> +{
> +	return 0;
> +}
> +#endif
>  
>  struct page *
>  __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid,
> diff --git a/mm/gup.c b/mm/gup.c
> index 0b9a806898f3..86fff6e4e4f3 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -391,6 +391,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
>  	struct page *page;
>  	spinlock_t *ptl;
>  	pte_t *ptep, pte;
> +	int ret;
>  
>  	/* FOLL_GET and FOLL_PIN are mutually exclusive. */
>  	if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
> @@ -449,8 +450,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
>  		if (is_zero_pfn(pte_pfn(pte))) {
>  			page = pte_page(pte);
>  		} else {
> -			int ret;
> -
>  			ret = follow_pfn_pte(vma, address, ptep, flags);
>  			page = ERR_PTR(ret);
>  			goto out;
> @@ -458,7 +457,6 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
>  	}
>  
>  	if (flags & FOLL_SPLIT && PageTransCompound(page)) {
> -		int ret;
>  		get_page(page);
>  		pte_unmap_unlock(ptep, ptl);
>  		lock_page(page);
> @@ -475,6 +473,14 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
>  		page = ERR_PTR(-ENOMEM);
>  		goto out;
>  	}
> +	if (flags & FOLL_PIN) {
> +		ret = arch_make_page_accessible(page);
> +		if (ret) {
> +			unpin_user_page(page);
> +			page = ERR_PTR(ret);
> +			goto out;
> +		}
> +	}
>  	if (flags & FOLL_TOUCH) {
>  		if ((flags & FOLL_WRITE) &&
>  		    !pte_dirty(pte) && !PageDirty(page))
> @@ -2143,6 +2149,13 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
>  
>  		VM_BUG_ON_PAGE(compound_head(page) != head, page);
>  
> +		if (flags & FOLL_PIN) {
> +			ret = arch_make_page_accessible(page);
> +			if (ret) {
> +				unpin_user_page(page);
> +				goto pte_unmap;
> +			}
> +		}
>  		SetPageReferenced(page);
>  		pages[*nr] = page;
>  		(*nr)++;
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index ab5a3cee8ad3..8384be5a2758 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2807,6 +2807,11 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
>  		inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
>  	}
>  	unlock_page_memcg(page);
> +	/*
> +	 * If writeback has been triggered on a page that cannot be made
> +	 * accessible, it is too late.
> +	 */
> +	WARN_ON(arch_make_page_accessible(page));
>  	return ret;
>  
>  }
> 



  reply	other threads:[~2020-02-28 16:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-28 15:43 [RFC v1 0/2] add callbacks for inaccessible pages Claudio Imbrenda
2020-02-28 15:43 ` [RFC v1 1/2] fixup for 9947ea2c1e608e32669d5caeb67b3e3fba3309e8 "mm/gup: track FOLL_PIN pages" Claudio Imbrenda
2020-02-28 15:45   ` Claudio Imbrenda
2020-02-28 15:43 ` [RFC v1 1/2] mm/gup: fixup for 9947ea2c1e608e32 " Claudio Imbrenda
2020-02-28 23:08   ` John Hubbard
2020-02-29 10:51     ` Claudio Imbrenda
2020-02-29 20:09       ` John Hubbard
2020-03-02 13:46     ` Michal Hocko
2020-02-28 15:43 ` [RFC v1 2/2] mm/gup/writeback: add callbacks for inaccessible pages Claudio Imbrenda
2020-02-28 16:08   ` Christian Borntraeger [this message]
2020-02-29  0:08     ` John Hubbard
2020-02-29 10:49       ` Claudio Imbrenda
2020-02-29 20:07         ` John Hubbard
2020-03-01  3:47     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2e3bf1a2-b672-68e0-97b6-42f08133e077@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=sfr@canb.auug.org.au \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).