All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: yongji xie <xyjxie@linux.vnet.ibm.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-api@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Cc: aik@ozlabs.ru, benh@kernel.crashing.org, paulus@samba.org,
	mpe@ellerman.id.au, warrier@linux.vnet.ibm.com,
	zhong@linux.vnet.ibm.com, nikunj@linux.vnet.ibm.com
Subject: Re: [RFC PATCH 2/3] vfio-pci: Allow to mmap sub-page MMIO BARs if all MMIO BARs are page aligned
Date: Thu, 17 Dec 2015 14:46:44 -0700	[thread overview]
Message-ID: <1450388804.2674.158.camel@redhat.com> (raw)
In-Reply-To: <56728DC8.20803@linux.vnet.ibm.com>

On Thu, 2015-12-17 at 18:26 +0800, yongji xie wrote:
> 
> On 2015/12/17 4:04, Alex Williamson wrote:
> > On Fri, 2015-12-11 at 16:53 +0800, Yongji Xie wrote:
> > > Current vfio-pci implementation disallows to mmap
> > > sub-page(size < PAGE_SIZE) MMIO BARs because these BARs' mmio
> > > page
> > > may be shared with other BARs.
> > > 
> > > But we should allow to mmap these sub-page MMIO BARs if all MMIO
> > > BARs
> > > are page aligned which leads the BARs' mmio page would not be
> > > shared
> > > with other BARs.
> > > 
> > > This patch adds support for this case and we also add a
> > > VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED flag to notify userspace that
> > > platform supports all MMIO BARs to be page aligned.
> > > 
> > > Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
> > > ---
> > >   drivers/vfio/pci/vfio_pci.c         |   10 +++++++++-
> > >   drivers/vfio/pci/vfio_pci_private.h |    5 +++++
> > >   include/uapi/linux/vfio.h           |    2 ++
> > >   3 files changed, 16 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/vfio/pci/vfio_pci.c
> > > b/drivers/vfio/pci/vfio_pci.c
> > > index 32b88bd..dbcad99 100644
> > > --- a/drivers/vfio/pci/vfio_pci.c
> > > +++ b/drivers/vfio/pci/vfio_pci.c
> > > @@ -443,6 +443,9 @@ static long vfio_pci_ioctl(void *device_data,
> > >   		if (vdev->reset_works)
> > >   			info.flags |= VFIO_DEVICE_FLAGS_RESET;
> > >   
> > > +		if (vfio_pci_bar_page_aligned())
> > > +			info.flags |=
> > > VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED;
> > > +
> > >   		info.num_regions = VFIO_PCI_NUM_REGIONS;
> > >   		info.num_irqs = VFIO_PCI_NUM_IRQS;
> > >   
> > > @@ -479,7 +482,8 @@ static long vfio_pci_ioctl(void *device_data,
> > >   				     VFIO_REGION_INFO_FLAG_WRIT
> > > E;
> > >   			if (IS_ENABLED(CONFIG_VFIO_PCI_MMAP) &&
> > >   			    pci_resource_flags(pdev,
> > > info.index) &
> > > -			    IORESOURCE_MEM && info.size >=
> > > PAGE_SIZE)
> > > +			    IORESOURCE_MEM && (info.size >=
> > > PAGE_SIZE ||
> > > +			    vfio_pci_bar_page_aligned()))
> > >   				info.flags |=
> > > VFIO_REGION_INFO_FLAG_MMAP;
> > >   			break;
> > >   		case VFIO_PCI_ROM_REGION_INDEX:
> > > @@ -855,6 +859,10 @@ static int vfio_pci_mmap(void *device_data,
> > > struct vm_area_struct *vma)
> > >   		return -EINVAL;
> > >   
> > >   	phys_len = pci_resource_len(pdev, index);
> > > +
> > > +	if (vfio_pci_bar_page_aligned())
> > > +		phys_len = PAGE_ALIGN(phys_len);
> > > +
> > >   	req_len = vma->vm_end - vma->vm_start;
> > >   	pgoff = vma->vm_pgoff &
> > >   		((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) -
> > > 1);
> > > diff --git a/drivers/vfio/pci/vfio_pci_private.h
> > > b/drivers/vfio/pci/vfio_pci_private.h
> > > index 0e7394f..319352a 100644
> > > --- a/drivers/vfio/pci/vfio_pci_private.h
> > > +++ b/drivers/vfio/pci/vfio_pci_private.h
> > > @@ -69,6 +69,11 @@ struct vfio_pci_device {
> > >   #define is_irq_none(vdev) (!(is_intx(vdev) || is_msi(vdev) ||
> > > is_msix(vdev)))
> > >   #define irq_is(vdev, type) (vdev->irq_type == type)
> > >   
> > > +static inline bool vfio_pci_bar_page_aligned(void)
> > > +{
> > > +	return IS_ENABLED(CONFIG_PPC64);
> > > +}
> > I really dislike this.  This is a problem for any architecture that
> > runs on larger pages, and even an annoyance on 4k hosts.  Why are
> > we
> > only solving it for PPC64?
> Yes, I know it's a problem for other architectures. But I'm not sure
> if 
> other archs prefer
> to enforce the alignment of all BARs to be at least PAGE_SIZE which 
> would result in
> some waste of address space.
> 
> So I just propose a prototype and add PPC64 support here. And other 
> archs could decide
> to use it or not by themselves.
> > Can't we do something similar in the core PCI code and detect it?
> So you mean we can do it like this:
> 
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index d390fc1..f46c04d 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -320,6 +320,11 @@ static inline resource_size_t 
> pci_resource_alignment(struct pci_dev *dev,
>          return resource_alignment(res);
>   }
> 
> +static inline bool pci_bar_page_aligned(void)
> +{
> +       return IS_ENABLED(CONFIG_PPC64);
> +}
> +
>   void pci_enable_acs(struct pci_dev *dev);
> 
>   struct pci_dev_reset_methods {
> 
> or add a config option to indicate that PCI MMIO BARs should be page 
> aligned? 

Yes, I'm thinking of a boot commandline option, maybe one that PPC64
can default to enabled if it chooses to.  The problem is not unique to
PPC64 and the solution should not be unique either.  I don't want to
need to revisit this for ARM, which we know is going to be similarly
afflicted.  Thanks,

Alex

  reply	other threads:[~2015-12-17 21:46 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-11  8:53 [RFC PATCH 0/3] vfio-pci: Allow to mmap sub-page MMIO BARs and MSI-X table on PPC64 platform Yongji Xie
2015-12-11  8:53 ` Yongji Xie
2015-12-11  8:53 ` [RFC PATCH 1/3] powerpc/pci: Enforce all MMIO BARs to be page aligned Yongji Xie
2015-12-11  8:53 ` [RFC PATCH 2/3] vfio-pci: Allow to mmap sub-page MMIO BARs if all MMIO BARs are " Yongji Xie
2015-12-16 20:04   ` Alex Williamson
2015-12-17 10:26     ` yongji xie
2015-12-17 21:46       ` Alex Williamson [this message]
2015-12-18  8:23         ` yongji xie
2015-12-18  8:23           ` yongji xie
2015-12-11  8:53 ` [RFC PATCH 3/3] vfio-pci: Allow to mmap MSI-X table if EEH is supported Yongji Xie
2015-12-11  8:53   ` Yongji Xie
2015-12-16 20:14   ` Alex Williamson
2015-12-16 20:14     ` Alex Williamson
2015-12-17 10:08     ` David Laight
2015-12-17 10:08       ` David Laight
2015-12-17 10:08       ` David Laight
2015-12-17 21:06       ` Alex Williamson
2015-12-17 21:06         ` Alex Williamson
2015-12-18 10:15         ` David Laight
2015-12-18 10:15           ` David Laight
2015-12-18 10:15           ` David Laight
2015-12-17 10:37     ` yongji xie
2015-12-17 21:41       ` Alex Williamson
2015-12-17 21:41         ` Alex Williamson
2015-12-17 22:48         ` Benjamin Herrenschmidt
2015-12-17 22:48           ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1450388804.2674.158.camel@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=aik@ozlabs.ru \
    --cc=benh@kernel.crashing.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=warrier@linux.vnet.ibm.com \
    --cc=xyjxie@linux.vnet.ibm.com \
    --cc=zhong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.