IOMMU Archive on lore.kernel.org
 help / color / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Lu Baolu <baolu.lu@linux.intel.com>
Cc: kevin.tian@intel.com, Yi Sun <yi.y.sun@linux.intel.com>,
	ashok.raj@intel.com, kvm@vger.kernel.org,
	sanjay.k.kumar@intel.com, iommu@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org,
	Alex Williamson <alex.williamson@redhat.com>,
	David Woodhouse <dwmw2@infradead.org>,
	yi.y.sun@intel.com
Subject: Re: [RFC PATCH 2/4] iommu/vt-d: Add first level page table interfaces
Date: Sun, 29 Sep 2019 13:25:32 +0800
Message-ID: <20190929052532.GA12953@xz-x1> (raw)
In-Reply-To: <66823e27-aa33-5968-b5fd-e5221fb1fffe@linux.intel.com>

On Sat, Sep 28, 2019 at 04:23:16PM +0800, Lu Baolu wrote:
> Hi Peter,
> 
> On 9/27/19 1:34 PM, Peter Xu wrote:
> > Hi, Baolu,
> > 
> > On Fri, Sep 27, 2019 at 10:27:24AM +0800, Lu Baolu wrote:
> > > > > > > +	spin_lock(&(domain)->page_table_lock);				\
> > > > > > 
> > > > > > Is this intended to lock here instead of taking the lock during the
> > > > > > whole page table walk?  Is it safe?
> > > > > > 
> > > > > > Taking the example where nm==PTE: when we reach here how do we
> > > > > > guarantee that the PMD page that has this PTE is still valid?
> > > > > 
> > > > > We will always keep the non-leaf pages in the table,
> > > > 
> > > > I see.  Though, could I ask why?  It seems to me that the existing 2nd
> > > > level page table does not keep these when unmap, and it's not even use
> > > > locking at all by leveraging cmpxchg()?
> > > 
> > > I still need some time to understand how cmpxchg() solves the race issue
> > > when reclaims pages. For example.
> > > 
> > > Thread A				Thread B
> > > -A1: check all PTE's empty		-B1: up-level PDE valid
> > > -A2: clear the up-level PDE
> > > -A3: reclaim the page			-B2: populate the PTEs
> > > 
> > > Both (A1,A2) and (B1,B2) should be atomic. Otherwise, race could happen.
> > 
> > I'm not sure of this, but IMHO it is similarly because we need to
> > allocate the iova ranges from iova allocator first, so thread A (who's
> > going to unmap pages) and thread B (who's going to map new pages)
> > should never have collapsed regions if happening concurrently.  I'm
> 
> Although they don't collapse, they might share a same pmd entry. If A
> cleared the pmd entry and B goes ahead with populating the pte's. It
> will crash.

My understanding is that if A was not owning all the pages on that PMD
entry then it will never free the page that was backing that PMD
entry.  Please refer to the code in dma_pte_clear_level() where it
has:

        /* If range covers entire pagetable, free it */
        if (start_pfn <= level_pfn &&
                last_pfn >= level_pfn + level_size(level) - 1) {
                ...
        } else {
                ...
        }

Note that when going into the else block, the PMD won't be freed but
only the PTEs that upon the PMD will be cleared.

In the case you mentioned above, IMHO it should go into that else
block.  Say, thread A must not contain the whole range of that PMD
otherwise thread B won't get allocated with pages within that range
covered by the same PMD.

Thanks,

-- 
Peter Xu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply index

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-23 12:24 [RFC PATCH 0/4] Use 1st-level for DMA remapping in guest Lu Baolu
2019-09-23 12:24 ` [RFC PATCH 1/4] iommu/vt-d: Move domain_flush_cache helper into header Lu Baolu
2019-09-23 12:24 ` [RFC PATCH 2/4] iommu/vt-d: Add first level page table interfaces Lu Baolu
2019-09-23 20:31   ` Raj, Ashok
2019-09-24  1:38     ` Lu Baolu
2019-09-25  4:30       ` Peter Xu
2019-09-25  4:38         ` Tian, Kevin
2019-09-25  5:24           ` Peter Xu
2019-09-25  6:52             ` Lu Baolu
2019-09-25  7:32               ` Tian, Kevin
2019-09-25  8:35                 ` Peter Xu
2019-09-26  1:42                 ` Lu Baolu
2019-09-25  5:21   ` Peter Xu
2019-09-26  2:35     ` Lu Baolu
2019-09-26  3:49       ` Peter Xu
2019-09-27  2:27         ` Lu Baolu
2019-09-27  5:34           ` Peter Xu
2019-09-28  8:23             ` Lu Baolu
2019-09-29  5:25               ` Peter Xu [this message]
2019-10-08  2:20                 ` Lu Baolu
2019-09-23 12:24 ` [RFC PATCH 3/4] iommu/vt-d: Map/unmap domain with mmmap/mmunmap Lu Baolu
2019-09-25  5:00   ` Tian, Kevin
2019-09-25  7:06     ` Lu Baolu
2019-09-23 12:24 ` [RFC PATCH 4/4] iommu/vt-d: Identify domains using first level page table Lu Baolu
2019-09-25  6:50   ` Peter Xu
2019-09-25  7:35     ` Tian, Kevin
2019-09-23 19:27 ` [RFC PATCH 0/4] Use 1st-level for DMA remapping in guest Jacob Pan
2019-09-23 20:25   ` Raj, Ashok
2019-09-24  4:40     ` Lu Baolu
2019-09-24  7:00     ` Tian, Kevin
2019-09-25  2:48       ` Lu Baolu
2019-09-25  6:56         ` Peter Xu
2019-09-25  7:21           ` Tian, Kevin
2019-09-25  7:45             ` Peter Xu
2019-09-25  8:02               ` Tian, Kevin
2019-09-25  8:52                 ` Peter Xu
2019-09-26  1:37                   ` Lu Baolu
2019-09-24  4:27   ` Lu Baolu

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190929052532.GA12953@xz-x1 \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sanjay.k.kumar@intel.com \
    --cc=yi.y.sun@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

IOMMU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-iommu/0 linux-iommu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-iommu linux-iommu/ https://lore.kernel.org/linux-iommu \
		iommu@lists.linux-foundation.org
	public-inbox-index linux-iommu

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.linux-foundation.lists.iommu


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git