On Tue, Sep 15, 2020 at 10:50:40AM -0400, Peter Xu wrote: > Hi, all, > > I prepared another version of the FOLL_PIN enforced cow patch attached, just in > case it would still be anything close to useful (though now I highly doubt it > considering below...). I took care of !USERFAULTFD as suggested by Leon, and > also the fast gup path. Now with the patch attached (for real..). > > However... > > On Mon, Sep 14, 2020 at 08:28:51PM -0300, Jason Gunthorpe wrote: > > Yes, this stuff does pin_user_pages_fast() and MADV_DONTFORK > > together. It sets FOLL_FORCE and FOLL_WRITE to get an exclusive copy > > of the page and MADV_DONTFORK was needed to ensure that a future fork > > doesn't establish a COW that would break the DMA by moving the > > physical page over to the fork. DMA should stay with the process that > > called pin_user_pages_fast() (Is MADV_DONTFORK still needed with > > recent years work to GUP/etc? It is a pretty terrible ancient thing) > > ... Now I'm more confused on what has happened. > > If we're with FORCE|WRITE, iiuc it should guarantee that the page will trigger > COW during gup even if it is shared, so no problem on the gup side. Then I'm > quite confused on why the write bit is not set when cow triggered. > > E.g., in wp_page_copy(), if I'm not wrong, the write bit is only controlled by > (besides the fix patch, though I believe the rdma test should have nothing to > do with uffd-wp after all so it should be the same anyways): > > entry = maybe_mkwrite(pte_mkdirty(entry), vma); > > It means, as long as the rdma region has VM_WRITE set (which I think of no > reason on why it shouldn't...), then it should have the write bit in the COWed > page entry. If so, the page should be stable and I don't undersdand why > another COW could even trigger and how the code path in the "trial cow" patch > is triggered. > > Or, the VMA is without VM_WRITE due to some reason? Sorry I probably know > nothing about RDMA, more information on that side might help too. E.g., is the > hardware going to walk the software process page table too when doing RDMA (or > is IOMMU page table used, or none)? > > Thanks, > > -- > Peter Xu -- Peter Xu