On 7/13/17 1:16 PM, Jerome Glisse wrote: > ... > Hi Jerome, I have hit another kind of hang. Briefly, if a not yet allocated page faults on CPU during migration to device memory, any subsequent migration will fail for such page. Such a situation can trigger if a CPU page fault happens just immediately after migrate_vma() starts unmapping pages to migrate. Please find attached a reproducer based on the sample driver. In the hmm_test() function, an HMM_DMIRROR_MIGRATE request is triggered from a separate thread for not yet allocated pages (coming from malloc). In the same time, a HMM_DMIRROR_READ request is made for the same pages. This results in a sporadic app-side hang, because random number of pages never migrate to device memory. Note that if the pages are touched (initialized with data) prior to that, everything works as expected: all HMM_DMIRROR_READ and HMM_DMIRROR_MIGRATE requests eventually succeed. See comments in the hmm_test() function. Thanks! -- Evgeny Baskakov NVIDIA