On Wed, Apr 17, 2013 at 7:48 PM, Simon Jeons <simon.jeons@gmail.com> wrote:
Hi Jerome,

On 04/17/2013 10:01 PM, Jerome Glisse wrote:
On Tue, Apr 16, 2013 at 7:50 PM, Simon Jeons <simon.jeons@gmail.com> wrote:
On 04/17/2013 12:27 AM, Jerome Glisse wrote:

[snip]



As i said this is for pre-filling already present entry, ie pte that are present with a valid page (no special bit set). This is an optimization so that the GPU can pre-fill its tlb without having to take any mmap_sem. Hope is that in most common case this will be enough, but in some case you will have to go through the lengthy non fast gup.

I know this. What I concern is the pte you mentioned is for normal cpu, correct? How can you pre-fill pte and tlb of GPU?

You getting confuse, idea is to look at cpu pte and prefill gpu pte. I do not prefill cpu pte, if a cpu pte is valid then i use the page it point to prefill the GPU pte.

Yes, confused!



So i don't pre-fill CPU PTE and TLB GPU, i pre-fill GPU PTE from CPU PTE if CPU PTE is valid. Other GPU PTE are marked as invalid and will trigger a fault that will be handle using gup that will fill CPU PTE (if fault happen at a valid address) at which point GPU PTE is updated or error is reported if fault happened at an invalid address.

gup is used to fill CPU PTE, could you point out to me which codes will re-fill GPU PTE? gup fast?
GPU page table is different from CPU?


The GPU interrupt handler will schedule a work thread that will call gup and then update gpu page table.

Cheers,
Jerome