All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oded Gabbay <oded.gabbay@gmail.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Jerome Glisse <j.glisse@gmail.com>,
	lsf-pc@lists.linux-foundation.org,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Joerg Roedel <joro@8bytes.org>
Subject: Re: [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU
Date: Wed, 3 Feb 2016 13:07:07 +0200	[thread overview]
Message-ID: <CAFCwf12U2iQS2xUoRx4W7cVQJOcso+QK2_PdYYD-k_J1V8KJsQ@mail.gmail.com> (raw)
In-Reply-To: <CAFCwf10tLwQiZ0ROeuf2FHcWa9iTBwJ-0X_WWfU8tjTSvGH_0w@mail.gmail.com>

On Wed, Feb 3, 2016 at 1:01 PM, Oded Gabbay <oded.gabbay@gmail.com> wrote:
> On Wed, Feb 3, 2016 at 12:15 PM, David Woodhouse <dwmw2@infradead.org> wrote:
>> On Wed, 2016-02-03 at 11:21 +0200, Oded Gabbay wrote:
>>
>>> OK, so I think I got confused up a little, but looking at your code I
>>> see that you register SVM for the mm notifier (intel_mm_release),
>>> therefore I guess what you meant to say you don't want to call a
>>> device driver callback from your mm notifier callback, correct ? (like
>>> the amd_iommu_v2 does when it calls ev_state->inv_ctx_cb inside its
>>> mn_release)
>>
>> Right.
>>
>>> Because you can't really control what the device driver will do, i.e.
>>> if it decides to register itself to the mm notifier in its own code.
>>
>> Right. I can't *prevent* them from doing it. But I don't need to
>> encourage or facilitate it :)
>>
>>> And because you don't call the device driver, the driver can/will get
>>> errors for using this PASID (since you unbinded it) and the device
>>> driver is supposed to handle it. Did I understood that correctly ?
>>
>> In the case of an unclean exit, yes. In an orderly shutdown of the
>> process, one would hope that the device context is relinquished cleanly
>> rather than the process simply exiting.
>>
>> And yes, the device and its driver are expected to handle faults. If
>> they don't do that, they are broken :)
>>
>>> If I understood it correctly, doesn't it confuses between error/fault
>>> and normal unbinding ? Won't it be better to actively notify them and
>>> indeed *wait* until the device driver cleared its H/W pipeline before
>>> "pulling the carpet under their feet" ?
>>>
>>> In our case (AMD GPUs), if we have such an error it could make the GPU
>>> stuck. That's why we even reset the wavefronts inside the GPU, if we
>>> can't gracefully remove the work from the GPU (see
>>> kfd_unbind_process_from_device)
>>
>> But a rogue process can easily trigger faults — just request access to
>> an address that doesn't exist. My conversation with the hardware
>> designers was not about the peculiarities of any specific
>> implementation, but just getting them to confirm my assertion that if a
>> device *doesn't* cleanly handle faults on *one* PASID without screwing
>> over all the *other* PASIDs, then it is utterly broken by design and
>> should never get to production.
>
> Yes, that is agreed, address errors should not affect the H/W itself,
> nor other processes.
>
>>
>> I *do* anticipate broken hardware which will crap itself completely
>> when it takes a fault, and have implemented a callback from the fault
>> handler so that the driver gets notified when a fault *happens* (even
>> on a PASID which is still alive), and can prod the broken hardware if
>> it needs to.
>>
>> But I wasn't expecting it to be the norm.
>>
> Yeah, I guess that after a few H/W iterations the "correct"
> implementation will be the norm.
>
>>> In the patch's comment you wrote:
>>> "Hardware designers have confirmed that the resulting 'PASID not present'
>>> faults should be handled just as gracefully as 'page not present' faults"
>>>
>>> Unless *all* the H/W that is going to use SVM is designed by the same
>>> company, I don't think we can say such a thing. And even then, from my
>>> experience, H/W designers can be "creative" sometimes.
>>
>> If we have to turn it into a 'page not present' fault instead of a
>> 'PASID not present' fault, that's easy enough to do by pointing it at a
>> dummy PML4 (the zero page will do).
>>
>> But I stand by my assertion that any hardware which doesn't handle at
>> least a 'page not present' fault in a given PASID without screwing over
>> all the other users of the hardware is BROKEN.
>
> Totally agreed!
>
>>
>> We could *almost* forgive hardware for stalling when it sees a 'PASID
>> not present' fault. Since that *does* require OS participation.
>>
>> --
>> David Woodhouse                            Open Source Technology Centre
>> David.Woodhouse@intel.com                              Intel Corporation
>>
>
> Another, perhaps trivial, question.
> When there is an address fault, who handles it ? the SVM driver, or
> each device driver ?
>
> In other words, is the model the same as (AMD) IOMMU where it binds
> amd_iommu driver to the IOMMU H/W, and that driver (amd_iommu/v2) is
> the only one which handles the PPR events ?
>
> If that is the case, then with SVM, how will the device driver be made
> aware of faults, if the SVM driver won't notify him about them,
> because it has already severed the connection between PASID and
> process ?
>
> If the model is that each device driver gets a direct fault
> notification (via interrupt or some other way) then that is a
> different story.
>
> Oded

And another question, if I may, aren't you afraid of "false positive"
prints to dmesg ? I mean, I'm pretty sure page faults / pasid faults
errors will be logged somewhere, probably to dmesg. Aren't you
concerned of the users seeing those errors and thinking they may have
a bug, while actually the errors were only caused by process
termination ?

Or in that case you say that the application is broken, because if it
still had something running in the H/W, it should not have closed
itself ?

I can accept that, I just want to know what is our answer when people
will start to complain :)

Thanks,

     Oded

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-02-03 11:07 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-28 17:55 [LSF/MM ATTEND] HMM (heterogeneous memory manager) and GPU Jerome Glisse
2016-01-29  9:50 ` Kirill A. Shutemov
2016-01-29 13:35   ` Jerome Glisse
2016-02-01 15:46 ` Aneesh Kumar K.V
2016-02-02 23:03   ` Jerome Glisse
2016-02-03  0:40 ` David Woodhouse
2016-02-03  8:13   ` Oded Gabbay
2016-02-03  8:40     ` David Woodhouse
2016-02-03  9:21       ` Oded Gabbay
2016-02-03 10:15         ` David Woodhouse
2016-02-03 11:01           ` Oded Gabbay
2016-02-03 11:07             ` Oded Gabbay [this message]
2016-02-03 11:35               ` David Woodhouse
2016-02-03 11:41                 ` David Woodhouse
2016-02-03 11:41                 ` Oded Gabbay
2016-02-03 12:22                   ` David Woodhouse
2016-02-25 13:49   ` Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFCwf12U2iQS2xUoRx4W7cVQJOcso+QK2_PdYYD-k_J1V8KJsQ@mail.gmail.com \
    --to=oded.gabbay@gmail.com \
    --cc=dwmw2@infradead.org \
    --cc=j.glisse@gmail.com \
    --cc=joro@8bytes.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.