All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Gabbay, Oded" <Oded.Gabbay@amd.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Lewycky, Andrew" <Andrew.Lewycky@amd.com>,
	"Cornwall, Jay" <Jay.Cornwall@amd.com>,
	"Bridgman, John" <John.Bridgman@amd.com>,
	Jerome Glisse <j.glisse@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mgorman@suse.de" <mgorman@suse.de>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"peterz@infraread.org" <peterz@infraread.org>,
	"aarcange@redhat.com" <aarcange@redhat.com>,
	"riel@redhat.com" <riel@redhat.com>,
	"jweiner@redhat.com" <jweiner@redhat.com>,
	"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
	Mark Hairgrove <mhairgrove@nvidia.com>,
	Jatin Kumar <jakumar@nvidia.com>,
	Subhash Gutti <sgutti@nvidia.com>,
	Lucien Dunning <ldunning@nvidia.com>,
	Cameron Buschardt <cabuschardt@nvidia.com>,
	Arvind Gopalakrishnan <arvindg@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Sherry Cheung <SCheung@nvidia.com>,
	Duncan Poole <dpoole@nvidia.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>
Subject: Re: [PATCH 1/6] mmput: use notifier chain to call subsystem exit handler.
Date: Tue, 1 Jul 2014 09:29:49 +0000	[thread overview]
Message-ID: <019CCE693E457142B37B791721487FD91806DD8B@storexdag01.amd.com> (raw)
In-Reply-To: <20140701091535.GF26537@8bytes.org>

On Tue, 2014-07-01 at 11:15 +0200, Joerg Roedel wrote:
> On Mon, Jun 30, 2014 at 02:35:57PM -0400, Jerome Glisse wrote:
> > We do intend to tear down all secondary mapping inside the relase
> > callback but still we can not cleanup all the resources associated
> > with it.
> >
> 
> And why can't you cleanup the other resources in the file close path?
> Tearing down the mappings is all you need to do in the release function
> anyway.
> 
> > As said from the release call back you can not call
> > mmu_notifier_unregister and thus you can not fully cleanup things.
> 
> You don't need to call mmu_notifier_unregister when the release function
> is already running from exit_mmap because this is equivalent to calling
> mmu_notifier_unregister.
> 
> > Only way to achieve so is to do it ouside mmu_notifier callback.
> 
> The resources that can't be handled there can be cleaned up in the
> file-close path. No need for a new notifier in mm code.
> 
> In the end all you need to do in the release function is to tear down
> the secondary mapping and make sure the device can no longer access the
> address space when the release function returns. Everything else, like
> freeing any resources can be done later when the file descriptors are
> teared down.

I will answer from the KFD perpective, as I'm AMD's maintainer of this
driver.

Little background: AMD's HSA Linux kernel driver (called radeon_kfd or
KFD in short), has been developed for the past year by AMD, to support
running Linux compute applications on AMD's HSA-enabled APUs, i.e Kaveri
(A10-7850K/7700K). The driver will be up for kernel community review in
about 2-3 weeks so we could push it during the 3.17 merge window. Prior
discussions were made with gpu/drm subsystem maintainers about this
driver.

In the KFD, we need to maintain a notion of each compute process.
Therefore, we have an object called "kfd_process" that is created for
each process that uses the KFD. Naturally, we need to be able to track
the process's shutdown in order to perform cleanup of the resources it
uses (compute queues, virtual address space, gpu local memory
allocations, etc.).

To enable this tracking mechanism, we decided to associate the
kfd_process with mm_struct to ensure that a kfd_process object has
exactly the same lifespan as the process it represents. We preferred to
use the mm_struct and not a file description because using a file
descriptor to track “process” shutdown is wrong in two ways:

* Technical: file descriptors can be passed to unrelated processes using
AF_UNIX sockets. This means that a process can exit while the file stays
open. Even if we implement this “correctly” i.e. holding the address
space & page tables alive until the file is finally released, it’s
really dodgy.

* Philosophical: our ioctls are actually system calls in disguise. They
operate on the process, not on a device.

Moreover, because the GPU interacts with the process only through
virtual memory (and not e.g. file descriptors), and because virtual
address space is fundamental to an intuitive notion of what a process
is, the decision to associate the kfd_process with mm_struct seems like
a natural choice.

Then arrived the issue of how the KFD is notified about an mm_struct
destruction. Because the mmu_notifier release callback is called from an
RCU read lock, it can't destory the mmu_notifier object, which is the
kfd_process object itself. Therefore, I talked to Jerome and Andrew
Morton on a way to implement this and after the discussion (which was in
private emails), Jerome was kind enough to write a patch, which is the
patch we are now discussing.

You are more than welcomed to take a look at the entire driver, at
http://cgit.freedesktop.org/~gabbayo/linux/?h=kfd-0.6.x      although
the driver will undergo some changes before sending the pull request to
Dave Airle.

I believe that converting amd_iommu_v2 driver to use this patch as well,
will benefit all parties. AFAIK, KFD is the _only_ client of the
amd_iommu_v2 driver, so it is imperative that we will work together on
this.

	Oded
> > If you know any other way to call mmu_notifier_unregister before the
> > end of mmput function than i am all ear. I am not adding this call
> > back just for the fun of it i spend serious time trying to find a
> > way to do thing without it. I might have miss a way so if i did please
> > show it to me.
> 
> Why do you need to call mmu_notifier_unregister manually when it is done
> implicitly in exit_mmap already? 
> 
> 
> 	Joerg
> 
> 


WARNING: multiple messages have this Message-ID (diff)
From: "Gabbay, Oded" <Oded.Gabbay@amd.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Lewycky, Andrew" <Andrew.Lewycky@amd.com>,
	"Cornwall, Jay" <Jay.Cornwall@amd.com>,
	"Bridgman, John" <John.Bridgman@amd.com>,
	Jerome Glisse <j.glisse@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mgorman@suse.de" <mgorman@suse.de>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"peterz@infraread.org" <peterz@infraread.org>,
	"aarcange@redhat.com" <aarcange@redhat.com>,
	"riel@redhat.com" <riel@redhat.com>,
	"jweiner@redhat.com" <jweiner@redhat.com>,
	"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
	Mark Hairgrove <mhairgrove@nvidia.com>,
	Jatin Kumar <jakumar@nvidia.com>,
	Subhash Gutti <sgutti@nvidia.com>,
	Lucien Dunning <ldunning@nvidia.com>,
	Cameron Buschardt <cabuschardt@nvidia.com>,
	Arvind
Subject: Re: [PATCH 1/6] mmput: use notifier chain to call subsystem exit handler.
Date: Tue, 1 Jul 2014 09:29:49 +0000	[thread overview]
Message-ID: <019CCE693E457142B37B791721487FD91806DD8B@storexdag01.amd.com> (raw)
In-Reply-To: <20140701091535.GF26537@8bytes.org>

On Tue, 2014-07-01 at 11:15 +0200, Joerg Roedel wrote:
> On Mon, Jun 30, 2014 at 02:35:57PM -0400, Jerome Glisse wrote:
> > We do intend to tear down all secondary mapping inside the relase
> > callback but still we can not cleanup all the resources associated
> > with it.
> >
> 
> And why can't you cleanup the other resources in the file close path?
> Tearing down the mappings is all you need to do in the release function
> anyway.
> 
> > As said from the release call back you can not call
> > mmu_notifier_unregister and thus you can not fully cleanup things.
> 
> You don't need to call mmu_notifier_unregister when the release function
> is already running from exit_mmap because this is equivalent to calling
> mmu_notifier_unregister.
> 
> > Only way to achieve so is to do it ouside mmu_notifier callback.
> 
> The resources that can't be handled there can be cleaned up in the
> file-close path. No need for a new notifier in mm code.
> 
> In the end all you need to do in the release function is to tear down
> the secondary mapping and make sure the device can no longer access the
> address space when the release function returns. Everything else, like
> freeing any resources can be done later when the file descriptors are
> teared down.

I will answer from the KFD perpective, as I'm AMD's maintainer of this
driver.

Little background: AMD's HSA Linux kernel driver (called radeon_kfd or
KFD in short), has been developed for the past year by AMD, to support
running Linux compute applications on AMD's HSA-enabled APUs, i.e Kaveri
(A10-7850K/7700K). The driver will be up for kernel community review in
about 2-3 weeks so we could push it during the 3.17 merge window. Prior
discussions were made with gpu/drm subsystem maintainers about this
driver.

In the KFD, we need to maintain a notion of each compute process.
Therefore, we have an object called "kfd_process" that is created for
each process that uses the KFD. Naturally, we need to be able to track
the process's shutdown in order to perform cleanup of the resources it
uses (compute queues, virtual address space, gpu local memory
allocations, etc.).

To enable this tracking mechanism, we decided to associate the
kfd_process with mm_struct to ensure that a kfd_process object has
exactly the same lifespan as the process it represents. We preferred to
use the mm_struct and not a file description because using a file
descriptor to track “process” shutdown is wrong in two ways:

* Technical: file descriptors can be passed to unrelated processes using
AF_UNIX sockets. This means that a process can exit while the file stays
open. Even if we implement this “correctly” i.e. holding the address
space & page tables alive until the file is finally released, it’s
really dodgy.

* Philosophical: our ioctls are actually system calls in disguise. They
operate on the process, not on a device.

Moreover, because the GPU interacts with the process only through
virtual memory (and not e.g. file descriptors), and because virtual
address space is fundamental to an intuitive notion of what a process
is, the decision to associate the kfd_process with mm_struct seems like
a natural choice.

Then arrived the issue of how the KFD is notified about an mm_struct
destruction. Because the mmu_notifier release callback is called from an
RCU read lock, it can't destory the mmu_notifier object, which is the
kfd_process object itself. Therefore, I talked to Jerome and Andrew
Morton on a way to implement this and after the discussion (which was in
private emails), Jerome was kind enough to write a patch, which is the
patch we are now discussing.

You are more than welcomed to take a look at the entire driver, at
http://cgit.freedesktop.org/~gabbayo/linux/?h=kfd-0.6.x      although
the driver will undergo some changes before sending the pull request to
Dave Airle.

I believe that converting amd_iommu_v2 driver to use this patch as well,
will benefit all parties. AFAIK, KFD is the _only_ client of the
amd_iommu_v2 driver, so it is imperative that we will work together on
this.

	Oded
> > If you know any other way to call mmu_notifier_unregister before the
> > end of mmput function than i am all ear. I am not adding this call
> > back just for the fun of it i spend serious time trying to find a
> > way to do thing without it. I might have miss a way so if i did please
> > show it to me.
> 
> Why do you need to call mmu_notifier_unregister manually when it is done
> implicitly in exit_mmap already? 
> 
> 
> 	Joerg
> 
> 


  reply	other threads:[~2014-07-01  9:29 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-28  2:00 mm preparatory patches for HMM and IOMMUv2 Jérôme Glisse
2014-06-28  2:00 ` Jérôme Glisse
2014-06-28  2:00 ` [PATCH 1/6] mmput: use notifier chain to call subsystem exit handler Jérôme Glisse
2014-06-28  2:00   ` Jérôme Glisse
2014-06-30  3:49   ` John Hubbard
2014-06-30  3:49     ` John Hubbard
2014-06-30 15:07     ` Jerome Glisse
2014-06-30 15:07       ` Jerome Glisse
2014-06-30 14:41   ` Gabbay, Oded
2014-06-30 14:41     ` Gabbay, Oded
2014-06-30 15:06     ` Jerome Glisse
2014-06-30 15:06       ` Jerome Glisse
     [not found]     ` <019CCE693E457142B37B791721487FD91806B836-0nO7ALo/ziwxlywnonMhLEEOCMrvLtNR@public.gmane.org>
2014-06-30 15:40       ` Joerg Roedel
2014-06-30 16:06         ` Jerome Glisse
2014-06-30 16:06           ` Jerome Glisse
2014-06-30 18:16           ` Joerg Roedel
2014-06-30 18:16             ` Joerg Roedel
2014-06-30 18:35             ` Jerome Glisse
2014-06-30 18:35               ` Jerome Glisse
2014-06-30 18:57               ` Lewycky, Andrew
2014-06-30 18:57                 ` Lewycky, Andrew
2014-07-01  9:41                 ` Joerg Roedel
2014-07-01  9:41                   ` Joerg Roedel
     [not found]               ` <20140630183556.GB3280-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-07-01  9:15                 ` Joerg Roedel
2014-07-01  9:29                   ` Gabbay, Oded [this message]
2014-07-01  9:29                     ` Gabbay, Oded
     [not found]                     ` <019CCE693E457142B37B791721487FD91806DD8B-0nO7ALo/ziwxlywnonMhLEEOCMrvLtNR@public.gmane.org>
2014-07-01 11:00                       ` Joerg Roedel
2014-07-01 19:33                         ` Jerome Glisse
2014-07-01 19:33                           ` Jerome Glisse
     [not found]                           ` <20140701193343.GB3322-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-07-01 21:06                             ` Joerg Roedel
2014-07-01 21:32                               ` Jerome Glisse
2014-07-01 21:32                                 ` Jerome Glisse
2014-07-03 18:30                                 ` Jerome Glisse
2014-07-03 18:30                                   ` Jerome Glisse
     [not found]                                   ` <20140703183024.GA3306-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-07-03 23:15                                     ` Joerg Roedel
2014-07-04  0:03                                       ` Jerome Glisse
2014-07-04  0:03                                         ` Jerome Glisse
2014-07-06 19:25                                       ` Gabbay, Oded
2014-07-06 19:25                                         ` Gabbay, Oded
2014-07-07 10:11                                         ` joro
2014-07-07 10:11                                           ` joro
2014-07-07 10:36                                           ` Oded Gabbay
2014-07-07 10:36                                             ` Oded Gabbay
2014-07-07 10:43                                           ` Oded Gabbay
2014-07-07 10:43                                             ` Oded Gabbay
     [not found]                                             ` <1404729783.31606.1.camel-OrheeFI7RUaGvNAqNQFwiPZ4XP/Yx64J@public.gmane.org>
2014-07-08  8:00                                               ` joro-zLv9SwRftAIdnm+yROfE0A
2014-07-08 17:03                                                 ` Jerome Glisse
2014-07-08 17:03                                                   ` Jerome Glisse
2015-10-11 19:03                                                   ` David Woodhouse
2015-10-11 19:03                                                     ` David Woodhouse
2015-10-12 17:41                                                     ` Jerome Glisse
2015-10-12 17:41                                                       ` Jerome Glisse
2015-10-12 17:41                                                       ` Jerome Glisse
2015-11-20 15:45                                                     ` David Woodhouse
2015-11-20 15:45                                                       ` David Woodhouse
2014-06-30 15:37   ` Joerg Roedel
2014-06-28  2:00 ` [PATCH 2/6] mm: differentiate unmap for vmscan from other unmap Jérôme Glisse
2014-06-28  2:00   ` Jérôme Glisse
2014-06-30  3:58   ` John Hubbard
2014-06-30  3:58     ` John Hubbard
2014-06-30 15:58     ` Jerome Glisse
2014-06-30 15:58       ` Jerome Glisse
2014-06-28  2:00 ` [PATCH 3/6] mmu_notifier: add event information to address invalidation v2 Jérôme Glisse
2014-06-28  2:00   ` Jérôme Glisse
2014-06-30  5:22   ` John Hubbard
2014-06-30  5:22     ` John Hubbard
2014-06-30 15:57     ` Jerome Glisse
2014-06-30 15:57       ` Jerome Glisse
2014-07-01  1:57   ` Linus Torvalds
2014-06-28  2:00 ` [PATCH 4/6] mmu_notifier: pass through vma to invalidate_range and invalidate_page Jérôme Glisse
2014-06-28  2:00   ` Jérôme Glisse
2014-06-30  3:29   ` John Hubbard
2014-06-30  3:29     ` John Hubbard
2014-06-30 16:00     ` Jerome Glisse
2014-06-30 16:00       ` Jerome Glisse
2014-07-01  2:04   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=019CCE693E457142B37B791721487FD91806DD8B@storexdag01.amd.com \
    --to=oded.gabbay@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Andrew.Lewycky@amd.com \
    --cc=Jay.Cornwall@amd.com \
    --cc=John.Bridgman@amd.com \
    --cc=SCheung@nvidia.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arvindg@nvidia.com \
    --cc=cabuschardt@nvidia.com \
    --cc=dpoole@nvidia.com \
    --cc=hpa@zytor.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=j.glisse@gmail.com \
    --cc=jakumar@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=jweiner@redhat.com \
    --cc=ldunning@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhairgrove@nvidia.com \
    --cc=peterz@infraread.org \
    --cc=riel@redhat.com \
    --cc=sgutti@nvidia.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.