From: "Gabbay, Oded" <Oded.Gabbay@amd.com> To: Joerg Roedel <joro@8bytes.org> Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>, "Lewycky, Andrew" <Andrew.Lewycky@amd.com>, "Cornwall, Jay" <Jay.Cornwall@amd.com>, "Bridgman, John" <John.Bridgman@amd.com>, Jerome Glisse <j.glisse@gmail.com>, "akpm@linux-foundation.org" <akpm@linux-foundation.org>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "mgorman@suse.de" <mgorman@suse.de>, "hpa@zytor.com" <hpa@zytor.com>, "peterz@infraread.org" <peterz@infraread.org>, "aarcange@redhat.com" <aarcange@redhat.com>, "riel@redhat.com" <riel@redhat.com>, "jweiner@redhat.com" <jweiner@redhat.com>, "torvalds@linux-foundation.org" <torvalds@linux-foundation.org>, Mark Hairgrove <mhairgrove@nvidia.com>, Jatin Kumar <jakumar@nvidia.com>, Subhash Gutti <sgutti@nvidia.com>, Lucien Dunning <ldunning@nvidia.com>, Cameron Buschardt <cabuschardt@nvidia.com>, Arvind Gopalakrishnan <arvindg@nvidia.com>, John Hubbard <jhubbard@nvidia.com>, Sherry Cheung <SCheung@nvidia.com>, Duncan Poole <dpoole@nvidia.com>, "iommu@lists.linux-foundation.org" <iommu@lists.linux-foundation.org> Subject: Re: [PATCH 1/6] mmput: use notifier chain to call subsystem exit handler. Date: Tue, 1 Jul 2014 09:29:49 +0000 [thread overview] Message-ID: <019CCE693E457142B37B791721487FD91806DD8B@storexdag01.amd.com> (raw) In-Reply-To: <20140701091535.GF26537@8bytes.org> On Tue, 2014-07-01 at 11:15 +0200, Joerg Roedel wrote: > On Mon, Jun 30, 2014 at 02:35:57PM -0400, Jerome Glisse wrote: > > We do intend to tear down all secondary mapping inside the relase > > callback but still we can not cleanup all the resources associated > > with it. > > > > And why can't you cleanup the other resources in the file close path? > Tearing down the mappings is all you need to do in the release function > anyway. > > > As said from the release call back you can not call > > mmu_notifier_unregister and thus you can not fully cleanup things. > > You don't need to call mmu_notifier_unregister when the release function > is already running from exit_mmap because this is equivalent to calling > mmu_notifier_unregister. > > > Only way to achieve so is to do it ouside mmu_notifier callback. > > The resources that can't be handled there can be cleaned up in the > file-close path. No need for a new notifier in mm code. > > In the end all you need to do in the release function is to tear down > the secondary mapping and make sure the device can no longer access the > address space when the release function returns. Everything else, like > freeing any resources can be done later when the file descriptors are > teared down. I will answer from the KFD perpective, as I'm AMD's maintainer of this driver. Little background: AMD's HSA Linux kernel driver (called radeon_kfd or KFD in short), has been developed for the past year by AMD, to support running Linux compute applications on AMD's HSA-enabled APUs, i.e Kaveri (A10-7850K/7700K). The driver will be up for kernel community review in about 2-3 weeks so we could push it during the 3.17 merge window. Prior discussions were made with gpu/drm subsystem maintainers about this driver. In the KFD, we need to maintain a notion of each compute process. Therefore, we have an object called "kfd_process" that is created for each process that uses the KFD. Naturally, we need to be able to track the process's shutdown in order to perform cleanup of the resources it uses (compute queues, virtual address space, gpu local memory allocations, etc.). To enable this tracking mechanism, we decided to associate the kfd_process with mm_struct to ensure that a kfd_process object has exactly the same lifespan as the process it represents. We preferred to use the mm_struct and not a file description because using a file descriptor to track “process” shutdown is wrong in two ways: * Technical: file descriptors can be passed to unrelated processes using AF_UNIX sockets. This means that a process can exit while the file stays open. Even if we implement this “correctly” i.e. holding the address space & page tables alive until the file is finally released, it’s really dodgy. * Philosophical: our ioctls are actually system calls in disguise. They operate on the process, not on a device. Moreover, because the GPU interacts with the process only through virtual memory (and not e.g. file descriptors), and because virtual address space is fundamental to an intuitive notion of what a process is, the decision to associate the kfd_process with mm_struct seems like a natural choice. Then arrived the issue of how the KFD is notified about an mm_struct destruction. Because the mmu_notifier release callback is called from an RCU read lock, it can't destory the mmu_notifier object, which is the kfd_process object itself. Therefore, I talked to Jerome and Andrew Morton on a way to implement this and after the discussion (which was in private emails), Jerome was kind enough to write a patch, which is the patch we are now discussing. You are more than welcomed to take a look at the entire driver, at http://cgit.freedesktop.org/~gabbayo/linux/?h=kfd-0.6.x although the driver will undergo some changes before sending the pull request to Dave Airle. I believe that converting amd_iommu_v2 driver to use this patch as well, will benefit all parties. AFAIK, KFD is the _only_ client of the amd_iommu_v2 driver, so it is imperative that we will work together on this. Oded > > If you know any other way to call mmu_notifier_unregister before the > > end of mmput function than i am all ear. I am not adding this call > > back just for the fun of it i spend serious time trying to find a > > way to do thing without it. I might have miss a way so if i did please > > show it to me. > > Why do you need to call mmu_notifier_unregister manually when it is done > implicitly in exit_mmap already? > > > Joerg > >
WARNING: multiple messages have this Message-ID (diff)
From: "Gabbay, Oded" <Oded.Gabbay@amd.com> To: Joerg Roedel <joro@8bytes.org> Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>, "Lewycky, Andrew" <Andrew.Lewycky@amd.com>, "Cornwall, Jay" <Jay.Cornwall@amd.com>, "Bridgman, John" <John.Bridgman@amd.com>, Jerome Glisse <j.glisse@gmail.com>, "akpm@linux-foundation.org" <akpm@linux-foundation.org>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "mgorman@suse.de" <mgorman@suse.de>, "hpa@zytor.com" <hpa@zytor.com>, "peterz@infraread.org" <peterz@infraread.org>, "aarcange@redhat.com" <aarcange@redhat.com>, "riel@redhat.com" <riel@redhat.com>, "jweiner@redhat.com" <jweiner@redhat.com>, "torvalds@linux-foundation.org" <torvalds@linux-foundation.org>, Mark Hairgrove <mhairgrove@nvidia.com>, Jatin Kumar <jakumar@nvidia.com>, Subhash Gutti <sgutti@nvidia.com>, Lucien Dunning <ldunning@nvidia.com>, Cameron Buschardt <cabuschardt@nvidia.com>, Arvind Subject: Re: [PATCH 1/6] mmput: use notifier chain to call subsystem exit handler. Date: Tue, 1 Jul 2014 09:29:49 +0000 [thread overview] Message-ID: <019CCE693E457142B37B791721487FD91806DD8B@storexdag01.amd.com> (raw) In-Reply-To: <20140701091535.GF26537@8bytes.org> On Tue, 2014-07-01 at 11:15 +0200, Joerg Roedel wrote: > On Mon, Jun 30, 2014 at 02:35:57PM -0400, Jerome Glisse wrote: > > We do intend to tear down all secondary mapping inside the relase > > callback but still we can not cleanup all the resources associated > > with it. > > > > And why can't you cleanup the other resources in the file close path? > Tearing down the mappings is all you need to do in the release function > anyway. > > > As said from the release call back you can not call > > mmu_notifier_unregister and thus you can not fully cleanup things. > > You don't need to call mmu_notifier_unregister when the release function > is already running from exit_mmap because this is equivalent to calling > mmu_notifier_unregister. > > > Only way to achieve so is to do it ouside mmu_notifier callback. > > The resources that can't be handled there can be cleaned up in the > file-close path. No need for a new notifier in mm code. > > In the end all you need to do in the release function is to tear down > the secondary mapping and make sure the device can no longer access the > address space when the release function returns. Everything else, like > freeing any resources can be done later when the file descriptors are > teared down. I will answer from the KFD perpective, as I'm AMD's maintainer of this driver. Little background: AMD's HSA Linux kernel driver (called radeon_kfd or KFD in short), has been developed for the past year by AMD, to support running Linux compute applications on AMD's HSA-enabled APUs, i.e Kaveri (A10-7850K/7700K). The driver will be up for kernel community review in about 2-3 weeks so we could push it during the 3.17 merge window. Prior discussions were made with gpu/drm subsystem maintainers about this driver. In the KFD, we need to maintain a notion of each compute process. Therefore, we have an object called "kfd_process" that is created for each process that uses the KFD. Naturally, we need to be able to track the process's shutdown in order to perform cleanup of the resources it uses (compute queues, virtual address space, gpu local memory allocations, etc.). To enable this tracking mechanism, we decided to associate the kfd_process with mm_struct to ensure that a kfd_process object has exactly the same lifespan as the process it represents. We preferred to use the mm_struct and not a file description because using a file descriptor to track “process” shutdown is wrong in two ways: * Technical: file descriptors can be passed to unrelated processes using AF_UNIX sockets. This means that a process can exit while the file stays open. Even if we implement this “correctly” i.e. holding the address space & page tables alive until the file is finally released, it’s really dodgy. * Philosophical: our ioctls are actually system calls in disguise. They operate on the process, not on a device. Moreover, because the GPU interacts with the process only through virtual memory (and not e.g. file descriptors), and because virtual address space is fundamental to an intuitive notion of what a process is, the decision to associate the kfd_process with mm_struct seems like a natural choice. Then arrived the issue of how the KFD is notified about an mm_struct destruction. Because the mmu_notifier release callback is called from an RCU read lock, it can't destory the mmu_notifier object, which is the kfd_process object itself. Therefore, I talked to Jerome and Andrew Morton on a way to implement this and after the discussion (which was in private emails), Jerome was kind enough to write a patch, which is the patch we are now discussing. You are more than welcomed to take a look at the entire driver, at http://cgit.freedesktop.org/~gabbayo/linux/?h=kfd-0.6.x although the driver will undergo some changes before sending the pull request to Dave Airle. I believe that converting amd_iommu_v2 driver to use this patch as well, will benefit all parties. AFAIK, KFD is the _only_ client of the amd_iommu_v2 driver, so it is imperative that we will work together on this. Oded > > If you know any other way to call mmu_notifier_unregister before the > > end of mmput function than i am all ear. I am not adding this call > > back just for the fun of it i spend serious time trying to find a > > way to do thing without it. I might have miss a way so if i did please > > show it to me. > > Why do you need to call mmu_notifier_unregister manually when it is done > implicitly in exit_mmap already? > > > Joerg > >
next prev parent reply other threads:[~2014-07-01 9:29 UTC|newest] Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-06-28 2:00 mm preparatory patches for HMM and IOMMUv2 Jérôme Glisse 2014-06-28 2:00 ` Jérôme Glisse 2014-06-28 2:00 ` [PATCH 1/6] mmput: use notifier chain to call subsystem exit handler Jérôme Glisse 2014-06-28 2:00 ` Jérôme Glisse 2014-06-30 3:49 ` John Hubbard 2014-06-30 3:49 ` John Hubbard 2014-06-30 15:07 ` Jerome Glisse 2014-06-30 15:07 ` Jerome Glisse 2014-06-30 14:41 ` Gabbay, Oded 2014-06-30 14:41 ` Gabbay, Oded 2014-06-30 15:06 ` Jerome Glisse 2014-06-30 15:06 ` Jerome Glisse [not found] ` <019CCE693E457142B37B791721487FD91806B836-0nO7ALo/ziwxlywnonMhLEEOCMrvLtNR@public.gmane.org> 2014-06-30 15:40 ` Joerg Roedel 2014-06-30 16:06 ` Jerome Glisse 2014-06-30 16:06 ` Jerome Glisse 2014-06-30 18:16 ` Joerg Roedel 2014-06-30 18:16 ` Joerg Roedel 2014-06-30 18:35 ` Jerome Glisse 2014-06-30 18:35 ` Jerome Glisse 2014-06-30 18:57 ` Lewycky, Andrew 2014-06-30 18:57 ` Lewycky, Andrew 2014-07-01 9:41 ` Joerg Roedel 2014-07-01 9:41 ` Joerg Roedel [not found] ` <20140630183556.GB3280-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2014-07-01 9:15 ` Joerg Roedel 2014-07-01 9:29 ` Gabbay, Oded [this message] 2014-07-01 9:29 ` Gabbay, Oded [not found] ` <019CCE693E457142B37B791721487FD91806DD8B-0nO7ALo/ziwxlywnonMhLEEOCMrvLtNR@public.gmane.org> 2014-07-01 11:00 ` Joerg Roedel 2014-07-01 19:33 ` Jerome Glisse 2014-07-01 19:33 ` Jerome Glisse [not found] ` <20140701193343.GB3322-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2014-07-01 21:06 ` Joerg Roedel 2014-07-01 21:32 ` Jerome Glisse 2014-07-01 21:32 ` Jerome Glisse 2014-07-03 18:30 ` Jerome Glisse 2014-07-03 18:30 ` Jerome Glisse [not found] ` <20140703183024.GA3306-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2014-07-03 23:15 ` Joerg Roedel 2014-07-04 0:03 ` Jerome Glisse 2014-07-04 0:03 ` Jerome Glisse 2014-07-06 19:25 ` Gabbay, Oded 2014-07-06 19:25 ` Gabbay, Oded 2014-07-07 10:11 ` joro 2014-07-07 10:11 ` joro 2014-07-07 10:36 ` Oded Gabbay 2014-07-07 10:36 ` Oded Gabbay 2014-07-07 10:43 ` Oded Gabbay 2014-07-07 10:43 ` Oded Gabbay [not found] ` <1404729783.31606.1.camel-OrheeFI7RUaGvNAqNQFwiPZ4XP/Yx64J@public.gmane.org> 2014-07-08 8:00 ` joro-zLv9SwRftAIdnm+yROfE0A 2014-07-08 17:03 ` Jerome Glisse 2014-07-08 17:03 ` Jerome Glisse 2015-10-11 19:03 ` David Woodhouse 2015-10-11 19:03 ` David Woodhouse 2015-10-12 17:41 ` Jerome Glisse 2015-10-12 17:41 ` Jerome Glisse 2015-10-12 17:41 ` Jerome Glisse 2015-11-20 15:45 ` David Woodhouse 2015-11-20 15:45 ` David Woodhouse 2014-06-30 15:37 ` Joerg Roedel 2014-06-28 2:00 ` [PATCH 2/6] mm: differentiate unmap for vmscan from other unmap Jérôme Glisse 2014-06-28 2:00 ` Jérôme Glisse 2014-06-30 3:58 ` John Hubbard 2014-06-30 3:58 ` John Hubbard 2014-06-30 15:58 ` Jerome Glisse 2014-06-30 15:58 ` Jerome Glisse 2014-06-28 2:00 ` [PATCH 3/6] mmu_notifier: add event information to address invalidation v2 Jérôme Glisse 2014-06-28 2:00 ` Jérôme Glisse 2014-06-30 5:22 ` John Hubbard 2014-06-30 5:22 ` John Hubbard 2014-06-30 15:57 ` Jerome Glisse 2014-06-30 15:57 ` Jerome Glisse 2014-07-01 1:57 ` Linus Torvalds 2014-06-28 2:00 ` [PATCH 4/6] mmu_notifier: pass through vma to invalidate_range and invalidate_page Jérôme Glisse 2014-06-28 2:00 ` Jérôme Glisse 2014-06-30 3:29 ` John Hubbard 2014-06-30 3:29 ` John Hubbard 2014-06-30 16:00 ` Jerome Glisse 2014-06-30 16:00 ` Jerome Glisse 2014-07-01 2:04 ` Linus Torvalds
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=019CCE693E457142B37B791721487FD91806DD8B@storexdag01.amd.com \ --to=oded.gabbay@amd.com \ --cc=Alexander.Deucher@amd.com \ --cc=Andrew.Lewycky@amd.com \ --cc=Jay.Cornwall@amd.com \ --cc=John.Bridgman@amd.com \ --cc=SCheung@nvidia.com \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=arvindg@nvidia.com \ --cc=cabuschardt@nvidia.com \ --cc=dpoole@nvidia.com \ --cc=hpa@zytor.com \ --cc=iommu@lists.linux-foundation.org \ --cc=j.glisse@gmail.com \ --cc=jakumar@nvidia.com \ --cc=jhubbard@nvidia.com \ --cc=joro@8bytes.org \ --cc=jweiner@redhat.com \ --cc=ldunning@nvidia.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=mhairgrove@nvidia.com \ --cc=peterz@infraread.org \ --cc=riel@redhat.com \ --cc=sgutti@nvidia.com \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.