* [LSF/MM] schedule suggestion @ 2018-04-18 21:19 Jerome Glisse 2018-04-19 0:48 ` Dave Chinner 2018-04-19 1:55 ` Dave Chinner 0 siblings, 2 replies; 23+ messages in thread From: Jerome Glisse @ 2018-04-18 21:19 UTC (permalink / raw) To: linux-mm Cc: lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko Just wanted to suggest to push HMM status down one slot in the agenda to avoid having FS and MM first going into their own room and then merging back for GUP and DAX, and re-splitting after. More over HMM and NUMA talks will be good to have back to back as they deal with same kind of thing mostly. So on Monday afternoon GUP in first slot would be nice :) Just a suggestion https://docs.google.com/spreadsheets/d/15XFz_Zsklmle--L9CO4-ygCqmSHwBsFPjvjDpiL5qwM/edit#gid=0 Cheers, J�r�me ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-18 21:19 [LSF/MM] schedule suggestion Jerome Glisse @ 2018-04-19 0:48 ` Dave Chinner 2018-04-19 1:55 ` Dave Chinner 1 sibling, 0 replies; 23+ messages in thread From: Dave Chinner @ 2018-04-19 0:48 UTC (permalink / raw) To: Jerome Glisse Cc: linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Wed, Apr 18, 2018 at 05:19:39PM -0400, Jerome Glisse wrote: > Just wanted to suggest to push HMM status down one slot in the > agenda to avoid having FS and MM first going into their own > room and then merging back for GUP and DAX, and re-splitting > after. More over HMM and NUMA talks will be good to have back > to back as they deal with same kind of thing mostly. Yeah, I noticed there are a few of these shared sessions that have been placed in the middle slot of a session. Would be good to put all the shared fs/mm sessions into a complete session block so people don't have to keep moving rooms every half hour... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-18 21:19 [LSF/MM] schedule suggestion Jerome Glisse 2018-04-19 0:48 ` Dave Chinner @ 2018-04-19 1:55 ` Dave Chinner 2018-04-19 14:38 ` Jerome Glisse 2018-04-19 14:51 ` Chris Mason 1 sibling, 2 replies; 23+ messages in thread From: Dave Chinner @ 2018-04-19 1:55 UTC (permalink / raw) To: Jerome Glisse Cc: linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Wed, Apr 18, 2018 at 05:19:39PM -0400, Jerome Glisse wrote: > Just wanted to suggest to push HMM status down one slot in the > agenda to avoid having FS and MM first going into their own > room and then merging back for GUP and DAX, and re-splitting > after. More over HMM and NUMA talks will be good to have back > to back as they deal with same kind of thing mostly. So while we are talking about schedule suggestions, we see that there's lots of empty slots in the FS track. We (xfs guys) were just chatting on #xfs about whether we'd have time to have a "XFS devel meeting" at some point during LSF/MM as we are rarely in the same place at the same time. I'd like to propose that we compact the fs sessions so that we get a 3-slot session reserved for "Individual filesystem discussions" one afternoon. That way we've got time in the schedule for the all the ext4/btrfs/XFS/NFS/CIFS devs to get together with each other and talk about things of interest only to their own fileystems. That means we all don't have to find time outside the schedule to do this, and think this wold be time very well spent for most fs people at the conf.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 1:55 ` Dave Chinner @ 2018-04-19 14:38 ` Jerome Glisse 2018-04-19 14:43 ` Matthew Wilcox 2018-04-19 14:51 ` Chris Mason 1 sibling, 1 reply; 23+ messages in thread From: Jerome Glisse @ 2018-04-19 14:38 UTC (permalink / raw) To: Dave Chinner Cc: linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 11:55:08AM +1000, Dave Chinner wrote: > On Wed, Apr 18, 2018 at 05:19:39PM -0400, Jerome Glisse wrote: > > Just wanted to suggest to push HMM status down one slot in the > > agenda to avoid having FS and MM first going into their own > > room and then merging back for GUP and DAX, and re-splitting > > after. More over HMM and NUMA talks will be good to have back > > to back as they deal with same kind of thing mostly. > > So while we are talking about schedule suggestions, we see that > there's lots of empty slots in the FS track. We (xfs guys) were just > chatting on #xfs about whether we'd have time to have a "XFS devel > meeting" at some point during LSF/MM as we are rarely in the same > place at the same time. > > I'd like to propose that we compact the fs sessions so that we get a > 3-slot session reserved for "Individual filesystem discussions" one > afternoon. That way we've got time in the schedule for the all the > ext4/btrfs/XFS/NFS/CIFS devs to get together with each other and > talk about things of interest only to their own fileystems. > > That means we all don't have to find time outside the schedule to do > this, and think this wold be time very well spent for most fs people > at the conf.... Oh can i get one more small slot for fs ? I want to ask if they are any people against having a callback everytime a struct file is added to a task_struct and also having a secondary array so that special file like device file can store something opaque per task_struct per struct file. I will try to stich a patchset tomorrow for that. A lot of device driver would like to have this. Cheers, J�r�me ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 14:38 ` Jerome Glisse @ 2018-04-19 14:43 ` Matthew Wilcox 2018-04-19 16:30 ` Jerome Glisse 0 siblings, 1 reply; 23+ messages in thread From: Matthew Wilcox @ 2018-04-19 14:43 UTC (permalink / raw) To: Jerome Glisse Cc: Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote: > Oh can i get one more small slot for fs ? I want to ask if they are > any people against having a callback everytime a struct file is added > to a task_struct and also having a secondary array so that special > file like device file can store something opaque per task_struct per > struct file. Do you really want something per _thread_, and not per _mm_? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 14:43 ` Matthew Wilcox @ 2018-04-19 16:30 ` Jerome Glisse 2018-04-19 16:58 ` Jeff Layton 0 siblings, 1 reply; 23+ messages in thread From: Jerome Glisse @ 2018-04-19 16:30 UTC (permalink / raw) To: Matthew Wilcox Cc: Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote: > On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote: > > Oh can i get one more small slot for fs ? I want to ask if they are > > any people against having a callback everytime a struct file is added > > to a task_struct and also having a secondary array so that special > > file like device file can store something opaque per task_struct per > > struct file. > > Do you really want something per _thread_, and not per _mm_? Well per mm would be fine but i do not see how to make that happen with reasonable structure. So issue is that you can have multiple task with same mm but different file descriptors (or am i wrong here ?) thus there would be no easy way given a struct file to lookup the per mm struct. So as a not perfect solution i see a new array in filedes which would allow device driver to store a pointer to their per mm data structure. To be fair usualy you will only have a single fd in a single task for a given device. If you see an easy way to get a per mm per inode pointer store somewhere with easy lookup i am all ears :) Cheers, J�r�me ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 16:30 ` Jerome Glisse @ 2018-04-19 16:58 ` Jeff Layton 2018-04-19 17:26 ` Jerome Glisse 0 siblings, 1 reply; 23+ messages in thread From: Jeff Layton @ 2018-04-19 16:58 UTC (permalink / raw) To: Jerome Glisse, Matthew Wilcox Cc: Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, 2018-04-19 at 12:30 -0400, Jerome Glisse wrote: > On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote: > > On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote: > > > Oh can i get one more small slot for fs ? I want to ask if they are > > > any people against having a callback everytime a struct file is added > > > to a task_struct and also having a secondary array so that special > > > file like device file can store something opaque per task_struct per > > > struct file. > > > > Do you really want something per _thread_, and not per _mm_? > > Well per mm would be fine but i do not see how to make that happen with > reasonable structure. So issue is that you can have multiple task with > same mm but different file descriptors (or am i wrong here ?) thus there > would be no easy way given a struct file to lookup the per mm struct. > > So as a not perfect solution i see a new array in filedes which would > allow device driver to store a pointer to their per mm data structure. > To be fair usualy you will only have a single fd in a single task for > a given device. > > If you see an easy way to get a per mm per inode pointer store somewhere > with easy lookup i am all ears :) > I may be misunderstanding, but to be clear: struct files don't get added to a thread, per-se. When userland calls open() or similar, the struct file gets added to the files_struct. Those are generally shared with other threads within the same process. The files_struct can also be shared with other processes if you clone() with the right flags. Doing something per-thread on every open may be rather difficult to do. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 16:58 ` Jeff Layton @ 2018-04-19 17:26 ` Jerome Glisse 2018-04-19 18:31 ` Jeff Layton 2018-04-19 20:33 ` Al Viro 0 siblings, 2 replies; 23+ messages in thread From: Jerome Glisse @ 2018-04-19 17:26 UTC (permalink / raw) To: Jeff Layton Cc: Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 12:58:39PM -0400, Jeff Layton wrote: > On Thu, 2018-04-19 at 12:30 -0400, Jerome Glisse wrote: > > On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote: > > > On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote: > > > > Oh can i get one more small slot for fs ? I want to ask if they are > > > > any people against having a callback everytime a struct file is added > > > > to a task_struct and also having a secondary array so that special > > > > file like device file can store something opaque per task_struct per > > > > struct file. > > > > > > Do you really want something per _thread_, and not per _mm_? > > > > Well per mm would be fine but i do not see how to make that happen with > > reasonable structure. So issue is that you can have multiple task with > > same mm but different file descriptors (or am i wrong here ?) thus there > > would be no easy way given a struct file to lookup the per mm struct. > > > > So as a not perfect solution i see a new array in filedes which would > > allow device driver to store a pointer to their per mm data structure. > > To be fair usualy you will only have a single fd in a single task for > > a given device. > > > > If you see an easy way to get a per mm per inode pointer store somewhere > > with easy lookup i am all ears :) > > > > I may be misunderstanding, but to be clear: struct files don't get > added to a thread, per-se. > > When userland calls open() or similar, the struct file gets added to > the files_struct. Those are generally shared with other threads within > the same process. The files_struct can also be shared with other > processes if you clone() with the right flags. > > Doing something per-thread on every open may be rather difficult to do. Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and add void * *private_data; to struct fdtable (also a default array to struct files_struct). The callback would be part of struct file_operations. and only call if it exist (os overhead is only for device driver that care). Did i miss something fundamental ? copy_files() call dup_fd() so i should be all set here. I will work on patches i was hoping this would not be too much work. Cheers, J�r�me ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 17:26 ` Jerome Glisse @ 2018-04-19 18:31 ` Jeff Layton 2018-04-19 19:31 ` Jerome Glisse 2018-04-19 20:33 ` Al Viro 1 sibling, 1 reply; 23+ messages in thread From: Jeff Layton @ 2018-04-19 18:31 UTC (permalink / raw) To: Jerome Glisse Cc: Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, 2018-04-19 at 13:26 -0400, Jerome Glisse wrote: > On Thu, Apr 19, 2018 at 12:58:39PM -0400, Jeff Layton wrote: > > On Thu, 2018-04-19 at 12:30 -0400, Jerome Glisse wrote: > > > On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote: > > > > On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote: > > > > > Oh can i get one more small slot for fs ? I want to ask if they are > > > > > any people against having a callback everytime a struct file is added > > > > > to a task_struct and also having a secondary array so that special > > > > > file like device file can store something opaque per task_struct per > > > > > struct file. > > > > > > > > Do you really want something per _thread_, and not per _mm_? > > > > > > Well per mm would be fine but i do not see how to make that happen with > > > reasonable structure. So issue is that you can have multiple task with > > > same mm but different file descriptors (or am i wrong here ?) thus there > > > would be no easy way given a struct file to lookup the per mm struct. > > > > > > So as a not perfect solution i see a new array in filedes which would > > > allow device driver to store a pointer to their per mm data structure. > > > To be fair usualy you will only have a single fd in a single task for > > > a given device. > > > > > > If you see an easy way to get a per mm per inode pointer store somewhere > > > with easy lookup i am all ears :) > > > > > > > I may be misunderstanding, but to be clear: struct files don't get > > added to a thread, per-se. > > > > When userland calls open() or similar, the struct file gets added to > > the files_struct. Those are generally shared with other threads within > > the same process. The files_struct can also be shared with other > > processes if you clone() with the right flags. > > > > Doing something per-thread on every open may be rather difficult to do. > > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and > add void * *private_data; to struct fdtable (also a default array to > struct files_struct). The callback would be part of struct file_operations. > and only call if it exist (os overhead is only for device driver that > care). > > Did i miss something fundamental ? copy_files() call dup_fd() so i > should be all set here. > > I will work on patches i was hoping this would not be too much work. > No, I think I misunderstood. I was thinking you wanted to iterate over all of the threads that might be associated with a struct file, and that's rather non-trivial. A callback when you add a file to the files_struct seems like it would probably be OK (in principle). -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 18:31 ` Jeff Layton @ 2018-04-19 19:31 ` Jerome Glisse 2018-04-19 19:56 ` Matthew Wilcox 0 siblings, 1 reply; 23+ messages in thread From: Jerome Glisse @ 2018-04-19 19:31 UTC (permalink / raw) To: Jeff Layton Cc: Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 02:31:07PM -0400, Jeff Layton wrote: > On Thu, 2018-04-19 at 13:26 -0400, Jerome Glisse wrote: > > On Thu, Apr 19, 2018 at 12:58:39PM -0400, Jeff Layton wrote: > > > On Thu, 2018-04-19 at 12:30 -0400, Jerome Glisse wrote: > > > > On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote: > > > > > On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote: > > > > > > Oh can i get one more small slot for fs ? I want to ask if they are > > > > > > any people against having a callback everytime a struct file is added > > > > > > to a task_struct and also having a secondary array so that special > > > > > > file like device file can store something opaque per task_struct per > > > > > > struct file. > > > > > > > > > > Do you really want something per _thread_, and not per _mm_? > > > > > > > > Well per mm would be fine but i do not see how to make that happen with > > > > reasonable structure. So issue is that you can have multiple task with > > > > same mm but different file descriptors (or am i wrong here ?) thus there > > > > would be no easy way given a struct file to lookup the per mm struct. > > > > > > > > So as a not perfect solution i see a new array in filedes which would > > > > allow device driver to store a pointer to their per mm data structure. > > > > To be fair usualy you will only have a single fd in a single task for > > > > a given device. > > > > > > > > If you see an easy way to get a per mm per inode pointer store somewhere > > > > with easy lookup i am all ears :) > > > > > > > > > > I may be misunderstanding, but to be clear: struct files don't get > > > added to a thread, per-se. > > > > > > When userland calls open() or similar, the struct file gets added to > > > the files_struct. Those are generally shared with other threads within > > > the same process. The files_struct can also be shared with other > > > processes if you clone() with the right flags. > > > > > > Doing something per-thread on every open may be rather difficult to do. > > > > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and > > add void * *private_data; to struct fdtable (also a default array to > > struct files_struct). The callback would be part of struct file_operations. > > and only call if it exist (os overhead is only for device driver that > > care). > > > > Did i miss something fundamental ? copy_files() call dup_fd() so i > > should be all set here. > > > > I will work on patches i was hoping this would not be too much work. > > > > No, I think I misunderstood. I was thinking you wanted to iterate over > all of the threads that might be associated with a struct file, and > that's rather non-trivial. > > A callback when you add a file to the files_struct seems like it would > probably be OK (in principle). Well scratch that whole idea, i would need to add a new array to task struct which make it a lot less appealing. Hence a better solution is to instead have this as part of mm (well indirectly). Thanks folks for chimming in. I will discuss this in my mmu_notifier session. Cheers, J�r�me ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 19:31 ` Jerome Glisse @ 2018-04-19 19:56 ` Matthew Wilcox 2018-04-19 20:15 ` Jerome Glisse 0 siblings, 1 reply; 23+ messages in thread From: Matthew Wilcox @ 2018-04-19 19:56 UTC (permalink / raw) To: Jerome Glisse Cc: Jeff Layton, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 03:31:08PM -0400, Jerome Glisse wrote: > > > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and > > > add void * *private_data; to struct fdtable (also a default array to > > > struct files_struct). The callback would be part of struct file_operations. > > > and only call if it exist (os overhead is only for device driver that > > > care). > > > > > > Did i miss something fundamental ? copy_files() call dup_fd() so i > > > should be all set here. > > > > > > I will work on patches i was hoping this would not be too much work. > > Well scratch that whole idea, i would need to add a new array to task > struct which make it a lot less appealing. Hence a better solution is > to instead have this as part of mm (well indirectly). It shouldn't be too bad to add a struct radix_tree to the fdtable. I'm sure we could just not support weird cases like sharing the fdtable without sharing the mm. Does anyone actually do that? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 19:56 ` Matthew Wilcox @ 2018-04-19 20:15 ` Jerome Glisse 2018-04-19 20:25 ` Matthew Wilcox 2018-04-19 20:51 ` Al Viro 0 siblings, 2 replies; 23+ messages in thread From: Jerome Glisse @ 2018-04-19 20:15 UTC (permalink / raw) To: Matthew Wilcox Cc: Jeff Layton, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 12:56:37PM -0700, Matthew Wilcox wrote: > On Thu, Apr 19, 2018 at 03:31:08PM -0400, Jerome Glisse wrote: > > > > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and > > > > add void * *private_data; to struct fdtable (also a default array to > > > > struct files_struct). The callback would be part of struct file_operations. > > > > and only call if it exist (os overhead is only for device driver that > > > > care). > > > > > > > > Did i miss something fundamental ? copy_files() call dup_fd() so i > > > > should be all set here. > > > > > > > > I will work on patches i was hoping this would not be too much work. > > > > Well scratch that whole idea, i would need to add a new array to task > > struct which make it a lot less appealing. Hence a better solution is > > to instead have this as part of mm (well indirectly). > > It shouldn't be too bad to add a struct radix_tree to the fdtable. > > I'm sure we could just not support weird cases like sharing the fdtable > without sharing the mm. Does anyone actually do that? Well like you pointed out what i really want is a 1:1 structure linking a device struct an a mm_struct. Given that this need to be cleanup when mm goes away hence tying this to mmu_notifier sounds like a better idea. I am thinking of adding a hashtable to mmu_notifier_mm using file id for hash as this should be a good hash value for common cases. I only expect few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have a hashtable inside their driver and they has on the mm struct pointer, i believe hash mmu_notifier_mm using file id will be better. J�r�me ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 20:15 ` Jerome Glisse @ 2018-04-19 20:25 ` Matthew Wilcox 2018-04-19 20:39 ` Al Viro 2018-04-19 20:51 ` Al Viro 1 sibling, 1 reply; 23+ messages in thread From: Matthew Wilcox @ 2018-04-19 20:25 UTC (permalink / raw) To: Jerome Glisse Cc: Jeff Layton, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 04:15:02PM -0400, Jerome Glisse wrote: > On Thu, Apr 19, 2018 at 12:56:37PM -0700, Matthew Wilcox wrote: > > > Well scratch that whole idea, i would need to add a new array to task > > > struct which make it a lot less appealing. Hence a better solution is > > > to instead have this as part of mm (well indirectly). > > > > It shouldn't be too bad to add a struct radix_tree to the fdtable. > > > > I'm sure we could just not support weird cases like sharing the fdtable > > without sharing the mm. Does anyone actually do that? > > Well like you pointed out what i really want is a 1:1 structure linking > a device struct an a mm_struct. Given that this need to be cleanup when > mm goes away hence tying this to mmu_notifier sounds like a better idea. > > I am thinking of adding a hashtable to mmu_notifier_mm using file id for > hash as this should be a good hash value for common cases. I only expect > few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have > a hashtable inside their driver and they has on the mm struct pointer, > i believe hash mmu_notifier_mm using file id will be better. file descriptors are small positive integers ... ideal for the radix tree. If you need to find your data based on the struct file address, then by all means a hashtable is the better data structure. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 20:25 ` Matthew Wilcox @ 2018-04-19 20:39 ` Al Viro 2018-04-19 21:08 ` Jerome Glisse 0 siblings, 1 reply; 23+ messages in thread From: Al Viro @ 2018-04-19 20:39 UTC (permalink / raw) To: Matthew Wilcox Cc: Jerome Glisse, Jeff Layton, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 01:25:13PM -0700, Matthew Wilcox wrote: > On Thu, Apr 19, 2018 at 04:15:02PM -0400, Jerome Glisse wrote: > > On Thu, Apr 19, 2018 at 12:56:37PM -0700, Matthew Wilcox wrote: > > > > Well scratch that whole idea, i would need to add a new array to task > > > > struct which make it a lot less appealing. Hence a better solution is > > > > to instead have this as part of mm (well indirectly). > > > > > > It shouldn't be too bad to add a struct radix_tree to the fdtable. > > > > > > I'm sure we could just not support weird cases like sharing the fdtable > > > without sharing the mm. Does anyone actually do that? > > > > Well like you pointed out what i really want is a 1:1 structure linking > > a device struct an a mm_struct. Given that this need to be cleanup when > > mm goes away hence tying this to mmu_notifier sounds like a better idea. > > > > I am thinking of adding a hashtable to mmu_notifier_mm using file id for > > hash as this should be a good hash value for common cases. I only expect > > few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have > > a hashtable inside their driver and they has on the mm struct pointer, > > i believe hash mmu_notifier_mm using file id will be better. > > file descriptors are small positive integers ... ... except when there's a lot of them. Or when something uses dup2() in interesting ways, but hey - we could "just not support" that, right? > ideal for the radix tree. > If you need to find your data based on the struct file address, then by > all means a hashtable is the better data structure. Perhaps it would be a good idea to describe whatever is being attempted? FWIW, passing around descriptors is almost always a bloody bad idea. There are very few things really associated with those and just about every time I'd seen internal APIs that work in terms of those "small positive numbers" they had been badly racy and required massive redesign to get something even remotely sane. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 20:39 ` Al Viro @ 2018-04-19 21:08 ` Jerome Glisse 0 siblings, 0 replies; 23+ messages in thread From: Jerome Glisse @ 2018-04-19 21:08 UTC (permalink / raw) To: Al Viro Cc: Matthew Wilcox, Jeff Layton, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 09:39:53PM +0100, Al Viro wrote: > On Thu, Apr 19, 2018 at 01:25:13PM -0700, Matthew Wilcox wrote: > > On Thu, Apr 19, 2018 at 04:15:02PM -0400, Jerome Glisse wrote: > > > On Thu, Apr 19, 2018 at 12:56:37PM -0700, Matthew Wilcox wrote: > > > > > Well scratch that whole idea, i would need to add a new array to task > > > > > struct which make it a lot less appealing. Hence a better solution is > > > > > to instead have this as part of mm (well indirectly). > > > > > > > > It shouldn't be too bad to add a struct radix_tree to the fdtable. > > > > > > > > I'm sure we could just not support weird cases like sharing the fdtable > > > > without sharing the mm. Does anyone actually do that? > > > > > > Well like you pointed out what i really want is a 1:1 structure linking > > > a device struct an a mm_struct. Given that this need to be cleanup when > > > mm goes away hence tying this to mmu_notifier sounds like a better idea. > > > > > > I am thinking of adding a hashtable to mmu_notifier_mm using file id for > > > hash as this should be a good hash value for common cases. I only expect > > > few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have > > > a hashtable inside their driver and they has on the mm struct pointer, > > > i believe hash mmu_notifier_mm using file id will be better. > > > > file descriptors are small positive integers ... > > ... except when there's a lot of them. Or when something uses dup2() in > interesting ways, but hey - we could "just not support" that, right? > > > ideal for the radix tree. > > If you need to find your data based on the struct file address, then by > > all means a hashtable is the better data structure. > > Perhaps it would be a good idea to describe whatever is being attempted? > > FWIW, passing around descriptors is almost always a bloody bad idea. There > are very few things really associated with those and just about every time > I'd seen internal APIs that work in terms of those "small positive numbers" > they had been badly racy and required massive redesign to get something even > remotely sane. Ok i will use struct device pointer as index, or something else (i would like to use PCI domain:bus:slot but i don't want this to be PCIE only), maybe dev_t ... Cheers, J�r�me ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 20:15 ` Jerome Glisse 2018-04-19 20:25 ` Matthew Wilcox @ 2018-04-19 20:51 ` Al Viro 1 sibling, 0 replies; 23+ messages in thread From: Al Viro @ 2018-04-19 20:51 UTC (permalink / raw) To: Jerome Glisse Cc: Matthew Wilcox, Jeff Layton, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 04:15:02PM -0400, Jerome Glisse wrote: > Well like you pointed out what i really want is a 1:1 structure linking > a device struct an a mm_struct. Given that this need to be cleanup when > mm goes away hence tying this to mmu_notifier sounds like a better idea. > > I am thinking of adding a hashtable to mmu_notifier_mm using file id for > hash as this should be a good hash value for common cases. I only expect > few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have > a hashtable inside their driver and they has on the mm struct pointer, > i believe hash mmu_notifier_mm using file id will be better. What _is_ "file id"? If you are talking about file descriptors, you can very well have several for the same opened file. Moreover, you can bloody well have it opened, then dup'ed, then original descriptor closed and reused by another open... ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 17:26 ` Jerome Glisse 2018-04-19 18:31 ` Jeff Layton @ 2018-04-19 20:33 ` Al Viro 2018-04-19 20:58 ` Jerome Glisse 1 sibling, 1 reply; 23+ messages in thread From: Al Viro @ 2018-04-19 20:33 UTC (permalink / raw) To: Jerome Glisse Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 01:26:10PM -0400, Jerome Glisse wrote: > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and > add void * *private_data; to struct fdtable (also a default array to > struct files_struct). The callback would be part of struct file_operations. > and only call if it exist (os overhead is only for device driver that > care). Hell, *NO*. This is insane - you would need to maintain extra counts ("how many descriptors refer to this struct file... for this descriptor table"). Besides, _what_ private_data? What would own and maintain it? A specific driver? What if more than one of them wants that thing? > Did i miss something fundamental ? copy_files() call dup_fd() so i > should be all set here. That looks like an extremely misguided kludge for hell knows what purpose, almost certainly architecturally insane. What are you actually trying to achieve? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 20:33 ` Al Viro @ 2018-04-19 20:58 ` Jerome Glisse 2018-04-19 21:21 ` Al Viro 0 siblings, 1 reply; 23+ messages in thread From: Jerome Glisse @ 2018-04-19 20:58 UTC (permalink / raw) To: Al Viro Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 09:33:07PM +0100, Al Viro wrote: > On Thu, Apr 19, 2018 at 01:26:10PM -0400, Jerome Glisse wrote: > > > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and > > add void * *private_data; to struct fdtable (also a default array to > > struct files_struct). The callback would be part of struct file_operations. > > and only call if it exist (os overhead is only for device driver that > > care). > > Hell, *NO*. This is insane - you would need to maintain extra counts > ("how many descriptors refer to this struct file... for this descriptor > table"). > > Besides, _what_ private_data? What would own and maintain it? A specific > driver? What if more than one of them wants that thing? I hadn't something complex in mind (ie timelife link to struct file and no refcouting changes). But anyway i gave up on that idea and will add what i need in mmu_notifier. > > > Did i miss something fundamental ? copy_files() call dup_fd() so i > > should be all set here. > > That looks like an extremely misguided kludge for hell knows what purpose, > almost certainly architecturally insane. What are you actually trying to > achieve? I need a struct to link part of device context with mm struct for a process. Most of device context is link to the struct file of the device file (ie process open has a file descriptor for the device file). Device driver for GPU have some part of their process context tied to the process mm (accessing process address space directly from the GPU). However we can not store this context information in the struct file private data because of clone (same struct file accross different mm). So today driver have an hashtable in their global device structure to lookup context information for a given mm. This is sub-optimal and duplicate a lot of code among different drivers. Hence why i want something generic that allow a device driver to store context structure that is specific to a mm. I thought that adding a new array on the side of struct file array would be a good idea but it has too many kludges. So i will do something inside mmu_notifier and there will be no tie to any fs aspect. I expect only a handful of driver to care about this and for a given platform you won't see that many devices hence you won't have that many pointer to deal with. Cheers, J�r�me ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 20:58 ` Jerome Glisse @ 2018-04-19 21:21 ` Al Viro 2018-04-19 21:47 ` Jerome Glisse 0 siblings, 1 reply; 23+ messages in thread From: Al Viro @ 2018-04-19 21:21 UTC (permalink / raw) To: Jerome Glisse Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 04:58:20PM -0400, Jerome Glisse wrote: > I need a struct to link part of device context with mm struct for a > process. Most of device context is link to the struct file of the > device file (ie process open has a file descriptor for the device > file). Er... You do realize that fd = open(...) mmap(fd, ...) close(fd) is absolutely legitimate, right? IOW, existence/stability/etc. of a file descriptor is absolutely *not* guaranteed - in fact, there might be not a single file descriptor referring to a given openen and mmaped file. > Device driver for GPU have some part of their process context tied to > the process mm (accessing process address space directly from the GPU). > However we can not store this context information in the struct file > private data because of clone (same struct file accross different mm). > > So today driver have an hashtable in their global device structure to > lookup context information for a given mm. This is sub-optimal and > duplicate a lot of code among different drivers. Umm... Examples? > Hence why i want something generic that allow a device driver to store > context structure that is specific to a mm. I thought that adding a > new array on the side of struct file array would be a good idea but it > has too many kludges. > > So i will do something inside mmu_notifier and there will be no tie to > any fs aspect. I expect only a handful of driver to care about this and > for a given platform you won't see that many devices hence you won't > have that many pointer to deal with. Let's step back for a second - lookups by _what_? If you are associating somethin with a mapping, vm_area_struct would be a natural candidate for storing such data, wouldn't it? What do you have and what do you want to find? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 21:21 ` Al Viro @ 2018-04-19 21:47 ` Jerome Glisse 2018-04-19 22:13 ` Al Viro 0 siblings, 1 reply; 23+ messages in thread From: Jerome Glisse @ 2018-04-19 21:47 UTC (permalink / raw) To: Al Viro Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 10:21:37PM +0100, Al Viro wrote: > On Thu, Apr 19, 2018 at 04:58:20PM -0400, Jerome Glisse wrote: > > > I need a struct to link part of device context with mm struct for a > > process. Most of device context is link to the struct file of the > > device file (ie process open has a file descriptor for the device > > file). > > Er... You do realize that > fd = open(...) > mmap(fd, ...) > close(fd) > is absolutely legitimate, right? IOW, existence/stability/etc. of > a file descriptor is absolutely *not* guaranteed - in fact, there might > be not a single file descriptor referring to a given openen and mmaped > file. Yes and that's fine, on close(fd) the device driver is tear down and struct i want to store is tear down too and free. > > > Device driver for GPU have some part of their process context tied to > > the process mm (accessing process address space directly from the GPU). > > However we can not store this context information in the struct file > > private data because of clone (same struct file accross different mm). > > > > So today driver have an hashtable in their global device structure to > > lookup context information for a given mm. This is sub-optimal and > > duplicate a lot of code among different drivers. > > Umm... Examples? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_mn.c https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/i915/i915_gem_userptr.c https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c RDMA folks too have similar construct. > > > Hence why i want something generic that allow a device driver to store > > context structure that is specific to a mm. I thought that adding a > > new array on the side of struct file array would be a good idea but it > > has too many kludges. > > > > So i will do something inside mmu_notifier and there will be no tie to > > any fs aspect. I expect only a handful of driver to care about this and > > for a given platform you won't see that many devices hence you won't > > have that many pointer to deal with. > > Let's step back for a second - lookups by _what_? If you are associating > somethin with a mapping, vm_area_struct would be a natural candidate for > storing such data, wouldn't it? > > What do you have and what do you want to find? So you are in an ioctl against the device file, you have struct file and driver store a pointer to some file context info in struct file private data which itself has a pointer to some global device driver structure which itself has a pointer to struct device. Hence i have struct mm (from current->mm), and dev_t easily available. The context information is tie to the mm for the device and can only be use against said mm. Even if the struct file of the device outlive the original process, no one can use that struct with a process that do not have the same mm. Moreover that struct is freed if the mm is destroy. If child, share the struct file but have a different and want to use same feature then a new structure is created and has same property ie can only be use against this new mm. The link with struct file is not explicit but you can only use objects tie to that struct through ioctl against the struct file. Hopes this clarify the use case. Cheers, J�r�me ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 21:47 ` Jerome Glisse @ 2018-04-19 22:13 ` Al Viro 0 siblings, 0 replies; 23+ messages in thread From: Al Viro @ 2018-04-19 22:13 UTC (permalink / raw) To: Jerome Glisse Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On Thu, Apr 19, 2018 at 05:47:51PM -0400, Jerome Glisse wrote: > On Thu, Apr 19, 2018 at 10:21:37PM +0100, Al Viro wrote: > > On Thu, Apr 19, 2018 at 04:58:20PM -0400, Jerome Glisse wrote: > > > > > I need a struct to link part of device context with mm struct for a > > > process. Most of device context is link to the struct file of the > > > device file (ie process open has a file descriptor for the device > > > file). > > > > Er... You do realize that > > fd = open(...) > > mmap(fd, ...) > > close(fd) > > is absolutely legitimate, right? IOW, existence/stability/etc. of > > a file descriptor is absolutely *not* guaranteed - in fact, there might > > be not a single file descriptor referring to a given openen and mmaped > > file. > > Yes and that's fine, on close(fd) the device driver is tear down No, it is not. _NOTHING_ is done on that close(fd), other than removing a reference from descriptor table. In this case struct file is still open and remains such until munmap(). That's why descriptor table is a very bad place for sticking that kind of information. Besides, as soon as your syscall (ioctl, write, whatever) has looked struct file up, the mapping from descriptors to struct file can change. Literally before fdget() has returned to caller. Another thread could do dup() and close() of the original descriptor. Or just plain close(), for that matter - struct file will remain open until fdput(). > and > struct i want to store is tear down too and free. So do a hash table indexed by pair (void *, struct mm_struct *) and do lookups there... And use radeon_device as the first half of the key. Or struct file *, or pointer to whatever private data you maintain for an opened file... ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 1:55 ` Dave Chinner 2018-04-19 14:38 ` Jerome Glisse @ 2018-04-19 14:51 ` Chris Mason 2018-04-19 15:07 ` Martin K. Petersen 1 sibling, 1 reply; 23+ messages in thread From: Chris Mason @ 2018-04-19 14:51 UTC (permalink / raw) To: Dave Chinner Cc: Jerome Glisse, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko On 18 Apr 2018, at 21:55, Dave Chinner wrote: > On Wed, Apr 18, 2018 at 05:19:39PM -0400, Jerome Glisse wrote: >> Just wanted to suggest to push HMM status down one slot in the >> agenda to avoid having FS and MM first going into their own >> room and then merging back for GUP and DAX, and re-splitting >> after. More over HMM and NUMA talks will be good to have back >> to back as they deal with same kind of thing mostly. > > So while we are talking about schedule suggestions, we see that > there's lots of empty slots in the FS track. We (xfs guys) were just > chatting on #xfs about whether we'd have time to have a "XFS devel > meeting" at some point during LSF/MM as we are rarely in the same > place at the same time. > > I'd like to propose that we compact the fs sessions so that we get a > 3-slot session reserved for "Individual filesystem discussions" one > afternoon. That way we've got time in the schedule for the all the > ext4/btrfs/XFS/NFS/CIFS devs to get together with each other and > talk about things of interest only to their own fileystems. > > That means we all don't have to find time outside the schedule to do > this, and think this wold be time very well spent for most fs people > at the conf.... I'd love this as well. -chris ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [LSF/MM] schedule suggestion 2018-04-19 14:51 ` Chris Mason @ 2018-04-19 15:07 ` Martin K. Petersen 0 siblings, 0 replies; 23+ messages in thread From: Martin K. Petersen @ 2018-04-19 15:07 UTC (permalink / raw) To: Chris Mason Cc: Dave Chinner, Jerome Glisse, linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko Chris, >> I'd like to propose that we compact the fs sessions so that we get a >> 3-slot session reserved for "Individual filesystem discussions" one >> afternoon. That way we've got time in the schedule for the all the >> ext4/btrfs/XFS/NFS/CIFS devs to get together with each other and >> talk about things of interest only to their own fileystems. >> >> That means we all don't have to find time outside the schedule to do >> this, and think this wold be time very well spent for most fs people >> at the conf.... > > I'd love this as well. Based on feedback last year we explicitly added a third day to LSF/MM to facilitate hack time and project meetings. As usual the schedule is fluid and will be adjusted on the fly. Depending on track, I am hoping we'll be done with the scheduled topics either at the end of Tuesday or Wednesday morning. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2018-04-19 22:13 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-04-18 21:19 [LSF/MM] schedule suggestion Jerome Glisse 2018-04-19 0:48 ` Dave Chinner 2018-04-19 1:55 ` Dave Chinner 2018-04-19 14:38 ` Jerome Glisse 2018-04-19 14:43 ` Matthew Wilcox 2018-04-19 16:30 ` Jerome Glisse 2018-04-19 16:58 ` Jeff Layton 2018-04-19 17:26 ` Jerome Glisse 2018-04-19 18:31 ` Jeff Layton 2018-04-19 19:31 ` Jerome Glisse 2018-04-19 19:56 ` Matthew Wilcox 2018-04-19 20:15 ` Jerome Glisse 2018-04-19 20:25 ` Matthew Wilcox 2018-04-19 20:39 ` Al Viro 2018-04-19 21:08 ` Jerome Glisse 2018-04-19 20:51 ` Al Viro 2018-04-19 20:33 ` Al Viro 2018-04-19 20:58 ` Jerome Glisse 2018-04-19 21:21 ` Al Viro 2018-04-19 21:47 ` Jerome Glisse 2018-04-19 22:13 ` Al Viro 2018-04-19 14:51 ` Chris Mason 2018-04-19 15:07 ` Martin K. Petersen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).