linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM] schedule suggestion
@ 2018-04-18 21:19 Jerome Glisse
  2018-04-19  0:48 ` Dave Chinner
  2018-04-19  1:55 ` Dave Chinner
  0 siblings, 2 replies; 23+ messages in thread
From: Jerome Glisse @ 2018-04-18 21:19 UTC (permalink / raw)
  To: linux-mm
  Cc: lsf-pc, linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko

Just wanted to suggest to push HMM status down one slot in the
agenda to avoid having FS and MM first going into their own
room and then merging back for GUP and DAX, and re-splitting
after. More over HMM and NUMA talks will be good to have back
to back as they deal with same kind of thing mostly.

So on Monday afternoon GUP in first slot would be nice :)

Just a suggestion

https://docs.google.com/spreadsheets/d/15XFz_Zsklmle--L9CO4-ygCqmSHwBsFPjvjDpiL5qwM/edit#gid=0

Cheers,
J�r�me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-18 21:19 [LSF/MM] schedule suggestion Jerome Glisse
@ 2018-04-19  0:48 ` Dave Chinner
  2018-04-19  1:55 ` Dave Chinner
  1 sibling, 0 replies; 23+ messages in thread
From: Dave Chinner @ 2018-04-19  0:48 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner,
	Michal Hocko

On Wed, Apr 18, 2018 at 05:19:39PM -0400, Jerome Glisse wrote:
> Just wanted to suggest to push HMM status down one slot in the
> agenda to avoid having FS and MM first going into their own
> room and then merging back for GUP and DAX, and re-splitting
> after. More over HMM and NUMA talks will be good to have back
> to back as they deal with same kind of thing mostly.

Yeah, I noticed there are a few of these shared sessions that have
been placed in the middle slot of a session. Would be good to put
all the shared fs/mm sessions into a complete session block so
people don't have to keep moving rooms every half hour...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-18 21:19 [LSF/MM] schedule suggestion Jerome Glisse
  2018-04-19  0:48 ` Dave Chinner
@ 2018-04-19  1:55 ` Dave Chinner
  2018-04-19 14:38   ` Jerome Glisse
  2018-04-19 14:51   ` Chris Mason
  1 sibling, 2 replies; 23+ messages in thread
From: Dave Chinner @ 2018-04-19  1:55 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner,
	Michal Hocko

On Wed, Apr 18, 2018 at 05:19:39PM -0400, Jerome Glisse wrote:
> Just wanted to suggest to push HMM status down one slot in the
> agenda to avoid having FS and MM first going into their own
> room and then merging back for GUP and DAX, and re-splitting
> after. More over HMM and NUMA talks will be good to have back
> to back as they deal with same kind of thing mostly.

So while we are talking about schedule suggestions, we see that
there's lots of empty slots in the FS track. We (xfs guys) were just
chatting on #xfs about whether we'd have time to have a "XFS devel
meeting" at some point during LSF/MM as we are rarely in the same
place at the same time.

I'd like to propose that we compact the fs sessions so that we get a
3-slot session reserved for "Individual filesystem discussions" one
afternoon. That way we've got time in the schedule for the all the
ext4/btrfs/XFS/NFS/CIFS devs to get together with each other and
talk about things of interest only to their own fileystems.

That means we all don't have to find time outside the schedule to do
this, and think this wold be time very well spent for most fs people
at the conf....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19  1:55 ` Dave Chinner
@ 2018-04-19 14:38   ` Jerome Glisse
  2018-04-19 14:43     ` Matthew Wilcox
  2018-04-19 14:51   ` Chris Mason
  1 sibling, 1 reply; 23+ messages in thread
From: Jerome Glisse @ 2018-04-19 14:38 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-mm, lsf-pc, linux-fsdevel, linux-block, Johannes Weiner,
	Michal Hocko

On Thu, Apr 19, 2018 at 11:55:08AM +1000, Dave Chinner wrote:
> On Wed, Apr 18, 2018 at 05:19:39PM -0400, Jerome Glisse wrote:
> > Just wanted to suggest to push HMM status down one slot in the
> > agenda to avoid having FS and MM first going into their own
> > room and then merging back for GUP and DAX, and re-splitting
> > after. More over HMM and NUMA talks will be good to have back
> > to back as they deal with same kind of thing mostly.
> 
> So while we are talking about schedule suggestions, we see that
> there's lots of empty slots in the FS track. We (xfs guys) were just
> chatting on #xfs about whether we'd have time to have a "XFS devel
> meeting" at some point during LSF/MM as we are rarely in the same
> place at the same time.
> 
> I'd like to propose that we compact the fs sessions so that we get a
> 3-slot session reserved for "Individual filesystem discussions" one
> afternoon. That way we've got time in the schedule for the all the
> ext4/btrfs/XFS/NFS/CIFS devs to get together with each other and
> talk about things of interest only to their own fileystems.
> 
> That means we all don't have to find time outside the schedule to do
> this, and think this wold be time very well spent for most fs people
> at the conf....

Oh can i get one more small slot for fs ? I want to ask if they are
any people against having a callback everytime a struct file is added
to a task_struct and also having a secondary array so that special
file like device file can store something opaque per task_struct per
struct file.

I will try to stich a patchset tomorrow for that. A lot of device
driver would like to have this.

Cheers,
J�r�me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 14:38   ` Jerome Glisse
@ 2018-04-19 14:43     ` Matthew Wilcox
  2018-04-19 16:30       ` Jerome Glisse
  0 siblings, 1 reply; 23+ messages in thread
From: Matthew Wilcox @ 2018-04-19 14:43 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block,
	Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote:
> Oh can i get one more small slot for fs ? I want to ask if they are
> any people against having a callback everytime a struct file is added
> to a task_struct and also having a secondary array so that special
> file like device file can store something opaque per task_struct per
> struct file.

Do you really want something per _thread_, and not per _mm_?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19  1:55 ` Dave Chinner
  2018-04-19 14:38   ` Jerome Glisse
@ 2018-04-19 14:51   ` Chris Mason
  2018-04-19 15:07     ` Martin K. Petersen
  1 sibling, 1 reply; 23+ messages in thread
From: Chris Mason @ 2018-04-19 14:51 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Jerome Glisse, linux-mm, lsf-pc, linux-fsdevel, linux-block,
	Johannes Weiner, Michal Hocko

On 18 Apr 2018, at 21:55, Dave Chinner wrote:

> On Wed, Apr 18, 2018 at 05:19:39PM -0400, Jerome Glisse wrote:
>> Just wanted to suggest to push HMM status down one slot in the
>> agenda to avoid having FS and MM first going into their own
>> room and then merging back for GUP and DAX, and re-splitting
>> after. More over HMM and NUMA talks will be good to have back
>> to back as they deal with same kind of thing mostly.
>
> So while we are talking about schedule suggestions, we see that
> there's lots of empty slots in the FS track. We (xfs guys) were just
> chatting on #xfs about whether we'd have time to have a "XFS devel
> meeting" at some point during LSF/MM as we are rarely in the same
> place at the same time.
>
> I'd like to propose that we compact the fs sessions so that we get a
> 3-slot session reserved for "Individual filesystem discussions" one
> afternoon. That way we've got time in the schedule for the all the
> ext4/btrfs/XFS/NFS/CIFS devs to get together with each other and
> talk about things of interest only to their own fileystems.
>
> That means we all don't have to find time outside the schedule to do
> this, and think this wold be time very well spent for most fs people
> at the conf....

I'd love this as well.

-chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 14:51   ` Chris Mason
@ 2018-04-19 15:07     ` Martin K. Petersen
  0 siblings, 0 replies; 23+ messages in thread
From: Martin K. Petersen @ 2018-04-19 15:07 UTC (permalink / raw)
  To: Chris Mason
  Cc: Dave Chinner, Jerome Glisse, linux-mm, lsf-pc, linux-fsdevel,
	linux-block, Johannes Weiner, Michal Hocko


Chris,

>> I'd like to propose that we compact the fs sessions so that we get a
>> 3-slot session reserved for "Individual filesystem discussions" one
>> afternoon. That way we've got time in the schedule for the all the
>> ext4/btrfs/XFS/NFS/CIFS devs to get together with each other and
>> talk about things of interest only to their own fileystems.
>>
>> That means we all don't have to find time outside the schedule to do
>> this, and think this wold be time very well spent for most fs people
>> at the conf....
>
> I'd love this as well.

Based on feedback last year we explicitly added a third day to LSF/MM to
facilitate hack time and project meetings.

As usual the schedule is fluid and will be adjusted on the fly.
Depending on track, I am hoping we'll be done with the scheduled topics
either at the end of Tuesday or Wednesday morning.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 14:43     ` Matthew Wilcox
@ 2018-04-19 16:30       ` Jerome Glisse
  2018-04-19 16:58         ` Jeff Layton
  0 siblings, 1 reply; 23+ messages in thread
From: Jerome Glisse @ 2018-04-19 16:30 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block,
	Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote:
> On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote:
> > Oh can i get one more small slot for fs ? I want to ask if they are
> > any people against having a callback everytime a struct file is added
> > to a task_struct and also having a secondary array so that special
> > file like device file can store something opaque per task_struct per
> > struct file.
> 
> Do you really want something per _thread_, and not per _mm_?

Well per mm would be fine but i do not see how to make that happen with
reasonable structure. So issue is that you can have multiple task with
same mm but different file descriptors (or am i wrong here ?) thus there
would be no easy way given a struct file to lookup the per mm struct.

So as a not perfect solution i see a new array in filedes which would
allow device driver to store a pointer to their per mm data structure.
To be fair usualy you will only have a single fd in a single task for
a given device.

If you see an easy way to get a per mm per inode pointer store somewhere
with easy lookup i am all ears :)

Cheers,
J�r�me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 16:30       ` Jerome Glisse
@ 2018-04-19 16:58         ` Jeff Layton
  2018-04-19 17:26           ` Jerome Glisse
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff Layton @ 2018-04-19 16:58 UTC (permalink / raw)
  To: Jerome Glisse, Matthew Wilcox
  Cc: Dave Chinner, linux-mm, lsf-pc, linux-fsdevel, linux-block,
	Johannes Weiner, Michal Hocko

On Thu, 2018-04-19 at 12:30 -0400, Jerome Glisse wrote:
> On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote:
> > On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote:
> > > Oh can i get one more small slot for fs ? I want to ask if they are
> > > any people against having a callback everytime a struct file is added
> > > to a task_struct and also having a secondary array so that special
> > > file like device file can store something opaque per task_struct per
> > > struct file.
> > 
> > Do you really want something per _thread_, and not per _mm_?
> 
> Well per mm would be fine but i do not see how to make that happen with
> reasonable structure. So issue is that you can have multiple task with
> same mm but different file descriptors (or am i wrong here ?) thus there
> would be no easy way given a struct file to lookup the per mm struct.
> 
> So as a not perfect solution i see a new array in filedes which would
> allow device driver to store a pointer to their per mm data structure.
> To be fair usualy you will only have a single fd in a single task for
> a given device.
> 
> If you see an easy way to get a per mm per inode pointer store somewhere
> with easy lookup i am all ears :)
> 

I may be misunderstanding, but to be clear: struct files don't get
added to a thread, per-se.

When userland calls open() or similar, the struct file gets added to
the files_struct. Those are generally shared with other threads within
the same process. The files_struct can also be shared with other
processes if you clone() with the right flags.

Doing something per-thread on every open may be rather difficult to do.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 16:58         ` Jeff Layton
@ 2018-04-19 17:26           ` Jerome Glisse
  2018-04-19 18:31             ` Jeff Layton
  2018-04-19 20:33             ` Al Viro
  0 siblings, 2 replies; 23+ messages in thread
From: Jerome Glisse @ 2018-04-19 17:26 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel,
	linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 12:58:39PM -0400, Jeff Layton wrote:
> On Thu, 2018-04-19 at 12:30 -0400, Jerome Glisse wrote:
> > On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote:
> > > On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote:
> > > > Oh can i get one more small slot for fs ? I want to ask if they are
> > > > any people against having a callback everytime a struct file is added
> > > > to a task_struct and also having a secondary array so that special
> > > > file like device file can store something opaque per task_struct per
> > > > struct file.
> > > 
> > > Do you really want something per _thread_, and not per _mm_?
> > 
> > Well per mm would be fine but i do not see how to make that happen with
> > reasonable structure. So issue is that you can have multiple task with
> > same mm but different file descriptors (or am i wrong here ?) thus there
> > would be no easy way given a struct file to lookup the per mm struct.
> > 
> > So as a not perfect solution i see a new array in filedes which would
> > allow device driver to store a pointer to their per mm data structure.
> > To be fair usualy you will only have a single fd in a single task for
> > a given device.
> > 
> > If you see an easy way to get a per mm per inode pointer store somewhere
> > with easy lookup i am all ears :)
> > 
> 
> I may be misunderstanding, but to be clear: struct files don't get
> added to a thread, per-se.
> 
> When userland calls open() or similar, the struct file gets added to
> the files_struct. Those are generally shared with other threads within
> the same process. The files_struct can also be shared with other
> processes if you clone() with the right flags.
> 
> Doing something per-thread on every open may be rather difficult to do.

Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and
add void * *private_data; to struct fdtable (also a default array to
struct files_struct). The callback would be part of struct file_operations.
and only call if it exist (os overhead is only for device driver that
care).

Did i miss something fundamental ? copy_files() call dup_fd() so i
should be all set here.

I will work on patches i was hoping this would not be too much work.

Cheers,
J�r�me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 17:26           ` Jerome Glisse
@ 2018-04-19 18:31             ` Jeff Layton
  2018-04-19 19:31               ` Jerome Glisse
  2018-04-19 20:33             ` Al Viro
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff Layton @ 2018-04-19 18:31 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel,
	linux-block, Johannes Weiner, Michal Hocko

On Thu, 2018-04-19 at 13:26 -0400, Jerome Glisse wrote:
> On Thu, Apr 19, 2018 at 12:58:39PM -0400, Jeff Layton wrote:
> > On Thu, 2018-04-19 at 12:30 -0400, Jerome Glisse wrote:
> > > On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote:
> > > > On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote:
> > > > > Oh can i get one more small slot for fs ? I want to ask if they are
> > > > > any people against having a callback everytime a struct file is added
> > > > > to a task_struct and also having a secondary array so that special
> > > > > file like device file can store something opaque per task_struct per
> > > > > struct file.
> > > > 
> > > > Do you really want something per _thread_, and not per _mm_?
> > > 
> > > Well per mm would be fine but i do not see how to make that happen with
> > > reasonable structure. So issue is that you can have multiple task with
> > > same mm but different file descriptors (or am i wrong here ?) thus there
> > > would be no easy way given a struct file to lookup the per mm struct.
> > > 
> > > So as a not perfect solution i see a new array in filedes which would
> > > allow device driver to store a pointer to their per mm data structure.
> > > To be fair usualy you will only have a single fd in a single task for
> > > a given device.
> > > 
> > > If you see an easy way to get a per mm per inode pointer store somewhere
> > > with easy lookup i am all ears :)
> > > 
> > 
> > I may be misunderstanding, but to be clear: struct files don't get
> > added to a thread, per-se.
> > 
> > When userland calls open() or similar, the struct file gets added to
> > the files_struct. Those are generally shared with other threads within
> > the same process. The files_struct can also be shared with other
> > processes if you clone() with the right flags.
> > 
> > Doing something per-thread on every open may be rather difficult to do.
> 
> Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and
> add void * *private_data; to struct fdtable (also a default array to
> struct files_struct). The callback would be part of struct file_operations.
> and only call if it exist (os overhead is only for device driver that
> care).
> 
> Did i miss something fundamental ? copy_files() call dup_fd() so i
> should be all set here.
> 
> I will work on patches i was hoping this would not be too much work.
> 

No, I think I misunderstood. I was thinking you wanted to iterate over
all of the threads that might be associated with a struct file, and
that's rather non-trivial.

A callback when you add a file to the files_struct seems like it would
probably be OK (in principle).
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 18:31             ` Jeff Layton
@ 2018-04-19 19:31               ` Jerome Glisse
  2018-04-19 19:56                 ` Matthew Wilcox
  0 siblings, 1 reply; 23+ messages in thread
From: Jerome Glisse @ 2018-04-19 19:31 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel,
	linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 02:31:07PM -0400, Jeff Layton wrote:
> On Thu, 2018-04-19 at 13:26 -0400, Jerome Glisse wrote:
> > On Thu, Apr 19, 2018 at 12:58:39PM -0400, Jeff Layton wrote:
> > > On Thu, 2018-04-19 at 12:30 -0400, Jerome Glisse wrote:
> > > > On Thu, Apr 19, 2018 at 07:43:56AM -0700, Matthew Wilcox wrote:
> > > > > On Thu, Apr 19, 2018 at 10:38:25AM -0400, Jerome Glisse wrote:
> > > > > > Oh can i get one more small slot for fs ? I want to ask if they are
> > > > > > any people against having a callback everytime a struct file is added
> > > > > > to a task_struct and also having a secondary array so that special
> > > > > > file like device file can store something opaque per task_struct per
> > > > > > struct file.
> > > > > 
> > > > > Do you really want something per _thread_, and not per _mm_?
> > > > 
> > > > Well per mm would be fine but i do not see how to make that happen with
> > > > reasonable structure. So issue is that you can have multiple task with
> > > > same mm but different file descriptors (or am i wrong here ?) thus there
> > > > would be no easy way given a struct file to lookup the per mm struct.
> > > > 
> > > > So as a not perfect solution i see a new array in filedes which would
> > > > allow device driver to store a pointer to their per mm data structure.
> > > > To be fair usualy you will only have a single fd in a single task for
> > > > a given device.
> > > > 
> > > > If you see an easy way to get a per mm per inode pointer store somewhere
> > > > with easy lookup i am all ears :)
> > > > 
> > > 
> > > I may be misunderstanding, but to be clear: struct files don't get
> > > added to a thread, per-se.
> > > 
> > > When userland calls open() or similar, the struct file gets added to
> > > the files_struct. Those are generally shared with other threads within
> > > the same process. The files_struct can also be shared with other
> > > processes if you clone() with the right flags.
> > > 
> > > Doing something per-thread on every open may be rather difficult to do.
> > 
> > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and
> > add void * *private_data; to struct fdtable (also a default array to
> > struct files_struct). The callback would be part of struct file_operations.
> > and only call if it exist (os overhead is only for device driver that
> > care).
> > 
> > Did i miss something fundamental ? copy_files() call dup_fd() so i
> > should be all set here.
> > 
> > I will work on patches i was hoping this would not be too much work.
> > 
> 
> No, I think I misunderstood. I was thinking you wanted to iterate over
> all of the threads that might be associated with a struct file, and
> that's rather non-trivial.
> 
> A callback when you add a file to the files_struct seems like it would
> probably be OK (in principle).

Well scratch that whole idea, i would need to add a new array to task
struct which make it a lot less appealing. Hence a better solution is
to instead have this as part of mm (well indirectly).

Thanks folks for chimming in. I will discuss this in my mmu_notifier
session.

Cheers,
J�r�me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 19:31               ` Jerome Glisse
@ 2018-04-19 19:56                 ` Matthew Wilcox
  2018-04-19 20:15                   ` Jerome Glisse
  0 siblings, 1 reply; 23+ messages in thread
From: Matthew Wilcox @ 2018-04-19 19:56 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Jeff Layton, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel,
	linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 03:31:08PM -0400, Jerome Glisse wrote:
> > > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and
> > > add void * *private_data; to struct fdtable (also a default array to
> > > struct files_struct). The callback would be part of struct file_operations.
> > > and only call if it exist (os overhead is only for device driver that
> > > care).
> > > 
> > > Did i miss something fundamental ? copy_files() call dup_fd() so i
> > > should be all set here.
> > > 
> > > I will work on patches i was hoping this would not be too much work.
> 
> Well scratch that whole idea, i would need to add a new array to task
> struct which make it a lot less appealing. Hence a better solution is
> to instead have this as part of mm (well indirectly).

It shouldn't be too bad to add a struct radix_tree to the fdtable.

I'm sure we could just not support weird cases like sharing the fdtable
without sharing the mm.  Does anyone actually do that?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 19:56                 ` Matthew Wilcox
@ 2018-04-19 20:15                   ` Jerome Glisse
  2018-04-19 20:25                     ` Matthew Wilcox
  2018-04-19 20:51                     ` Al Viro
  0 siblings, 2 replies; 23+ messages in thread
From: Jerome Glisse @ 2018-04-19 20:15 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jeff Layton, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel,
	linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 12:56:37PM -0700, Matthew Wilcox wrote:
> On Thu, Apr 19, 2018 at 03:31:08PM -0400, Jerome Glisse wrote:
> > > > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and
> > > > add void * *private_data; to struct fdtable (also a default array to
> > > > struct files_struct). The callback would be part of struct file_operations.
> > > > and only call if it exist (os overhead is only for device driver that
> > > > care).
> > > > 
> > > > Did i miss something fundamental ? copy_files() call dup_fd() so i
> > > > should be all set here.
> > > > 
> > > > I will work on patches i was hoping this would not be too much work.
> > 
> > Well scratch that whole idea, i would need to add a new array to task
> > struct which make it a lot less appealing. Hence a better solution is
> > to instead have this as part of mm (well indirectly).
> 
> It shouldn't be too bad to add a struct radix_tree to the fdtable.
> 
> I'm sure we could just not support weird cases like sharing the fdtable
> without sharing the mm.  Does anyone actually do that?

Well like you pointed out what i really want is a 1:1 structure linking
a device struct an a mm_struct. Given that this need to be cleanup when
mm goes away hence tying this to mmu_notifier sounds like a better idea.

I am thinking of adding a hashtable to mmu_notifier_mm using file id for
hash as this should be a good hash value for common cases. I only expect
few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have
a hashtable inside their driver and they has on the mm struct pointer,
i believe hash mmu_notifier_mm using file id will be better.

J�r�me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 20:15                   ` Jerome Glisse
@ 2018-04-19 20:25                     ` Matthew Wilcox
  2018-04-19 20:39                       ` Al Viro
  2018-04-19 20:51                     ` Al Viro
  1 sibling, 1 reply; 23+ messages in thread
From: Matthew Wilcox @ 2018-04-19 20:25 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Jeff Layton, Dave Chinner, linux-mm, lsf-pc, linux-fsdevel,
	linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 04:15:02PM -0400, Jerome Glisse wrote:
> On Thu, Apr 19, 2018 at 12:56:37PM -0700, Matthew Wilcox wrote:
> > > Well scratch that whole idea, i would need to add a new array to task
> > > struct which make it a lot less appealing. Hence a better solution is
> > > to instead have this as part of mm (well indirectly).
> > 
> > It shouldn't be too bad to add a struct radix_tree to the fdtable.
> > 
> > I'm sure we could just not support weird cases like sharing the fdtable
> > without sharing the mm.  Does anyone actually do that?
> 
> Well like you pointed out what i really want is a 1:1 structure linking
> a device struct an a mm_struct. Given that this need to be cleanup when
> mm goes away hence tying this to mmu_notifier sounds like a better idea.
> 
> I am thinking of adding a hashtable to mmu_notifier_mm using file id for
> hash as this should be a good hash value for common cases. I only expect
> few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have
> a hashtable inside their driver and they has on the mm struct pointer,
> i believe hash mmu_notifier_mm using file id will be better.

file descriptors are small positive integers ... ideal for the radix tree.
If you need to find your data based on the struct file address, then by
all means a hashtable is the better data structure.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 17:26           ` Jerome Glisse
  2018-04-19 18:31             ` Jeff Layton
@ 2018-04-19 20:33             ` Al Viro
  2018-04-19 20:58               ` Jerome Glisse
  1 sibling, 1 reply; 23+ messages in thread
From: Al Viro @ 2018-04-19 20:33 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc,
	linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 01:26:10PM -0400, Jerome Glisse wrote:

> Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and
> add void * *private_data; to struct fdtable (also a default array to
> struct files_struct). The callback would be part of struct file_operations.
> and only call if it exist (os overhead is only for device driver that
> care).

Hell, *NO*.  This is insane - you would need to maintain extra counts
("how many descriptors refer to this struct file... for this descriptor
table").

Besides, _what_ private_data?  What would own and maintain it?  A specific
driver?  What if more than one of them wants that thing?
 
> Did i miss something fundamental ? copy_files() call dup_fd() so i
> should be all set here.

That looks like an extremely misguided kludge for hell knows what purpose,
almost certainly architecturally insane.  What are you actually trying to
achieve?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 20:25                     ` Matthew Wilcox
@ 2018-04-19 20:39                       ` Al Viro
  2018-04-19 21:08                         ` Jerome Glisse
  0 siblings, 1 reply; 23+ messages in thread
From: Al Viro @ 2018-04-19 20:39 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jerome Glisse, Jeff Layton, Dave Chinner, linux-mm, lsf-pc,
	linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 01:25:13PM -0700, Matthew Wilcox wrote:
> On Thu, Apr 19, 2018 at 04:15:02PM -0400, Jerome Glisse wrote:
> > On Thu, Apr 19, 2018 at 12:56:37PM -0700, Matthew Wilcox wrote:
> > > > Well scratch that whole idea, i would need to add a new array to task
> > > > struct which make it a lot less appealing. Hence a better solution is
> > > > to instead have this as part of mm (well indirectly).
> > > 
> > > It shouldn't be too bad to add a struct radix_tree to the fdtable.
> > > 
> > > I'm sure we could just not support weird cases like sharing the fdtable
> > > without sharing the mm.  Does anyone actually do that?
> > 
> > Well like you pointed out what i really want is a 1:1 structure linking
> > a device struct an a mm_struct. Given that this need to be cleanup when
> > mm goes away hence tying this to mmu_notifier sounds like a better idea.
> > 
> > I am thinking of adding a hashtable to mmu_notifier_mm using file id for
> > hash as this should be a good hash value for common cases. I only expect
> > few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have
> > a hashtable inside their driver and they has on the mm struct pointer,
> > i believe hash mmu_notifier_mm using file id will be better.
> 
> file descriptors are small positive integers ...

... except when there's a lot of them.  Or when something uses dup2() in
interesting ways, but hey - we could "just not support" that, right?

> ideal for the radix tree.
> If you need to find your data based on the struct file address, then by
> all means a hashtable is the better data structure.

Perhaps it would be a good idea to describe whatever is being attempted?

FWIW, passing around descriptors is almost always a bloody bad idea.  There
are very few things really associated with those and just about every time
I'd seen internal APIs that work in terms of those "small positive numbers"
they had been badly racy and required massive redesign to get something even
remotely sane.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 20:15                   ` Jerome Glisse
  2018-04-19 20:25                     ` Matthew Wilcox
@ 2018-04-19 20:51                     ` Al Viro
  1 sibling, 0 replies; 23+ messages in thread
From: Al Viro @ 2018-04-19 20:51 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Matthew Wilcox, Jeff Layton, Dave Chinner, linux-mm, lsf-pc,
	linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 04:15:02PM -0400, Jerome Glisse wrote:

> Well like you pointed out what i really want is a 1:1 structure linking
> a device struct an a mm_struct. Given that this need to be cleanup when
> mm goes away hence tying this to mmu_notifier sounds like a better idea.
> 
> I am thinking of adding a hashtable to mmu_notifier_mm using file id for
> hash as this should be a good hash value for common cases. I only expect
> few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have
> a hashtable inside their driver and they has on the mm struct pointer,
> i believe hash mmu_notifier_mm using file id will be better.

What _is_ "file id"?  If you are talking about file descriptors, you can
very well have several for the same opened file.  Moreover, you can
bloody well have it opened, then dup'ed, then original descriptor closed
and reused by another open...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 20:33             ` Al Viro
@ 2018-04-19 20:58               ` Jerome Glisse
  2018-04-19 21:21                 ` Al Viro
  0 siblings, 1 reply; 23+ messages in thread
From: Jerome Glisse @ 2018-04-19 20:58 UTC (permalink / raw)
  To: Al Viro
  Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc,
	linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 09:33:07PM +0100, Al Viro wrote:
> On Thu, Apr 19, 2018 at 01:26:10PM -0400, Jerome Glisse wrote:
> 
> > Basicly i want a callback in __fd_install(), do_dup2(), dup_fd() and
> > add void * *private_data; to struct fdtable (also a default array to
> > struct files_struct). The callback would be part of struct file_operations.
> > and only call if it exist (os overhead is only for device driver that
> > care).
> 
> Hell, *NO*.  This is insane - you would need to maintain extra counts
> ("how many descriptors refer to this struct file... for this descriptor
> table").
> 
> Besides, _what_ private_data?  What would own and maintain it?  A specific
> driver?  What if more than one of them wants that thing?

I hadn't something complex in mind (ie timelife link to struct file and
no refcouting changes). But anyway i gave up on that idea and will add
what i need in mmu_notifier.

>  
> > Did i miss something fundamental ? copy_files() call dup_fd() so i
> > should be all set here.
> 
> That looks like an extremely misguided kludge for hell knows what purpose,
> almost certainly architecturally insane.  What are you actually trying to
> achieve?

I need a struct to link part of device context with mm struct for a
process. Most of device context is link to the struct file of the
device file (ie process open has a file descriptor for the device
file).

Device driver for GPU have some part of their process context tied to
the process mm (accessing process address space directly from the GPU).
However we can not store this context information in the struct file
private data because of clone (same struct file accross different mm).

So today driver have an hashtable in their global device structure to
lookup context information for a given mm. This is sub-optimal and
duplicate a lot of code among different drivers.

Hence why i want something generic that allow a device driver to store
context structure that is specific to a mm. I thought that adding a
new array on the side of struct file array would be a good idea but it
has too many kludges.

So i will do something inside mmu_notifier and there will be no tie to
any fs aspect. I expect only a handful of driver to care about this and
for a given platform you won't see that many devices hence you won't
have that many pointer to deal with.

Cheers,
J�r�me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 20:39                       ` Al Viro
@ 2018-04-19 21:08                         ` Jerome Glisse
  0 siblings, 0 replies; 23+ messages in thread
From: Jerome Glisse @ 2018-04-19 21:08 UTC (permalink / raw)
  To: Al Viro
  Cc: Matthew Wilcox, Jeff Layton, Dave Chinner, linux-mm, lsf-pc,
	linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 09:39:53PM +0100, Al Viro wrote:
> On Thu, Apr 19, 2018 at 01:25:13PM -0700, Matthew Wilcox wrote:
> > On Thu, Apr 19, 2018 at 04:15:02PM -0400, Jerome Glisse wrote:
> > > On Thu, Apr 19, 2018 at 12:56:37PM -0700, Matthew Wilcox wrote:
> > > > > Well scratch that whole idea, i would need to add a new array to task
> > > > > struct which make it a lot less appealing. Hence a better solution is
> > > > > to instead have this as part of mm (well indirectly).
> > > > 
> > > > It shouldn't be too bad to add a struct radix_tree to the fdtable.
> > > > 
> > > > I'm sure we could just not support weird cases like sharing the fdtable
> > > > without sharing the mm.  Does anyone actually do that?
> > > 
> > > Well like you pointed out what i really want is a 1:1 structure linking
> > > a device struct an a mm_struct. Given that this need to be cleanup when
> > > mm goes away hence tying this to mmu_notifier sounds like a better idea.
> > > 
> > > I am thinking of adding a hashtable to mmu_notifier_mm using file id for
> > > hash as this should be a good hash value for common cases. I only expect
> > > few drivers to need that (GPU drivers, RDMA). Today GPU drivers do have
> > > a hashtable inside their driver and they has on the mm struct pointer,
> > > i believe hash mmu_notifier_mm using file id will be better.
> > 
> > file descriptors are small positive integers ...
> 
> ... except when there's a lot of them.  Or when something uses dup2() in
> interesting ways, but hey - we could "just not support" that, right?
> 
> > ideal for the radix tree.
> > If you need to find your data based on the struct file address, then by
> > all means a hashtable is the better data structure.
> 
> Perhaps it would be a good idea to describe whatever is being attempted?
> 
> FWIW, passing around descriptors is almost always a bloody bad idea.  There
> are very few things really associated with those and just about every time
> I'd seen internal APIs that work in terms of those "small positive numbers"
> they had been badly racy and required massive redesign to get something even
> remotely sane.

Ok i will use struct device pointer as index, or something else (i
would like to use PCI domain:bus:slot but i don't want this to be
PCIE only), maybe dev_t ...

Cheers,
J�r�me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 20:58               ` Jerome Glisse
@ 2018-04-19 21:21                 ` Al Viro
  2018-04-19 21:47                   ` Jerome Glisse
  0 siblings, 1 reply; 23+ messages in thread
From: Al Viro @ 2018-04-19 21:21 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc,
	linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 04:58:20PM -0400, Jerome Glisse wrote:

> I need a struct to link part of device context with mm struct for a
> process. Most of device context is link to the struct file of the
> device file (ie process open has a file descriptor for the device
> file).

Er...  You do realize that
	fd = open(...)
	mmap(fd, ...)
	close(fd)
is absolutely legitimate, right?  IOW, existence/stability/etc. of
a file descriptor is absolutely *not* guaranteed - in fact, there might
be not a single file descriptor referring to a given openen and mmaped
file.

> Device driver for GPU have some part of their process context tied to
> the process mm (accessing process address space directly from the GPU).
> However we can not store this context information in the struct file
> private data because of clone (same struct file accross different mm).
> 
> So today driver have an hashtable in their global device structure to
> lookup context information for a given mm. This is sub-optimal and
> duplicate a lot of code among different drivers.

Umm...  Examples?

> Hence why i want something generic that allow a device driver to store
> context structure that is specific to a mm. I thought that adding a
> new array on the side of struct file array would be a good idea but it
> has too many kludges.
> 
> So i will do something inside mmu_notifier and there will be no tie to
> any fs aspect. I expect only a handful of driver to care about this and
> for a given platform you won't see that many devices hence you won't
> have that many pointer to deal with.

Let's step back for a second - lookups by _what_?  If you are associating
somethin with a mapping, vm_area_struct would be a natural candidate for
storing such data, wouldn't it?

What do you have and what do you want to find?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 21:21                 ` Al Viro
@ 2018-04-19 21:47                   ` Jerome Glisse
  2018-04-19 22:13                     ` Al Viro
  0 siblings, 1 reply; 23+ messages in thread
From: Jerome Glisse @ 2018-04-19 21:47 UTC (permalink / raw)
  To: Al Viro
  Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc,
	linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 10:21:37PM +0100, Al Viro wrote:
> On Thu, Apr 19, 2018 at 04:58:20PM -0400, Jerome Glisse wrote:
> 
> > I need a struct to link part of device context with mm struct for a
> > process. Most of device context is link to the struct file of the
> > device file (ie process open has a file descriptor for the device
> > file).
> 
> Er...  You do realize that
> 	fd = open(...)
> 	mmap(fd, ...)
> 	close(fd)
> is absolutely legitimate, right?  IOW, existence/stability/etc. of
> a file descriptor is absolutely *not* guaranteed - in fact, there might
> be not a single file descriptor referring to a given openen and mmaped
> file.

Yes and that's fine, on close(fd) the device driver is tear down and
struct i want to store is tear down too and free.

> 
> > Device driver for GPU have some part of their process context tied to
> > the process mm (accessing process address space directly from the GPU).
> > However we can not store this context information in the struct file
> > private data because of clone (same struct file accross different mm).
> > 
> > So today driver have an hashtable in their global device structure to
> > lookup context information for a given mm. This is sub-optimal and
> > duplicate a lot of code among different drivers.
> 
> Umm...  Examples?

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_mn.c
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/i915/i915_gem_userptr.c
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c

RDMA folks too have similar construct.

> 
> > Hence why i want something generic that allow a device driver to store
> > context structure that is specific to a mm. I thought that adding a
> > new array on the side of struct file array would be a good idea but it
> > has too many kludges.
> > 
> > So i will do something inside mmu_notifier and there will be no tie to
> > any fs aspect. I expect only a handful of driver to care about this and
> > for a given platform you won't see that many devices hence you won't
> > have that many pointer to deal with.
> 
> Let's step back for a second - lookups by _what_?  If you are associating
> somethin with a mapping, vm_area_struct would be a natural candidate for
> storing such data, wouldn't it?
> 
> What do you have and what do you want to find?

So you are in an ioctl against the device file, you have struct file
and driver store a pointer to some file context info in struct file
private data which itself has a pointer to some global device driver
structure which itself has a pointer to struct device.

Hence i have struct mm (from current->mm), and dev_t easily available.

The context information is tie to the mm for the device and can only
be use against said mm. Even if the struct file of the device outlive
the original process, no one can use that struct with a process that
do not have the same mm. Moreover that struct is freed if the mm is
destroy.

If child, share the struct file but have a different and want to use
same feature then a new structure is created and has same property ie
can only be use against this new mm.

The link with struct file is not explicit but you can only use objects
tie to that struct through ioctl against the struct file.

Hopes this clarify the use case.

Cheers,
J�r�me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [LSF/MM] schedule suggestion
  2018-04-19 21:47                   ` Jerome Glisse
@ 2018-04-19 22:13                     ` Al Viro
  0 siblings, 0 replies; 23+ messages in thread
From: Al Viro @ 2018-04-19 22:13 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Jeff Layton, Matthew Wilcox, Dave Chinner, linux-mm, lsf-pc,
	linux-fsdevel, linux-block, Johannes Weiner, Michal Hocko

On Thu, Apr 19, 2018 at 05:47:51PM -0400, Jerome Glisse wrote:
> On Thu, Apr 19, 2018 at 10:21:37PM +0100, Al Viro wrote:
> > On Thu, Apr 19, 2018 at 04:58:20PM -0400, Jerome Glisse wrote:
> > 
> > > I need a struct to link part of device context with mm struct for a
> > > process. Most of device context is link to the struct file of the
> > > device file (ie process open has a file descriptor for the device
> > > file).
> > 
> > Er...  You do realize that
> > 	fd = open(...)
> > 	mmap(fd, ...)
> > 	close(fd)
> > is absolutely legitimate, right?  IOW, existence/stability/etc. of
> > a file descriptor is absolutely *not* guaranteed - in fact, there might
> > be not a single file descriptor referring to a given openen and mmaped
> > file.
> 
> Yes and that's fine, on close(fd) the device driver is tear down

No, it is not.  _NOTHING_ is done on that close(fd), other than removing
a reference from descriptor table.  In this case struct file is still
open and remains such until munmap().

That's why descriptor table is a very bad place for sticking that kind of
information.  Besides, as soon as your syscall (ioctl, write, whatever)
has looked struct file up, the mapping from descriptors to struct file
can change.  Literally before fdget() has returned to caller.  Another
thread could do dup() and close() of the original descriptor.  Or
just plain close(), for that matter - struct file will remain open until
fdput().

> and
> struct i want to store is tear down too and free.

So do a hash table indexed by pair (void *, struct mm_struct *) and
do lookups there...  And use radeon_device as the first half of the
key.  Or struct file *, or pointer to whatever private data you maintain
for an opened file...

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-04-19 22:13 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-18 21:19 [LSF/MM] schedule suggestion Jerome Glisse
2018-04-19  0:48 ` Dave Chinner
2018-04-19  1:55 ` Dave Chinner
2018-04-19 14:38   ` Jerome Glisse
2018-04-19 14:43     ` Matthew Wilcox
2018-04-19 16:30       ` Jerome Glisse
2018-04-19 16:58         ` Jeff Layton
2018-04-19 17:26           ` Jerome Glisse
2018-04-19 18:31             ` Jeff Layton
2018-04-19 19:31               ` Jerome Glisse
2018-04-19 19:56                 ` Matthew Wilcox
2018-04-19 20:15                   ` Jerome Glisse
2018-04-19 20:25                     ` Matthew Wilcox
2018-04-19 20:39                       ` Al Viro
2018-04-19 21:08                         ` Jerome Glisse
2018-04-19 20:51                     ` Al Viro
2018-04-19 20:33             ` Al Viro
2018-04-19 20:58               ` Jerome Glisse
2018-04-19 21:21                 ` Al Viro
2018-04-19 21:47                   ` Jerome Glisse
2018-04-19 22:13                     ` Al Viro
2018-04-19 14:51   ` Chris Mason
2018-04-19 15:07     ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).