* COW in XArray [not found] <CA+49okpy=FUsZpc-WcBG9tMUwzgP7MYNuPPKN22BR=dq3HQ9tA@mail.gmail.com> @ 2019-05-12 14:56 ` Shawn Landden 2019-05-13 2:22 ` Matthew Wilcox 0 siblings, 1 reply; 4+ messages in thread From: Shawn Landden @ 2019-05-12 14:56 UTC (permalink / raw) To: linux-fsdevel, Matthew Wilcox Willy, I am trying to implement epochs for pids. For this I need to allow radix tree operations to be specified COW (deletion does not need to change). Radix trees look like they are under alot of work by you, so how can I best get this feature, and have some code I can work with to write my feature? -Shawn Landden ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: COW in XArray 2019-05-12 14:56 ` COW in XArray Shawn Landden @ 2019-05-13 2:22 ` Matthew Wilcox 2019-05-13 3:42 ` Shawn Landden 0 siblings, 1 reply; 4+ messages in thread From: Matthew Wilcox @ 2019-05-13 2:22 UTC (permalink / raw) To: Shawn Landden; +Cc: linux-fsdevel On Sun, May 12, 2019 at 09:56:47AM -0500, Shawn Landden wrote: > I am trying to implement epochs for pids. For this I need to allow > radix tree operations to be specified COW (deletion does not need to > change). Radix > trees look like they are under alot of work by you, so how can I best > get this feature, and have some code I can work with to write my > feature? Hi Shawn, I'd love to help, but I don't quite understand what you want. Here's the conversion of the PID allocator from the IDR to the XArray: http://git.infradead.org/users/willy/linux-dax.git/commitdiff/223ad3ae5dfffdfc5642b1ce54df2c7836b57ef1 What semantics do you want to change? ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: COW in XArray 2019-05-13 2:22 ` Matthew Wilcox @ 2019-05-13 3:42 ` Shawn Landden 2019-05-13 3:51 ` Shawn Landden 0 siblings, 1 reply; 4+ messages in thread From: Shawn Landden @ 2019-05-13 3:42 UTC (permalink / raw) To: Matthew Wilcox; +Cc: linux-fsdevel On Sun, May 12, 2019 at 9:22 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Sun, May 12, 2019 at 09:56:47AM -0500, Shawn Landden wrote: > > I am trying to implement epochs for pids. For this I need to allow > > radix tree operations to be specified COW (deletion does not need to > > change). Radix > > trees look like they are under alot of work by you, so how can I best > > get this feature, and have some code I can work with to write my > > feature? > > Hi Shawn, > > I'd love to help, but I don't quite understand what you want. > > Here's the conversion of the PID allocator from the IDR to the XArray: > > http://git.infradead.org/users/willy/linux-dax.git/commitdiff/223ad3ae5dfffdfc5642b1ce54df2c7836b57ef1 > > What semantics do you want to change? When allocating a pid, you pass an epoch number. If the pids being allocated wrap, then the epoch is incremented, and a new radix tree created that is COW of the last epoch. If the page that is found for allocation is of an older epoch, it is copied and the allocation only happens in the copy. On freeing a pid, there a single radix-tree bit for every still-active epoch that is set to indicate that this slot has expired. This will be used for the (new) waitpidv syscall, which can provide all the functionality of wait4() and more, and allows process to synchronize their references to the current epoch. The current versions of the pid syscalls will continue to operate with the same existing racy semantics. New pid syscalls will be added that take an epoch argument. A current pid epoch u32 is added to task_sched, that reset on fork() when a new process is allocated, then a new pid is allocated, and the epoch has a prctl setter and getter. If a syscall comes in with and the epoch passed is not current AND has passed the pid of the process (this is not a lock, because we current and previous epochs are always available), then it might fail with EEPOCH, the caller then has to call a new syscall, waitpidv(pidv *pid_t, epoch, O_NONBLOCK) providing a list of pids it has references to in a specific epoch, and it gets back a list of which processes have excited. The epoch of a process is always relative to it's pid (not thread-id), so the same epoch number can mean differn't things in differn't places. The process can then invalidate its own internal pids and use ptctl to indicate it doesn't need the old epoch. Processes also get a signal if they haven't updated and are 2 full epochs behind. Being behind should also could against a process in kernel memory accounting. I am sure there is much more to consider.... ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: COW in XArray 2019-05-13 3:42 ` Shawn Landden @ 2019-05-13 3:51 ` Shawn Landden 0 siblings, 0 replies; 4+ messages in thread From: Shawn Landden @ 2019-05-13 3:51 UTC (permalink / raw) To: Matthew Wilcox; +Cc: linux-fsdevel Actually, I am sorry. I will have to put some more thought into how to do this, as it might be possible without only some bitmaps to keep track of invalidated processes. On Sun, May 12, 2019 at 10:42 PM Shawn Landden <slandden@gmail.com> wrote: > > On Sun, May 12, 2019 at 9:22 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Sun, May 12, 2019 at 09:56:47AM -0500, Shawn Landden wrote: > > > I am trying to implement epochs for pids. For this I need to allow > > > radix tree operations to be specified COW (deletion does not need to > > > change). Radix > > > trees look like they are under alot of work by you, so how can I best > > > get this feature, and have some code I can work with to write my > > > feature? > > > > Hi Shawn, > > > > I'd love to help, but I don't quite understand what you want. > > > > Here's the conversion of the PID allocator from the IDR to the XArray: > > > > http://git.infradead.org/users/willy/linux-dax.git/commitdiff/223ad3ae5dfffdfc5642b1ce54df2c7836b57ef1 > > > > What semantics do you want to change? > When allocating a pid, you pass an epoch number. If the pids being > allocated wrap, then the epoch is incremented, and a new radix tree > created that is COW of the last epoch. If the page that is found for > allocation is of an older epoch, it is copied and the allocation only > happens in the copy. > > On freeing a pid, there a single radix-tree bit for every still-active > epoch that is set to indicate that this slot has expired. This will be > used for the (new) waitpidv syscall, which can provide all the > functionality of wait4() and more, and allows process to synchronize > their references to the current epoch. > > The current versions of the pid syscalls will continue to operate with > the same existing racy semantics. New pid syscalls will be added that > take an epoch argument. A current pid epoch u32 is added to > task_sched, that reset on fork() when a new process is allocated, then > a new pid is allocated, and the epoch has a prctl setter and getter. > > If a syscall comes in with and the epoch passed is not current AND has > passed the pid of the process (this is not a lock, because we current > and previous epochs are always available), then it might fail with > EEPOCH, the caller then has to call a new syscall, waitpidv(pidv > *pid_t, epoch, O_NONBLOCK) providing a list of pids it has references > to in a specific epoch, and it gets back a list of which processes > have excited. > > The epoch of a process is always relative to it's pid (not thread-id), > so the same epoch number can mean differn't things in differn't > places. > > The process can then invalidate its own internal pids and use ptctl to > indicate it doesn't need the old epoch. Processes also get a signal if > they haven't updated and are 2 full epochs behind. Being behind should > also could against a process in kernel memory accounting. I am sure > there is much more to consider.... ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-05-13 3:52 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CA+49okpy=FUsZpc-WcBG9tMUwzgP7MYNuPPKN22BR=dq3HQ9tA@mail.gmail.com> 2019-05-12 14:56 ` COW in XArray Shawn Landden 2019-05-13 2:22 ` Matthew Wilcox 2019-05-13 3:42 ` Shawn Landden 2019-05-13 3:51 ` Shawn Landden
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).