On Mon, 12 Apr 2010, Andrew Morton wrote: > On Mon, 12 Apr 2010 07:22:17 +0100 > Eric B Munson wrote: > > > Andrew, > > > > I am resubmitting this patch because I believe that the discussion > > has shown this to be an acceptable solution. > > To whom? Some acked-by's would clarify. > > > I have fixed the 32 bit > > build errors, but other than that change, the code is the same as > > Roland's V3 patch. > > > > From: Roland Dreier > > > > As discussed in > > and follow-up messages, libraries using RDMA would like to track > > precisely when application code changes memory mapping via free(), > > munmap(), etc. Current pure-userspace solutions using malloc hooks > > and other tricks are not robust, and the feeling among experts is that > > the issue is unfixable without kernel help. > > But this info could be reassembled by tracking syscall activity, yes? > Perhaps some discussion here explaining why the (possibly enhanced) > ptrace, audit, etc interfaces are unsuitable. > > > We solve this not by implementing the full API proposed in the email > > linked above but rather with a simpler and more generic interface, > > which may be useful in other contexts. Specifically, we implement a > > new character device driver, ummunotify, that creates a /dev/ummunotify > > node. A userspace process can open this node read-only and use the fd > > as follows: > > > > 1. ioctl() to register/unregister an address range to watch in the > > kernel (cf struct ummunotify_register_ioctl in ). > > > > 2. read() to retrieve events generated when a mapping in a watched > > address range is invalidated (cf struct ummunotify_event in > > ). select()/poll()/epoll() and SIGIO are > > handled for this IO. > > > > 3. mmap() one page at offset 0 to map a kernel page that contains a > > generation counter that is incremented each time an event is > > generated. This allows userspace to have a fast path that checks > > that no events have occurred without a system call. > > OK, what's missing from this whole description and from ummunotify.txt > is: how does one specify the target process? Does /dev/ummunotify > implicitly attach to current->mm? If so, why, and what are the > implications of this? > > If instead it is possible to attach to some other process's mmu > activity (/proc//ummunotity?) then how is that done and what are > the security/permissions implications? > > Also, the whole thing is obviously racy: by the time userspace finds > out that something has happened, it might have changed. This > inevitably reduces the applicability/usefulness of the whole thing as > compared to some synchronous mechanism which halts the monitored thread > until the request has been processed and acked. All this should (IMO) > be explored, explained and justified. > > Also, what prevents the obvious DoS which occurs when I register for > events and just let them queue up until the kernel runs out of memory? > presumably events get dropped - what are the reliability implications > of this and how is the max queue length managed? > > Also, ioctls are unpopular. Were other intefaces considered? > I am reworking the Documentation to address all these questions and will resubmit when finished. Thanks for the feedback, Eric