On Mon, Jan 23 2017, Matthew Wilcox wrote: > On Sun, Jan 22, 2017 at 03:45:01PM +1100, NeilBrown wrote: >> On Sun, Jan 22 2017, Theodore Ts'o wrote: >> > On Sat, Jan 21, 2017 at 11:11:41AM +1100, NeilBrown wrote: >> >> What are the benefits of GFP_TEMPORARY? Presumably it doesn't guarantee >> >> success any more than GFP_KERNEL does, but maybe it is slightly less >> >> likely to fail, and somewhat less likely to block for a long time?? But >> >> without some sort of promise, I wonder why anyone would use the >> >> flag. Is there a promise? Or is it just "you can be nice to the MM >> >> layer by setting this flag sometimes". ??? >> > >> > My understanding is that the idea is to allow short-term use cases not >> > to be mixed with long-term use cases --- in the Java world, to declare >> > that a particular object will never be promoted from the "nursury" >> > arena to the "tenured" arena, so that we don't end up with a situation >> > where a page is used 90% for temporary objects, and 10% for a tenured >> > object, such that later on we have a page which is 90% unused. >> > >> > Many of the existing users may in fact be for things like a temporary >> > bounce buffer for I/O, where declaring this to the mm system could >> > lead to less fragmented pages, but which would violate your proposed >> > contract: > > I don't have a clear picture in my mind of when Java promotes objects > from nursery to tenure ... which is not too different from my lack of > understanding of what the MM layer considers "temporary" :-) Is it > acceptable usage to allocate a SCSI command (guaranteed to be freed > within 30 seconds) from the temporary area? Or should it only be used > for allocations where the thread of control is not going to sleep between > allocation and freeing? > >> You have used terms like "nursery" and "tenured" which don't really help >> without definitions of those terms. >> How about >> >> GFP_TEMPORARY should be used when the memory allocated will either be >> freed, or will be placed in a reclaimable cache, after some sequence >> of events which is time-limited. i.e. there must be no indefinite >> wait on the path from allocation to freeing-or-caching. >> The memory will typically be allocated from a region dedicated to >> GFP_TEMPORARY allocations, thus ensuring that this region does not >> become fragmented. Consequently, the delay imposed on GFP_TEMPORARY >> allocations is likely to be less than for non-TEMPORARY allocations >> when memory pressure is high. > > I think you're overcomplicating your proposed contract by allowing for > the "adding to a reclaimable cache" case. If that will happen, the > code should be using GFP_RECLAIMABLE, not GFP_TEMPORARY as a matter of > good documentation. And to allow the definitions to differ in future. > Maybe they will always be the same bit pattern, but the code should > distinguish the two cases (obviously there is no problem with allocating > memory with GFP_RECLAIMABLE, then deciding you didn't need it after all > and freeing it). I only included the "Reclaimable cache" possibility because Michal said: I guess the original intention was to use this flag for allocations which will be either freed shortly or they are reclaimable. > >> ?? >> I think that for this definition to work, we would need to make it "a >> movable cache", meaning that any item can be either freed or >> re-allocated (presumably to a "tenured" location). I don't think we >> currently have that concept for slabs do we? That implies that this >> flag would only apply to whole-page allocations (which was part of the >> original question). We could presumably add movability to >> slab-shrinkers if these seemed like a good idea. > > Funnily, Christoph Lameter and I are working on just such a proposal. > He put it up as a topic discussion at the LCA Kernel Miniconf, and I've > done a proof of concept implementation for radix tree nodes. It needs > changes to the radix tree API to make it work, so it's not published yet, > but it's a useful proof of concept for things which can probably work > and be more effective, like the dentry & inode caches. Awesome! > >> I think that it would also make sense to require that the path from >> allocation to freeing (or caching) of GFP_TEMPORARY allocation must not >> wait for a non-TEMPORARY allocation, as that becomes an indefinite wait. > > ... can it even wait for *another* TEMPORARY allocation? I really think > this discussion needs to take place in a room with more people present > so we can get misunderstandings hammered out and general acceptance of > the consensus. I suspect you are right, but throwing around some thoughts in advance, to spark new ideas, can't hurt? I hate going to meetings where the agenda has a topic, but no background discussion. It means that I have to do all my thinking on my feet (not that I'll be at this meeting). NeilBrown