Returning ENOSPC when you have free space you can't yet prove is safer than not returning it and risking a data loss when you get hit by a write/commit storm. :) On Thu, Jul 30, 2015 at 9:44 PM, OGAWA Hirofumi wrote: > Jan Kara writes: > > >> > Yes, if userspace truncates the file, the situation we end up with is > >> > basically the same. However for truncate to happen some malicious > process > >> > has to come and truncate the file - a failure scenario that is > acceptable > >> > for most use cases since it doesn't happen unless someone is actively > >> > trying to screw you. With page forking it is enough for flusher thread > >> > to start writeback for that page to trigger the problem - event that > is > >> > basically bound to happen without any other userspace application > >> > interfering. > >> > >> Acceptable conclusion is where came from? That pseudocode logic doesn't > >> say about usage at all. And even if assume it is acceptable, as far as I > >> can see, for example /proc/sys/vm/drop_caches is enough to trigger, or a > >> page on non-exists block (sparse file. i.e. missing disk space check in > >> your logic). And if really no any lock/check, there would be another > >> races. > > > > So drop_caches won't cause any issues because it avoids mmaped pages. > > Also page reclaim or page migration don't cause any issues because > > they avoid pages with increased refcount (and increased refcount would > stop > > drop_caches from reclaiming the page as well if it was not for the mmaped > > check before). Generally, elevated page refcount currently guarantees > page > > isn't migrated, reclaimed, or otherwise detached from the mapping (except > > for truncate where the combination of mapping-index becomes invalid) and > > your page forking would change that assumption - which IMHO has a big > > potential for some breakage somewhere. > > Lifetime and visibility from user are different topic. The issue here > is visibility. Of course, those has relation more or less though, > refcount doesn't stop to drop page from radix-tree at all. > > Well, anyway, your claim seems to be assuming the userspace app > workarounds the issues. And it sounds like still not workarounds the > ENOSPC issue (validate at page fault/GUP) even if assuming userspace > behave as perfect. Calling it as kernel assumption is strange. > > If you claim, there is strange logic widely used already, and of course, > we can't simply break it because of compatibility. I would be able to > agree. But your claim sounds like that logic is sane and well designed > behavior. So I disagree. > > > And frankly I fail to see why you and Daniel care so much about this > > corner case because from performance POV it's IMHO a non-issue and you > > bother with page forking because of performance, don't you? > > Trying to penalize the corner case path, instead of normal path, should > try at first. Penalizing normal path to allow corner case path is insane > basically. > > Make normal path faster and more reliable is what we are trying. > > > So you can have a look for example at > > drivers/media/v4l2-core/videobuf2-dma-contig.c which implements setting > up > > of a video device buffer at virtual address specified by user. Now I > don't > > know whether there really is any userspace video program that sets up the > > video buffer in mmaped file. I would agree with you that it would be a > > strange thing to do but I've seen enough strange userspace code that I > > would not be too surprised. > > > > Another example of similar kind is at > > drivers/infiniband/core/umem.c where we again set up buffer for > infiniband > > cards at users specified virtual address. And there are more drivers in > > kernel like that. > > Unfortunately, I'm not looking those yet though. I guess those would be > helpful to see the details. > > Thanks. > -- > OGAWA Hirofumi > > _______________________________________________ > Tux3 mailing list > Tux3@phunq.net > http://phunq.net/mailman/listinfo/tux3 >