Why do we let munmap fail?

* Why do we let munmap fail?
@ 2018-05-21 22:07 Daniel Colascione
  2018-05-21 22:12 ` Dave Hansen
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Colascione @ 2018-05-21 22:07 UTC (permalink / raw)
  To: linux-mm; +Cc: Tim Murray, Minchan Kim

Right now, we have this system knob max_map_count that caps the number of
VMAs we can have in a single address space. Put aside for the moment of
whether this knob should exist: even if it does, enforcing it for munmap,
mprotect, etc. produces weird and counter-intuitive situations in which
it's possible to fail to return resources (address space and commit charge)
to the system. At a deep philosophical level, that's the kind of operation
that should never fail. A library that does all the right things can still
experience a failure to deallocate resources it allocated itself if it gets
unlucky with VMA merging. Why should we allow that to happen?

Now let's return to max_map_count itself: what is it supposed to achieve?
If we want to limit application kernel memory resource consumption, let's
limit application kernel memory resource consumption, accounting for it on
a byte basis the same way we account for other kernel objects allocated on
behalf of userspace. Why should we have a separate cap just for the VMA
count?

I propose the following changes:

1) Let -1 mean "no VMA count limit".
2) Default max_map_count to -1.
3) Do not enforce max_map_count on munmap and mprotect.

Alternatively, can we account VMAs toward max_map_count on a page count
basis instead of a VMA basis? This way, no matter how you split and merge
your VMAs, you'll never see a weird failure to release resources. We'd have
to bump the default value of max_map_count to compensate for its new
interpretation.

^ permalink raw reply	[flat|nested] 19+ messages in thread