On Sun, Oct 30, 2022 at 11:51 AM Linus Torvalds wrote: > > We could keep the current placement of the TLB flush, to just before > we drop the page table lock. > > And we could do all the things we do in 'page_remove_rmap()' right now > *except* for the mapcount stuff. > > And only move the mapcount code to the page freeing stage. So I actually have a three-commit series to do the rmap simplification, but let me just post the end result of that series, because the end result is actually smaller than the individual commits (I did it as three incremental commits just to make it more obvious to me how to get to that end result). The three commits end up being mm: introduce simplified versions of 'page_remove_rmap()' mm: inline simpler case of page_remove_file_rmap() mm: re-unify the simplified page_zap_*_rmap() function and the end result of them is this attached patch. I'm *claiming* that the attached patch is semantically identical to what we do before it, just _hugely_ simplified. Basically, that new 'page_zap_pte_rmap()' does the same things that 'page_remove_rmap()' did, except it is limited to only last-level PTE entries (and that munlock_vma_page() has to be called separately). The simplification comes from 'compound' being false, from it always being about small pages, and from the atomic mapcount decrement having been moved outside the memcg lock, since it is independent of it. Anyway, this simplification patch basically means that the *next* step could be to just move that ipage_zap_pte_rmap()' after the TLB flush, and now it's trivial and no longer scary. I did *not* do that yet, because it still needs that "encoded_page[]" array - except now it doesn't encode the 'dirty' bit, now it would encode the 'do a page->_mapcount decrement' bit. I didn't do that part, because needed to do the rc3 release, plus I'd like to have somebody look at this introductory patch first. Linus