All of lore.kernel.org
 help / color / mirror / Atom feed
* scalability regressions related to hugetlb_fault() changes
@ 2022-03-24 20:12 Ray Fucillo
  2022-03-24 21:55 ` Randy Dunlap
  2022-03-30 12:22 ` Thorsten Leemhuis
  0 siblings, 2 replies; 8+ messages in thread
From: Ray Fucillo @ 2022-03-24 20:12 UTC (permalink / raw)
  To: linux-kernel

In moving to newer versions of the kernel, our customers have experienced dramatic new scalability problems in our database application, InterSystems IRIS.  Our research has narrowed this down to new processes that attach to the database's shared memory segment taking very long delays (in some cases ~100ms!) acquiring the i_mmap_lock_read() in hugetlb_fault() as they fault in the huge page for the first time.  The addition of this lock in hugetlb_fault() matches the versions where we see this problem.  It's not just slowing the new process that incurs the delay, but backing up other processes if the page fault occurs inside a critical section within the database application.

Is there something that can be improved here?  

The read locks in hugetlb_fault() contend with write locks that seem to be taken in very common application code paths: shmat(), process exit, fork() (not vfork()), shmdt(), presumably others.  So hugetlb_fault() contending to read turns out to be common.  When the system is loaded, there will be many new processes faulting in pages that may blocks the write lock, which in turn blocks more readers in fault behind it, and so on...  I don't think there's any support for shared page tables in hugetlb to avoid the faults altogether.

Switching to 1GB huge pages instead of 2MB is a good mitigation in reducing the frequency of fault, but not a complete solution.

Thanks for considering.

Ray

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-03-30 12:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-24 20:12 scalability regressions related to hugetlb_fault() changes Ray Fucillo
2022-03-24 21:55 ` Randy Dunlap
2022-03-24 22:41   ` Mike Kravetz
2022-03-25  0:02     ` Ray Fucillo
2022-03-25  4:40       ` Mike Kravetz
2022-03-25 13:33         ` Ray Fucillo
2022-03-28 18:30           ` Mike Kravetz
2022-03-30 12:22 ` Thorsten Leemhuis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.