From: "Paul E. McKenney" <email@example.com> To: Thomas Gleixner <firstname.lastname@example.org> Cc: Michal Hocko <email@example.com>, Uladzislau Rezki <firstname.lastname@example.org>, LKML <email@example.com>, RCU <firstname.lastname@example.org>, email@example.com, Andrew Morton <firstname.lastname@example.org>, Vlastimil Babka <email@example.com>, Matthew Wilcox <firstname.lastname@example.org>, "Theodore Y . Ts'o" <email@example.com>, Joel Fernandes <firstname.lastname@example.org>, Sebastian Andrzej Siewior <email@example.com>, Oleksiy Avramchenko <firstname.lastname@example.org>, Peter Zijlstra <email@example.com> Subject: Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag Date: Fri, 14 Aug 2020 20:01:48 -0700 [thread overview] Message-ID: <20200815030148.GX4295@paulmck-ThinkPad-P72> (raw) In-Reply-To: <firstname.lastname@example.org> On Sat, Aug 15, 2020 at 02:43:51AM +0200, Thomas Gleixner wrote: > Paul, > > On Fri, Aug 14 2020 at 16:41, Paul E. McKenney wrote: > > On Sat, Aug 15, 2020 at 01:14:53AM +0200, Thomas Gleixner wrote: > >> As a matter of fact I assume^Wdeclare that removing struct rcu_head which > >> provides a fallback is not an option at all. I know that you want to, > >> but it wont work ever. Dream on, but as we agreed on recently there is > >> this thing called reality which ruins everything. > > > > For call_rcu(), agreed. For kfree_rcu(), we know what the callback is > > going to do, plus single-argument kfree_rcu() can only be invoked from > > sleepable context. (If you want to kfree_rcu() from non-sleepable > > context, that will cost you an rcu_head in the data structure being > > freed.) > > kfree_rcu() as of today is just a conveniance wrapper around > call_rcu(obj, rcu) which can be called from any context and it still > takes TWO arguments. > > Icepack? Indeed. Make that not kfree_rcu(), but rather kvfree_rcu(), which is in mainline. :-/ > So if you come up with a new kfree_rcu_magic(void *obj) single argument > variant which can only be called from sleepable contexts then this does > not require any of the raw lock vs. non raw hacks at all because you can > simply allocate without holding the raw lock in the rare case that you > run out of storage space. With four 4k pages preallocated per CPU that's > every 2048 invocations per CPU on 64bit. > > So if you run into that situation then you drop the lock and then it's > racy because you might be preempted or migrated after dropping the lock > and you might have done a useless allocation, but that does not justify > having a special allocator just for that? You have an extra page, so > what? > > To prevent subsequent callers to add to the allocation race you simply > can let them wait on the first allocating attempt to finish That avoids > more pointless allocations and as a side effect prevents all of them to > create more pressure by continuing their open/close loop naturally > without extra work. Agreed, as I said, it is the double-argument version that is the challenge. > > So if the single-argument kfree_rcu() case gets hit with a > > memory-allocation failure, it can fall back to waiting for a grace > > period and doing the free. Of course, grace-period waits have horrible > > latency, but under OOM life is hard. If this becomes a problem in > > non-OOM situations due to the lockless caches becoming empty, we will > > have to allocate memory if needed before acquiring the lock with the > > usual backout logic. Doing that means that we can let the allocator > > acquire locks and maybe even do a little bit of blocking, so that the > > inline grace-period-wait would only happen if the system was well and > > truly OOMed. > > No. It dropped the rcu internal lock and does a regular GFP_KENRNEL > allocation which waits for the page to become available. Which is a good > thing in the open/close scenario because it throttles the offender. Understood, especially that last. But it really doesn't want to be waiting in the memory allocator for more than a grace period. But that was hashed out quite some time ago, and there is a combination of GFP_* flags that achieves the right balance for the can-sleep situation. > >> For normal operations a couple of pages which can be preallocated are > >> enough. What you are concerned of is the case where you run out of > >> pointer storage space. > > > > Agreed. > > > >> There are two reasons why that can happen: > >> > >> 1) RCU call flooding > >> 2) RCU not being able to run and mop up the backlog > >> > >> #1 is observable by looking at the remaining storage space and the RCU > >> call frequency > >> > >> #2 is uninteresting because it's caused by RCU being stalled / delayed > >> e.g. by a runaway of some sorts or a plain RCU usage bug. > >> > >> Allocating more memory in that case does not solve or improve anything. > > > > Yes, #2 is instead RCU CPU stall warning territory. > > > > If this becomes a problem, one approach is to skip the page-of-pointers > > allocation if the grace period is more than (say) one second old. If > > the grace period never completes, OOM is unavoidable, but this is a way > > of putting it off for a bit. > > Don't even think about optimizing your new thing for #2. It's a > pointless exercise. If the task which runs into the 'can't allocate' > case then is sleeps and waits. End of story. Agreed, and hence my "If this becomes a problem". Until such time, it is pointless. For one thing, we don't yet know the failure mode. But it has been helpful for me to think a move or two ahead when playing against RCU, hence the remainder of my paragraph. > >> So the interesting case is #1. Which means we need to look at the > >> potential sources of the flooding: > >> > >> 1) User space via syscalls, e.g. open/close > >> 2) Kernel thread > >> 3) Softirq > >> 4) Device interrupt > >> 5) System interrupts, deep atomic context, NMI ... > >> > >> #1 trivial fix is to force switching to an high prio thread or a soft > >> interrupt which does the allocation > >> > >> #2 Similar to #1 unless that thread loops with interrupts, softirqs or > >> preemption disabled. If that's the case then running out of RCU > >> storage space is the least of your worries. > >> > >> #3 Similar to #2. The obvious candidates (e.g. NET) for monopolizing a > >> CPU have loop limits in place already. If there is a bug which fails > >> to care about the limit, why would RCU care and allocate more memory? > >> > >> #4 Similar to #3. If the interrupt handler loops forever or if the > >> interrupt is a runaway which prevents task/softirq processing then > >> RCU free performance is the least of your worries. > >> > >> #5 Clearly a bug and making RCU accomodate for that is beyond silly. > >> > >> So if call_rcu() detects that the remaining storage space for pointers > >> goes below the critical point or if it observes high frequency calls > >> then it simply should force a soft interrupt which does the allocation. > > > > Unless call_rcu() has been invoked with scheduler locks held. But > > eventually call_rcu() should be invoked with interrupts enabled, and at > > that point it would be safe to raise_softirq(), wake_up(), or > > whatever. > > If this atomic context corner case is hit within a problematic context > then we talk about the RCU of today and not about the future single > argument thing. And that oldschool RCU has a fallback. We are talking > about pressure corner cases and you really want to squeeze out the last > cache miss? What for? If there is pressure then these cache misses are > irrelevant. Of course. My point was instead that even this atomic corner case was likely to have escape routes in the form of occasional non-atomic calls, and that these could do the wakeups. Again, thank you. Thanx, Paul
next prev parent reply other threads:[~2020-08-15 22:11 UTC|newest] Thread overview: 111+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-08-09 20:43 [RFC-PATCH 0/2] __GFP_NO_LOCKS Uladzislau Rezki (Sony) 2020-08-09 20:43 ` [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag Uladzislau Rezki (Sony) 2020-08-10 12:31 ` Michal Hocko 2020-08-10 16:07 ` Uladzislau Rezki 2020-08-10 19:25 ` Michal Hocko 2020-08-11 8:19 ` Michal Hocko 2020-08-11 9:37 ` Uladzislau Rezki 2020-08-11 9:42 ` Uladzislau Rezki 2020-08-11 10:28 ` Michal Hocko 2020-08-11 10:45 ` Uladzislau Rezki 2020-08-11 10:26 ` Michal Hocko 2020-08-11 11:33 ` Uladzislau Rezki 2020-08-11 9:18 ` Uladzislau Rezki 2020-08-11 10:21 ` Michal Hocko 2020-08-11 11:10 ` Uladzislau Rezki 2020-08-11 14:44 ` Thomas Gleixner 2020-08-11 15:22 ` Thomas Gleixner 2020-08-12 11:38 ` Thomas Gleixner 2020-08-12 12:01 ` Uladzislau Rezki 2020-08-13 7:18 ` Michal Hocko 2020-08-11 15:33 ` Paul E. McKenney 2020-08-11 15:43 ` Thomas Gleixner 2020-08-11 15:56 ` Sebastian Andrzej Siewior 2020-08-11 16:02 ` Paul E. McKenney 2020-08-11 16:19 ` Paul E. McKenney 2020-08-11 19:39 ` Thomas Gleixner 2020-08-11 21:09 ` Paul E. McKenney 2020-08-12 0:13 ` Thomas Gleixner 2020-08-12 4:29 ` Paul E. McKenney 2020-08-12 8:32 ` Thomas Gleixner 2020-08-12 13:30 ` Paul E. McKenney 2020-08-13 7:50 ` Michal Hocko 2020-08-13 9:58 ` Uladzislau Rezki 2020-08-13 11:15 ` Michal Hocko 2020-08-13 13:27 ` Thomas Gleixner 2020-08-13 13:45 ` Michal Hocko 2020-08-13 14:32 ` Matthew Wilcox 2020-08-13 16:14 ` Thomas Gleixner 2020-08-13 16:22 ` Matthew Wilcox 2020-08-13 13:22 ` Thomas Gleixner 2020-08-13 13:33 ` Michal Hocko 2020-08-13 14:34 ` Thomas Gleixner 2020-08-13 14:53 ` Michal Hocko 2020-08-13 15:41 ` Paul E. McKenney 2020-08-13 15:54 ` Michal Hocko 2020-08-13 16:04 ` Paul E. McKenney 2020-08-13 16:13 ` Michal Hocko 2020-08-13 16:29 ` Paul E. McKenney 2020-08-13 17:12 ` Michal Hocko 2020-08-13 17:27 ` Paul E. McKenney 2020-08-13 18:31 ` peterz 2020-08-13 19:13 ` Michal Hocko 2020-08-13 16:20 ` Uladzislau Rezki 2020-08-13 16:36 ` Michal Hocko 2020-08-14 11:54 ` Uladzislau Rezki 2020-08-13 17:09 ` Thomas Gleixner 2020-08-13 17:22 ` Michal Hocko 2020-08-14 7:17 ` Michal Hocko 2020-08-14 12:15 ` Uladzislau Rezki 2020-08-14 12:48 ` Michal Hocko 2020-08-14 13:34 ` Paul E. McKenney 2020-08-14 14:06 ` Michal Hocko 2020-08-14 18:01 ` Paul E. McKenney 2020-08-14 23:14 ` Thomas Gleixner 2020-08-14 23:41 ` Paul E. McKenney 2020-08-15 0:43 ` Thomas Gleixner 2020-08-15 3:01 ` Paul E. McKenney [this message] 2020-08-15 8:27 ` Peter Zijlstra 2020-08-15 13:03 ` Paul E. McKenney 2020-08-15 8:42 ` Peter Zijlstra 2020-08-15 14:18 ` Paul E. McKenney 2020-08-15 14:23 ` Paul E. McKenney 2020-08-17 8:47 ` Michal Hocko 2020-08-13 18:26 ` peterz 2020-08-13 18:52 ` Paul E. McKenney 2020-08-13 22:06 ` peterz 2020-08-13 23:23 ` Paul E. McKenney 2020-08-13 23:59 ` Thomas Gleixner 2020-08-14 8:30 ` Peter Zijlstra 2020-08-14 10:23 ` peterz 2020-08-14 15:26 ` Paul E. McKenney 2020-08-14 14:14 ` Paul E. McKenney 2020-08-14 16:11 ` Paul E. McKenney 2020-08-14 17:49 ` Peter Zijlstra 2020-08-14 18:02 ` Paul E. McKenney 2020-08-14 19:33 ` Thomas Gleixner 2020-08-14 20:41 ` Paul E. McKenney 2020-08-14 21:52 ` Peter Zijlstra 2020-08-14 23:27 ` Paul E. McKenney 2020-08-14 23:40 ` Thomas Gleixner 2020-08-16 22:56 ` Uladzislau Rezki 2020-08-17 8:28 ` Michal Hocko 2020-08-17 10:36 ` Uladzislau Rezki 2020-08-17 22:28 ` Paul E. McKenney 2020-08-18 7:43 ` Michal Hocko 2020-08-18 13:53 ` Paul E. McKenney 2020-08-18 14:43 ` Thomas Gleixner 2020-08-18 16:13 ` Paul E. McKenney 2020-08-18 16:55 ` Thomas Gleixner 2020-08-18 17:13 ` Paul E. McKenney 2020-08-18 23:26 ` Thomas Gleixner 2020-08-19 23:07 ` Paul E. McKenney 2020-08-18 15:02 ` Michal Hocko 2020-08-18 15:45 ` Uladzislau Rezki 2020-08-18 16:18 ` Paul E. McKenney 2020-08-14 16:19 ` peterz 2020-08-14 18:15 ` Paul E. McKenney 2020-08-13 13:29 ` Uladzislau Rezki 2020-08-13 13:41 ` Michal Hocko 2020-08-13 14:22 ` Uladzislau Rezki 2020-08-09 20:43 ` [PATCH 2/2] rcu/tree: use " Uladzislau Rezki (Sony)
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200815030148.GX4295@paulmck-ThinkPad-P72 \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).