From: "Michal Koutný" <mkoutny@suse.com> To: Jan Kara <jack@suse.cz> Cc: Paolo Valente <paolo.valente@linaro.org>, linux-block@vger.kernel.org, fvogdt@suse.de, cgroups@vger.kernel.org Subject: Re: Use after free with BFQ and cgroups Date: Fri, 26 Nov 2021 15:47:24 +0100 [thread overview] Message-ID: <20211126144724.GA31093@blackbody.suse.cz> (raw) In-Reply-To: <20211125172809.GC19572@quack2.suse.cz> Hello. On Thu, Nov 25, 2021 at 06:28:09PM +0100, Jan Kara <jack@suse.cz> wrote: [...] +Cc cgroups ML https://lore.kernel.org/linux-block/20211125172809.GC19572@quack2.suse.cz/ I understand there are more objects than blkcgs but I assume it can eventually boil down to blkcg references, so I suggest another alternative. (But I may easily miss the relations between BFQ objects, so consider this only high-level opinion.) > After some poking, looking into crashdumps, and applying some debug patches > the following seems to be happening: We have a process P in blkcg G. Now > G is taken offline so bfq_group is cleaned up in bfq_pd_offline() but P > still holds reference to G from its bfq_queue. Then P submits IO, G gets > inserted into service tree despite being already offline. (If G is offline, P can only be zombie, just saying. (I guess it can still be Q's IO on behalf of G.)) IIUC, the reference to G is only held by P. If the G reference is copied into another structure (the service tree) it should get another reference. My naïve proposal would be css_get(). (1) > IO completes, P exits, bfq_queue pointing to G gets destroyed, the > last reference to G is dropped, G gets freed although is it still > inserted in the service tree. Eventually someone trips over the freed > memory. Isn't it the bfq_queue.bfq_entity that's inserted in the service tree (not blkcg G)? You write bfq_queue is destroyed, shouldn't that remove it from the service tree? (2) > Now I was looking into how to best fix this. There are several > possibilities and I'm not sure which one to pick so that's why I'm writing > to you. bfq_pd_offline() is walking all entities in service trees and > trying to get rid of references to bfq_group (by reparenting entities). > Is this guaranteed to see all entities that point to G? From the scenario > I'm observing it seems this can miss entities pointing to G - e.g. if they > are in idle tree, we will just remove them from the idle tree but we won't > change entity->parent so they still point to G. This can be seen as one > culprit of the bug. There can be two types of references to blkcg (transitively via bfq_group): a) "plain" (just a pointer stored somewhere), b) "pinned" (marked by css_get() of the respective blkcg). The bfq_pd_offline() callback should erase all plain references (e.g. by reparenting) or poke the holders of pinned references to release (unpin) them eventually (so that blkcg goes away). I reckon it's not possible to traverse all references in the bfq_pd_offline(). > Or alternatively, should we e.g. add __bfq_deactivate_entity() to > bfq_put_queue() when that function is dropping last queue in a bfq_group? I guess this is what I wondered about in (2). (But I'm not sure this really is proof against subsequent re-insertions into the tree.) > Or should we just reparent bfq queues that have already dead parent on > activation? If (1) used css_tryget_online(), the parent (or ancestor if it happened to be offlined too) could be the fallback. > What's your opinion? The question here is how long would stay the offlined blkcgs around if they were directly pinned upon the IO submission. If it's unbound, then reparenting makes more sense. Michal
WARNING: multiple messages have this Message-ID (diff)
From: "Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org> To: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> Cc: Paolo Valente <paolo.valente-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>, linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, fvogdt-l3A5Bk7waGM@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Subject: Re: Use after free with BFQ and cgroups Date: Fri, 26 Nov 2021 15:47:24 +0100 [thread overview] Message-ID: <20211126144724.GA31093@blackbody.suse.cz> (raw) In-Reply-To: <20211125172809.GC19572-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org> Hello. On Thu, Nov 25, 2021 at 06:28:09PM +0100, Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> wrote: [...] +Cc cgroups ML https://lore.kernel.org/linux-block/20211125172809.GC19572-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org/ I understand there are more objects than blkcgs but I assume it can eventually boil down to blkcg references, so I suggest another alternative. (But I may easily miss the relations between BFQ objects, so consider this only high-level opinion.) > After some poking, looking into crashdumps, and applying some debug patches > the following seems to be happening: We have a process P in blkcg G. Now > G is taken offline so bfq_group is cleaned up in bfq_pd_offline() but P > still holds reference to G from its bfq_queue. Then P submits IO, G gets > inserted into service tree despite being already offline. (If G is offline, P can only be zombie, just saying. (I guess it can still be Q's IO on behalf of G.)) IIUC, the reference to G is only held by P. If the G reference is copied into another structure (the service tree) it should get another reference. My naïve proposal would be css_get(). (1) > IO completes, P exits, bfq_queue pointing to G gets destroyed, the > last reference to G is dropped, G gets freed although is it still > inserted in the service tree. Eventually someone trips over the freed > memory. Isn't it the bfq_queue.bfq_entity that's inserted in the service tree (not blkcg G)? You write bfq_queue is destroyed, shouldn't that remove it from the service tree? (2) > Now I was looking into how to best fix this. There are several > possibilities and I'm not sure which one to pick so that's why I'm writing > to you. bfq_pd_offline() is walking all entities in service trees and > trying to get rid of references to bfq_group (by reparenting entities). > Is this guaranteed to see all entities that point to G? From the scenario > I'm observing it seems this can miss entities pointing to G - e.g. if they > are in idle tree, we will just remove them from the idle tree but we won't > change entity->parent so they still point to G. This can be seen as one > culprit of the bug. There can be two types of references to blkcg (transitively via bfq_group): a) "plain" (just a pointer stored somewhere), b) "pinned" (marked by css_get() of the respective blkcg). The bfq_pd_offline() callback should erase all plain references (e.g. by reparenting) or poke the holders of pinned references to release (unpin) them eventually (so that blkcg goes away). I reckon it's not possible to traverse all references in the bfq_pd_offline(). > Or alternatively, should we e.g. add __bfq_deactivate_entity() to > bfq_put_queue() when that function is dropping last queue in a bfq_group? I guess this is what I wondered about in (2). (But I'm not sure this really is proof against subsequent re-insertions into the tree.) > Or should we just reparent bfq queues that have already dead parent on > activation? If (1) used css_tryget_online(), the parent (or ancestor if it happened to be offlined too) could be the fallback. > What's your opinion? The question here is how long would stay the offlined blkcgs around if they were directly pinned upon the IO submission. If it's unbound, then reparenting makes more sense. Michal
next prev parent reply other threads:[~2021-11-26 14:49 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-11-25 17:28 Use after free with BFQ and cgroups Jan Kara 2021-11-26 14:47 ` Michal Koutný [this message] 2021-11-26 14:47 ` Michal Koutný 2021-11-29 17:11 ` Jan Kara 2021-11-29 17:11 ` Jan Kara 2021-12-09 2:23 ` yukuai (C) 2021-12-09 2:23 ` yukuai (C) 2021-12-09 15:33 ` Paolo Valente 2021-12-09 15:33 ` Paolo Valente 2021-12-13 17:33 ` Jan Kara 2021-12-14 1:24 ` yukuai (C) 2021-12-14 1:24 ` yukuai (C) 2021-12-20 18:38 ` Jan Kara 2021-12-20 18:38 ` Jan Kara 2021-12-22 15:21 ` Jan Kara 2021-12-22 15:21 ` Jan Kara 2021-12-23 1:02 ` yukuai (C) 2021-12-23 1:02 ` yukuai (C) 2021-12-23 17:13 ` Jan Kara 2021-12-23 17:13 ` Jan Kara 2021-11-29 17:12 ` Tejun Heo 2021-11-30 11:50 ` Jan Kara 2021-11-30 16:22 ` Tejun Heo 2021-11-30 16:22 ` Tejun Heo
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20211126144724.GA31093@blackbody.suse.cz \ --to=mkoutny@suse.com \ --cc=cgroups@vger.kernel.org \ --cc=fvogdt@suse.de \ --cc=jack@suse.cz \ --cc=linux-block@vger.kernel.org \ --cc=paolo.valente@linaro.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.