linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "yukuai (C)" <yukuai3@huawei.com>
To: Jan Kara <jack@suse.cz>
Cc: "Michal Koutný" <mkoutny@suse.com>,
	"Paolo Valente" <paolo.valente@linaro.org>,
	linux-block@vger.kernel.org, fvogt@suse.de,
	cgroups@vger.kernel.org
Subject: Re: Use after free with BFQ and cgroups
Date: Tue, 14 Dec 2021 09:24:58 +0800	[thread overview]
Message-ID: <896161a3-922c-63b3-6e6f-9f6005a46bd4@huawei.com> (raw)
In-Reply-To: <20211213173354.GE14044@quack2.suse.cz>

在 2021/12/14 1:33, Jan Kara 写道:
> Hello!
> 
> On Thu 09-12-21 10:23:33, yukuai (C) wrote:
>> 在 2021/11/30 1:11, Jan Kara 写道:
>>> On Fri 26-11-21 15:47:24, Michal Koutný wrote:
>>>> Hello.
>>>>
>>>> On Thu, Nov 25, 2021 at 06:28:09PM +0100, Jan Kara <jack@suse.cz> wrote:
>>>> [...]
>>>> +Cc cgroups ML
>>>> https://lore.kernel.org/linux-block/20211125172809.GC19572@quack2.suse.cz/
>>>>
>>>>
>>>> I understand there are more objects than blkcgs but I assume it can
>>>> eventually boil down to blkcg references, so I suggest another
>>>> alternative. (But I may easily miss the relations between BFQ objects,
>>>> so consider this only high-level opinion.)
>>>>
>>>>> After some poking, looking into crashdumps, and applying some debug patches
>>>>> the following seems to be happening: We have a process P in blkcg G. Now
>>>>> G is taken offline so bfq_group is cleaned up in bfq_pd_offline() but P
>>>>> still holds reference to G from its bfq_queue. Then P submits IO, G gets
>>>>> inserted into service tree despite being already offline.
>>>>
>>>> (If G is offline, P can only be zombie, just saying. (I guess it can
>>>> still be Q's IO on behalf of G.))
>>>>
>>>> IIUC, the reference to G is only held by P. If the G reference is copied
>>>> into another structure (the service tree) it should get another
>>>> reference. My naïve proposal would be css_get(). (1)
>>>
>>> So I was looking into this puzzle. The answer is following:
>>>
>>> The process P (podman, pid 2571) is currently attached to the root cgroup
>>> but it has io_context with BFQ queue that points to the already-offline G
>>> as a parent. The bio is thus associated with the root cgroup (via
>>> bio->bi_blkg) but BFQ uses io_context to lookup the BFQ queue where IO
>>> should be queued and then uses its parent to determine blkg which it should
>>> be charged and thus gets to the dying cgroup.
>>
>> After some code review, we found that the root cause of the problem
>> semms to be different.
>>
>> If the process is moved from group G to root group, and a new io is
>> issued from the process, then bfq should detect this and changing
>> bfq_queue's parent to root bfq_group:
>>
>> bfq_insert_request
>>   bfq_init_rq
>>    bfq_bic_update_cgroup
>>     serial_nr = __bio_blkcg(bio)->css.serial_nr; -> from root group
>>     bic->blkcg_serial_nr == serial_nr -> this do not pass,because
>> bic->blkcg_serial_nr is still from group G
> 
> So in the crashdump I have available, I can see that
> _bio_blkcg(bio)->css.serial_nr is 4. Also bic->blkcg_serial_nr is 4. But
> bic->bfqq[1] is a bfq_queue that has its entity->parent set to already
> offlined bfq_group. Not sure how that is possible...
> 
>>     __bfq_bic_change_cgroup -> bfq_queue parent will be changed to root group
>>
>> And we think the following path is possible to trigger the problem:
>>
>> 1) process P1 and P2 is currently in cgroup C1, corresponding to
>> bfq_queue q1, q2 and bfq_group g1. And q1 and q2 are merged:
>> q1->next_bfqq = q2.
> 
> I agree shared queues are some factor in this - the problematic bfq_queue
> pointing to the dead bfq_group has 'coop' flag set, pid == -1, bic ==
> NULL. So clearly it has been merged with another bfq_queue.
> 
>> 2) move P1 from C1 to root_cgroup, q1->next_bfqq is still q2
>> and flag BFQQF_split_coop is not set yet.
> 
> There's no next_bfqq in the kernel I'm looking into... Generally the merge
> code seems to be working somewhat differently to what you describe (I'm
> looking into 5.16-rc3 kernel).

Hi, Jan

Sorry, It should be new_bfqq, which will be set if the queue is merged
to other queue.
> 
>> 3) P2 exit, q2 won't exit because it's still referenced through
>> queue merge.
>>
>> 4) delete C1, g1 is offlined
>>
>> 5) issue a new io in q1, q1's parent entity will change to root,
>> however the io will end up in q1->next_bfqq = q2, and thus the
>> offlined g1 is inserted to service tree through q2.
>>
>> 6) P1 exit, q2 exit, and finially g1 is freed, while g1 is still
>> in service tree of it's parent.
>>
>> We confirmed this by our reproducer through a simple patch:
>> stop merging bfq_queues if their parents are different.
>>
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 1ce1a99a7160..14c1d1c3811e 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -2626,6 +2626,11 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct
>> bfq_queue *new_bfqq)
>>          while ((__bfqq = new_bfqq->new_bfqq)) {
>>                  if (__bfqq == bfqq)
>>                          return NULL;
>> +               if (__bfqq->entity.parent != bfqq->entity.parent) {
>> +                       if (bfq_bfqq_coop(__bfqq))
>> +                               bfq_mark_bfqq_split_coop(__bfqq);
>> +                       return NULL;
>> +               }
>>                  new_bfqq = __bfqq;
>>          }
>>
>> @@ -2825,8 +2830,16 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct
>> bfq_queue *bfqq,
>>          if (bfq_too_late_for_merging(bfqq))
>>                  return NULL;
>>
>> -       if (bfqq->new_bfqq)
>> -               return bfqq->new_bfqq;
>> +       if (bfqq->new_bfqq) {
>> +               struct bfq_queue *new_bfqq = bfqq->new_bfqq;
>> +
>> +               if(bfqq->entity.parent == new_bfqq->entity.parent)
>> +                       return new_bfqq;
>> +
>> +               if(bfq_bfqq_coop(new_bfqq))
>> +                       bfq_mark_bfqq_split_coop(new_bfqq);
>> +               return NULL;
>> +       }
>>
>> Do you think this analysis is correct?
> 
> Honestly, I'm not sure. At this point I'm not sure why the bfqq pointed to
> from bic didn't get reparented to the new cgroup when bio was submitted...

After queue merging, bic is set to the new bfqq, for example:

__bfq_insert_request
  new_bfqq = bfq_setup_cooperator
  bfq_merge_bfqqs -> before this is done, bic is set to old bfqq
   bic_set_bfqq(bic, new_bfqq, 1); -> bic is set to new bfqq
  rq->elv.priv[1] = new_bfqq;

I think current problem is that if bfq_queue is merged, task migration
won't break such cooperation,thus issue io from one cgroup may endup to
a bfq_queue that is from another cgroup, which might be offlined.

Thanks,
Kuai
> 
> 								Honza
> 

  reply	other threads:[~2021-12-14  1:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-25 17:28 Use after free with BFQ and cgroups Jan Kara
2021-11-26 14:47 ` Michal Koutný
2021-11-29 17:11   ` Jan Kara
2021-12-09  2:23     ` yukuai (C)
2021-12-09 15:33       ` Paolo Valente
2021-12-13 17:33       ` Jan Kara
2021-12-14  1:24         ` yukuai (C) [this message]
2021-12-20 18:38           ` Jan Kara
2021-12-22 15:21       ` Jan Kara
2021-12-23  1:02         ` yukuai (C)
2021-12-23 17:13           ` Jan Kara
2021-11-29 17:12   ` Tejun Heo
2021-11-30 11:50     ` Jan Kara
2021-11-30 16:22       ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=896161a3-922c-63b3-6e6f-9f6005a46bd4@huawei.com \
    --to=yukuai3@huawei.com \
    --cc=cgroups@vger.kernel.org \
    --cc=fvogt@suse.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=mkoutny@suse.com \
    --cc=paolo.valente@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).