All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Christie <michael.christie@oracle.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	linux@leemhuis.info, nicolas.dichtel@6wind.com, axboe@kernel.dk,
	torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, mst@redhat.com,
	sgarzare@redhat.com, jasowang@redhat.com, stefanha@redhat.com,
	brauner@kernel.org
Subject: Re: [CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
Date: Tue, 6 Jun 2023 10:57:46 -0500	[thread overview]
Message-ID: <39f5913c-e658-e476-0378-62236bb4ed49@oracle.com> (raw)
In-Reply-To: <20230606121643.GD7542@redhat.com>

On 6/6/23 7:16 AM, Oleg Nesterov wrote:
> On 06/05, Mike Christie wrote:
>>
>> On 6/5/23 10:10 AM, Oleg Nesterov wrote:
>>> On 06/03, michael.christie@oracle.com wrote:
>>>>
>>>> On 6/2/23 11:15 PM, Eric W. Biederman wrote:
>>>> The problem is that as part of the flush the drivers/vhost/scsi.c code
>>>> will wait for outstanding commands, because we can't free the device and
>>>> it's resources before the commands complete or we will hit the accessing
>>>> freed memory bug.
>>>
>>> ignoring send-fd/clone issues, can we assume that the final fput/release
>>> should always come from vhost_worker's sub-thread (which shares mm/etc) ?
>>
>> I think I'm misunderstanding the sub-thread term.
>>
>> - Is it the task_struct's context that we did the
>> kernel/vhost_taskc.c:vhost_task_create() from? Below it would be the
>> thread we did VHOST_SET_OWNER from.
> 
> Yes,
> 
>> So it works like if we were using a kthread still:
>>
>> 1. Userapce thread0 opens /dev/vhost-$something.
>> 2. thread0 does VHOST_SET_OWNER ioctl. This calls vhost_task_create() to
>> create the task_struct which runs the vhost_worker() function which handles
>> the work->fns.
>> 3. If userspace now does a SIGKILL or just exits without doing a close() on
>> /dev/vhost-$something, then when thread0 does exit_files() that will do the
>> fput that does vhost-$something's file_operations->release.
> 
> So, at least in this simple case vhost_worker() can just exit after SIGKILL,
> and thread0 can flush the outstanding commands when it calls vhost_dev_flush()
> rather than wait for vhost_worker().
> 
> Right?

With the current code, the answer is no. We would hang like I mentioned here:

https://lore.kernel.org/lkml/ae250076-7d55-c407-1066-86b37014c69c@oracle.com/

We need to add code like I mentioned in that reply because we don't have a
way to call into the layers below us to flush those commands. We need more
like an abort and don't call back into us type of operation. Or, I'm just trying
to add a check where we detect what happened then instead of trying to use
the vhost_task we try to complete in the context the lower level completes us
in.

WARNING: multiple messages have this Message-ID (diff)
From: Mike Christie <michael.christie@oracle.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: axboe@kernel.dk, brauner@kernel.org, mst@redhat.com,
	linux-kernel@vger.kernel.org, linux@leemhuis.info,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	stefanha@redhat.com, nicolas.dichtel@6wind.com,
	virtualization@lists.linux-foundation.org,
	torvalds@linux-foundation.org
Subject: Re: [CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
Date: Tue, 6 Jun 2023 10:57:46 -0500	[thread overview]
Message-ID: <39f5913c-e658-e476-0378-62236bb4ed49@oracle.com> (raw)
In-Reply-To: <20230606121643.GD7542@redhat.com>

On 6/6/23 7:16 AM, Oleg Nesterov wrote:
> On 06/05, Mike Christie wrote:
>>
>> On 6/5/23 10:10 AM, Oleg Nesterov wrote:
>>> On 06/03, michael.christie@oracle.com wrote:
>>>>
>>>> On 6/2/23 11:15 PM, Eric W. Biederman wrote:
>>>> The problem is that as part of the flush the drivers/vhost/scsi.c code
>>>> will wait for outstanding commands, because we can't free the device and
>>>> it's resources before the commands complete or we will hit the accessing
>>>> freed memory bug.
>>>
>>> ignoring send-fd/clone issues, can we assume that the final fput/release
>>> should always come from vhost_worker's sub-thread (which shares mm/etc) ?
>>
>> I think I'm misunderstanding the sub-thread term.
>>
>> - Is it the task_struct's context that we did the
>> kernel/vhost_taskc.c:vhost_task_create() from? Below it would be the
>> thread we did VHOST_SET_OWNER from.
> 
> Yes,
> 
>> So it works like if we were using a kthread still:
>>
>> 1. Userapce thread0 opens /dev/vhost-$something.
>> 2. thread0 does VHOST_SET_OWNER ioctl. This calls vhost_task_create() to
>> create the task_struct which runs the vhost_worker() function which handles
>> the work->fns.
>> 3. If userspace now does a SIGKILL or just exits without doing a close() on
>> /dev/vhost-$something, then when thread0 does exit_files() that will do the
>> fput that does vhost-$something's file_operations->release.
> 
> So, at least in this simple case vhost_worker() can just exit after SIGKILL,
> and thread0 can flush the outstanding commands when it calls vhost_dev_flush()
> rather than wait for vhost_worker().
> 
> Right?

With the current code, the answer is no. We would hang like I mentioned here:

https://lore.kernel.org/lkml/ae250076-7d55-c407-1066-86b37014c69c@oracle.com/

We need to add code like I mentioned in that reply because we don't have a
way to call into the layers below us to flush those commands. We need more
like an abort and don't call back into us type of operation. Or, I'm just trying
to add a check where we detect what happened then instead of trying to use
the vhost_task we try to complete in the context the lower level completes us
in.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2023-06-06 15:58 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-01 18:32 [PATCH 1/1] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression Mike Christie
2023-06-01 18:32 ` Mike Christie
2023-06-01 19:11 ` Michael S. Tsirkin
2023-06-01 19:11   ` Michael S. Tsirkin
2023-06-02  0:43 ` Eric W. Biederman
2023-06-02  0:43   ` Eric W. Biederman
2023-06-02 14:34 ` Nicolas Dichtel
2023-06-02 19:22 ` Oleg Nesterov
2023-06-02 19:22   ` Oleg Nesterov
2023-06-03  3:44   ` Eric W. Biederman
2023-06-03  3:44     ` Eric W. Biederman
2023-06-05 13:26     ` Oleg Nesterov
2023-06-05 13:26       ` Oleg Nesterov
2023-06-03  4:15   ` [CFT][PATCH v3] " Eric W. Biederman
2023-06-03  4:15     ` Eric W. Biederman
2023-06-04  3:28     ` michael.christie
2023-06-04  3:28       ` michael.christie
2023-06-05 15:10       ` Oleg Nesterov
2023-06-05 15:10         ` Oleg Nesterov
2023-06-05 15:46         ` Mike Christie
2023-06-05 15:46           ` Mike Christie
2023-06-06 12:16           ` Oleg Nesterov
2023-06-06 12:16             ` Oleg Nesterov
2023-06-06 15:57             ` Mike Christie [this message]
2023-06-06 15:57               ` Mike Christie
2023-06-06 19:39               ` Oleg Nesterov
2023-06-06 19:39                 ` Oleg Nesterov
2023-06-06 20:38                 ` Mike Christie
2023-06-06 20:38                   ` Mike Christie
2023-06-14  6:02                   ` Can vhost translate to io_uring? Eric W. Biederman
2023-06-14  6:02                     ` Eric W. Biederman
2023-06-14  6:25                     ` michael.christie
2023-06-14  6:25                       ` michael.christie
2023-06-14 14:30                       ` Jens Axboe
2023-06-14 14:30                         ` Jens Axboe
2023-06-14 17:59                       ` Mike Christie
2023-06-14 17:59                         ` Mike Christie
2023-06-14 14:20                     ` Michael S. Tsirkin
2023-06-14 14:20                       ` Michael S. Tsirkin
2023-06-14 15:02                     ` Michael S. Tsirkin
2023-06-14 15:02                       ` Michael S. Tsirkin
2023-06-11 20:27                 ` [CFT][PATCH v3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression Eric W. Biederman
2023-06-11 20:27                   ` Eric W. Biederman
2023-06-14 17:08                   ` Oleg Nesterov
2023-06-14 17:08                     ` Oleg Nesterov
2023-06-05 12:38     ` Oleg Nesterov
2023-06-05 12:38       ` Oleg Nesterov
2023-06-05 13:48 ` [PATCH 1/1] " Oleg Nesterov
2023-06-05 13:48   ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=39f5913c-e658-e476-0378-62236bb4ed49@oracle.com \
    --to=michael.christie@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=ebiederm@xmission.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@leemhuis.info \
    --cc=mst@redhat.com \
    --cc=nicolas.dichtel@6wind.com \
    --cc=oleg@redhat.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.