From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93C5EC77B75 for ; Mon, 15 May 2023 14:24:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239091AbjEOOX6 (ORCPT ); Mon, 15 May 2023 10:23:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229568AbjEOOXz (ORCPT ); Mon, 15 May 2023 10:23:55 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC28FF4 for ; Mon, 15 May 2023 07:23:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3FED561DE3 for ; Mon, 15 May 2023 14:23:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1CE93C433EF; Mon, 15 May 2023 14:23:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684160633; bh=BS6JVi3U6AfGKU8Juy8abcYENm4+fIaoNY9mN6+Bwy8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=GrK32zAplnlrqy40OU3XJSBXClfmG9O7ZOlTlNfLZCNAlZqPWij/MvKdYbCd95gR2 aqBJ9Phg/BhKmSH3JKt+kK36Yp/sCMMpu/hWfqT82d/Jb2G4M3mJusvu6e3ksv2lpE pRedKzAVyDGXLVPsGb1RLCwMmDe33hoWngAKowOcuFqmAj6ue952HtF7GcJd31dx8S GaX/sf4DiWvSy52m27xRONQ8jgx9Z9kAUXUy8FoHrvVXYrv/R0FQntd8ziWJx9Zt9x u40aiCpN4LNpF3i/wJXZaAFB4YJPNraVBZqlzdMwTf3oHdOkbawI64usc7zsQu3nJH Xl2vfX5SxQyRA== Date: Mon, 15 May 2023 16:23:46 +0200 From: Christian Brauner To: Linus Torvalds Cc: Thorsten Leemhuis , Mike Christie , nicolas.dichtel@6wind.com, Linux kernel regressions list , hch@infradead.org, stefanha@redhat.com, jasowang@redhat.com, mst@redhat.com, sgarzare@redhat.com, virtualization@lists.linux-foundation.org, ebiederm@xmission.com, konrad.wilk@oracle.com, linux-kernel@vger.kernel.org, Jens Axboe Subject: Re: [PATCH v11 8/8] vhost: use vhost_tasks for worker threads Message-ID: <20230515-vollrausch-liebgeworden-2765f3ca3540@brauner> References: <20230202232517.8695-1-michael.christie@oracle.com> <20230202232517.8695-9-michael.christie@oracle.com> <78c5e150-26cf-7724-74ee-4a0b16b944b1@oracle.com> <48842e92-835e-bc3f-7118-48b8f415f532@leemhuis.info> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, May 13, 2023 at 10:08:04AM -0500, Linus Torvalds wrote: > On Sat, May 13, 2023 at 7:39 AM Thorsten Leemhuis wrote: > > > > Jumping in here, as I found another problem with that patch: it broke > > s2idle on my laptop when a qemu-kvm VM is running, as freezing user > > space processes now fails for me: > > Hmm. kthreads have PF_NOFREEZE by default, which is probably the reason. > > Adding > > current->flags |= PF_NOFREEZE; > > to the vhost_task setup might just fix it, but it feels a bit off. > > The way io_uring does this is to do > > if (signal_pending(current)) { > struct ksignal ksig; > > if (!get_signal(&ksig)) > continue; > break; > } > > in the main loop, which ends up handling the freezer situation too. > But it should handle things like SIGSTOP etc as well, and also exit on > actual signals. > > I get the feeling that the whole "vhost_task_should_stop()" logic > should have the exact logic above, and basically make those threads > killable as well. > > Hmm? I'm still trying to catch up after LSFMM with everything that's happened on the fs side so coming back to this thread with a fresh set of eyes is difficult. Sorry about the delay here. So we seem to two immediate issues: (1) The current logic breaks ps output because vhost creates helper processes instead of threads. The suggested patch by Mike was to make them proper threads again but somehow special threads in the sense that they don't unshare signal handlers. The latter part is possibly broken and seems hacky. (That's earlier in the thread.) (2) Freezing of vhost tasks fails. (This mail.) So I think we will be able to address (1) and (2) by making vhost tasks proper threads and blocking every signal except for SIGKILL and SIGSTOP and then having vhost handle get_signal() - as you mentioned - the same way io uring already does. We should also remove the ingore_signals thing completely imho. I don't think we ever want to do this with user workers. @Mike, can you get a patch ready ideally this week so we can get this fixed soon?