live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Miroslav Benes <mbenes@suse.cz>
To: Josef Bacik <josef@toxicpanda.com>
Cc: xiaojun.zhao141@gmail.com, linux-kernel@vger.kernel.org,
	live-patching@vger.kernel.org
Subject: Re: the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks'
Date: Thu, 15 Apr 2021 10:37:44 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LSU.2.21.2104151026100.15642@pobox.suse.cz> (raw)
In-Reply-To: <f7698105-23a4-4558-7b65-9116e8587848@toxicpanda.com>

On Wed, 14 Apr 2021, Josef Bacik wrote:

> On 4/14/21 11:21 AM, xiaojun.zhao141@gmail.com wrote:
> > On Wed, 14 Apr 2021 13:27:43 +0200 (CEST)
> > Miroslav Benes <mbenes@suse.cz> wrote:
> > 
> >> Hi,
> >>
> >> On Wed, 14 Apr 2021, xiaojun.zhao141@gmail.com wrote:
> >>
> >>> I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
> >>> nbd.qcow2) will automatically exit when I patched for functions of
> >>> the nbd with livepatch.
> >>>
> >>> The nbd relative source:
> >>> static int nbd_start_device_ioctl(struct nbd_device *nbd, struct
> >>> block_device *bdev)
> >>> { struct nbd_config *config =
> >>> nbd->config; int
> >>> ret;
> >>>          ret =
> >>> nbd_start_device(nbd); if
> >>> (ret) return
> >>> ret;
> >>>          if
> >>> (max_part) bdev->bd_invalidated =
> >>> 1;
> >>> mutex_unlock(&nbd->config_lock); ret =
> >>> wait_event_interruptible(config->recv_wq,
> >>> atomic_read(&config->recv_threads) == 0); if
> >>> (ret)
> >>> sock_shutdown(nbd);
> >>> flush_workqueue(nbd->recv_workq);
> >>>          mutex_lock(&nbd->config_lock);
> >>>          nbd_bdev_reset(bdev);
> >>>          /* user requested, ignore socket errors
> >>> */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED,
> >>> &config->runtime_flags)) ret =
> >>> 0; if (test_bit(NBD_RT_TIMEDOUT,
> >>> &config->runtime_flags)) ret =
> >>> -ETIMEDOUT; return
> >>> ret; }
> >>
> >> So my understanding is that ndb spawns a number
> >> (config->recv_threads) of workqueue jobs and then waits for them to
> >> finish. It waits interruptedly. Now, any signal would make
> >> wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake
> >> signal is no exception there. The error is then propagated back to
> >> the userspace. Unless a user requested a disconnection or there is
> >> timeout set. How does the userspace then reacts to it? Is
> >> _interruptible there because the userspace sends a signal in case of
> >> NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles
> >> ordinary signals? This all sounds a bit strange, but I may be missing
> >> something easily.
> >>
> >>> When the nbd waits for atomic_read(&config->recv_threads) == 0, the
> >>> klp will send a fake signal to it then the qemu-nbd process exits.
> >>> And the signal of sysfs to control this action was removed in the
> >>> commit 10b3d52790e 'livepatch: Remove signal sysfs attribute'. Are
> >>> there other ways to control this action? How?
> >>
> >> No, there is no way currently. We send a fake signal automatically.
> >>
> >> Regards
> >> Miroslav
> > It occurs IO error of the nbd device when I use livepatch of the
> > nbd, and I guess that any livepatch on other kernel source maybe cause
> > the IO error. Well, now I decide to workaround for this problem by
> > adding a livepatch for the klp to disable a automatic fake signal.
> > 
> 
> Would wait_event_killable() fix this problem?  I'm not sure any client
> implementations depend on being able to send other signals to the client
> process, so it should be safe from that standpoint.  Not sure if the livepatch
> thing would still get an error at that point tho.  Thanks,

wait_event_killable() means that you would sleep uninterruptedly (still 
reacting to fatal signals), so the fake signal from livepatch would not be 
sent at all. set_notify_signal() handles TASK_INTERRUPTIBLE tasks. No 
disruption for the userspace and it would fix this problem.

There is a catch on the livepatch side of things. If there is a live patch 
for nbd_start_device_ioctl(), the transition process would get stuck until 
the task leaves the function (all workqueue jobs are processed). I gather 
it is unlikely to be it indefinite, so we can live with that, I think.

Miroslav

      parent reply	other threads:[~2021-04-15  8:37 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210414115548.0cdb529b@slime>
2021-04-14 11:27 ` the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks' Miroslav Benes
2021-04-14 14:52   ` xiaojun.zhao141
2021-04-14 15:21   ` xiaojun.zhao141
2021-04-14 17:21     ` Josef Bacik
2021-04-15  6:27       ` xiaojun.zhao141
2021-04-15  8:37       ` Miroslav Benes [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.21.2104151026100.15642@pobox.suse.cz \
    --to=mbenes@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=xiaojun.zhao141@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).