live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks'
       [not found] <20210414115548.0cdb529b@slime>
@ 2021-04-14 11:27 ` Miroslav Benes
  2021-04-14 14:52   ` xiaojun.zhao141
  2021-04-14 15:21   ` xiaojun.zhao141
  0 siblings, 2 replies; 6+ messages in thread
From: Miroslav Benes @ 2021-04-14 11:27 UTC (permalink / raw)
  To: xiaojun.zhao141; +Cc: josef, linux-kernel, live-patching

Hi,

On Wed, 14 Apr 2021, xiaojun.zhao141@gmail.com wrote:

> I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
> nbd.qcow2) will automatically exit when I patched for functions of
> the nbd with livepatch.
> 
> The nbd relative source:
> static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *bdev)
> {                                                                               
>         struct nbd_config *config = nbd->config;                                
>         int ret;                                                                
>                                                                                 
>         ret = nbd_start_device(nbd);                                            
>         if (ret)                                                                
>                 return ret;                                                     
>                                                                                 
>         if (max_part)                                                           
>                 bdev->bd_invalidated = 1;                                       
>         mutex_unlock(&nbd->config_lock);                                        
>         ret = wait_event_interruptible(config->recv_wq,                         
>                                          atomic_read(&config->recv_threads) == 0);
>         if (ret)                                                                
>                 sock_shutdown(nbd);                                             
>         flush_workqueue(nbd->recv_workq);                                       
>                                                                                 
>         mutex_lock(&nbd->config_lock);                                          
>         nbd_bdev_reset(bdev);                                                   
>         /* user requested, ignore socket errors */                              
>         if (test_bit(NBD_RT_DISCONNECT_REQUESTED, &config->runtime_flags))      
>                 ret = 0;                                                        
>         if (test_bit(NBD_RT_TIMEDOUT, &config->runtime_flags))                  
>                 ret = -ETIMEDOUT;                                               
>         return ret;                                                             
> }

So my understanding is that ndb spawns a number (config->recv_threads) of 
workqueue jobs and then waits for them to finish. It waits interruptedly. 
Now, any signal would make wait_event_interruptible() to return 
-ERESTARTSYS. Livepatch fake signal is no exception there. The error is 
then propagated back to the userspace. Unless a user requested a 
disconnection or there is timeout set. How does the userspace then reacts 
to it? Is _interruptible there because the userspace sends a signal in 
case of NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles 
ordinary signals? This all sounds a bit strange, but I may be missing 
something easily.

> When the nbd waits for atomic_read(&config->recv_threads) == 0, the klp
> will send a fake signal to it then the qemu-nbd process exits. And the
> signal of sysfs to control this action was removed in the commit
> 10b3d52790e 'livepatch: Remove signal sysfs attribute'. Are there other
> ways to control this action? How?

No, there is no way currently. We send a fake signal automatically.

Regards
Miroslav

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks'
  2021-04-14 11:27 ` the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks' Miroslav Benes
@ 2021-04-14 14:52   ` xiaojun.zhao141
  2021-04-14 15:21   ` xiaojun.zhao141
  1 sibling, 0 replies; 6+ messages in thread
From: xiaojun.zhao141 @ 2021-04-14 14:52 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: xiaojun.zhao141, josef, linux-kernel, live-patching

On Wed, 14 Apr 2021 13:27:43 +0200 (CEST)
Miroslav Benes <mbenes@suse.cz> wrote:

> Hi,
> 
> On Wed, 14 Apr 2021, xiaojun.zhao141@gmail.com wrote:
> 
> > I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
> > nbd.qcow2) will automatically exit when I patched for functions of
> > the nbd with livepatch.
> > 
> > The nbd relative source:
> > static int nbd_start_device_ioctl(struct nbd_device *nbd, struct
> > block_device *bdev)
> > { struct nbd_config *config =
> > nbd->config; int
> > ret; 
> >         ret =
> > nbd_start_device(nbd); if
> > (ret) return
> > ret; 
> >         if
> > (max_part) bdev->bd_invalidated =
> > 1;
> > mutex_unlock(&nbd->config_lock); ret =
> > wait_event_interruptible(config->recv_wq,
> > atomic_read(&config->recv_threads) == 0); if
> > (ret)
> > sock_shutdown(nbd);
> > flush_workqueue(nbd->recv_workq); 
> >         mutex_lock(&nbd->config_lock);                                          
> >         nbd_bdev_reset(bdev);                                                   
> >         /* user requested, ignore socket errors
> > */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED,
> > &config->runtime_flags)) ret =
> > 0; if (test_bit(NBD_RT_TIMEDOUT,
> > &config->runtime_flags)) ret =
> > -ETIMEDOUT; return
> > ret; }  
> 
> So my understanding is that ndb spawns a number
> (config->recv_threads) of workqueue jobs and then waits for them to
> finish. It waits interruptedly. Now, any signal would make
> wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake
> signal is no exception there. The error is then propagated back to
> the userspace. Unless a user requested a disconnection or there is
> timeout set. How does the userspace then reacts to it? Is
> _interruptible there because the userspace sends a signal in case of
> NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles
> ordinary signals? This all sounds a bit strange, but I may be missing
> something easily.
>
Sorry, now I also don't know how the qemu-nbd handles these signals. I
need to see its source.

Thank you very much. 
> > When the nbd waits for atomic_read(&config->recv_threads) == 0, the
> > klp will send a fake signal to it then the qemu-nbd process exits.
> > And the signal of sysfs to control this action was removed in the
> > commit 10b3d52790e 'livepatch: Remove signal sysfs attribute'. Are
> > there other ways to control this action? How?  
> 
> No, there is no way currently. We send a fake signal automatically.
> 
> Regards
> Miroslav


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks'
  2021-04-14 11:27 ` the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks' Miroslav Benes
  2021-04-14 14:52   ` xiaojun.zhao141
@ 2021-04-14 15:21   ` xiaojun.zhao141
  2021-04-14 17:21     ` Josef Bacik
  1 sibling, 1 reply; 6+ messages in thread
From: xiaojun.zhao141 @ 2021-04-14 15:21 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: xiaojun.zhao141, josef, linux-kernel, live-patching

On Wed, 14 Apr 2021 13:27:43 +0200 (CEST)
Miroslav Benes <mbenes@suse.cz> wrote:

> Hi,
> 
> On Wed, 14 Apr 2021, xiaojun.zhao141@gmail.com wrote:
> 
> > I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
> > nbd.qcow2) will automatically exit when I patched for functions of
> > the nbd with livepatch.
> > 
> > The nbd relative source:
> > static int nbd_start_device_ioctl(struct nbd_device *nbd, struct
> > block_device *bdev)
> > { struct nbd_config *config =
> > nbd->config; int
> > ret; 
> >         ret =
> > nbd_start_device(nbd); if
> > (ret) return
> > ret; 
> >         if
> > (max_part) bdev->bd_invalidated =
> > 1;
> > mutex_unlock(&nbd->config_lock); ret =
> > wait_event_interruptible(config->recv_wq,
> > atomic_read(&config->recv_threads) == 0); if
> > (ret)
> > sock_shutdown(nbd);
> > flush_workqueue(nbd->recv_workq); 
> >         mutex_lock(&nbd->config_lock);                                          
> >         nbd_bdev_reset(bdev);                                                   
> >         /* user requested, ignore socket errors
> > */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED,
> > &config->runtime_flags)) ret =
> > 0; if (test_bit(NBD_RT_TIMEDOUT,
> > &config->runtime_flags)) ret =
> > -ETIMEDOUT; return
> > ret; }  
> 
> So my understanding is that ndb spawns a number
> (config->recv_threads) of workqueue jobs and then waits for them to
> finish. It waits interruptedly. Now, any signal would make
> wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake
> signal is no exception there. The error is then propagated back to
> the userspace. Unless a user requested a disconnection or there is
> timeout set. How does the userspace then reacts to it? Is
> _interruptible there because the userspace sends a signal in case of
> NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles
> ordinary signals? This all sounds a bit strange, but I may be missing
> something easily.
> 
> > When the nbd waits for atomic_read(&config->recv_threads) == 0, the
> > klp will send a fake signal to it then the qemu-nbd process exits.
> > And the signal of sysfs to control this action was removed in the
> > commit 10b3d52790e 'livepatch: Remove signal sysfs attribute'. Are
> > there other ways to control this action? How?  
> 
> No, there is no way currently. We send a fake signal automatically.
> 
> Regards
> Miroslav
It occurs IO error of the nbd device when I use livepatch of the
nbd, and I guess that any livepatch on other kernel source maybe cause
the IO error. Well, now I decide to workaround for this problem by
adding a livepatch for the klp to disable a automatic fake signal.

Regards.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks'
  2021-04-14 15:21   ` xiaojun.zhao141
@ 2021-04-14 17:21     ` Josef Bacik
  2021-04-15  6:27       ` xiaojun.zhao141
  2021-04-15  8:37       ` Miroslav Benes
  0 siblings, 2 replies; 6+ messages in thread
From: Josef Bacik @ 2021-04-14 17:21 UTC (permalink / raw)
  To: xiaojun.zhao141, Miroslav Benes; +Cc: linux-kernel, live-patching

On 4/14/21 11:21 AM, xiaojun.zhao141@gmail.com wrote:
> On Wed, 14 Apr 2021 13:27:43 +0200 (CEST)
> Miroslav Benes <mbenes@suse.cz> wrote:
> 
>> Hi,
>>
>> On Wed, 14 Apr 2021, xiaojun.zhao141@gmail.com wrote:
>>
>>> I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
>>> nbd.qcow2) will automatically exit when I patched for functions of
>>> the nbd with livepatch.
>>>
>>> The nbd relative source:
>>> static int nbd_start_device_ioctl(struct nbd_device *nbd, struct
>>> block_device *bdev)
>>> { struct nbd_config *config =
>>> nbd->config; int
>>> ret;
>>>          ret =
>>> nbd_start_device(nbd); if
>>> (ret) return
>>> ret;
>>>          if
>>> (max_part) bdev->bd_invalidated =
>>> 1;
>>> mutex_unlock(&nbd->config_lock); ret =
>>> wait_event_interruptible(config->recv_wq,
>>> atomic_read(&config->recv_threads) == 0); if
>>> (ret)
>>> sock_shutdown(nbd);
>>> flush_workqueue(nbd->recv_workq);
>>>          mutex_lock(&nbd->config_lock);
>>>          nbd_bdev_reset(bdev);
>>>          /* user requested, ignore socket errors
>>> */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED,
>>> &config->runtime_flags)) ret =
>>> 0; if (test_bit(NBD_RT_TIMEDOUT,
>>> &config->runtime_flags)) ret =
>>> -ETIMEDOUT; return
>>> ret; }
>>
>> So my understanding is that ndb spawns a number
>> (config->recv_threads) of workqueue jobs and then waits for them to
>> finish. It waits interruptedly. Now, any signal would make
>> wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake
>> signal is no exception there. The error is then propagated back to
>> the userspace. Unless a user requested a disconnection or there is
>> timeout set. How does the userspace then reacts to it? Is
>> _interruptible there because the userspace sends a signal in case of
>> NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles
>> ordinary signals? This all sounds a bit strange, but I may be missing
>> something easily.
>>
>>> When the nbd waits for atomic_read(&config->recv_threads) == 0, the
>>> klp will send a fake signal to it then the qemu-nbd process exits.
>>> And the signal of sysfs to control this action was removed in the
>>> commit 10b3d52790e 'livepatch: Remove signal sysfs attribute'. Are
>>> there other ways to control this action? How?
>>
>> No, there is no way currently. We send a fake signal automatically.
>>
>> Regards
>> Miroslav
> It occurs IO error of the nbd device when I use livepatch of the
> nbd, and I guess that any livepatch on other kernel source maybe cause
> the IO error. Well, now I decide to workaround for this problem by
> adding a livepatch for the klp to disable a automatic fake signal.
> 

Would wait_event_killable() fix this problem?  I'm not sure any client 
implementations depend on being able to send other signals to the client 
process, so it should be safe from that standpoint.  Not sure if the livepatch 
thing would still get an error at that point tho.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks'
  2021-04-14 17:21     ` Josef Bacik
@ 2021-04-15  6:27       ` xiaojun.zhao141
  2021-04-15  8:37       ` Miroslav Benes
  1 sibling, 0 replies; 6+ messages in thread
From: xiaojun.zhao141 @ 2021-04-15  6:27 UTC (permalink / raw)
  To: Josef Bacik; +Cc: xiaojun.zhao141, Miroslav Benes, linux-kernel, live-patching

On Wed, 14 Apr 2021 13:21:37 -0400
Josef Bacik <josef@toxicpanda.com> wrote:

> On 4/14/21 11:21 AM, xiaojun.zhao141@gmail.com wrote:
> > On Wed, 14 Apr 2021 13:27:43 +0200 (CEST)
> > Miroslav Benes <mbenes@suse.cz> wrote:
> >   
> >> Hi,
> >>
> >> On Wed, 14 Apr 2021, xiaojun.zhao141@gmail.com wrote:
> >>  
> >>> I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
> >>> nbd.qcow2) will automatically exit when I patched for functions of
> >>> the nbd with livepatch.
> >>>
> >>> The nbd relative source:
> >>> static int nbd_start_device_ioctl(struct nbd_device *nbd, struct
> >>> block_device *bdev)
> >>> { struct nbd_config *config =
> >>> nbd->config; int
> >>> ret;
> >>>          ret =
> >>> nbd_start_device(nbd); if
> >>> (ret) return
> >>> ret;
> >>>          if
> >>> (max_part) bdev->bd_invalidated =
> >>> 1;
> >>> mutex_unlock(&nbd->config_lock); ret =
> >>> wait_event_interruptible(config->recv_wq,
> >>> atomic_read(&config->recv_threads) == 0); if
> >>> (ret)
> >>> sock_shutdown(nbd);
> >>> flush_workqueue(nbd->recv_workq);
> >>>          mutex_lock(&nbd->config_lock);
> >>>          nbd_bdev_reset(bdev);
> >>>          /* user requested, ignore socket errors
> >>> */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED,
> >>> &config->runtime_flags)) ret =
> >>> 0; if (test_bit(NBD_RT_TIMEDOUT,
> >>> &config->runtime_flags)) ret =
> >>> -ETIMEDOUT; return
> >>> ret; }  
> >>
> >> So my understanding is that ndb spawns a number
> >> (config->recv_threads) of workqueue jobs and then waits for them to
> >> finish. It waits interruptedly. Now, any signal would make
> >> wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake
> >> signal is no exception there. The error is then propagated back to
> >> the userspace. Unless a user requested a disconnection or there is
> >> timeout set. How does the userspace then reacts to it? Is
> >> _interruptible there because the userspace sends a signal in case
> >> of NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles
> >> ordinary signals? This all sounds a bit strange, but I may be
> >> missing something easily.
> >>  
> >>> When the nbd waits for atomic_read(&config->recv_threads) == 0,
> >>> the klp will send a fake signal to it then the qemu-nbd process
> >>> exits. And the signal of sysfs to control this action was removed
> >>> in the commit 10b3d52790e 'livepatch: Remove signal sysfs
> >>> attribute'. Are there other ways to control this action? How?  
> >>
> >> No, there is no way currently. We send a fake signal automatically.
> >>
> >> Regards
> >> Miroslav  
> > It occurs IO error of the nbd device when I use livepatch of the
> > nbd, and I guess that any livepatch on other kernel source maybe
> > cause the IO error. Well, now I decide to workaround for this
> > problem by adding a livepatch for the klp to disable a automatic
> > fake signal. 
> 
> Would wait_event_killable() fix this problem?  I'm not sure any
> client implementations depend on being able to send other signals to
> the client process, so it should be safe from that standpoint.  Not
> sure if the livepatch thing would still get an error at that point
> tho.  Thanks,
> Josef
Yes, I tested that wait_event_killable() can fix this problem.

Thanks.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks'
  2021-04-14 17:21     ` Josef Bacik
  2021-04-15  6:27       ` xiaojun.zhao141
@ 2021-04-15  8:37       ` Miroslav Benes
  1 sibling, 0 replies; 6+ messages in thread
From: Miroslav Benes @ 2021-04-15  8:37 UTC (permalink / raw)
  To: Josef Bacik; +Cc: xiaojun.zhao141, linux-kernel, live-patching

On Wed, 14 Apr 2021, Josef Bacik wrote:

> On 4/14/21 11:21 AM, xiaojun.zhao141@gmail.com wrote:
> > On Wed, 14 Apr 2021 13:27:43 +0200 (CEST)
> > Miroslav Benes <mbenes@suse.cz> wrote:
> > 
> >> Hi,
> >>
> >> On Wed, 14 Apr 2021, xiaojun.zhao141@gmail.com wrote:
> >>
> >>> I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
> >>> nbd.qcow2) will automatically exit when I patched for functions of
> >>> the nbd with livepatch.
> >>>
> >>> The nbd relative source:
> >>> static int nbd_start_device_ioctl(struct nbd_device *nbd, struct
> >>> block_device *bdev)
> >>> { struct nbd_config *config =
> >>> nbd->config; int
> >>> ret;
> >>>          ret =
> >>> nbd_start_device(nbd); if
> >>> (ret) return
> >>> ret;
> >>>          if
> >>> (max_part) bdev->bd_invalidated =
> >>> 1;
> >>> mutex_unlock(&nbd->config_lock); ret =
> >>> wait_event_interruptible(config->recv_wq,
> >>> atomic_read(&config->recv_threads) == 0); if
> >>> (ret)
> >>> sock_shutdown(nbd);
> >>> flush_workqueue(nbd->recv_workq);
> >>>          mutex_lock(&nbd->config_lock);
> >>>          nbd_bdev_reset(bdev);
> >>>          /* user requested, ignore socket errors
> >>> */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED,
> >>> &config->runtime_flags)) ret =
> >>> 0; if (test_bit(NBD_RT_TIMEDOUT,
> >>> &config->runtime_flags)) ret =
> >>> -ETIMEDOUT; return
> >>> ret; }
> >>
> >> So my understanding is that ndb spawns a number
> >> (config->recv_threads) of workqueue jobs and then waits for them to
> >> finish. It waits interruptedly. Now, any signal would make
> >> wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake
> >> signal is no exception there. The error is then propagated back to
> >> the userspace. Unless a user requested a disconnection or there is
> >> timeout set. How does the userspace then reacts to it? Is
> >> _interruptible there because the userspace sends a signal in case of
> >> NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles
> >> ordinary signals? This all sounds a bit strange, but I may be missing
> >> something easily.
> >>
> >>> When the nbd waits for atomic_read(&config->recv_threads) == 0, the
> >>> klp will send a fake signal to it then the qemu-nbd process exits.
> >>> And the signal of sysfs to control this action was removed in the
> >>> commit 10b3d52790e 'livepatch: Remove signal sysfs attribute'. Are
> >>> there other ways to control this action? How?
> >>
> >> No, there is no way currently. We send a fake signal automatically.
> >>
> >> Regards
> >> Miroslav
> > It occurs IO error of the nbd device when I use livepatch of the
> > nbd, and I guess that any livepatch on other kernel source maybe cause
> > the IO error. Well, now I decide to workaround for this problem by
> > adding a livepatch for the klp to disable a automatic fake signal.
> > 
> 
> Would wait_event_killable() fix this problem?  I'm not sure any client
> implementations depend on being able to send other signals to the client
> process, so it should be safe from that standpoint.  Not sure if the livepatch
> thing would still get an error at that point tho.  Thanks,

wait_event_killable() means that you would sleep uninterruptedly (still 
reacting to fatal signals), so the fake signal from livepatch would not be 
sent at all. set_notify_signal() handles TASK_INTERRUPTIBLE tasks. No 
disruption for the userspace and it would fix this problem.

There is a catch on the livepatch side of things. If there is a live patch 
for nbd_start_device_ioctl(), the transition process would get stuck until 
the task leaves the function (all workqueue jobs are processed). I gather 
it is unlikely to be it indefinite, so we can live with that, I think.

Miroslav

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-15  8:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210414115548.0cdb529b@slime>
2021-04-14 11:27 ` the qemu-nbd process automatically exit with the commit 43347d56c 'livepatch: send a fake signal to all blocking tasks' Miroslav Benes
2021-04-14 14:52   ` xiaojun.zhao141
2021-04-14 15:21   ` xiaojun.zhao141
2021-04-14 17:21     ` Josef Bacik
2021-04-15  6:27       ` xiaojun.zhao141
2021-04-15  8:37       ` Miroslav Benes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).