Re: [PATCH] block: Bail out iteration functions upon SIGKILL.

From: Damien Le Moal <Damien.LeMoal@wdc.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Bob Liu <bob.liu@oracle.com>, "axboe@kernel.dk" <axboe@kernel.dk>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Subject: Re: [PATCH] block: Bail out iteration functions upon SIGKILL.
Date: Wed, 13 Nov 2019 07:11:36 +0000	[thread overview]
Message-ID: <BYAPR04MB581697D2911048D329DFF5B2E7760@BYAPR04MB5816.namprd04.prod.outlook.com> (raw)
In-Reply-To: 20191113065523.GA1985@ming.t460p

On 2019/11/13 15:55, Ming Lei wrote:
> On Wed, Nov 13, 2019 at 01:54:14AM +0000, Damien Le Moal wrote:
>> On 2019/11/12 23:48, Tetsuo Handa wrote:
>> [...]
>>>>> +static int blk_should_abort(struct bio *bio)
>>>>> +{
>>>>> +	int ret;
>>>>> +
>>>>> +	cond_resched();
>>>>> +	if (!fatal_signal_pending(current))
>>>>> +		return 0;
>>>>> +	ret = submit_bio_wait(bio);
>>>>
>>>> This will change the behavior of __blkdev_issue_discard() to a sync IO
>>>> execution instead of the current async execution since submit_bio_wait()
>>>> call is the responsibility of the caller (e.g. blkdev_issue_discard()).
>>>> Have you checked if users of __blkdev_issue_discard() are OK with that ?
>>>> f2fs, ext4, xfs, dm and nvme use this function.
>>>
>>> I'm not sure...
>>>
>>>>
>>>> Looking at f2fs, this does not look like it is going to work as expected
>>>> since the bio setup, including end_io callback, is done after this
>>>> function is called and a regular submit_bio() execution is being used.
>>>
>>> Then, just breaking the iteration like below?
>>> nvmet_bdev_execute_write_zeroes() ignores -EINTR if "*biop = bio;" is done. Is that no problem?
>>>
>>> --- a/block/blk-lib.c
>>> +++ b/block/blk-lib.c
>>> @@ -7,6 +7,7 @@
>>>  #include <linux/bio.h>
>>>  #include <linux/blkdev.h>
>>>  #include <linux/scatterlist.h>
>>> +#include <linux/sched/signal.h>
>>>  
>>>  #include "blk.h"
>>>  
>>> @@ -30,6 +31,7 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>>  	struct bio *bio = *biop;
>>>  	unsigned int op;
>>>  	sector_t bs_mask;
>>> +	int ret = 0;
>>>  
>>>  	if (!q)
>>>  		return -ENXIO;
>>> @@ -76,10 +78,14 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>>>  		 * is disabled.
>>>  		 */
>>>  		cond_resched();
>>> +		if (fatal_signal_pending(current)) {
>>> +			ret = -EINTR;
>>> +			break;
>>> +		}
>>>  	}
>>>  
>>>  	*biop = bio;
>>> -	return 0;
>>> +	return ret;
>>
>> This will leak a bio as blkdev_issue_discard() executes the bio only in
>> the case "if (!ret && bio)". So that does not work as is, unless all
>> callers of __blkdev_issue_discard() are also changed. Same problem for
>> the other __blkdev_issue_xxx() functions.
>>
>> Looking more into this, if an error is returned here, no bio should be
>> returned and we need to make sure that all started bios are also
>> completed. So your helper blk_should_abort() did the right thing calling
>> submit_bio_wait(). However, I Think it would be better to fail
>> immediately the current loop bio instead of executing it and then
>> reporting the -EINTR error, unconditionally, regardless of what the
>> started bios completion status is.
>>
>> This could be done with the help of a function like this, very similar
>> to submit_bio_wait().
>>
>> void bio_chain_end_wait(struct bio *bio)
>> {
>> 	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
>>
>> 	bio->bi_private = &done;
>> 	bio->bi_end_io = submit_bio_wait_endio;
>> 	bio->bi_opf |= REQ_SYNC;
>> 	bio_endio(bio);
>> 	wait_for_completion_io(&done);
>> }
>>
>> And then your helper function becomes something like this:
>>
>> static int blk_should_abort(struct bio *bio)
>> {
>> 	int ret;
>>
>> 	cond_resched();
>> 	if (!fatal_signal_pending(current))
>> 		return 0;
>>
>> 	if (bio_flagged(bio, BIO_CHAIN))
>> 		bio_chain_end_wait(bio);
>> 	bio_put(bio);
>>
>> 	return -EINTR;
>> }
>>
>> Thoughts ?
> 
> DISCARD request can be quite big, and any sync bio submission may cause
> serious performance regression.

Yes indeed. But if the bio issuing loop is interrupted with discard BIOs
already issued, I do not think there is any other choice but to wait for
their completion before returning.

> Not mention blkdev_issue_discard() may be called in non-block context.

This loop is calling cond_resched(), which checks might_sleep(). So
certainly this function can block, no ?

-- 
Damien Le Moal
Western Digital Research