All of lore.kernel.org
 help / color / mirror / Atom feed
From: valdis.kletnieks@vt.edu
To: Dennis Zhou <dennis@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>, Tejun Heo <tj@kernel.org>,
	linux-kernel@vger.kernel.org, linux-block@vger.kernel.org
Subject: Re: [BUG] ext4/block null pointer crashes in linux-next
Date: Fri, 19 Oct 2018 22:47:19 -0400	[thread overview]
Message-ID: <19715.1540003639@turing-police.cc.vt.edu> (raw)
In-Reply-To: <20181019222100.GA20900@dennisz-mbp.dhcp.thefacebook.com>

[-- Attachment #1: Type: text/plain, Size: 5055 bytes --]

On Fri, 19 Oct 2018 18:21:00 -0400, Dennis Zhou said:

> Do you by chance run any encryption or anything on top of your hard
> drive or ssd?

ext4 on an LVM LV that's part of a PV that's inside a cryptLUKS partition on a hard drive..

So lots of nested levels there.

> I thought of another issue that may explain what's going on. It has to
> do with how a bio can go through make_request() several times. However,
> I do association on the first entry, but subsequent requests may go to
> separate queues. Therefore association and the blk_get_rl() returns the
> wrong request_list. It may be that a particular blkg doesn't have a
> fully initialized request_list.

> Thanks for being patient with me. Would you be able to try the following
> on Jens' for-4.20/block branch? His tree is available here:
> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git

No problem.  I've managed to trip over issues that took a *lot* longer to resolve
(I think back around 2.5.47 or so, the PCMCIA slot in my Dell Latitude kept finding
different ways to explode the kernel for close to 8-9 months...)

I checked, and linux-next was all of 1 commit behind jens' for-4.20 tree, so
I applied it to that (I had a linux-next tree that works, but I'm a git idiot so
figuring out how to graft that tree on was going to take a while...)

Result:

Script started on 2018-10-19 22:29:32-04:00
[root@turing-police x86_64]# uname -a
Linux turing-police.cc.vt.edu 4.19.0-rc8-next-20181019-dirty #641 SMP PREEMPT Fri Oct 19 21:18:19 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@turing-police x86_64]# rpm -Uvh --force dracut-049-4.git20181010.fc30.x86_64.rpm
Verifying...                          ################################# [100%]
warning: Unable to get systemd shutdown inhibition lock: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
Preparing...                          ################################# [100%]
Updating / installing...
   1:dracut-049-4.git20181010.fc30    ################################# [100%]
[root@turing-police x86_64]# exit
exit

Script done on 2018-10-19 22:29:59-04:00

System stable, RPM works, dnf works, some good-sized compiles worked.

Looks like it's time to commit that, and add these:

Reported-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>

:)

> ---
>  block/bio.c         | 20 ++++++++++++++++++++
>  block/blk-core.c    |  1 +
>  include/linux/bio.h |  3 +++
>  3 files changed, 24 insertions(+)
>
> diff --git a/block/bio.c b/block/bio.c
> index 17a8b0aa7050..bbfeb4ee2892 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -2083,6 +2083,26 @@ int bio_associate_create_blkg(struct request_queue *q, struct bio *bio)
>  	return ret;
>  }
>
> +/**
> + * bio_reassociate_blkg - reassociate a bio with a blkg from q
> + * @q: request_queue where bio is going
> + * @bio: target bio
> + *
> + * When submitting a bio, multiple recursive calls to make_request() may occur.
> + * This causes the initial associate done in blkcg_bio_issue_check() to be
> + * incorrect and reference the prior request_queue.  This performs reassociation
> + * when this situation happens.
> + */
> +int bio_reassociate_blkg(struct request_queue *q, struct bio *bio)
> +{
> +	if (bio->bi_blkg) {
> +		blkg_put(bio->bi_blkg);
> +		bio->bi_blkg = NULL;
> +	}
> +
> +	return bio_associate_create_blkg(q, bio);
> +}
> +
>  /**
>   * bio_disassociate_task - undo bio_associate_current()
>   * @bio: target bio
> diff --git a/block/blk-core.c b/block/blk-core.c
> index cdfabc5646da..3ed60723e242 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2433,6 +2433,7 @@ blk_qc_t generic_make_request(struct bio *bio)
>  			if (q)
>  				blk_queue_exit(q);
>  			q = bio->bi_disk->queue;
> +			bio_reassociate_blkg(q, bio);
>  			flags = 0;
>  			if (bio->bi_opf & REQ_NOWAIT)
>  				flags = BLK_MQ_REQ_NOWAIT;
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index f447b0ebb288..b47c7f716731 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -514,6 +514,7 @@ int bio_associate_blkg(struct bio *bio, struct blkcg_gq *blkg);
>  int bio_associate_blkg_from_css(struct bio *bio,
>  				struct cgroup_subsys_state *css);
>  int bio_associate_create_blkg(struct request_queue *q, struct bio *bio);
> +int bio_reassociate_blkg(struct request_queue *q, struct bio *bio);
>  void bio_disassociate_task(struct bio *bio);
>  void bio_clone_blkg_association(struct bio *dst, struct bio *src);
>  #else	/* CONFIG_BLK_CGROUP */
> @@ -522,6 +523,8 @@ static inline int bio_associate_blkg_from_css(struct bio *bio,
>  { return 0; }
>  static inline int bio_associate_create_blkg(struct request_queue *q,
>  					    struct bio *bio) { return 0; }
> +static inline int bio_reassociate_blkg(struct request_queue *q, struct bio *bio)
> +{ return 0; }
>  static inline void bio_disassociate_task(struct bio *bio) { }
>  static inline void bio_clone_blkg_association(struct bio *dst,
>  					      struct bio *src) { }
> --
> 2.17.1
>


[-- Attachment #2: Type: application/pgp-signature, Size: 486 bytes --]

  reply	other threads:[~2018-10-20  2:47 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-15 23:28 [BUG] ext4/block null pointer crashes in linux-next valdis.kletnieks
2018-10-16  1:52 ` Theodore Y. Ts'o
2018-10-16 12:42   ` valdis.kletnieks
2018-10-16 16:12     ` valdis.kletnieks
2018-10-16 16:02 ` Dennis Zhou
2018-10-16 18:25   ` Dennis Zhou
2018-10-17 15:47     ` valdis.kletnieks
2018-10-17 21:20       ` Dennis Zhou
2018-10-19 15:52         ` valdis.kletnieks
2018-10-19 22:21           ` Dennis Zhou
2018-10-20  2:47             ` valdis.kletnieks [this message]
2018-10-20  4:04               ` Dennis Zhou
2018-10-19 23:50         ` valdis.kletnieks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=19715.1540003639@turing-police.cc.vt.edu \
    --to=valdis.kletnieks@vt.edu \
    --cc=axboe@kernel.dk \
    --cc=dennis@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.