All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dennis Zhou <dennis@kernel.org>
To: dsterba@suse.cz, David Sterba <dsterba@suse.com>,
	Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
	kernel-team@fb.com, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/2] btrfs: fix compressed write bio attribution
Date: Thu, 12 Dec 2019 08:19:43 -0800	[thread overview]
Message-ID: <20191212161943.GA30404@dennisz-mbp> (raw)
In-Reply-To: <20191212151853.GT3929@twin.jikos.cz>

On Thu, Dec 12, 2019 at 04:18:53PM +0100, David Sterba wrote:
> On Wed, Dec 11, 2019 at 04:07:07PM -0800, Dennis Zhou wrote:
> > Bio attribution is handled at bio_set_dev() as once we have a device, we
> > have a corresponding request_queue and then can derive the current css.
> > In special cases, we want to attribute to bio to someone else. This can
> > be done by calling bio_associate_blkg_from_css(). Btrfs does this for
> > compressed writeback as they are handled by kworkers which would be of
> > the root cgroup rather than the cgroup designated by the wbc.
> > 
> > Commit 1a41802701ec ("btrfs: drop bio_set_dev where not needed") removes
> > early bio_set_dev() calls prior to submit_stripe_bio(). This breaks the
> > above assumption that we'll have a request_queue when we are doing
> > association. To fix this, special case passing the bio through just for
> > btrfs_submit_compressed_write().
> > 
> > Without this, we crash in btrfs/024:
> > [ 3052.093088] BUG: kernel NULL pointer dereference, address: 0000000000000510
> > [ 3052.107013] #PF: supervisor read access in kernel mode
> > [ 3052.107014] #PF: error_code(0x0000) - not-present page
> > [ 3052.107015] PGD 0 P4D 0
> > [ 3052.107021] Oops: 0000 [#1] SMP
> > [ 3052.138904] CPU: 42 PID: 201270 Comm: kworker/u161:0 Kdump: loaded Not tainted 5.5.0-rc1-00062-g4852d8ac90a9 #712
> > [ 3052.138905] Hardware name: Quanta Tioga Pass Single Side 01-0032211004/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018
> > [ 3052.138912] Workqueue: btrfs-delalloc btrfs_work_helper
> > [ 3052.191375] RIP: 0010:bio_associate_blkg_from_css+0x1e/0x3c0
> > [ 3052.191377] Code: ff 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 49 89 fc 55 53 48 89 f3 48 83 ec 08 48 8b 47 08 65 ff 05 ea 6e 9f 7e <48> 8b a8 10 05 00 00 45 31 c9 45 31 c0 31 d2 31 f6 b9 02 00 00 00
> > [ 3052.191379] RSP: 0018:ffffc900210cfc90 EFLAGS: 00010282
> > [ 3052.191380] RAX: 0000000000000000 RBX: ffff88bfe5573c00 RCX: 0000000000000000
> > [ 3052.191382] RDX: ffff889db48ec2f0 RSI: ffff88bfe5573c00 RDI: ffff889db48ec2f0
> > [ 3052.191386] RBP: 0000000000000800 R08: 0000000000203bb0 R09: ffff889db16b2400
> > [ 3052.293364] R10: 0000000000000000 R11: ffff88a07fffde80 R12: ffff889db48ec2f0
> > [ 3052.293365] R13: 0000000000001000 R14: ffff889de82bc000 R15: ffff889e2b7bdcc8
> > [ 3052.293367] FS:  0000000000000000(0000) GS:ffff889ffba00000(0000) knlGS:0000000000000000
> > [ 3052.293368] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 3052.293369] CR2: 0000000000000510 CR3: 0000000002611001 CR4: 00000000007606e0
> > [ 3052.293370] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 3052.293371] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 3052.293372] PKRU: 55555554
> > [ 3052.293376] Call Trace:
> > [ 3052.402552]  btrfs_submit_compressed_write+0x137/0x390
> 
> Isn't it the same crash that Chris Murphy reported?
> 
> https://lore.kernel.org/linux-btrfs/CAJCQCtS_7vjBnqeDsedBQJYuE_ap+Xo6D=MXY=rOxf66oJZkrA@mail.gmail.com/
> 

Yeah, looks like the same crash.

> > [ 3052.402558]  submit_compressed_extents+0x40f/0x4c0
> > [ 3052.422401]  btrfs_work_helper+0x246/0x5a0
> > [ 3052.422408]  process_one_work+0x200/0x570
> > [ 3052.438601]  ? process_one_work+0x180/0x570
> > [ 3052.438605]  worker_thread+0x4c/0x3e0
> > [ 3052.438614]  kthread+0x103/0x140
> > [ 3052.460735]  ? process_one_work+0x570/0x570
> > [ 3052.460737]  ? kthread_mod_delayed_work+0xc0/0xc0
> > [ 3052.460744]  ret_from_fork+0x24/0x30
> > 
> > Fixes: 1a41802701ec ("btrfs: drop bio_set_dev where not needed")
> > Cc: David Sterba <dsterba@suse.com>
> > Cc: Josef Bacik <josef@toxicpanda.com>
> > Signed-off-by: Dennis Zhou <dennis@kernel.org>
> > ---
> >  fs/btrfs/compression.c | 14 +++++---------
> >  fs/btrfs/volumes.c     | 18 ++++++++++++++----
> >  fs/btrfs/volumes.h     |  3 +++
> >  3 files changed, 22 insertions(+), 13 deletions(-)
> > 
> > diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> > index 4ce81571f0cd..67d604fcb606 100644
> > --- a/fs/btrfs/compression.c
> > +++ b/fs/btrfs/compression.c
> > @@ -444,11 +444,9 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
> >  	bio->bi_opf = REQ_OP_WRITE | write_flags;
> >  	bio->bi_private = cb;
> >  	bio->bi_end_io = end_compressed_bio_write;
> > -
> > -	if (blkcg_css) {
> > +	if (blkcg_css)
> >  		bio->bi_opf |= REQ_CGROUP_PUNT;
> > -		bio_associate_blkg_from_css(bio, blkcg_css);
> > -	}
> > +
> >  	refcount_set(&cb->pending_bios, 1);
> >  
> >  	/* create and submit bios for the compressed pages */
> > @@ -481,7 +479,7 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
> >  				BUG_ON(ret); /* -ENOMEM */
> >  			}
> >  
> > -			ret = btrfs_map_bio(fs_info, bio, 0);
> > +			ret = __btrfs_map_bio(fs_info, bio, 0, blkcg_css);
> >  			if (ret) {
> >  				bio->bi_status = ret;
> >  				bio_endio(bio);
> > @@ -491,10 +489,8 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
> >  			bio->bi_opf = REQ_OP_WRITE | write_flags;
> >  			bio->bi_private = cb;
> >  			bio->bi_end_io = end_compressed_bio_write;
> > -			if (blkcg_css) {
> > +			if (blkcg_css)
> >  				bio->bi_opf |= REQ_CGROUP_PUNT;
> > -				bio_associate_blkg_from_css(bio, blkcg_css);
> > -			}
> >  			bio_add_page(bio, page, PAGE_SIZE, 0);
> >  		}
> >  		if (bytes_left < PAGE_SIZE) {
> > @@ -515,7 +511,7 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
> >  		BUG_ON(ret); /* -ENOMEM */
> >  	}
> >  
> > -	ret = btrfs_map_bio(fs_info, bio, 0);
> > +	ret = __btrfs_map_bio(fs_info, bio, 0, blkcg_css);
> >  	if (ret) {
> >  		bio->bi_status = ret;
> >  		bio_endio(bio);
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index 66377e678504..c68d93a1aae8 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -6240,7 +6240,8 @@ static void btrfs_end_bio(struct bio *bio)
> >  }
> >  
> >  static void submit_stripe_bio(struct btrfs_bio *bbio, struct bio *bio,
> > -			      u64 physical, int dev_nr)
> > +			      u64 physical, int dev_nr,
> > +			      struct cgroup_subsys_state *blkcg_css)
> >  {
> >  	struct btrfs_device *dev = bbio->stripes[dev_nr].dev;
> >  	struct btrfs_fs_info *fs_info = bbio->fs_info;
> > @@ -6255,6 +6256,8 @@ static void submit_stripe_bio(struct btrfs_bio *bbio, struct bio *bio,
> >  		(u_long)dev->bdev->bd_dev, rcu_str_deref(dev->name), dev->devid,
> >  		bio->bi_iter.bi_size);
> >  	bio_set_dev(bio, dev->bdev);
> > +	if (blkcg_css)
> > +		bio_associate_blkg_from_css(bio, blkcg_css);
> 
> At this point we know the bdev is the correct one, but is the blkcg_css
> different for each device or is there one for all?
> 

It's a single blkcg_css for all devices. It would be different blkgs as
those are the request_queue-blkcg pairs.

> Passing the blkcg_css is one way, one single point where the bio and css
> are associated and probably the cleanest. I only don't like the need to
> pass the blkcg_css around but that's probably not a big deal than to
> forget to set bdev somewhere.

Actually, let me try spinning one other way of doing this and get back
to you in a bit.

  reply	other threads:[~2019-12-12 16:19 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-12  0:07 [PATCH 1/2] btrfs: punt all bios created in btrfs_submit_compressed_write() Dennis Zhou
2019-12-12  0:07 ` [PATCH 2/2] btrfs: fix compressed write bio attribution Dennis Zhou
2019-12-12 15:18   ` David Sterba
2019-12-12 16:19     ` Dennis Zhou [this message]
2019-12-12 18:19   ` [PATCH v2 " Dennis Zhou
2019-12-13 12:24     ` David Sterba
2019-12-13 22:21       ` Dennis Zhou
2019-12-17 15:05         ` David Sterba
2019-12-17 18:44           ` Dennis Zhou
2019-12-12  0:15 ` [PATCH 1/2] btrfs: punt all bios created in btrfs_submit_compressed_write() Chris Mason
2019-12-30 15:08 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191212161943.GA30404@dennisz-mbp \
    --to=dennis@kernel.org \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=dsterba@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.