All of lore.kernel.org
 help / color / mirror / Atom feed
* IO errors after "block: remove bio_get_nr_vecs()"
@ 2015-12-20 17:51 Linus Torvalds
  2015-12-20 18:18 ` Christoph Hellwig
                   ` (4 more replies)
  0 siblings, 5 replies; 45+ messages in thread
From: Linus Torvalds @ 2015-12-20 17:51 UTC (permalink / raw)
  To: Kent Overstreet, Christoph Hellwig, Ming Lin, Jens Axboe,
	Artem S. Tashkinov
  Cc: Steven Whitehouse, Tejun Heo, IDE-ML, Linux Kernel Mailing List

Kent, Jens, Christoph et al,
 please see this bugzilla:

  https://bugzilla.kernel.org/show_bug.cgi?id=109661

where Artem Tashkinov bisected his problems with 4.3 down to commit
b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
signed off on.

(Also Tejun - maybe you can see what's up - maybe that error message
tells you something)

I'm not sure what's up with his machine, the disk doesn't seem to be
anyuthing particularly unusual, it looks like a 1TB Seagate Barracuda:

  ata1.00: ATA-8: ST1000DM003-1CH162, CC44, max UDMA/133

which doesn't strike me as odd.

Looking at the dmesg, it also looks like it's a pretty normal
Sandybridge setup with Intel chipset. Artem, can you confirm? The PCI
ID for the AHCI chip seems to be (INTEL, 0x1c02).

Any ideas? Anybody?

                       Linus

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 17:51 IO errors after "block: remove bio_get_nr_vecs()" Linus Torvalds
@ 2015-12-20 18:18 ` Christoph Hellwig
  2015-12-20 18:41   ` Linus Torvalds
                     ` (2 more replies)
  2015-12-20 23:23 ` Artem S. Tashkinov
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 45+ messages in thread
From: Christoph Hellwig @ 2015-12-20 18:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kent Overstreet, Ming Lin, Jens Axboe, Artem S. Tashkinov,
	Steven Whitehouse, Tejun Heo, IDE-ML, Linux Kernel Mailing List

On Sun, Dec 20, 2015 at 09:51:14AM -0800, Linus Torvalds wrote:
> Kent, Jens, Christoph et al,
>  please see this bugzilla:
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=109661
> 
> where Artem Tashkinov bisected his problems with 4.3 down to commit
> b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
> signed off on.

Artem,

can you re-check the commits around this series again?  I would be
extremtly surprised if it's really this particular commit and not
one just before it causing the problem - it just allocates bios
to the biggest possible instead of only allocating up to what
bio_add_page would accept.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 18:18 ` Christoph Hellwig
@ 2015-12-20 18:41   ` Linus Torvalds
  2015-12-20 23:36     ` Artem S. Tashkinov
  2015-12-21 11:21     ` Dan Aloni
  2015-12-20 18:44   ` Kent Overstreet
  2015-12-20 23:25   ` Artem S. Tashkinov
  2 siblings, 2 replies; 45+ messages in thread
From: Linus Torvalds @ 2015-12-20 18:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kent Overstreet, Ming Lin, Jens Axboe, Artem S. Tashkinov,
	Steven Whitehouse, Tejun Heo, IDE-ML, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2167 bytes --]

On Sun, Dec 20, 2015 at 10:18 AM, Christoph Hellwig <hch@lst.de> wrote:
>
> Artem,
>
> can you re-check the commits around this series again?  I would be
> extremtly surprised if it's really this particular commit and not
> one just before it causing the problem - it just allocates bios
> to the biggest possible instead of only allocating up to what
> bio_add_page would accept.

Judging by Artem's bisect log, the last commit he tested before the
bad one was the commit before: commit 6cf66b4caf9c ("fs: use helper
bio_add_page() instead of open coding on bi_io_vec") and he marked
that one good.

Sadly, without CONFIG_LOCALVERSION_AUTO, there's no way to match up
the dmesg files (in the same bisection tar-file as the bisection log)
with the actual versions. Also, Artem's bisect.log isn't actually the
.git/BISECT_LOG file that contains the full information about what was
marked good and bad, so it's a bit hard to read (ie I can tell that
Artem had to mark commit 6cf66b4caf9c as "good" not because his log
says so, but because that explains the next commit to be tested).

Of course, it's fairly easy to make a mistake while bisecting (just
doing a thinko), but usually bisection miistakes end up causing you to
go into some "all good" or "all bad" region of commits, and the fact
that Artem seems to have marked the previous commit good and the final
commit bad does seem to imply the bisection was successful.

But yes, it is always nice to double-check the bisection results. The
best way to do it is generally to try to revert the bad commit and
verify that things work after that, but that commit doesn't revert
cleanly on top of 4.3 due to other changes.

Attached is a *COMPLETELY*UNTESTED* revertish patch for 4.3. It's
basically a revert of b54ffb73cadc, but with a few fixups to make the
revert work on top of 4.3.

So Artem, if you can test whether 4.3 works with that revert, and/or
double-check booting that b54ffb73cadc again (to verify that it's
really bad), and its parent (to double-check that it's really good),
that would be a good way to verify that yes, it is really that *one*
commit that breaks things for you.

                Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/plain, Size: 11122 bytes --]

 block/bio.c            | 23 +++++++++++++++++++++++
 drivers/md/dm-io.c     |  2 +-
 fs/btrfs/compression.c |  5 ++++-
 fs/btrfs/extent_io.c   |  9 +++++++--
 fs/btrfs/inode.c       |  3 ++-
 fs/btrfs/scrub.c       | 18 ++++++++++++++++--
 fs/direct-io.c         |  2 +-
 fs/ext4/page-io.c      |  3 ++-
 fs/ext4/readpage.c     |  2 +-
 fs/f2fs/data.c         |  2 +-
 fs/gfs2/lops.c         |  9 ++++++++-
 fs/logfs/dev_bdev.c    |  4 ++--
 fs/mpage.c             |  4 ++--
 fs/nilfs2/segbuf.c     |  2 +-
 fs/xfs/xfs_aops.c      |  3 ++-
 include/linux/bio.h    |  1 +
 16 files changed, 74 insertions(+), 18 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index ad3f276d74bc..d483dbb0162d 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -694,6 +694,29 @@ integrity_clone:
 EXPORT_SYMBOL(bio_clone_bioset);
 
 /**
+ *	bio_get_nr_vecs		- return approx number of vecs
+ *	@bdev:  I/O target
+ *
+ *	Return the approximate number of pages we can send to this target.
+ *	There's no guarantee that you will be able to fit this number of pages
+ *	into a bio, it does not account for dynamic restrictions that vary
+ *	on offset.
+ */
+int bio_get_nr_vecs(struct block_device *bdev)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+	int nr_pages;
+
+	nr_pages = min_t(unsigned,
+		     queue_max_segments(q),
+		     queue_max_sectors(q) / (PAGE_SIZE >> 9) + 1);
+
+	return min_t(unsigned, nr_pages, BIO_MAX_PAGES);
+
+}
+EXPORT_SYMBOL(bio_get_nr_vecs);
+
+/**
  *	bio_add_pc_page	-	attempt to add page to bio
  *	@q: the target queue
  *	@bio: destination bio
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 6f8e83b2a6f8..c84714f70378 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -316,7 +316,7 @@ static void do_region(int rw, unsigned region, struct dm_io_region *where,
 		if ((rw & REQ_DISCARD) || (rw & REQ_WRITE_SAME))
 			num_bvecs = 1;
 		else
-			num_bvecs = min_t(int, BIO_MAX_PAGES,
+			num_bvecs = min_t(int, bio_get_nr_vecs(where->bdev),
 					  dm_sector_div_up(remaining, (PAGE_SIZE >> SECTOR_SHIFT)));
 
 		bio = bio_alloc_bioset(GFP_NOIO, num_bvecs, io->client->bios);
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 57ee8ca29b06..302266ec2cdb 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -97,7 +97,10 @@ static inline int compressed_bio_size(struct btrfs_root *root,
 static struct bio *compressed_bio_alloc(struct block_device *bdev,
 					u64 first_byte, gfp_t gfp_flags)
 {
-	return btrfs_bio_alloc(bdev, first_byte >> 9, BIO_MAX_PAGES, gfp_flags);
+	int nr_vecs;
+
+	nr_vecs = bio_get_nr_vecs(bdev);
+	return btrfs_bio_alloc(bdev, first_byte >> 9, nr_vecs, gfp_flags);
 }
 
 static int check_compressed_csum(struct inode *inode,
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 3915c9473e94..f39f819cf7e8 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2803,7 +2803,9 @@ static int submit_extent_page(int rw, struct extent_io_tree *tree,
 {
 	int ret = 0;
 	struct bio *bio;
+	int nr;
 	int contig = 0;
+	int this_compressed = bio_flags & EXTENT_BIO_COMPRESSED;
 	int old_compressed = prev_bio_flags & EXTENT_BIO_COMPRESSED;
 	size_t page_size = min_t(size_t, size, PAGE_CACHE_SIZE);
 
@@ -2831,9 +2833,12 @@ static int submit_extent_page(int rw, struct extent_io_tree *tree,
 			return 0;
 		}
 	}
+	if (this_compressed)
+		nr = BIO_MAX_PAGES;
+	else
+		nr = bio_get_nr_vecs(bdev);
 
-	bio = btrfs_bio_alloc(bdev, sector, BIO_MAX_PAGES,
-			GFP_NOFS | __GFP_HIGH);
+	bio = btrfs_bio_alloc(bdev, sector, nr, GFP_NOFS | __GFP_HIGH);
 	if (!bio)
 		return -ENOMEM;
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 611b66d73e80..4e1aea2ec8b7 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7992,8 +7992,9 @@ out:
 static struct bio *btrfs_dio_bio_alloc(struct block_device *bdev,
 				       u64 first_sector, gfp_t gfp_flags)
 {
+	int nr_vecs = bio_get_nr_vecs(bdev);
 	struct bio *bio;
-	bio = btrfs_bio_alloc(bdev, first_sector, BIO_MAX_PAGES, gfp_flags);
+	bio = btrfs_bio_alloc(bdev, first_sector, nr_vecs, gfp_flags);
 	if (bio)
 		bio_associate_current(bio);
 	return bio;
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index a39f5d1144e8..3b9296e92cc5 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -464,14 +464,27 @@ struct scrub_ctx *scrub_setup_ctx(struct btrfs_device *dev, int is_dev_replace)
 	struct scrub_ctx *sctx;
 	int		i;
 	struct btrfs_fs_info *fs_info = dev->dev_root->fs_info;
+	int pages_per_rd_bio;
 	int ret;
 
+	/*
+	 * the setting of pages_per_rd_bio is correct for scrub but might
+	 * be wrong for the dev_replace code where we might read from
+	 * different devices in the initial huge bios. However, that
+	 * code is able to correctly handle the case when adding a page
+	 * to a bio fails.
+	 */
+	if (dev->bdev)
+		pages_per_rd_bio = min_t(int, SCRUB_PAGES_PER_RD_BIO,
+					 bio_get_nr_vecs(dev->bdev));
+	else
+		pages_per_rd_bio = SCRUB_PAGES_PER_RD_BIO;
 	sctx = kzalloc(sizeof(*sctx), GFP_NOFS);
 	if (!sctx)
 		goto nomem;
 	atomic_set(&sctx->refs, 1);
 	sctx->is_dev_replace = is_dev_replace;
-	sctx->pages_per_rd_bio = SCRUB_PAGES_PER_RD_BIO;
+	sctx->pages_per_rd_bio = pages_per_rd_bio;
 	sctx->curr = -1;
 	sctx->dev_root = dev->dev_root;
 	for (i = 0; i < SCRUB_BIOS_PER_SCTX; ++i) {
@@ -4053,7 +4066,8 @@ static int scrub_setup_wr_ctx(struct scrub_ctx *sctx,
 		return 0;
 
 	WARN_ON(!dev->bdev);
-	wr_ctx->pages_per_wr_bio = SCRUB_PAGES_PER_WR_BIO;
+	wr_ctx->pages_per_wr_bio = min_t(int, SCRUB_PAGES_PER_WR_BIO,
+					 bio_get_nr_vecs(dev->bdev));
 	wr_ctx->tgtdev = dev;
 	atomic_set(&wr_ctx->flush_all_writes, 0);
 	return 0;
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 11256291642e..818c647f36d3 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -655,7 +655,7 @@ static inline int dio_new_bio(struct dio *dio, struct dio_submit *sdio,
 	if (ret)
 		goto out;
 	sector = start_sector << (sdio->blkbits - 9);
-	nr_pages = min(sdio->pages_in_io, BIO_MAX_PAGES);
+	nr_pages = min(sdio->pages_in_io, bio_get_nr_vecs(map_bh->b_bdev));
 	BUG_ON(nr_pages <= 0);
 	dio_bio_alloc(dio, sdio, map_bh->b_bdev, sector, nr_pages);
 	sdio->boundary = 0;
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 84ba4d2b3a35..5b3fcb8a010f 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -374,9 +374,10 @@ void ext4_io_submit_init(struct ext4_io_submit *io,
 static int io_submit_init_bio(struct ext4_io_submit *io,
 			      struct buffer_head *bh)
 {
+	int nvecs = bio_get_nr_vecs(bh->b_bdev);
 	struct bio *bio;
 
-	bio = bio_alloc(GFP_NOIO, BIO_MAX_PAGES);
+	bio = bio_alloc(GFP_NOIO, min(nvecs, BIO_MAX_PAGES));
 	if (!bio)
 		return -ENOMEM;
 	wbc_init_bio(io->io_wbc, bio);
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index 560af0437704..a4823d88ae26 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -284,7 +284,7 @@ int ext4_mpage_readpages(struct address_space *mapping,
 					goto set_error_page;
 			}
 			bio = bio_alloc(GFP_KERNEL,
-				min_t(int, nr_pages, BIO_MAX_PAGES));
+				min_t(int, nr_pages, bio_get_nr_vecs(bdev)));
 			if (!bio) {
 				if (ctx)
 					ext4_release_crypto_ctx(ctx);
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index a82abe921b89..432496daacae 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -954,7 +954,7 @@ submit_and_realloc:
 			}
 
 			bio = bio_alloc(GFP_KERNEL,
-				min_t(int, nr_pages, BIO_MAX_PAGES));
+				min_t(int, nr_pages, bio_get_nr_vecs(bdev)));
 			if (!bio) {
 				if (ctx)
 					f2fs_release_crypto_ctx(ctx);
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index d5369a109781..4052116959fb 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -261,11 +261,18 @@ void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw)
 static struct bio *gfs2_log_alloc_bio(struct gfs2_sbd *sdp, u64 blkno)
 {
 	struct super_block *sb = sdp->sd_vfs;
+	unsigned nrvecs = bio_get_nr_vecs(sb->s_bdev);
 	struct bio *bio;
 
 	BUG_ON(sdp->sd_log_bio);
 
-	bio = bio_alloc(GFP_NOIO, BIO_MAX_PAGES);
+	while (1) {
+		bio = bio_alloc(GFP_NOIO, nrvecs);
+		if (likely(bio))
+			break;
+		nrvecs = max(nrvecs/2, 1U);
+	}
+
 	bio->bi_iter.bi_sector = blkno * (sb->s_blocksize >> 9);
 	bio->bi_bdev = sb->s_bdev;
 	bio->bi_end_io = gfs2_end_log_write;
diff --git a/fs/logfs/dev_bdev.c b/fs/logfs/dev_bdev.c
index a7fdbd868474..cea0cc9878b7 100644
--- a/fs/logfs/dev_bdev.c
+++ b/fs/logfs/dev_bdev.c
@@ -81,7 +81,7 @@ static int __bdev_writeseg(struct super_block *sb, u64 ofs, pgoff_t index,
 	unsigned int max_pages;
 	int i;
 
-	max_pages = min(nr_pages, BIO_MAX_PAGES);
+	max_pages = min(nr_pages, (size_t) bio_get_nr_vecs(super->s_bdev));
 
 	bio = bio_alloc(GFP_NOFS, max_pages);
 	BUG_ON(!bio);
@@ -171,7 +171,7 @@ static int do_erase(struct super_block *sb, u64 ofs, pgoff_t index,
 	unsigned int max_pages;
 	int i;
 
-	max_pages = min(nr_pages, BIO_MAX_PAGES);
+	max_pages = min(nr_pages, (size_t) bio_get_nr_vecs(super->s_bdev));
 
 	bio = bio_alloc(GFP_NOFS, max_pages);
 	BUG_ON(!bio);
diff --git a/fs/mpage.c b/fs/mpage.c
index a7c34274f207..7b0b90e23cbe 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -278,7 +278,7 @@ alloc_new:
 				goto out;
 		}
 		bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9),
-				min_t(int, nr_pages, BIO_MAX_PAGES), gfp);
+				min_t(int, nr_pages, bio_get_nr_vecs(bdev)), gfp);
 		if (bio == NULL)
 			goto confused;
 	}
@@ -605,7 +605,7 @@ alloc_new:
 			}
 		}
 		bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9),
-				BIO_MAX_PAGES, GFP_NOFS|__GFP_HIGH);
+				bio_get_nr_vecs(bdev), GFP_NOFS|__GFP_HIGH);
 		if (bio == NULL)
 			goto confused;
 
diff --git a/fs/nilfs2/segbuf.c b/fs/nilfs2/segbuf.c
index f63620ce3892..550b10efb14e 100644
--- a/fs/nilfs2/segbuf.c
+++ b/fs/nilfs2/segbuf.c
@@ -414,7 +414,7 @@ static void nilfs_segbuf_prepare_write(struct nilfs_segment_buffer *segbuf,
 {
 	wi->bio = NULL;
 	wi->rest_blocks = segbuf->sb_sum.nblocks;
-	wi->max_pages = BIO_MAX_PAGES;
+	wi->max_pages = bio_get_nr_vecs(wi->nilfs->ns_bdev);
 	wi->nr_vecs = min(wi->max_pages, wi->rest_blocks);
 	wi->start = wi->end = 0;
 	wi->blocknr = segbuf->sb_pseg_start;
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 50ab2879b9da..760dd2a5bf05 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -380,7 +380,8 @@ STATIC struct bio *
 xfs_alloc_ioend_bio(
 	struct buffer_head	*bh)
 {
-	struct bio		*bio = bio_alloc(GFP_NOIO, BIO_MAX_PAGES);
+	int			nvecs = bio_get_nr_vecs(bh->b_bdev);
+	struct bio		*bio = bio_alloc(GFP_NOIO, nvecs);
 
 	ASSERT(bio->bi_private == NULL);
 	bio->bi_iter.bi_sector = bh->b_blocknr * (bh->b_size >> 9);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index b9b6e046b52e..f2198d5b8be1 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -451,6 +451,7 @@ void bio_chain(struct bio *, struct bio *);
 extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
 extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
 			   unsigned int, unsigned int);
+extern int bio_get_nr_vecs(struct block_device *);
 struct rq_map_data;
 extern struct bio *bio_map_user_iov(struct request_queue *,
 				    const struct iov_iter *, gfp_t);

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 18:18 ` Christoph Hellwig
  2015-12-20 18:41   ` Linus Torvalds
@ 2015-12-20 18:44   ` Kent Overstreet
  2015-12-20 23:41     ` Artem S. Tashkinov
  2015-12-20 23:25   ` Artem S. Tashkinov
  2 siblings, 1 reply; 45+ messages in thread
From: Kent Overstreet @ 2015-12-20 18:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Ming Lin, Jens Axboe, Artem S. Tashkinov,
	Steven Whitehouse, Tejun Heo, IDE-ML, Linux Kernel Mailing List

On Sun, Dec 20, 2015 at 07:18:01PM +0100, Christoph Hellwig wrote:
> On Sun, Dec 20, 2015 at 09:51:14AM -0800, Linus Torvalds wrote:
> > Kent, Jens, Christoph et al,
> ie  please see this bugzilla:
> >o 
> >   httpps://bugzilla.kernel.org/show_bug.cgi?id=109661
> > 
> > where Artem Tashkinov bisected his problems with 4.3 down to commit
> > b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
> > signed off on.
> 
> Artem,
> 
> can you re-check the commits around this series again?  I would be
> extremtly surprised if it's really this particular commit and not
> one just before it causing the problem - it just allocates bios
> to the biggest possible instead of only allocating up to what
> bio_add_page would accept.

pretty sure it's something with how blk_bio_segment_split() decides what
segments are mergable and not. bio_get_nr_vecs() was just returning nr_pages ==
queue_max_segments (ignoring sectors for the moment) - so wait, wtf? that's
basically assuming no segment merging can ever happen, if it does then this was
causing us to send smaller requests to the device than we could have been.

so actually two possibilities I can see:
 - in blk_bio_segment_split(), something's screwed up with how it decides what
   segments are going to be mergable or not. but I don't think that's likely
   since it's doing the exact same thing the rest of the segment merging code
   does.
 - or, the driver was lying in its queue limits, using queue_max_segments for
   "the maximum number of pages I can possibly take", and that bug lurked
   undiscovered because of the screwed-upness in bio_get_nr_vecs().

Offhand I don't know where to start digging in the driver code to look into the
second theory though. Tejun, you got any ideas?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 17:51 IO errors after "block: remove bio_get_nr_vecs()" Linus Torvalds
  2015-12-20 18:18 ` Christoph Hellwig
@ 2015-12-20 23:23 ` Artem S. Tashkinov
  2015-12-21  1:38 ` Ming Lei
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-20 23:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kent Overstreet, Christoph Hellwig, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, Tejun Heo, IDE-ML,
	Linux Kernel Mailing List, linus971

On 2015-12-20 22:51, Linus Torvalds wrote:
> Kent, Jens, Christoph et al,
>  please see this bugzilla:
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=109661
> 
> where Artem Tashkinov bisected his problems with 4.3 down to commit
> b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
> signed off on.
> 
> (Also Tejun - maybe you can see what's up - maybe that error message
> tells you something)
> 
> I'm not sure what's up with his machine, the disk doesn't seem to be
> anyuthing particularly unusual, it looks like a 1TB Seagate Barracuda:
> 
>   ata1.00: ATA-8: ST1000DM003-1CH162, CC44, max UDMA/133
> 
> which doesn't strike me as odd.
> 
> Looking at the dmesg, it also looks like it's a pretty normal
> Sandybridge setup with Intel chipset. Artem, can you confirm? The PCI
> ID for the AHCI chip seems to be (INTEL, 0x1c02).
> 
> Any ideas? Anybody?
> 

That's correct. That's a very usual Asus P8P67 Pro motherboard (Intel 
P67 chipset) in AHCI mode and run of the mill HDD which is the one you 
identified.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 18:18 ` Christoph Hellwig
  2015-12-20 18:41   ` Linus Torvalds
  2015-12-20 18:44   ` Kent Overstreet
@ 2015-12-20 23:25   ` Artem S. Tashkinov
  2015-12-20 23:42     ` Kent Overstreet
  2 siblings, 1 reply; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-20 23:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Kent Overstreet, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, Tejun Heo, IDE-ML,
	Linux Kernel Mailing List

On 2015-12-20 23:18, Christoph Hellwig wrote:
> On Sun, Dec 20, 2015 at 09:51:14AM -0800, Linus Torvalds wrote:
>> Kent, Jens, Christoph et al,
>>  please see this bugzilla:
>> 
>>   https://bugzilla.kernel.org/show_bug.cgi?id=109661
>> 
>> where Artem Tashkinov bisected his problems with 4.3 down to commit
>> b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
>> signed off on.
> 
> Artem,
> 
> can you re-check the commits around this series again?  I would be
> extremtly surprised if it's really this particular commit and not
> one just before it causing the problem - it just allocates bios
> to the biggest possible instead of only allocating up to what
> bio_add_page would accept.

I'm positive about this particular commit. Of course, it might be 
another
GCC 4.7.4 miscompilation which causes the errors which shouldn't be 
there but
I'm not an expert, so.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 18:41   ` Linus Torvalds
@ 2015-12-20 23:36     ` Artem S. Tashkinov
  2015-12-21 11:21     ` Dan Aloni
  1 sibling, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-20 23:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Kent Overstreet, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, Tejun Heo, IDE-ML,
	Linux Kernel Mailing List, linus971

On 2015-12-20 23:41, Linus Torvalds wrote:
> On Sun, Dec 20, 2015 at 10:18 AM, Christoph Hellwig wrote:
>> 
>> Artem,
>> 
>> can you re-check the commits around this series again?  I would be
>> extremtly surprised if it's really this particular commit and not
>> one just before it causing the problem - it just allocates bios
>> to the biggest possible instead of only allocating up to what
>> bio_add_page would accept.
> 
> Judging by Artem's bisect log, the last commit he tested before the
> bad one was the commit before: commit 6cf66b4caf9c ("fs: use helper
> bio_add_page() instead of open coding on bi_io_vec") and he marked
> that one good.
> 
> Sadly, without CONFIG_LOCALVERSION_AUTO, there's no way to match up
> the dmesg files (in the same bisection tar-file as the bisection log)
> with the actual versions. Also, Artem's bisect.log isn't actually the
> .git/BISECT_LOG file that contains the full information about what was
> marked good and bad, so it's a bit hard to read (ie I can tell that
> Artem had to mark commit 6cf66b4caf9c as "good" not because his log
> says so, but because that explains the next commit to be tested).
> 
> Of course, it's fairly easy to make a mistake while bisecting (just
> doing a thinko), but usually bisection miistakes end up causing you to
> go into some "all good" or "all bad" region of commits, and the fact
> that Artem seems to have marked the previous commit good and the final
> commit bad does seem to imply the bisection was successful.
> 
> But yes, it is always nice to double-check the bisection results. The
> best way to do it is generally to try to revert the bad commit and
> verify that things work after that, but that commit doesn't revert
> cleanly on top of 4.3 due to other changes.
> 
> Attached is a *COMPLETELY*UNTESTED* revertish patch for 4.3. It's
> basically a revert of b54ffb73cadc, but with a few fixups to make the
> revert work on top of 4.3.
> 
> So Artem, if you can test whether 4.3 works with that revert, and/or
> double-check booting that b54ffb73cadc again (to verify that it's
> really bad), and its parent (to double-check that it's really good),
> that would be a good way to verify that yes, it is really that *one*
> commit that breaks things for you.
> 

After reverting (applying) this patch on top of 4.3.3 everything is back 
to normal. It's indeed a guilty commit.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 18:44   ` Kent Overstreet
@ 2015-12-20 23:41     ` Artem S. Tashkinov
  0 siblings, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-20 23:41 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Christoph Hellwig, Linus Torvalds, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, Tejun Heo, IDE-ML,
	Linux Kernel Mailing List

On 2015-12-20 23:44, Kent Overstreet wrote:
> On Sun, Dec 20, 2015 at 07:18:01PM +0100, Christoph Hellwig wrote:
>> On Sun, Dec 20, 2015 at 09:51:14AM -0800, Linus Torvalds wrote:
>> > Kent, Jens, Christoph et al,
>> ie  please see this bugzilla:
>> >o
>> >   httpps://bugzilla.kernel.org/show_bug.cgi?id=109661
>> >
>> > where Artem Tashkinov bisected his problems with 4.3 down to commit
>> > b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
>> > signed off on.
>> 
>> Artem,
>> 
>> can you re-check the commits around this series again?  I would be
>> extremtly surprised if it's really this particular commit and not
>> one just before it causing the problem - it just allocates bios
>> to the biggest possible instead of only allocating up to what
>> bio_add_page would accept.
> 
> pretty sure it's something with how blk_bio_segment_split() decides 
> what
> segments are mergable and not. bio_get_nr_vecs() was just returning 
> nr_pages ==
> queue_max_segments (ignoring sectors for the moment) - so wait, wtf? 
> that's
> basically assuming no segment merging can ever happen, if it does then 
> this was
> causing us to send smaller requests to the device than we could have 
> been.
> 
> so actually two possibilities I can see:
>  - in blk_bio_segment_split(), something's screwed up with how it 
> decides what
>    segments are going to be mergable or not. but I don't think that's 
> likely
>    since it's doing the exact same thing the rest of the segment 
> merging code
>    does.
>  - or, the driver was lying in its queue limits, using 
> queue_max_segments for
>    "the maximum number of pages I can possibly take", and that bug 
> lurked
>    undiscovered because of the screwed-upness in bio_get_nr_vecs().
> 
> Offhand I don't know where to start digging in the driver code to look 
> into the
> second theory though. Tejun, you got any ideas?

Here's an actual bisect log which Linus was missing:

git bisect start
# bad: [6a13feb9c82803e2b815eca72fa7a9f5561d7861] Linux 4.3
git bisect bad 6a13feb9c82803e2b815eca72fa7a9f5561d7861
# good: [64291f7db5bd8150a74ad2036f1037e6a0428df2] Linux 4.2
git bisect good 64291f7db5bd8150a74ad2036f1037e6a0428df2
# bad: [807249d3ada1ff28a47c4054ca4edd479421b671] Merge branch 
'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
git bisect bad 807249d3ada1ff28a47c4054ca4edd479421b671
# good: [102178108e2246cb4b329d3fb7872cd3d7120205] Merge tag 
'armsoc-drivers' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
git bisect good 102178108e2246cb4b329d3fb7872cd3d7120205
# good: [62da98656b62a5ca57f22263705175af8ded5aa1] netfilter: 
nf_conntrack: make nf_ct_zone_dflt built-in
git bisect good 62da98656b62a5ca57f22263705175af8ded5aa1
# good: [f1a3c0b933e7ff856223d6fcd7456d403e54e4e5] Merge tag 
'devicetree-for-4.3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
git bisect good f1a3c0b933e7ff856223d6fcd7456d403e54e4e5
# bad: [9cbf22b37ae0592dea809cb8d424990774c21786] Merge tag 'dlm-4.3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
git bisect bad 9cbf22b37ae0592dea809cb8d424990774c21786
# good: [8bdc69b764013a9b5ebeef7df8f314f1066c5d79] Merge branch 
'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
git bisect good 8bdc69b764013a9b5ebeef7df8f314f1066c5d79
# good: [df910390e2db07a76c87f258475f6c96253cee6c] Merge tag 'scsi-misc' 
of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
git bisect good df910390e2db07a76c87f258475f6c96253cee6c
# bad: [d975f309a8b250e67b66eabeb56be6989c783629] Merge branch 
'for-4.3/sg' of git://git.kernel.dk/linux-block
git bisect bad d975f309a8b250e67b66eabeb56be6989c783629
# bad: [89e2a8404e4415da1edbac6ca4f7332b4a74fae2] crypto/omap-sham: 
remove an open coded access to ->page_link
git bisect bad 89e2a8404e4415da1edbac6ca4f7332b4a74fae2
# good: [0e28997ec476bad4c7dbe0a08775290051325f53] btrfs: remove bio 
splitting and merge_bvec_fn() calls
git bisect good 0e28997ec476bad4c7dbe0a08775290051325f53
# bad: [2ec3182f9c20a9eef0dacc0512cf2ca2df7be5ad] Documentation: update 
notes in biovecs about arbitrarily sized bios
git bisect bad 2ec3182f9c20a9eef0dacc0512cf2ca2df7be5ad
# good: [7140aafce2fc14c5af02fdb7859b6bea0108be3d] md/raid5: get rid of 
bio_fits_rdev()
git bisect good 7140aafce2fc14c5af02fdb7859b6bea0108be3d
# good: [6cf66b4caf9c71f64a5486cadbd71ab58d0d4307] fs: use helper 
bio_add_page() instead of open coding on bi_io_vec
git bisect good 6cf66b4caf9c71f64a5486cadbd71ab58d0d4307
# bad: [b54ffb73cadcdcff9cc1ae0e11f502407e3e2e4c] block: remove 
bio_get_nr_vecs()
git bisect bad b54ffb73cadcdcff9cc1ae0e11f502407e3e2e4c

And like he said since the step before the last one was good and the 
very last one was bad there was no way I could have made a mistake.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 23:25   ` Artem S. Tashkinov
@ 2015-12-20 23:42     ` Kent Overstreet
  2015-12-20 23:49       ` Artem S. Tashkinov
  0 siblings, 1 reply; 45+ messages in thread
From: Kent Overstreet @ 2015-12-20 23:42 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Christoph Hellwig, Linus Torvalds, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, Tejun Heo, IDE-ML,
	Linux Kernel Mailing List

On Mon, Dec 21, 2015 at 04:25:12AM +0500, Artem S. Tashkinov wrote:
> On 2015-12-20 23:18, Christoph Hellwig wrote:
> >On Sun, Dec 20, 2015 at 09:51:14AM -0800, Linus Torvalds wrote:
> >>Kent, Jens, Christoph et al,
> >> please see this bugzilla:
> >>
> >>  https://bugzilla.kernel.org/show_bug.cgi?id=109661
> >>
> >>where Artem Tashkinov bisected his problems with 4.3 down to commit
> >>b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
> >>signed off on.
> >
> >Artem,
> >
> >can you re-check the commits around this series again?  I would be
> >extremtly surprised if it's really this particular commit and not
> >one just before it causing the problem - it just allocates bios
> >to the biggest possible instead of only allocating up to what
> >bio_add_page would accept.
> 
> I'm positive about this particular commit. Of course, it might be another
> GCC 4.7.4 miscompilation which causes the errors which shouldn't be there
> but
> I'm not an expert, so.

I believe you on the commit, and I doubt this has anything to do with gcc - the
errors you're getting are exactly what you normally get when you send the device
an sglist to dma to/from that it doesn't like.

The queue limits stuff is annoyingly fragile, you'd think we'd be able to check
directly in the driver that the stuff we're sending the device is sane but we
don't.

If I came up with a debug patch could you try it out? I don't have any ideas for
one yet, but if someone who knows the ATA code doesn't jump in I'll call up
Tejun and make him walk me through it.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 23:42     ` Kent Overstreet
@ 2015-12-20 23:49       ` Artem S. Tashkinov
  0 siblings, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-20 23:49 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Christoph Hellwig, Linus Torvalds, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, Tejun Heo, IDE-ML,
	Linux Kernel Mailing List

On 2015-12-21 04:42, Kent Overstreet wrote:
> On Mon, Dec 21, 2015 at 04:25:12AM +0500, Artem S. Tashkinov wrote:
>> On 2015-12-20 23:18, Christoph Hellwig wrote:
>> >On Sun, Dec 20, 2015 at 09:51:14AM -0800, Linus Torvalds wrote:
>> >>Kent, Jens, Christoph et al,
>> >> please see this bugzilla:
>> >>
>> >>  https://bugzilla.kernel.org/show_bug.cgi?id=109661
>> >>
>> >>where Artem Tashkinov bisected his problems with 4.3 down to commit
>> >>b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
>> >>signed off on.
>> >
>> >Artem,
>> >
>> >can you re-check the commits around this series again?  I would be
>> >extremtly surprised if it's really this particular commit and not
>> >one just before it causing the problem - it just allocates bios
>> >to the biggest possible instead of only allocating up to what
>> >bio_add_page would accept.
>> 
>> I'm positive about this particular commit. Of course, it might be 
>> another
>> GCC 4.7.4 miscompilation which causes the errors which shouldn't be 
>> there
>> but
>> I'm not an expert, so.
> 
> I believe you on the commit, and I doubt this has anything to do with 
> gcc - the
> errors you're getting are exactly what you normally get when you send 
> the device
> an sglist to dma to/from that it doesn't like.
> 
> The queue limits stuff is annoyingly fragile, you'd think we'd be able 
> to check
> directly in the driver that the stuff we're sending the device is sane 
> but we
> don't.
> 
> If I came up with a debug patch could you try it out? I don't have any 
> ideas for
> one yet, but if someone who knows the ATA code doesn't jump in I'll 
> call up
> Tejun and make him walk me through it.

No problem, I just hope that this particular access mode (and you debug 
patch) won't decrease the lifespan of my HDD. Seagate HDDs have been 
very fragile (read atrociously unreliable) for the past five years.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 17:51 IO errors after "block: remove bio_get_nr_vecs()" Linus Torvalds
  2015-12-20 18:18 ` Christoph Hellwig
  2015-12-20 23:23 ` Artem S. Tashkinov
@ 2015-12-21  1:38 ` Ming Lei
  2015-12-21  1:50   ` Artem S. Tashkinov
  2015-12-21  4:26 ` Tejun Heo
  2015-12-21  6:55 ` Tejun Heo
  4 siblings, 1 reply; 45+ messages in thread
From: Ming Lei @ 2015-12-21  1:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kent Overstreet, Christoph Hellwig, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, Tejun Heo, IDE-ML,
	Linux Kernel Mailing List

On Mon, Dec 21, 2015 at 1:51 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Kent, Jens, Christoph et al,
>  please see this bugzilla:
>
>   https://bugzilla.kernel.org/show_bug.cgi?id=109661
>
> where Artem Tashkinov bisected his problems with 4.3 down to commit
> b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
> signed off on.
>
> (Also Tejun - maybe you can see what's up - maybe that error message
> tells you something)
>
> I'm not sure what's up with his machine, the disk doesn't seem to be
> anyuthing particularly unusual, it looks like a 1TB Seagate Barracuda:
>
>   ata1.00: ATA-8: ST1000DM003-1CH162, CC44, max UDMA/133
>
> which doesn't strike me as odd.
>
> Looking at the dmesg, it also looks like it's a pretty normal
> Sandybridge setup with Intel chipset. Artem, can you confirm? The PCI
> ID for the AHCI chip seems to be (INTEL, 0x1c02).
>
> Any ideas? Anybody?

BTW, I have posted very similar issue in the link:

http://marc.info/?l=linux-ide&m=145066119623811&w=2

Artem, I noticed from bugzillar that the hardware is i386, just
wondering if PAE is enabled?  If yes, I am more confident
that both the two kinds of report are similar or same.

Thanks,

>
>                        Linus
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Ming Lei

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  1:38 ` Ming Lei
@ 2015-12-21  1:50   ` Artem S. Tashkinov
  2015-12-21  2:18     ` Ming Lei
                       ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-21  1:50 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linus Torvalds, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List

On 2015-12-21 06:38, Ming Lei wrote:
> On Mon, Dec 21, 2015 at 1:51 AM, Linus Torvalds wrote:
>> Kent, Jens, Christoph et al,
>>  please see this bugzilla:
>> 
>>   https://bugzilla.kernel.org/show_bug.cgi?id=109661
>> 
>> where Artem Tashkinov bisected his problems with 4.3 down to commit
>> b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
>> signed off on.
>> 
>> (Also Tejun - maybe you can see what's up - maybe that error message
>> tells you something)
>> 
>> I'm not sure what's up with his machine, the disk doesn't seem to be
>> anyuthing particularly unusual, it looks like a 1TB Seagate Barracuda:
>> 
>>   ata1.00: ATA-8: ST1000DM003-1CH162, CC44, max UDMA/133
>> 
>> which doesn't strike me as odd.
>> 
>> Looking at the dmesg, it also looks like it's a pretty normal
>> Sandybridge setup with Intel chipset. Artem, can you confirm? The PCI
>> ID for the AHCI chip seems to be (INTEL, 0x1c02).
>> 
>> Any ideas? Anybody?
> 
> BTW, I have posted very similar issue in the link:
> 
> http://marc.info/?l=linux-ide&m=145066119623811&w=2
> 
> Artem, I noticed from bugzillar that the hardware is i386, just
> wondering if PAE is enabled?  If yes, I am more confident
> that both the two kinds of report are similar or same.
> 

Yes, I'm on i686 with PAE (16GB of RAM here) - it's specifically 
mentioned in the corresponding bug report.

P.S. I know Linus doesn't condone PAE but I still find it more 
preferrable than running a mixed environment with almost zero benefit in 
regard to performance and quite obvious performance regressions related 
to an increased number of libraries being loaded (i686 + x86_64) and 
slightly bloated code which sometimes cannot fit in the CPU cache. Call 
me old fashioned but I won't upgrade to x86_64 until most of the things 
that I run locally are available for x86_64 and that won't happen any 
time soon.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  1:50   ` Artem S. Tashkinov
@ 2015-12-21  2:18     ` Ming Lei
  2015-12-21  2:25       ` Artem S. Tashkinov
  2015-12-21  2:32     ` Kent Overstreet
  2015-12-21  4:32     ` Linus Torvalds
  2 siblings, 1 reply; 45+ messages in thread
From: Ming Lei @ 2015-12-21  2:18 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Linus Torvalds, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List

On Mon, Dec 21, 2015 at 9:50 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
>> BTW, I have posted very similar issue in the link:
>>
>> http://marc.info/?l=linux-ide&m=145066119623811&w=2
>>
>> Artem, I noticed from bugzillar that the hardware is i386, just
>> wondering if PAE is enabled?  If yes, I am more confident
>> that both the two kinds of report are similar or same.
>>
>
> Yes, I'm on i686 with PAE (16GB of RAM here) - it's specifically mentioned
> in the corresponding bug report.

OK, could you dump value of the following files under /sys/block/sdN/queue/ ?

            max_hw_sectors_kb
            max_sectors_kb
            max_segments
            max_segment_size

'sdN' is the faulted disk name.

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  2:18     ` Ming Lei
@ 2015-12-21  2:25       ` Artem S. Tashkinov
  0 siblings, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-21  2:25 UTC (permalink / raw)
  To: Ming Lei
  Cc: Linus Torvalds, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List

On 2015-12-21 07:18, Ming Lei wrote:
> On Mon, Dec 21, 2015 at 9:50 AM, Artem S. Tashkinov wrote:
>>> BTW, I have posted very similar issue in the link:
>>> 
>>> http://marc.info/?l=linux-ide&m=145066119623811&w=2
>>> 
>>> Artem, I noticed from bugzillar that the hardware is i386, just
>>> wondering if PAE is enabled?  If yes, I am more confident
>>> that both the two kinds of report are similar or same.
>>> 
>> 
>> Yes, I'm on i686 with PAE (16GB of RAM here) - it's specifically 
>> mentioned
>> in the corresponding bug report.
> 
> OK, could you dump value of the following files under 
> /sys/block/sdN/queue/ ?
> 
>             max_hw_sectors_kb
>             max_sectors_kb
>             max_segments
>             max_segment_size
> 
> 'sdN' is the faulted disk name.
> 

# cat 
/sys/block/sda/queue/{max_hw_sectors_kb,max_sectors_kb,max_segments,max_segment_size}
32767
32767
168
65536

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  1:50   ` Artem S. Tashkinov
  2015-12-21  2:18     ` Ming Lei
@ 2015-12-21  2:32     ` Kent Overstreet
  2015-12-21  3:21       ` Ming Lei
  2015-12-21  4:32     ` Linus Torvalds
  2 siblings, 1 reply; 45+ messages in thread
From: Kent Overstreet @ 2015-12-21  2:32 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Ming Lei, Linus Torvalds, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List, Martin K. Petersen

On Mon, Dec 21, 2015 at 06:50:21AM +0500, Artem S. Tashkinov wrote:
> On 2015-12-21 06:38, Ming Lei wrote:
> >On Mon, Dec 21, 2015 at 1:51 AM, Linus Torvalds wrote:
> >>Kent, Jens, Christoph et al,
> >> please see this bugzilla:
> >>
> >>  https://bugzilla.kernel.org/show_bug.cgi?id=109661
> >>
> >>where Artem Tashkinov bisected his problems with 4.3 down to commit
> >>b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
> >>signed off on.
> >>
> >>(Also Tejun - maybe you can see what's up - maybe that error message
> >>tells you something)
> >>
> >>I'm not sure what's up with his machine, the disk doesn't seem to be
> >>anyuthing particularly unusual, it looks like a 1TB Seagate Barracuda:
> >>
> >>  ata1.00: ATA-8: ST1000DM003-1CH162, CC44, max UDMA/133
> >>
> >>which doesn't strike me as odd.
> >>
> >>Looking at the dmesg, it also looks like it's a pretty normal
> >>Sandybridge setup with Intel chipset. Artem, can you confirm? The PCI
> >>ID for the AHCI chip seems to be (INTEL, 0x1c02).
> >>
> >>Any ideas? Anybody?
> >
> >BTW, I have posted very similar issue in the link:
> >
> >http://marc.info/?l=linux-ide&m=145066119623811&w=2
> >
> >Artem, I noticed from bugzillar that the hardware is i386, just
> >wondering if PAE is enabled?  If yes, I am more confident
> >that both the two kinds of report are similar or same.
> >
> 
> Yes, I'm on i686 with PAE (16GB of RAM here) - it's specifically mentioned
> in the corresponding bug report.
> 
> P.S. I know Linus doesn't condone PAE but I still find it more preferrable
> than running a mixed environment with almost zero benefit in regard to
> performance and quite obvious performance regressions related to an
> increased number of libraries being loaded (i686 + x86_64) and slightly
> bloated code which sometimes cannot fit in the CPU cache. Call me old
> fashioned but I won't upgrade to x86_64 until most of the things that I run
> locally are available for x86_64 and that won't happen any time soon.

oy vey. WTF's been happening in blk-merge.c?

Theyy're not the same bug. The bug in your thread was introduced by Jens in
5014c311ba "block: fix bogus compiler warnings in blk-merge.c", where he screwed
up the bvprv handling - but that patch comes after the patch Artem bisected to.

blk_bio_segment_split() looks correct in b54ffb73ca. 

What we need to do is:
 in the _driver_, immediately before handing the sglist off to the device, walk
 the sglist and verify it obeys all the restrictions for that particular device
 - and if it's not, print out exactly what we screwed up.

I don't know where that code lives in the ahci driver, and more importantly I
don't know where the dma restrictions come from, but if someone who knows the
driver code can walk me through it I'll write the patch.

--------------

Also - Ming, Christoph, anyone else who might be working on this stuff in the
future:

The way all the queue limits stuff works is still way too fragile; this has been
a recurring source of bugs. There's way too many different restrictions
different devices need, and it's easy for a driver to specify the restrictions
incorrectly in a way that just happens to work, but for the wrong reasons - e.g.
"I can't handle more than x segments, but saying I can't handle more than x
sectors happens to work for now because of some other bug in the upper layers" -
and then when we have to debug that later, we're screwed.

My intent when I was working on this was to eventually push the implementation
of the limits down as much as possible to the actual drivers - i.e. there the
limitations come from, so the driver can say, for example:

"ok, my device can only do scatter/gather dma to max 20 different addresses, so
I'll allocate sglists with 20 entries, and it doesn't matter if the bio or
request or whatever is bigger because when I call blk_rq_map_sg() it's just
going to map as much of the request as will fit in a given sglist and requests
will get processed incrementally until they're finished - and if a particular sg
entry can only be a particular size, or has alignment restrictions or whatever,
I'll just pass that directly to blk_rq_map_sg()"

so that the driver is ideally specifying _only_ its real restrictions, and
they're being specified in the code exactly where they're being used.

-------

Basically, blk_queue_split() was only meant to be an interim solution, so I'd
suggest that instead of doing performance optimizations on that codepath a
better use of time and effort would be to work towards ripping it out entirely.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  2:32     ` Kent Overstreet
@ 2015-12-21  3:21       ` Ming Lei
  2015-12-21  3:36         ` Artem S. Tashkinov
  0 siblings, 1 reply; 45+ messages in thread
From: Ming Lei @ 2015-12-21  3:21 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Artem S. Tashkinov, Linus Torvalds, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List, Martin K. Petersen

On Mon, Dec 21, 2015 at 10:25 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> # cat
> /sys/block/sda/queue/{max_hw_sectors_kb,max_sectors_kb,max_segments,max_segment_size}
> 32767
> 32767
> 168
> 65536

Looks it is fine, then maybe it is related with BIOVEC_PHYS_MERGEABLE(),
BIOVEC_SEG_BOUNDARY() or sort of thing, because dma_addr_t and
phys_addr_t turn to 64-bit with PAE, but 'unsigned long' and 'void *'
is still 32bit.

It was confirmed that there isn't the issue if PAE is disabled.

Dumping both sata/ahci hw sg table and bio's bvec might be helpful.

On Mon, Dec 21, 2015 at 10:32 AM, Kent Overstreet
<kent.overstreet@gmail.com> wrote:
>
> oy vey. WTF's been happening in blk-merge.c?
>
> Theyy're not the same bug. The bug in your thread was introduced by Jens in
> 5014c311ba "block: fix bogus compiler warnings in blk-merge.c", where he screwed
> up the bvprv handling - but that patch comes after the patch Artem bisected to.
>
> blk_bio_segment_split() looks correct in b54ffb73ca.

Yes, that is why reverting 578270bfb(block: fix segment split) can make the
issue disappear, because 5014c311ba "block: fix bogus compiler
warnings in blk-merge.c" basically disables sg-merge and prevents the
issue from being
triggered.



Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  3:21       ` Ming Lei
@ 2015-12-21  3:36         ` Artem S. Tashkinov
  0 siblings, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-21  3:36 UTC (permalink / raw)
  To: Ming Lei
  Cc: Kent Overstreet, Linus Torvalds, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List, Martin K. Petersen

On 2015-12-21 08:21, Ming Lei wrote:
> On Mon, Dec 21, 2015 at 10:25 AM, Artem S. Tashkinov wrote:
>> # cat
>> /sys/block/sda/queue/{max_hw_sectors_kb,max_sectors_kb,max_segments,max_segment_size}
>> 32767
>> 32767
>> 168
>> 65536
> 
> Looks it is fine, then maybe it is related with 
> BIOVEC_PHYS_MERGEABLE(),
> BIOVEC_SEG_BOUNDARY() or sort of thing, because dma_addr_t and
> phys_addr_t turn to 64-bit with PAE, but 'unsigned long' and 'void *'
> is still 32bit.
> 
> It was confirmed that there isn't the issue if PAE is disabled.
> 
> Dumping both sata/ahci hw sg table and bio's bvec might be helpful.

Um, sorry, what exact variables/files do you want to see? I'm not an 
expert in /sys.

> 
> On Mon, Dec 21, 2015 at 10:32 AM, Kent Overstreet wrote:
>> 
>> oy vey. WTF's been happening in blk-merge.c?
>> 
>> Theyy're not the same bug. The bug in your thread was introduced by 
>> Jens in
>> 5014c311ba "block: fix bogus compiler warnings in blk-merge.c", where 
>> he screwed
>> up the bvprv handling - but that patch comes after the patch Artem 
>> bisected to.
>> 
>> blk_bio_segment_split() looks correct in b54ffb73ca.
> 
> Yes, that is why reverting 578270bfb(block: fix segment split) can make 
> the
> issue disappear, because 5014c311ba "block: fix bogus compiler
> warnings in blk-merge.c" basically disables sg-merge and prevents the
> issue from being
> triggered.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 17:51 IO errors after "block: remove bio_get_nr_vecs()" Linus Torvalds
                   ` (2 preceding siblings ...)
  2015-12-21  1:38 ` Ming Lei
@ 2015-12-21  4:26 ` Tejun Heo
  2015-12-21  5:10   ` Linus Torvalds
  2015-12-21  6:55 ` Tejun Heo
  4 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2015-12-21  4:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kent Overstreet, Christoph Hellwig, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List

Hello, Linus.

On Sun, Dec 20, 2015 at 09:51:14AM -0800, Linus Torvalds wrote:
...
> (Also Tejun - maybe you can see what's up - maybe that error message
> tells you something)

Hmmm... all it says is that something went wrong on the PCI side.

> I'm not sure what's up with his machine, the disk doesn't seem to be
> anyuthing particularly unusual, it looks like a 1TB Seagate Barracuda:
> 
>   ata1.00: ATA-8: ST1000DM003-1CH162, CC44, max UDMA/133
> 
> which doesn't strike me as odd.
> 
> Looking at the dmesg, it also looks like it's a pretty normal
> Sandybridge setup with Intel chipset. Artem, can you confirm? The PCI
> ID for the AHCI chip seems to be (INTEL, 0x1c02).
> 
> Any ideas? Anybody?

I wonder whether ahci is screwing up command / sg table setup in a way
that e.g. if there are too many segments the sg table overflows into
the neighboring one which is now being exposed by upper layer being
fixed to send down larger commands.  Looking into it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  1:50   ` Artem S. Tashkinov
  2015-12-21  2:18     ` Ming Lei
  2015-12-21  2:32     ` Kent Overstreet
@ 2015-12-21  4:32     ` Linus Torvalds
  2015-12-21  4:43       ` Artem S. Tashkinov
  2 siblings, 1 reply; 45+ messages in thread
From: Linus Torvalds @ 2015-12-21  4:32 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Ming Lei, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List

On Sun, Dec 20, 2015 at 5:50 PM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
>
> P.S. I know Linus doesn't condone PAE but I still find it more preferrable
> than running a mixed environment with almost zero benefit in regard to
> performance and quite obvious performance regressions related to an
> increased number of libraries being loaded (i686 + x86_64) and slightly
> bloated code which sometimes cannot fit in the CPU cache. Call me old
> fashioned but I won't upgrade to x86_64 until most of the things that I run
> locally are available for x86_64 and that won't happen any time soon.

Don't upgrade *user* land. User land doesn't use the braindamage that is PAE.

Just run a 64-bit kernel. Keep all your 32-bit userland apps and libraries.

Trust me, that *will* be faster. PAE works really horribly badly,
because all your really important data structures like your inodes and
directory cache will all be in the low 1GB even if you have 16BG of
RAM.

Of course, I'd also like more people to run things that way just to
get more coverage of the whole "yes, we do all the compat stuff
correctly". So I have some other reasons to prefer people running
64-bit kernels with 32-bit user land. But PAE really is a disaster.

                 Linus

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  4:32     ` Linus Torvalds
@ 2015-12-21  4:43       ` Artem S. Tashkinov
  2015-12-21  4:47         ` Linus Torvalds
  0 siblings, 1 reply; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-21  4:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List, linus971

On 2015-12-21 09:32, Linus Torvalds wrote:
> On Sun, Dec 20, 2015 at 5:50 PM, Artem S. Tashkinov wrote:
>> 
>> P.S. I know Linus doesn't condone PAE but I still find it more 
>> preferrable
>> than running a mixed environment with almost zero benefit in regard to
>> performance and quite obvious performance regressions related to an
>> increased number of libraries being loaded (i686 + x86_64) and 
>> slightly
>> bloated code which sometimes cannot fit in the CPU cache. Call me old
>> fashioned but I won't upgrade to x86_64 until most of the things that 
>> I run
>> locally are available for x86_64 and that won't happen any time soon.
> 
> Don't upgrade *user* land. User land doesn't use the braindamage that 
> is PAE.
> 
> Just run a 64-bit kernel. Keep all your 32-bit userland apps and 
> libraries.
> 
> Trust me, that *will* be faster. PAE works really horribly badly,
> because all your really important data structures like your inodes and
> directory cache will all be in the low 1GB even if you have 16BG of
> RAM.
> 
> Of course, I'd also like more people to run things that way just to
> get more coverage of the whole "yes, we do all the compat stuff
> correctly". So I have some other reasons to prefer people running
> 64-bit kernels with 32-bit user land. But PAE really is a disaster.
> 

In the past I happily ran an x86_64 bit kernel together with 32bit 
userland for quite some time but then I hit a wall: VirtualBox expects 
its kernel modules to have the same bitness as the application itself so 
I had to revert back to an i686 PAE setup. It's probably high time to 
try qemu however last time I looked at it a few years ago it lacked 
several crucial features I need from a VM.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  4:43       ` Artem S. Tashkinov
@ 2015-12-21  4:47         ` Linus Torvalds
  2015-12-21  5:23           ` Linus Torvalds
  0 siblings, 1 reply; 45+ messages in thread
From: Linus Torvalds @ 2015-12-21  4:47 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Ming Lei, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List

On Sun, Dec 20, 2015 at 8:43 PM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
>
> In the past I happily ran an x86_64 bit kernel together with 32bit userland
> for quite some time but then I hit a wall: VirtualBox expects its kernel
> modules to have the same bitness as the application itself so I had to
> revert back to an i686 PAE setup.

Ugh, ok. That kind of forces your hand, yes.

Although:

> t's probably high time to try qemu however last time I looked at it a few
> years ago it lacked several crucial features I need from a VM.

kvm-qemu really ends up working pretty well.. Give it a try.

That said, we obviously need to figure out this current problem
regardless first..

                 Linus

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  4:26 ` Tejun Heo
@ 2015-12-21  5:10   ` Linus Torvalds
  0 siblings, 0 replies; 45+ messages in thread
From: Linus Torvalds @ 2015-12-21  5:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Kent Overstreet, Christoph Hellwig, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List

On Sun, Dec 20, 2015 at 8:26 PM, Tejun Heo <tj@kernel.org> wrote:
>
> I wonder whether ahci is screwing up command / sg table setup in a way
> that e.g. if there are too many segments the sg table overflows into
> the neighboring one which is now being exposed by upper layer being
> fixed to send down larger commands.  Looking into it.

That would explain the

  Corrupted low memory at c0001000 ...

that Artem also saw.

Anyway, it would be lovely to have some verification in the ATA
routines that the passed-on IO actually h9onors the limits it set.
Could you add a WARN_ON_ONCE(check_io_limits())" or similar, and maybe
we could catch whatever causes the overflow red-handed?

On a totally separate issue:

Just looking at some of the merging code, and I have to say that it
strikes me as insane. This in particular:

  #define __BIO_SEG_BOUNDARY(addr1, addr2, mask) \
        (((addr1) | (mask)) == (((addr2) - 1) | (mask)))
  #define BIOVEC_SEG_BOUNDARY(q, b1, b2) \
        __BIO_SEG_BOUNDARY(bvec_to_phys((b1)), bvec_to_phys((b2)) +
(b2)->bv_len, queue_segment_boundary((q)))

seems just *stupid*.

Why does it do that "bvec_to_phys((b2)) + (b2)->bv_len -1" on the
second bvec? That's the :"physical address of the last byte of the
second bvec".

I understand the "round both addresses up by the mask, and we want to
make sure that they are in the same segment" part.

But since an individual bvec had better be fully inside one segment
(since we split at bvec boundaries anyway, so if ). why do all that
crap anyway? The end address doesn't matter, you could just use the
beginning.

So remove the "-1" and remove the "+bv_len".

At which it would become just

  #define __BIO_SEG_BOUNDARY(addr1, addr2, mask) \
        ((addr1) | (mask) == (addr2)|(mask))
  #define BIOVEC_SEG_BOUNDARY(q, b1, b2) \
        __BIO_SEG_BOUNDARY(bvec_to_phys((b1)), bvec_to_phys((b2)),
queue_segment_boundary((q)))

which seems simpler and more understandable. "Are the beginning
addresses in within the same segment"

Or are there ever bv_len == 0 things at the boundary that we want to
merge. Because then the "-1+bv_len" case migth make sense.

Anyway, that shouldn't change the end result in any way, so that
doesn't all *matter*, but it worries me when things look more
complicated than I think they should be.

Is there something I'm missing?

               Linus

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  4:47         ` Linus Torvalds
@ 2015-12-21  5:23           ` Linus Torvalds
  2015-12-21  7:31             ` Artem S. Tashkinov
  2015-12-22  4:06             ` Artem S. Tashkinov
  0 siblings, 2 replies; 45+ messages in thread
From: Linus Torvalds @ 2015-12-21  5:23 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Ming Lei, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List

On Sun, Dec 20, 2015 at 8:47 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> That said, we obviously need to figure out this current problem
> regardless first..

... although maybe it *would* be interesting to hear what happens if
you just compile a 64-bit kernel instead?

Do you still see the problem? Because if not, then we should look very
specifically for some 32-bit PAE issue.

For example, maybe we use "unsigned long" somewhere where we should
use "phys_addr_t". On x86-64, they obviously end up being the same. On
normal non-PAE x86-32, they are also the same. But ..

                 Linus

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 17:51 IO errors after "block: remove bio_get_nr_vecs()" Linus Torvalds
                   ` (3 preceding siblings ...)
  2015-12-21  4:26 ` Tejun Heo
@ 2015-12-21  6:55 ` Tejun Heo
  2015-12-21  7:25   ` Artem S. Tashkinov
  4 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2015-12-21  6:55 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Kent Overstreet, Christoph Hellwig, Ming Lin, Jens Axboe,
	Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List

Artem, can you please reproduce the issue with the following patch
applied and attach the kernel log?

Thanks.

---
 drivers/ata/libahci.c     |   40 ++++++++++++++++++++++++++++++++++++++--
 drivers/ata/libata-eh.c   |    4 ++++
 drivers/ata/libata-scsi.c |    1 +
 3 files changed, 43 insertions(+), 2 deletions(-)

--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -2278,7 +2278,7 @@ static int ahci_port_start(struct ata_po
 	struct ahci_host_priv *hpriv = ap->host->private_data;
 	struct device *dev = ap->host->dev;
 	struct ahci_port_priv *pp;
-	void *mem;
+	void *mem, *base;
 	dma_addr_t mem_dma;
 	size_t dma_sz, rx_fis_sz;
 
@@ -2319,7 +2319,9 @@ static int ahci_port_start(struct ata_po
 		rx_fis_sz = AHCI_RX_FIS_SZ;
 	}
 
-	mem = dmam_alloc_coherent(dev, dma_sz, &mem_dma, GFP_KERNEL);
+	base = mem = dmam_alloc_coherent(dev, dma_sz, &mem_dma, GFP_KERNEL);
+	printk("XXX port %d dma_sz=%zu mem=%p mem_dma=%p",
+	       ap->port_no, dma_sz, mem, (void *)mem_dma);
 	if (!mem)
 		return -ENOMEM;
 	memset(mem, 0, dma_sz);
@@ -2331,6 +2333,8 @@ static int ahci_port_start(struct ata_po
 	pp->cmd_slot = mem;
 	pp->cmd_slot_dma = mem_dma;
 
+	pr_cont(" cmd_slot=%zu", mem - base);
+
 	mem += AHCI_CMD_SLOT_SZ;
 	mem_dma += AHCI_CMD_SLOT_SZ;
 
@@ -2340,6 +2344,8 @@ static int ahci_port_start(struct ata_po
 	pp->rx_fis = mem;
 	pp->rx_fis_dma = mem_dma;
 
+	pr_cont(" rx_fis=%zu", mem - base);
+
 	mem += rx_fis_sz;
 	mem_dma += rx_fis_sz;
 
@@ -2350,6 +2356,8 @@ static int ahci_port_start(struct ata_po
 	pp->cmd_tbl = mem;
 	pp->cmd_tbl_dma = mem_dma;
 
+	pr_cont(" cmd_tbl=%zu\n", mem - base);
+
 	/*
 	 * Save off initial list of interrupts to be enabled.
 	 * This could be changed later
@@ -2540,6 +2548,34 @@ int ahci_host_activate(struct ata_host *
 }
 EXPORT_SYMBOL_GPL(ahci_host_activate);
 
+void ahci_dump_dma(struct ata_queued_cmd *qc)
+{
+	struct ata_port *ap = qc->ap;
+	struct ahci_port_priv *pp = ap->private_data;
+	struct ahci_cmd_hdr *cmd = &pp->cmd_slot[qc->tag];
+	void *cmd_tbl = pp->cmd_tbl + qc->tag * AHCI_CMD_TBL_SZ;
+	u32 *fis = cmd_tbl;
+	struct ahci_sg *ahci_sg = cmd_tbl + AHCI_CMD_TBL_HDR_SZ;
+	int prdtl = (cmd->opts & 0xffff0000) >> 16;
+	int i;
+
+	printk("XXX cmd=%p cmd_tbl=%p ahci_sg=%p\n", cmd, cmd_tbl, ahci_sg);
+	printk("XXX opts=%x st=%x addr=%x addr_hi=%x rsvd=%x:%x:%x:%x\n",
+	       cmd->opts, cmd->status, cmd->tbl_addr, cmd->tbl_addr_hi,
+	       cmd->reserved[0], cmd->reserved[1], cmd->reserved[2], cmd->reserved[3]);
+	printk("XXX fis=%08x:%08x:%08x:%08x %08x:%08x:%08x:%08x\n",
+	       fis[0], fis[1], fis[2], fis[3],
+	       fis[4], fis[5], fis[6], fis[7]);
+
+	printk("XXX qc->n_elem=%d fis_len=%d prdtl=%d\n",
+	       qc->n_elem, cmd->opts & 0xf, prdtl);
+
+	for (i = 0; i < prdtl; i++)
+		printk("XXX sg[%d] = %x %x %x (%d)\n",
+		       i, ahci_sg[i].addr, ahci_sg[i].addr_hi, ahci_sg[i].flags_size,
+		       (ahci_sg[i].flags_size & 0x7fffffff) + 1);
+}
+
 MODULE_AUTHOR("Jeff Garzik");
 MODULE_DESCRIPTION("Common AHCI SATA low-level routines");
 MODULE_LICENSE("GPL");
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1059,6 +1059,7 @@ static int ata_do_link_abort(struct ata_
 
 		if (qc && (!link || qc->dev->link == link)) {
 			qc->flags |= ATA_QCFLAG_FAILED;
+			qc->err_mask = AC_ERR_DEV;
 			ata_qc_complete(qc);
 			nr_aborted++;
 		}
@@ -2416,6 +2417,8 @@ const char *ata_get_cmd_descript(u8 comm
 }
 EXPORT_SYMBOL_GPL(ata_get_cmd_descript);
 
+void ahci_dump_dma(struct ata_queued_cmd *qc);
+
 /**
  *	ata_eh_link_report - report error handling to user
  *	@link: ATA link EH is going on
@@ -2590,6 +2593,7 @@ static void ata_eh_link_report(struct at
 			  res->feature & ATA_IDNF ? "IDNF " : "",
 			  res->feature & ATA_ABORTED ? "ABRT " : "");
 #endif
+		ahci_dump_dma(qc);
 	}
 }
 
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -4035,6 +4035,7 @@ int ata_scsi_user_scan(struct Scsi_Host
 	}
 
 	if (rc == 0) {
+		ata_port_freeze(ap);
 		ata_port_schedule_eh(ap);
 		spin_unlock_irqrestore(ap->lock, flags);
 		ata_port_wait_eh(ap);

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  6:55 ` Tejun Heo
@ 2015-12-21  7:25   ` Artem S. Tashkinov
  2015-12-21 19:35     ` Tejun Heo
  0 siblings, 1 reply; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-21  7:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Artem S. Tashkinov, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 382 bytes --]

On 2015-12-21 11:55, Tejun Heo wrote:
> Artem, can you please reproduce the issue with the following patch
> applied and attach the kernel log?
> 
> Thanks.
> 

I've applied this patch on top of vanilla 4.3.3 kernel (without Linus'es 
revert). Hopefully it's how you intended it to be.

Here's the result (I skipped the beginning of dmesg - it's the same as 
always - see bugzilla).

[-- Attachment #2: dmesg.log --]
[-- Type: text/plain, Size: 81753 bytes --]

[   60.387407] Corrupted low memory at c0001000 (1000 phys) = cba3d25f
[   60.387411] Corrupted low memory at c0001004 (1004 phys) = e8f17ba7
[   60.387413] Corrupted low memory at c0001008 (1008 phys) = 61cfa79a
[   60.387415] Corrupted low memory at c000100c (100c phys) = dc4d5d71
[   60.387417] Corrupted low memory at c0001010 (1010 phys) = adbdc15b
[   60.387418] Corrupted low memory at c0001014 (1014 phys) = dee76bdc
[   60.387420] Corrupted low memory at c0001018 (1018 phys) = 827dee31
[   60.387422] Corrupted low memory at c000101c (101c phys) = ef70cf7b
[   60.387423] Corrupted low memory at c0001020 (1020 phys) = 82fdee4d
[   60.387425] Corrupted low memory at c0001024 (1024 phys) = 77533c7b
[   60.387427] Corrupted low memory at c0001028 (1028 phys) = ddd4cf35
[   60.387428] Corrupted low memory at c000102c (102c phys) = 7beea149
[   60.387430] Corrupted low memory at c0001030 (1030 phys) = 798fe878
[   60.387432] Corrupted low memory at c0001034 (1034 phys) = 4283a7a8
[   60.387434] Corrupted low memory at c0001038 (1038 phys) = 4dee093d
[   60.387435] Corrupted low memory at c000103c (103c phys) = ee21ef73
[   60.387437] Corrupted low memory at c0001040 (1040 phys) = fe3dc93d
[   60.387439] Corrupted low memory at c0001044 (1044 phys) = b8e7cf0d
[   60.387440] Corrupted low memory at c0001048 (1048 phys) = af3c9977
[   60.387442] Corrupted low memory at c000104c (104c phys) = b80b7b8b
[   60.387444] Corrupted low memory at c0001050 (1050 phys) = b6f73d77
[   60.387445] Corrupted low memory at c0001054 (1054 phys) = f7276f70
[   60.387447] Corrupted low memory at c0001058 (1058 phys) = c62f70f6
[   60.387449] Corrupted low memory at c000105c (105c phys) = 3ef734bd
[   60.387451] Corrupted low memory at c0001060 (1060 phys) = 1ef79f40
[   60.387452] Corrupted low memory at c0001064 (1064 phys) = f1cf9f65
[   60.387454] Corrupted low memory at c0001068 (1068 phys) = 297a5390
[   60.387456] Corrupted low memory at c000106c (106c phys) = a7f14fbc
[   60.387457] Corrupted low memory at c0001070 (1070 phys) = 57ef71af
[   60.387459] Corrupted low memory at c0001074 (1074 phys) = 219d15e4
[   60.387461] Corrupted low memory at c0001078 (1078 phys) = 7b99a2af
[   60.387462] Corrupted low memory at c000107c (107c phys) = c56d281b
[   60.387464] Corrupted low memory at c0001080 (1080 phys) = 3c84de6e
[   60.387466] Corrupted low memory at c0001084 (1084 phys) = edee56ec
[   60.387468] Corrupted low memory at c0001088 (1088 phys) = 49b557a7
[   60.387469] Corrupted low memory at c000108c (108c phys) = 01baeb6a
[   60.387471] Corrupted low memory at c0001090 (1090 phys) = b775acde
[   60.387473] Corrupted low memory at c0001094 (1094 phys) = 30dd6851
[   60.387474] Corrupted low memory at c0001098 (1098 phys) = f328fd0f
[   60.387476] Corrupted low memory at c000109c (109c phys) = 17ad185c
[   60.387478] Corrupted low memory at c00010a0 (10a0 phys) = b83985f5
[   60.387479] Corrupted low memory at c00010a4 (10a4 phys) = 775b8af5
[   60.387481] Corrupted low memory at c00010a8 (10a8 phys) = 3d35e4bc
[   60.387483] Corrupted low memory at c00010ac (10ac phys) = bf4d7b90
[   60.387485] Corrupted low memory at c00010b0 (10b0 phys) = 1db6fd99
[   60.387486] Corrupted low memory at c00010b4 (10b4 phys) = 3b94bf2f
[   60.387488] Corrupted low memory at c00010b8 (10b8 phys) = 5f447e55
[   60.387490] Corrupted low memory at c00010bc (10bc phys) = dcfe6395
[   60.387491] Corrupted low memory at c00010c0 (10c0 phys) = fc0b7a23
[   60.387493] Corrupted low memory at c00010c4 (10c4 phys) = 32fa23aa
[   60.387495] Corrupted low memory at c00010c8 (10c8 phys) = e88ef3f8
[   60.387496] Corrupted low memory at c00010cc (10cc phys) = 1ed7e14b
[   60.387498] Corrupted low memory at c00010d0 (10d0 phys) = 9fc3d7d1
[   60.387500] Corrupted low memory at c00010d4 (10d4 phys) = 015f447f
[   60.387501] Corrupted low memory at c00010d8 (10d8 phys) = 7d11c17f
[   60.387503] Corrupted low memory at c00010dc (10dc phys) = 4785fc2d
[   60.387505] Corrupted low memory at c00010e0 (10e0 phys) = 5fe16bf4
[   60.387507] Corrupted low memory at c00010e4 (10e4 phys) = 4de3fcc5
[   60.387508] Corrupted low memory at c00010e8 (10e8 phys) = 4f477297
[   60.387510] Corrupted low memory at c00010ec (10ec phys) = 59a47d35
[   60.387512] Corrupted low memory at c00010f0 (10f0 phys) = c97c78df
[   60.387513] Corrupted low memory at c00010f4 (10f4 phys) = e3aafa4b
[   60.387515] Corrupted low memory at c00010f8 (10f8 phys) = 658bd8cb
[   60.387517] Corrupted low memory at c00010fc (10fc phys) = 6f5eb91f
[   60.387518] Corrupted low memory at c0001100 (1100 phys) = ca66ce3a
[   60.387520] Corrupted low memory at c0001104 (1104 phys) = b7d96a87
[   60.387522] Corrupted low memory at c0001108 (1108 phys) = ee0eeb4f
[   60.387523] Corrupted low memory at c000110c (110c phys) = 9e6f3671
[   60.387525] Corrupted low memory at c0001110 (1110 phys) = e0e797d5
[   60.387527] Corrupted low memory at c0001114 (1114 phys) = f099c96e
[   60.387528] Corrupted low memory at c0001118 (1118 phys) = 735ee4b5
[   60.387530] Corrupted low memory at c000111c (111c phys) = 6b7826fc
[   60.387532] Corrupted low memory at c0001120 (1120 phys) = f499b38e
[   60.387533] Corrupted low memory at c0001124 (1124 phys) = 629479a1
[   60.387535] Corrupted low memory at c0001128 (1128 phys) = e4f5b93e
[   60.387537] Corrupted low memory at c000112c (112c phys) = 03be3dce
[   60.387538] Corrupted low memory at c0001130 (1130 phys) = dc52b7af
[   60.387540] Corrupted low memory at c0001134 (1134 phys) = f043c7d5
[   60.387542] Corrupted low memory at c0001138 (1138 phys) = 4561e85a
[   60.387543] Corrupted low memory at c000113c (113c phys) = 2a5784ce
[   60.387545] Corrupted low memory at c0001140 (1140 phys) = 41f1cb70
[   60.387547] Corrupted low memory at c0001144 (1144 phys) = df8e1b78
[   60.387549] Corrupted low memory at c0001148 (1148 phys) = f8e3af06
[   60.387550] Corrupted low memory at c000114c (114c phys) = 1db6f09d
[   60.387552] Corrupted low memory at c0001150 (1150 phys) = 25eeb367
[   60.387554] Corrupted low memory at c0001154 (1154 phys) = ee0e5b7d
[   60.387555] Corrupted low memory at c0001158 (1158 phys) = 3d6ee4cd
[   60.387557] Corrupted low memory at c000115c (115c phys) = 336f063e
[   60.387559] Corrupted low memory at c0001160 (1160 phys) = 7e396c3d
[   60.387560] Corrupted low memory at c0001164 (1164 phys) = f0b6be17
[   60.387562] Corrupted low memory at c0001168 (1168 phys) = 67879b38
[   60.387564] Corrupted low memory at c000116c (116c phys) = c6ee8558
[   60.387565] Corrupted low memory at c0001170 (1170 phys) = 0cf1f553
[   60.387567] Corrupted low memory at c0001174 (1174 phys) = 3e3daed2
[   60.387569] Corrupted low memory at c0001178 (1178 phys) = f1e16f05
[   60.387570] Corrupted low memory at c000117c (117c phys) = 3fabbc69
[   60.387572] Corrupted low memory at c0001180 (1180 phys) = f62f1f7e
[   60.387574] Corrupted low memory at c0001184 (1184 phys) = f402f0fd
[   60.387576] Corrupted low memory at c0001188 (1188 phys) = f09cd6c9
[   60.387577] Corrupted low memory at c000118c (118c phys) = 67ce3dc8
[   60.387579] Corrupted low memory at c0001190 (1190 phys) = 7ddd11e9
[   60.387581] Corrupted low memory at c0001194 (1194 phys) = 26779f46
[   60.387582] Corrupted low memory at c0001198 (1198 phys) = bbf63cc7
[   60.387584] Corrupted low memory at c000119c (119c phys) = 3c5dd390
[   60.387586] Corrupted low memory at c00011a0 (11a0 phys) = b5ee2ad6
[   60.387587] Corrupted low memory at c00011a4 (11a4 phys) = 87f3daf3
[   60.387589] Corrupted low memory at c00011a8 (11a8 phys) = 262ee44d
[   60.387591] Corrupted low memory at c00011ac (11ac phys) = 9e9f1b77
[   60.387592] Corrupted low memory at c00011b0 (11b0 phys) = 76b12ec7
[   60.387594] Corrupted low memory at c00011b4 (11b4 phys) = 61414d67
[   60.387596] Corrupted low memory at c00011b8 (11b8 phys) = 8ebdc8cb
[   60.387597] Corrupted low memory at c00011bc (11bc phys) = 734d5c3f
[   60.387599] Corrupted low memory at c00011c0 (11c0 phys) = dc0d018e
[   60.387601] Corrupted low memory at c00011c4 (11c4 phys) = da53925b
[   60.387602] Corrupted low memory at c00011c8 (11c8 phys) = 0df906e0
[   60.387604] Corrupted low memory at c00011cc (11cc phys) = dc86b81e
[   60.387606] Corrupted low memory at c00011d0 (11d0 phys) = acf34a75
[   60.387607] Corrupted low memory at c00011d4 (11d4 phys) = 6654c9cf
[   60.387609] Corrupted low memory at c00011d8 (11d8 phys) = f38ecbc6
[   60.387611] Corrupted low memory at c00011dc (11dc phys) = 579f2f89
[   60.387613] Corrupted low memory at c00011e0 (11e0 phys) = 31e6068a
[   60.387614] Corrupted low memory at c00011e4 (11e4 phys) = 0936e4fd
[   60.387616] Corrupted low memory at c00011e8 (11e8 phys) = 991a2737
[   60.387618] Corrupted low memory at c00011ec (11ec phys) = ed540363
[   60.387619] Corrupted low memory at c00011f0 (11f0 phys) = bd19d711
[   60.387621] Corrupted low memory at c00011f4 (11f4 phys) = cc0139ea
[   60.387623] Corrupted low memory at c00011f8 (11f8 phys) = 0bdd4097
[   60.387624] Corrupted low memory at c00011fc (11fc phys) = f51ce354
[   60.387626] Corrupted low memory at c0001200 (1200 phys) = 02e3e2e2
[   60.387628] Corrupted low memory at c0001204 (1204 phys) = e9acb05d
[   60.387629] Corrupted low memory at c0001208 (1208 phys) = 265eaf0f
[   60.387631] Corrupted low memory at c000120c (120c phys) = 080f93c0
[   60.387633] Corrupted low memory at c0001210 (1210 phys) = 4f2cf26c
[   60.387634] Corrupted low memory at c0001214 (1214 phys) = 2e8a4bf6
[   60.387636] Corrupted low memory at c0001218 (1218 phys) = 7d45ca50
[   60.387638] Corrupted low memory at c000121c (121c phys) = 2315211e
[   60.387640] Corrupted low memory at c0001220 (1220 phys) = 5680c9ac
[   60.387641] Corrupted low memory at c0001224 (1224 phys) = e1c359c0
[   60.387643] Corrupted low memory at c0001228 (1228 phys) = 781bf2ad
[   60.387645] Corrupted low memory at c000122c (122c phys) = 3600c5d8
[   60.387646] Corrupted low memory at c0001230 (1230 phys) = d4c7dd41
[   60.387648] Corrupted low memory at c0001234 (1234 phys) = d099ecff
[   60.387650] Corrupted low memory at c0001238 (1238 phys) = 1fc29dd3
[   60.387651] Corrupted low memory at c000123c (123c phys) = e161fe13
[   60.387653] Corrupted low memory at c0001240 (1240 phys) = 1ff857bf
[   60.387655] Corrupted low memory at c0001244 (1244 phys) = e7e23e14
[   60.387657] Corrupted low memory at c0001248 (1248 phys) = 17ddd1c8
[   60.387658] Corrupted low memory at c000124c (124c phys) = 8b7fac8d
[   60.387660] Corrupted low memory at c0001250 (1250 phys) = f7e808aa
[   60.387662] Corrupted low memory at c0001254 (1254 phys) = 819f87f8
[   60.387663] Corrupted low memory at c0001258 (1258 phys) = df872af3
[   60.387665] Corrupted low memory at c000125c (125c phys) = 2373f384
[   60.387667] Corrupted low memory at c0001260 (1260 phys) = c849bfc6
[   60.387668] Corrupted low memory at c0001264 (1264 phys) = 81ef4a5d
[   60.387670] Corrupted low memory at c0001268 (1268 phys) = 1bd65ecc
[   60.387672] Corrupted low memory at c000126c (126c phys) = a5dc00ce
[   60.387673] Corrupted low memory at c0001270 (1270 phys) = e5e273bb
[   60.387675] Corrupted low memory at c0001274 (1274 phys) = 1bda2321
[   60.387677] Corrupted low memory at c0001278 (1278 phys) = 6ec31a3c
[   60.387678] Corrupted low memory at c000127c (127c phys) = dadd8097
[   60.387680] Corrupted low memory at c0001280 (1280 phys) = 6dc00710
[   60.387682] Corrupted low memory at c0001284 (1284 phys) = c049ee14
[   60.387683] Corrupted low memory at c0001288 (1288 phys) = 1a787349
[   60.387685] Corrupted low memory at c000128c (128c phys) = fc39bbce
[   60.387687] Corrupted low memory at c0001290 (1290 phys) = 710f9c10
[   60.387689] Corrupted low memory at c0001294 (1294 phys) = a73849f8
[   60.387690] Corrupted low memory at c0001298 (1298 phys) = 3809fc38
[   60.387692] Corrupted low memory at c000129c (129c phys) = cfe1cbbf
[   60.387694] Corrupted low memory at c00012a0 (12a0 phys) = 7d9979c1
[   60.387695] Corrupted low memory at c00012a4 (12a4 phys) = 292ffe09
[   60.387697] Corrupted low memory at c00012a8 (12a8 phys) = 394e4bcc
[   60.387699] Corrupted low memory at c00012ac (12ac phys) = 59c01787
[   60.387700] Corrupted low memory at c00012b0 (12b0 phys) = e1ebe1c2
[   60.387702] Corrupted low memory at c00012b4 (12b4 phys) = 16f0e5ac
[   60.387704] Corrupted low memory at c00012b8 (12b8 phys) = bc39db38
[   60.387705] Corrupted low memory at c00012bc (12bc phys) = 870f4e13
[   60.387707] Corrupted low memory at c00012c0 (12c0 phys) = 3764e4f7
[   60.387709] Corrupted low memory at c00012c4 (12c4 phys) = 0e706378
[   60.387710] Corrupted low memory at c00012c8 (12c8 phys) = e70d3870
[   60.387712] Corrupted low memory at c00012cc (12cc phys) = 207f0e7e
[   60.387714] Corrupted low memory at c00012d0 (12d0 phys) = d812efa7
[   60.387715] Corrupted low memory at c00012d4 (12d4 phys) = 0cc31cf2
[   60.387717] Corrupted low memory at c00012d8 (12d8 phys) = 03e29f00
[   60.387719] Corrupted low memory at c00012dc (12dc phys) = 14fca644
[   60.387720] Corrupted low memory at c00012e0 (12e0 phys) = f872cf9c
[   60.387722] Corrupted low memory at c00012e4 (12e4 phys) = c47e706b
[   60.387724] Corrupted low memory at c00012e8 (12e8 phys) = a7315fe1
[   60.387725] Corrupted low memory at c00012ec (12ec phys) = a2751968
[   60.387727] Corrupted low memory at c00012f0 (12f0 phys) = 35a2b5e8
[   60.387729] Corrupted low memory at c00012f4 (12f4 phys) = 7c20dfc2
[   60.387731] Corrupted low memory at c00012f8 (12f8 phys) = 07457bd1
[   60.387732] Corrupted low memory at c00012fc (12fc phys) = ac6497a8
[   60.387734] Corrupted low memory at c0001300 (1300 phys) = 69ff8587
[   60.387736] Corrupted low memory at c0001304 (1304 phys) = bff065d6
[   60.387737] Corrupted low memory at c0001308 (1308 phys) = a1e1bba2
[   60.387739] Corrupted low memory at c000130c (130c phys) = c2de8bc5
[   60.387741] Corrupted low memory at c0001310 (1310 phys) = c55c45b7
[   60.387742] Corrupted low memory at c0001314 (1314 phys) = 94fca63d
[   60.387744] Corrupted low memory at c0001318 (1318 phys) = 7d838e10
[   60.387746] Corrupted low memory at c000131c (131c phys) = b72fc2c5
[   60.387747] Corrupted low memory at c0001320 (1320 phys) = 44a63c46
[   60.387749] Corrupted low memory at c0001324 (1324 phys) = 914f01a3
[   60.387751] Corrupted low memory at c0001328 (1328 phys) = add331ae
[   60.387752] Corrupted low memory at c000132c (132c phys) = c41afe10
[   60.387754] Corrupted low memory at c0001330 (1330 phys) = fc9d59cb
[   60.387756] Corrupted low memory at c0001334 (1334 phys) = cc5947df
[   60.387757] Corrupted low memory at c0001338 (1338 phys) = bee8bcdd
[   60.387759] Corrupted low memory at c000133c (133c phys) = 2fdff0a4
[   60.387761] Corrupted low memory at c0001340 (1340 phys) = ff084ffc
[   60.387763] Corrupted low memory at c0001344 (1344 phys) = 0dff858f
[   60.387764] Corrupted low memory at c0001348 (1348 phys) = f844ffe1
[   60.387766] Corrupted low memory at c000134c (134c phys) = e2fdc2ff
[   60.387768] Corrupted low memory at c0001350 (1350 phys) = 2d3fc245
[   60.387769] Corrupted low memory at c0001354 (1354 phys) = ff096ffc
[   60.387771] Corrupted low memory at c0001358 (1358 phys) = 125fb99f
[   60.387773] Corrupted low memory at c000135c (135c phys) = f0a4fe14
[   60.387774] Corrupted low memory at c0001360 (1360 phys) = fc955345
[   60.387776] Corrupted low memory at c0001364 (1364 phys) = 2a7ff0ad
[   60.387778] Corrupted low memory at c0001368 (1368 phys) = fee4fffc
[   60.387779] Corrupted low memory at c000136c (136c phys) = dfe1229c
[   60.387781] Corrupted low memory at c0001370 (1370 phys) = 29d2fdcc
[   60.387783] Corrupted low memory at c0001374 (1374 phys) = f84c7fe1
[   60.387784] Corrupted low memory at c0001378 (1378 phys) = 9fff84e7
[   60.387786] Corrupted low memory at c000137c (137c phys) = a565a6bc
[   60.387788] Corrupted low memory at c0001380 (1380 phys) = 45acb9fc
[   60.387789] Corrupted low memory at c0001384 (1384 phys) = f8449bf8
[   60.387791] Corrupted low memory at c0001388 (1388 phys) = 0b77e156
[   60.387793] Corrupted low memory at c000138c (138c phys) = 1092447f
[   60.387794] Corrupted low memory at c0001390 (1390 phys) = fc2131fc
[   60.387796] Corrupted low memory at c0001394 (1394 phys) = 2bdfe129
[   60.387798] Corrupted low memory at c0001398 (1398 phys) = ff09f7fc
[   60.387800] Corrupted low memory at c000139c (139c phys) = 47dd9109
[   60.387801] Corrupted low memory at c00013a0 (13a0 phys) = fb2db68d
[   60.387803] Corrupted low memory at c00013a4 (13a4 phys) = 2bcbf726
[   60.387805] Corrupted low memory at c00013a8 (13a8 phys) = 746cdf85
[   60.387806] Corrupted low memory at c00013ac (13ac phys) = dc8ae182
[   60.387808] Corrupted low memory at c00013b0 (13b0 phys) = 7f09d7cb
[   60.387810] Corrupted low memory at c00013b4 (13b4 phys) = 7aac19b3
[   60.387811] Corrupted low memory at c00013b8 (13b8 phys) = 6df97594
[   60.387813] Corrupted low memory at c00013bc (13bc phys) = 78b0d222
[   60.387815] Corrupted low memory at c00013c0 (13c0 phys) = 31fe169e
[   60.387816] Corrupted low memory at c00013c4 (13c4 phys) = fc253fe1
[   60.387818] Corrupted low memory at c00013c8 (13c8 phys) = e8be17df
[   60.387820] Corrupted low memory at c00013cc (13cc phys) = 44e7f281
[   60.387821] Corrupted low memory at c00013d0 (13d0 phys) = f3c4d7ba
[   60.387823] Corrupted low memory at c00013d4 (13d4 phys) = a2bf73df
[   60.387825] Corrupted low memory at c00013d8 (13d8 phys) = 8abe0f84
[   60.387826] Corrupted low memory at c00013dc (13dc phys) = 2714f9ab
[   60.387828] Corrupted low memory at c00013e0 (13e0 phys) = 52d674b4
[   60.387830] Corrupted low memory at c00013e4 (13e4 phys) = 9d61113c
[   60.387831] Corrupted low memory at c00013e8 (13e8 phys) = ad15f89c
[   60.387833] Corrupted low memory at c00013ec (13ec phys) = d1ac51f9
[   60.387835] Corrupted low memory at c00013f0 (13f0 phys) = f5e765f5
[   60.387836] Corrupted low memory at c00013f4 (13f4 phys) = 57eafd5f
[   60.387838] Corrupted low memory at c00013f8 (13f8 phys) = 162f3a3f
[   60.387840] Corrupted low memory at c00013fc (13fc phys) = 7a9bdc95
[   60.387841] Corrupted low memory at c0001400 (1400 phys) = e02cc3ed
[   60.387843] Corrupted low memory at c0001404 (1404 phys) = 02d86717
[   60.387845] Corrupted low memory at c0001408 (1408 phys) = c553e72b
[   60.387847] Corrupted low memory at c000140c (140c phys) = 8d4670ad
[   60.387848] Corrupted low memory at c0001410 (1410 phys) = 88f8275c
[   60.387850] Corrupted low memory at c0001414 (1414 phys) = ebcfddd7
[   60.387852] Corrupted low memory at c0001418 (1418 phys) = 0af34e73
[   60.387853] Corrupted low memory at c000141c (141c phys) = 8d97205f
[   60.387855] Corrupted low memory at c0001420 (1420 phys) = 8a778ab7
[   60.387857] Corrupted low memory at c0001424 (1424 phys) = 565a8ad2
[   60.387858] Corrupted low memory at c0001428 (1428 phys) = cff091fc
[   60.387860] Corrupted low memory at c000142c (142c phys) = 8c778c71
[   60.387862] Corrupted low memory at c0001430 (1430 phys) = 60aa3116
[   60.387863] Corrupted low memory at c0001434 (1434 phys) = ffe165c5
[   60.387865] Corrupted low memory at c0001438 (1438 phys) = d110cf9c
[   60.387867] Corrupted low memory at c000143c (143c phys) = 0bb7736e
[   60.387868] Corrupted low memory at c0001440 (1440 phys) = 2bf728ff
[   60.387870] Corrupted low memory at c0001444 (1444 phys) = 26ff0be7
[   60.387872] Corrupted low memory at c0001448 (1448 phys) = 3f8447e1
[   60.387873] Corrupted low memory at c000144c (144c phys) = a55d746c
[   60.387875] Corrupted low memory at c0001450 (1450 phys) = 7f95dc18
[   60.387877] Corrupted low memory at c0001454 (1454 phys) = 11bffc20
[   60.387878] Corrupted low memory at c0001458 (1458 phys) = bf713ffe
[   60.387880] Corrupted low memory at c000145c (145c phys) = dfe12aea
[   60.387882] Corrupted low memory at c0001460 (1460 phys) = 68317c20
[   60.387883] Corrupted low memory at c0001464 (1464 phys) = 2f85baab
[   60.387885] Corrupted low memory at c0001468 (1468 phys) = e8956f06
[   60.387887] Corrupted low memory at c000146c (146c phys) = 78312a92
[   60.387889] Corrupted low memory at c0001470 (1470 phys) = af20c4ab
[   60.387890] Corrupted low memory at c0001474 (1474 phys) = 31f84dbe
[   60.387892] Corrupted low memory at c0001478 (1478 phys) = e9623fe1
[   60.387894] Corrupted low memory at c000147c (147c phys) = c2e3eae9
[   60.387895] Corrupted low memory at c0001480 (1480 phys) = 6ff85a7f
[   60.387897] Corrupted low memory at c0001484 (1484 phys) = c2bbff08
[   60.387899] Corrupted low memory at c0001488 (1488 phys) = faaa7317
[   60.387900] Corrupted low memory at c000148c (148c phys) = d79fdc57
[   60.387902] Corrupted low memory at c0001490 (1490 phys) = e141fc24
[   60.387904] Corrupted low memory at c0001494 (1494 phys) = 9c7fc2eb
[   60.387905] Corrupted low memory at c0001498 (1498 phys) = c7de6757
[   60.387907] Corrupted low memory at c000149c (149c phys) = e175ff84
[   60.387909] Corrupted low memory at c00014a0 (14a0 phys) = fffc2abf
[   60.387910] Corrupted low memory at c00014a4 (14a4 phys) = 6e6af1e0
[   60.387912] Corrupted low memory at c00014a8 (14a8 phys) = 9a9d9731
[   60.387914] Corrupted low memory at c00014ac (14ac phys) = 145f080f
[   60.387915] Corrupted low memory at c00014b0 (14b0 phys) = f844b6b9
[   60.387917] Corrupted low memory at c00014b4 (14b4 phys) = bb7f84ab
[   60.387919] Corrupted low memory at c00014b8 (14b8 phys) = e0f947f0
[   60.387920] Corrupted low memory at c00014bc (14bc phys) = f4604da8
[   60.387922] Corrupted low memory at c00014c0 (14c0 phys) = d6c99d10
[   60.387924] Corrupted low memory at c00014c4 (14c4 phys) = 39d3b283
[   60.387925] Corrupted low memory at c00014c8 (14c8 phys) = ceb08103
[   60.387927] Corrupted low memory at c00014cc (14cc phys) = 579d15ab
[   60.387929] Corrupted low memory at c00014d0 (14d0 phys) = f6ad3a2b
[   60.387930] Corrupted low memory at c00014d4 (14d4 phys) = afbff0b2
[   60.387932] Corrupted low memory at c00014d8 (14d8 phys) = bdad1fbb
[   60.387934] Corrupted low memory at c00014dc (14dc phys) = 53f4b390
[   60.387935] Corrupted low memory at c00014e0 (14e0 phys) = 62f8579d
[   60.387937] Corrupted low memory at c00014e4 (14e4 phys) = 102dd686
[   60.387939] Corrupted low memory at c00014e8 (14e8 phys) = 5fc2b5fe
[   60.387941] Corrupted low memory at c00014ec (14ec phys) = dd5efc2b
[   60.387942] Corrupted low memory at c00014f0 (14f0 phys) = d7868e7d
[   60.387944] Corrupted low memory at c00014f4 (14f4 phys) = 9c021039
[   60.387946] Corrupted low memory at c00014f8 (14f8 phys) = 0a190344
[   60.387947] Corrupted low memory at c00014fc (14fc phys) = 761a4ee7
[   60.387949] Corrupted low memory at c0001500 (1500 phys) = 7b8a1a54
[   60.387951] Corrupted low memory at c0001504 (1504 phys) = dd652f17
[   60.387952] Corrupted low memory at c0001508 (1508 phys) = 5e7f0863
[   60.387954] Corrupted low memory at c000150c (150c phys) = a30670df
[   60.387956] Corrupted low memory at c0001510 (1510 phys) = 8c33cdf3
[   60.387957] Corrupted low memory at c0001514 (1514 phys) = 1f816272
[   60.387959] Corrupted low memory at c0001518 (1518 phys) = bd28c739
[   60.387961] Corrupted low memory at c000151c (151c phys) = 1d7eb01c
[   60.387962] Corrupted low memory at c0001520 (1520 phys) = 0e7f3f1e
[   60.387964] Corrupted low memory at c0001524 (1524 phys) = 22b477ee
[   60.387966] Corrupted low memory at c0001528 (1528 phys) = 75e32148
[   60.387967] Corrupted low memory at c000152c (152c phys) = 704d8d70
[   60.387969] Corrupted low memory at c0001530 (1530 phys) = d993c53c
[   60.387971] Corrupted low memory at c0001534 (1534 phys) = 3a2c7a58
[   60.387972] Corrupted low memory at c0001538 (1538 phys) = 8c50cb12
[   60.387974] Corrupted low memory at c000153c (153c phys) = cc484a91
[   60.387976] Corrupted low memory at c0001540 (1540 phys) = 464cd112
[   60.387978] Corrupted low memory at c0001544 (1544 phys) = 9011a404
[   60.387979] Corrupted low memory at c0001548 (1548 phys) = 12a6658e
[   60.387981] Corrupted low memory at c000154c (154c phys) = 55301e59
[   60.387983] Corrupted low memory at c0001550 (1550 phys) = 3654c906
[   60.387984] Corrupted low memory at c0001554 (1554 phys) = 2360c89e
[   60.387986] Corrupted low memory at c0001558 (1558 phys) = 4af2ce3e
[   60.387988] Corrupted low memory at c000155c (155c phys) = 38f37767
[   60.387989] Corrupted low memory at c0001560 (1560 phys) = 44fb56ea
[   60.387991] Corrupted low memory at c0001564 (1564 phys) = 112a746a
[   60.387993] Corrupted low memory at c0001568 (1568 phys) = dbdeaccb
[   60.387994] Corrupted low memory at c000156c (156c phys) = 35d3dedb
[   60.387996] Corrupted low memory at c0001570 (1570 phys) = eeabe880
[   60.387998] Corrupted low memory at c0001574 (1574 phys) = 2cec4cca
[   60.387999] Corrupted low memory at c0001578 (1578 phys) = c46c511c
[   60.388001] Corrupted low memory at c000157c (157c phys) = e160b453
[   60.388003] Corrupted low memory at c0001580 (1580 phys) = d4e9bd68
[   60.388004] Corrupted low memory at c0001584 (1584 phys) = fd833c44
[   60.388006] Corrupted low memory at c0001588 (1588 phys) = df1409fa
[   60.388008] Corrupted low memory at c000158c (158c phys) = 0205becc
[   60.388010] Corrupted low memory at c0001590 (1590 phys) = 3c58d8be
[   60.388011] Corrupted low memory at c0001594 (1594 phys) = de8ceb19
[   60.388013] Corrupted low memory at c0001598 (1598 phys) = d145dd33
[   60.388015] Corrupted low memory at c000159c (159c phys) = acb2c677
[   60.388016] Corrupted low memory at c00015a0 (15a0 phys) = dc108d76
[   60.388018] Corrupted low memory at c00015a4 (15a4 phys) = 2393a666
[   60.388020] Corrupted low memory at c00015a8 (15a8 phys) = eca9f1e1
[   60.388021] Corrupted low memory at c00015ac (15ac phys) = b250ea68
[   60.388023] Corrupted low memory at c00015b0 (15b0 phys) = 96f670af
[   60.388025] Corrupted low memory at c00015b4 (15b4 phys) = 0348f878
[   60.388026] Corrupted low memory at c00015b8 (15b8 phys) = b1b1b664
[   60.388028] Corrupted low memory at c00015bc (15bc phys) = 1b1d3a58
[   60.388030] Corrupted low memory at c00015c0 (15c0 phys) = 0733448d
[   60.388031] Corrupted low memory at c00015c4 (15c4 phys) = 55a70a7a
[   60.388033] Corrupted low memory at c00015c8 (15c8 phys) = 9a2921a3
[   60.388035] Corrupted low memory at c00015cc (15cc phys) = 5edc27ab
[   60.388036] Corrupted low memory at c00015d0 (15d0 phys) = 332a4138
[   60.388038] Corrupted low memory at c00015d4 (15d4 phys) = 55504f16
[   60.388040] Corrupted low memory at c00015d8 (15d8 phys) = 64806aa0
[   60.388041] Corrupted low memory at c00015dc (15dc phys) = 2c8e1737
[   60.388043] Corrupted low memory at c00015e0 (15e0 phys) = a88a5d73
[   60.388045] Corrupted low memory at c00015e4 (15e4 phys) = 226c44b1
[   60.388046] Corrupted low memory at c00015e8 (15e8 phys) = 874f4d96
[   60.388048] Corrupted low memory at c00015ec (15ec phys) = 43c9a9b3
[   60.388050] Corrupted low memory at c00015f0 (15f0 phys) = f8c8ecd1
[   60.388052] Corrupted low memory at c00015f4 (15f4 phys) = cc24336d
[   60.388053] Corrupted low memory at c00015f8 (15f8 phys) = af1b2c8b
[   60.388055] Corrupted low memory at c00015fc (15fc phys) = 8e993623
[   60.388057] Corrupted low memory at c0001600 (1600 phys) = b1b3448d
[   60.388058] Corrupted low memory at c0001604 (1604 phys) = d533ddd1
[   60.388060] Corrupted low memory at c0001608 (1608 phys) = b242464a
[   60.388062] Corrupted low memory at c000160c (160c phys) = b97c0866
[   60.388063] Corrupted low memory at c0001610 (1610 phys) = 9a3644e2
[   60.388065] Corrupted low memory at c0001614 (1614 phys) = 44748f8d
[   60.388067] Corrupted low memory at c0001618 (1618 phys) = 333dd1b3
[   60.388068] Corrupted low memory at c000161c (161c phys) = aa35aa75
[   60.388070] Corrupted low memory at c0001620 (1620 phys) = 962672fc
[   60.388072] Corrupted low memory at c0001624 (1624 phys) = 6227c48d
[   60.388073] Corrupted low memory at c0001628 (1628 phys) = 606ca9a9
[   60.388075] Corrupted low memory at c000162c (162c phys) = 18b38286
[   60.388077] Corrupted low memory at c0001630 (1630 phys) = 2307d969
[   60.388079] Corrupted low memory at c0001634 (1634 phys) = 48959609
[   60.388080] Corrupted low memory at c0001638 (1638 phys) = d5d2dab3
[   60.388082] Corrupted low memory at c000163c (163c phys) = 7d98e062
[   60.388084] Corrupted low memory at c0001640 (1640 phys) = 3b123fbd
[   60.388085] Corrupted low memory at c0001644 (1644 phys) = c2a48d9c
[   60.388087] Corrupted low memory at c0001648 (1648 phys) = fd03c353
[   60.388089] Corrupted low memory at c000164c (164c phys) = 7dbfe421
[   60.388090] Corrupted low memory at c0001650 (1650 phys) = 3c4f7522
[   60.388092] Corrupted low memory at c0001654 (1654 phys) = 4aae8891
[   60.388094] Corrupted low memory at c0001658 (1658 phys) = 66aff7a5
[   60.388095] Corrupted low memory at c000165c (165c phys) = 3ab13e03
[   60.388097] Corrupted low memory at c0001660 (1660 phys) = 536462cb
[   60.388099] Corrupted low memory at c0001664 (1664 phys) = a9e1a999
[   60.388100] Corrupted low memory at c0001668 (1668 phys) = b5c6fc74
[   60.388102] Corrupted low memory at c000166c (166c phys) = 66d8904d
[   60.388104] Corrupted low memory at c0001670 (1670 phys) = 16b26a42
[   60.388105] Corrupted low memory at c0001674 (1674 phys) = 4cef6c63
[   60.388107] Corrupted low memory at c0001678 (1678 phys) = 09a386a4
[   60.388109] Corrupted low memory at c000167c (167c phys) = 6326ec36
[   60.388110] Corrupted low memory at c0001680 (1680 phys) = 963c6ce9
[   60.388112] Corrupted low memory at c0001684 (1684 phys) = 9c7cb669
[   60.388114] Corrupted low memory at c0001688 (1688 phys) = 5f1ab57f
[   60.388115] Corrupted low memory at c000168c (168c phys) = 5247260d
[   60.388117] Corrupted low memory at c0001690 (1690 phys) = 3d74b1a3
[   60.388119] Corrupted low memory at c0001694 (1694 phys) = 1d3de9c4
[   60.388120] Corrupted low memory at c0001698 (1698 phys) = 8efea1c1
[   60.388122] Corrupted low memory at c000169c (169c phys) = 7dceee9e
[   60.388124] Corrupted low memory at c00016a0 (16a0 phys) = 7e9ee3c6
[   60.388126] Corrupted low memory at c00016a4 (16a4 phys) = 7f7b63cb
[   60.388127] Corrupted low memory at c00016a8 (16a8 phys) = 1a091030
[   60.388129] Corrupted low memory at c00016ac (16ac phys) = dede818f
[   60.388131] Corrupted low memory at c00016b0 (16b0 phys) = a3477760
[   60.388132] Corrupted low memory at c00016b4 (16b4 phys) = 0ff48c7c
[   60.388134] Corrupted low memory at c00016b8 (16b8 phys) = f7b3b774
[   60.388136] Corrupted low memory at c00016bc (16bc phys) = 18774628
[   60.388137] Corrupted low memory at c00016c0 (16c0 phys) = c1c2876f
[   60.388139] Corrupted low memory at c00016c4 (16c4 phys) = aefb7f40
[   60.388141] Corrupted low memory at c00016c8 (16c8 phys) = ff508ea1
[   60.388142] Corrupted low memory at c00016cc (16cc phys) = 1c286750
[   60.388144] Corrupted low memory at c00016d0 (16d0 phys) = 6b30d441
[   60.388146] Corrupted low memory at c00016d4 (16d4 phys) = d4c4dedb
[   60.388147] Corrupted low memory at c00016d8 (16d8 phys) = d33278fe
[   60.388149] Corrupted low memory at c00016dc (16dc phys) = 8cd3b9d1
[   60.388151] Corrupted low memory at c00016e0 (16e0 phys) = f7dec1da
[   60.388152] Corrupted low memory at c00016e4 (16e4 phys) = 2206da87
[   60.388154] Corrupted low memory at c00016e8 (16e8 phys) = 7b6e9ea1
[   60.388156] Corrupted low memory at c00016ec (16ec phys) = 74b625ba
[   60.388158] Corrupted low memory at c00016f0 (16f0 phys) = a3b64cf1
[   60.388159] Corrupted low memory at c00016f4 (16f4 phys) = 40ccf953
[   60.388161] Corrupted low memory at c00016f8 (16f8 phys) = a4afd838
[   60.388163] Corrupted low memory at c00016fc (16fc phys) = 320667c1
[   60.388164] Corrupted low memory at c0001700 (1700 phys) = 99229969
[   60.388166] Corrupted low memory at c0001704 (1704 phys) = a254e259
[   60.388168] Corrupted low memory at c0001708 (1708 phys) = 64d3253a
[   60.388169] Corrupted low memory at c000170c (170c phys) = b72d8366
[   60.388171] Corrupted low memory at c0001710 (1710 phys) = 1d9efede
[   60.388173] Corrupted low memory at c0001714 (1714 phys) = d0e0cea1
[   60.388174] Corrupted low memory at c0001718 (1718 phys) = 0b883586
[   60.388176] Corrupted low memory at c000171c (171c phys) = 735ac1f4
[   60.388178] Corrupted low memory at c0001720 (1720 phys) = 293903ac
[   60.388179] Corrupted low memory at c0001724 (1724 phys) = 07f6f4d9
[   60.388181] Corrupted low memory at c0001728 (1728 phys) = 6ef68477
[   60.388183] Corrupted low memory at c000172c (172c phys) = 0c870eb0
[   60.388185] Corrupted low memory at c0001730 (1730 phys) = 49969f8d
[   60.388186] Corrupted low memory at c0001734 (1734 phys) = 8bd5d85e
[   60.388188] Corrupted low memory at c0001738 (1738 phys) = 9d1e1df0
[   60.388190] Corrupted low memory at c000173c (173c phys) = a97fb3d7
[   60.388191] Corrupted low memory at c0001740 (1740 phys) = c33f2219
[   60.388193] Corrupted low memory at c0001744 (1744 phys) = a1999742
[   60.388195] Corrupted low memory at c0001748 (1748 phys) = e0a1eea1
[   60.388196] Corrupted low memory at c000174c (174c phys) = 3bb048de
[   60.388198] Corrupted low memory at c0001750 (1750 phys) = cea2dacc
[   60.388200] Corrupted low memory at c0001754 (1754 phys) = a025be0c
[   60.388201] Corrupted low memory at c0001758 (1758 phys) = ef760a46
[   60.388203] Corrupted low memory at c000175c (175c phys) = 56b3750c
[   60.388205] Corrupted low memory at c0001760 (1760 phys) = b50de9ef
[   60.388206] Corrupted low memory at c0001764 (1764 phys) = 9c3ca243
[   60.388208] Corrupted low memory at c0001768 (1768 phys) = c4ca1d4e
[   60.388210] Corrupted low memory at c000176c (176c phys) = f3e22dd2
[   60.388211] Corrupted low memory at c0001770 (1770 phys) = 7afb24b1
[   60.388213] Corrupted low memory at c0001774 (1774 phys) = cb8d85ac
[   60.388215] Corrupted low memory at c0001778 (1778 phys) = 363c58f8
[   60.388216] Corrupted low memory at c000177c (177c phys) = 1bead68a
[   60.388218] Corrupted low memory at c0001780 (1780 phys) = 86504508
[   60.388220] Corrupted low memory at c0001784 (1784 phys) = 26eb10ce
[   60.388222] Corrupted low memory at c0001788 (1788 phys) = c5a53334
[   60.388223] Corrupted low memory at c000178c (178c phys) = 2a7bb8d8
[   60.388225] Corrupted low memory at c0001790 (1790 phys) = 4e896d9e
[   60.388227] Corrupted low memory at c0001794 (1794 phys) = 6ce2264e
[   60.388228] Corrupted low memory at c0001798 (1798 phys) = c953c4d2
[   60.388230] Corrupted low memory at c000179c (179c phys) = e99f7915
[   60.388232] Corrupted low memory at c00017a0 (17a0 phys) = 0cf5760f
[   60.388233] Corrupted low memory at c00017a4 (17a4 phys) = 19503b06
[   60.388235] Corrupted low memory at c00017a8 (17a8 phys) = 363a7391
[   60.388237] Corrupted low memory at c00017ac (17ac phys) = 1b183a91
[   60.388238] Corrupted low memory at c00017b0 (17b0 phys) = a0512fcd
[   60.388240] Corrupted low memory at c00017b4 (17b4 phys) = d1d823a3
[   60.388242] Corrupted low memory at c00017b8 (17b8 phys) = 8e8c4660
[   60.388243] Corrupted low memory at c00017bc (17bc phys) = f346c68e
[   60.388245] Corrupted low memory at c00017c0 (17c0 phys) = 15df6c73
[   60.388247] Corrupted low memory at c00017c4 (17c4 phys) = c969dee8
[   60.388249] Corrupted low memory at c00017c8 (17c8 phys) = 9a3c6471
[   60.388250] Corrupted low memory at c00017cc (17cc phys) = 5d1b1fdc
[   60.388252] Corrupted low memory at c00017d0 (17d0 phys) = b3b5179e
[   60.388254] Corrupted low memory at c00017d4 (17d4 phys) = 6cb2c027
[   60.388255] Corrupted low memory at c00017d8 (17d8 phys) = 52266334
[   60.388257] Corrupted low memory at c00017dc (17dc phys) = 073ce751
[   60.388259] Corrupted low memory at c00017e0 (17e0 phys) = dd83fbfa
[   60.388260] Corrupted low memory at c00017e4 (17e4 phys) = 29cf3511
[   60.388262] Corrupted low memory at c00017e8 (17e8 phys) = d992dd86
[   60.388264] Corrupted low memory at c00017ec (17ec phys) = c11da8fc
[   60.388265] Corrupted low memory at c00017f0 (17f0 phys) = 4e81811d
[   60.388267] Corrupted low memory at c00017f4 (17f4 phys) = 209b2a35
[   60.388269] Corrupted low memory at c00017f8 (17f8 phys) = 2862b651
[   60.388270] Corrupted low memory at c00017fc (17fc phys) = 6103679f
[   60.388272] Corrupted low memory at c0001800 (1800 phys) = da2ed430
[   60.388274] Corrupted low memory at c0001804 (1804 phys) = 8dbf25b8
[   60.388275] Corrupted low memory at c0001808 (1808 phys) = df167c33
[   60.388277] Corrupted low memory at c000180c (180c phys) = 2ba29f9c
[   60.388279] Corrupted low memory at c0001810 (1810 phys) = 04740918
[   60.388281] Corrupted low memory at c0001814 (1814 phys) = 83360122
[   60.388282] Corrupted low memory at c0001818 (1818 phys) = 93d61142
[   60.388284] Corrupted low memory at c000181c (181c phys) = 1d287288
[   60.388286] Corrupted low memory at c0001820 (1820 phys) = 890eb2cf
[   60.388287] Corrupted low memory at c0001824 (1824 phys) = 0ca3a365
[   60.388289] Corrupted low memory at c0001828 (1828 phys) = 5364b4ab
[   60.388291] Corrupted low memory at c000182c (182c phys) = ccf1a32d
[   60.388292] Corrupted low memory at c0001830 (1830 phys) = 77e81681
[   60.388294] Corrupted low memory at c0001834 (1834 phys) = d8d96225
[   60.388296] Corrupted low memory at c0001838 (1838 phys) = da5cfc96
[   60.388297] Corrupted low memory at c000183c (183c phys) = 82bbed03
[   60.388299] Corrupted low memory at c0001840 (1840 phys) = 462d8b32
[   60.388301] Corrupted low memory at c0001844 (1844 phys) = 91d12316
[   60.388302] Corrupted low memory at c0001848 (1848 phys) = 7a5e58f1
[   60.388304] Corrupted low memory at c000184c (184c phys) = 7d918aa5
[   60.388306] Corrupted low memory at c0001850 (1850 phys) = 3d08acbd
[   60.388307] Corrupted low memory at c0001854 (1854 phys) = 8147f2b7
[   60.388309] Corrupted low memory at c0001858 (1858 phys) = 0b5a3829
[   60.388311] Corrupted low memory at c000185c (185c phys) = 6b515d52
[   60.388312] Corrupted low memory at c0001860 (1860 phys) = 5b16a46b
[   60.388314] Corrupted low memory at c0001864 (1864 phys) = b5361eea
[   60.388316] Corrupted low memory at c0001868 (1868 phys) = a0b6e4cd
[   60.388318] Corrupted low memory at c000186c (186c phys) = e1c8b6e5
[   60.388319] Corrupted low memory at c0001870 (1870 phys) = cefec15d
[   60.388321] Corrupted low memory at c0001874 (1874 phys) = 8dddf69e
[   60.388323] Corrupted low memory at c0001878 (1878 phys) = 680caa32
[   60.388324] Corrupted low memory at c000187c (187c phys) = 6b522332
[   60.388326] Corrupted low memory at c0001880 (1880 phys) = 04eaedec
[   60.388328] Corrupted low memory at c0001884 (1884 phys) = b37091da
[   60.388329] Corrupted low memory at c0001888 (1888 phys) = 8d8475ed
[   60.388331] Corrupted low memory at c000188c (188c phys) = 1ef91afa
[   60.388333] Corrupted low memory at c0001890 (1890 phys) = 6750f80f
[   60.388334] Corrupted low memory at c0001894 (1894 phys) = 687dd6e0
[   60.388336] Corrupted low memory at c0001898 (1898 phys) = 33a05091
[   60.388338] Corrupted low memory at c000189c (189c phys) = adfa2b74
[   60.388339] Corrupted low memory at c00018a0 (18a0 phys) = 2d6fb033
[   60.388341] Corrupted low memory at c00018a4 (18a4 phys) = 9c6a7516
[   60.388343] Corrupted low memory at c00018a8 (18a8 phys) = 0b20f750
[   60.388345] Corrupted low memory at c00018ac (18ac phys) = 99e3e0ef
[   60.388346] Corrupted low memory at c00018b0 (18b0 phys) = ef1a35de
[   60.388348] Corrupted low memory at c00018b4 (18b4 phys) = 98fd8306
[   60.388350] Corrupted low memory at c00018b8 (18b8 phys) = a1a11841
[   60.388351] Corrupted low memory at c00018bc (18bc phys) = d966bc76
[   60.388353] Corrupted low memory at c00018c0 (18c0 phys) = 8a77f40e
[   60.388355] Corrupted low memory at c00018c4 (18c4 phys) = 369ebb59
[   60.388356] Corrupted low memory at c00018c8 (18c8 phys) = d6d1b2d7
[   60.388358] Corrupted low memory at c00018cc (18cc phys) = 7f7444f8
[   60.388360] Corrupted low memory at c00018d0 (18d0 phys) = 96e1f56c
[   60.388361] Corrupted low memory at c00018d4 (18d4 phys) = 792a48f8
[   60.388363] Corrupted low memory at c00018d8 (18d8 phys) = 133470e3
[   60.388365] Corrupted low memory at c00018dc (18dc phys) = 4ac378db
[   60.388366] Corrupted low memory at c00018e0 (18e0 phys) = b83ba294
[   60.388368] Corrupted low memory at c00018e4 (18e4 phys) = ee9f6187
[   60.388370] Corrupted low memory at c00018e8 (18e8 phys) = 5c309860
[   60.388371] Corrupted low memory at c00018ec (18ec phys) = 6510ef60
[   60.388373] Corrupted low memory at c00018f0 (18f0 phys) = 40856612
[   60.388375] Corrupted low memory at c00018f4 (18f4 phys) = 33f18b53
[   60.388376] Corrupted low memory at c00018f8 (18f8 phys) = ec1037d4
[   60.388378] Corrupted low memory at c00018fc (18fc phys) = dcc187df
[   60.388380] Corrupted low memory at c0001900 (1900 phys) = 30460e1f
[   60.388381] Corrupted low memory at c0001904 (1904 phys) = 9ee14359
[   60.388383] Corrupted low memory at c0001908 (1908 phys) = 379bcd96
[   60.388385] Corrupted low memory at c000190c (190c phys) = db59696c
[   60.388387] Corrupted low memory at c0001910 (1910 phys) = 6daf5668
[   60.388388] Corrupted low memory at c0001914 (1914 phys) = 4d83bba0
[   60.388390] Corrupted low memory at c0001918 (1918 phys) = e309ac73
[   60.388392] Corrupted low memory at c000191c (191c phys) = 55a0188c
[   60.388393] Corrupted low memory at c0001920 (1920 phys) = eaf47083
[   60.388395] Corrupted low memory at c0001924 (1924 phys) = 9a246a44
[   60.388397] Corrupted low memory at c0001928 (1928 phys) = 21a40988
[   60.388398] Corrupted low memory at c000192c (192c phys) = 069355bc
[   60.388400] Corrupted low memory at c0001930 (1930 phys) = 6a3cc6e6
[   60.388402] Corrupted low memory at c0001934 (1934 phys) = 77740feb
[   60.388403] Corrupted low memory at c0001938 (1938 phys) = 6ac22c88
[   60.388405] Corrupted low memory at c000193c (193c phys) = 68e9c316
[   60.388407] Corrupted low memory at c0001940 (1940 phys) = dc9e3472
[   60.388408] Corrupted low memory at c0001944 (1944 phys) = 67a1332f
[   60.388410] Corrupted low memory at c0001948 (1948 phys) = d7da3125
[   60.388412] Corrupted low memory at c000194c (194c phys) = 8621b5ec
[   60.388413] Corrupted low memory at c0001950 (1950 phys) = 09aed6df
[   60.388415] Corrupted low memory at c0001954 (1954 phys) = 9b30af61
[   60.388417] Corrupted low memory at c0001958 (1958 phys) = a3c9649d
[   60.388419] Corrupted low memory at c000195c (195c phys) = c1b1565b
[   60.388420] Corrupted low memory at c0001960 (1960 phys) = 7565bab3
[   60.388422] Corrupted low memory at c0001964 (1964 phys) = b2363266
[   60.388424] Corrupted low memory at c0001968 (1968 phys) = 8a4a259a
[   60.388425] Corrupted low memory at c000196c (196c phys) = b55dac52
[   60.388427] Corrupted low memory at c0001970 (1970 phys) = da43b566
[   60.388429] Corrupted low memory at c0001974 (1974 phys) = 769657b5
[   60.388430] Corrupted low memory at c0001978 (1978 phys) = b615eaed
[   60.388432] Corrupted low memory at c000197c (197c phys) = 60ae81b4
[   60.388434] Corrupted low memory at c0001980 (1980 phys) = 1ed037b8
[   60.388435] Corrupted low memory at c0001984 (1984 phys) = 9ecd0f0c
[   60.388437] Corrupted low memory at c0001988 (1988 phys) = 332c68c1
[   60.388439] Corrupted low memory at c000198c (198c phys) = 024f8e92
[   60.388440] Corrupted low memory at c0001990 (1990 phys) = 6f5e510b
[   60.388442] Corrupted low memory at c0001994 (1994 phys) = d59ac93b
[   60.388444] Corrupted low memory at c0001998 (1998 phys) = 679fadcb
[   60.388445] Corrupted low memory at c000199c (199c phys) = 89d03103
[   60.388447] Corrupted low memory at c00019a0 (19a0 phys) = 58b15032
[   60.388449] Corrupted low memory at c00019a4 (19a4 phys) = 3a133e2a
[   60.388450] Corrupted low memory at c00019a8 (19a8 phys) = b44d54cc
[   60.388452] Corrupted low memory at c00019ac (19ac phys) = 36aadd84
[   60.388454] Corrupted low memory at c00019b0 (19b0 phys) = b3f31ed3
[   60.388455] Corrupted low memory at c00019b4 (19b4 phys) = 5a1d0e87
[   60.388457] Corrupted low memory at c00019b8 (19b8 phys) = ebc62ea5
[   60.388459] Corrupted low memory at c00019bc (19bc phys) = b8d9acd6
[   60.388461] Corrupted low memory at c00019c0 (19c0 phys) = d902605a
[   60.388462] Corrupted low memory at c00019c4 (19c4 phys) = 392ccb6f
[   60.388464] Corrupted low memory at c00019c8 (19c8 phys) = 306c5f18
[   60.388466] Corrupted low memory at c00019cc (19cc phys) = 74be0f28
[   60.388467] Corrupted low memory at c00019d0 (19d0 phys) = 046c64a1
[   60.388469] Corrupted low memory at c00019d4 (19d4 phys) = b03239f2
[   60.388471] Corrupted low memory at c00019d8 (19d8 phys) = 1c306773
[   60.388472] Corrupted low memory at c00019dc (19dc phys) = dc25ba19
[   60.388474] Corrupted low memory at c00019e0 (19e0 phys) = 3c4a4dd3
[   60.388476] Corrupted low memory at c00019e4 (19e4 phys) = 8365fa26
[   60.388477] Corrupted low memory at c00019e8 (19e8 phys) = 7fb53331
[   60.388479] Corrupted low memory at c00019ec (19ec phys) = 87832c7f
[   60.388481] Corrupted low memory at c00019f0 (19f0 phys) = 32bb8acc
[   60.388482] Corrupted low memory at c00019f4 (19f4 phys) = 3532e4a9
[   60.388484] Corrupted low memory at c00019f8 (19f8 phys) = aecaff25
[   60.388486] Corrupted low memory at c00019fc (19fc phys) = c6275748
[   60.388487] Corrupted low memory at c0001a00 (1a00 phys) = d582e829
[   60.388489] Corrupted low memory at c0001a04 (1a04 phys) = c489d9e3
[   60.388491] Corrupted low memory at c0001a08 (1a08 phys) = 598e2552
[   60.388492] Corrupted low memory at c0001a0c (1a0c phys) = bcb6ddff
[   60.388494] Corrupted low memory at c0001a10 (1a10 phys) = c56ddb1d
[   60.388496] Corrupted low memory at c0001a14 (1a14 phys) = f20489e1
[   60.388498] Corrupted low memory at c0001a18 (1a18 phys) = 46f5760a
[   60.388499] Corrupted low memory at c0001a1c (1a1c phys) = f4750df6
[   60.388501] Corrupted low memory at c0001a20 (1a20 phys) = b95c83b4
[   60.388503] Corrupted low memory at c0001a24 (1a24 phys) = ea1666e9
[   60.388504] Corrupted low memory at c0001a28 (1a28 phys) = 674b183f
[   60.388506] Corrupted low memory at c0001a2c (1a2c phys) = 8bad8c58
[   60.388508] Corrupted low memory at c0001a30 (1a30 phys) = be2ed359
[   60.388509] Corrupted low memory at c0001a34 (1a34 phys) = 4de95b79
[   60.388511] Corrupted low memory at c0001a38 (1a38 phys) = f67ec198
[   60.388513] Corrupted low memory at c0001a3c (1a3c phys) = bad99926
[   60.388514] Corrupted low memory at c0001a40 (1a40 phys) = cc5c1518
[   60.388516] Corrupted low memory at c0001a44 (1a44 phys) = ec535a1f
[   60.388518] Corrupted low memory at c0001a48 (1a48 phys) = abcf18b9
[   60.388519] Corrupted low memory at c0001a4c (1a4c phys) = 51a32db7
[   60.388521] Corrupted low memory at c0001a50 (1a50 phys) = 5947b3db
[   60.388523] Corrupted low memory at c0001a54 (1a54 phys) = 9ae93197
[   60.388524] Corrupted low memory at c0001a58 (1a58 phys) = 4aba144c
[   60.388526] Corrupted low memory at c0001a5c (1a5c phys) = 1310748c
[   60.388528] Corrupted low memory at c0001a60 (1a60 phys) = 8a415cd4
[   60.388529] Corrupted low memory at c0001a64 (1a64 phys) = 3fbd8981
[   60.388531] Corrupted low memory at c0001a68 (1a68 phys) = a3a58c96
[   60.388533] Corrupted low memory at c0001a6c (1a6c phys) = 2c535589
[   60.388534] Corrupted low memory at c0001a70 (1a70 phys) = c9999b1a
[   60.388536] Corrupted low memory at c0001a74 (1a74 phys) = bc3e2aec
[   60.388538] Corrupted low memory at c0001a78 (1a78 phys) = 9b5a2df1
[   60.388540] Corrupted low memory at c0001a7c (1a7c phys) = d363d4d9
[   60.388541] Corrupted low memory at c0001a80 (1a80 phys) = 7b83d692
[   60.388543] Corrupted low memory at c0001a84 (1a84 phys) = 58913679
[   60.388545] Corrupted low memory at c0001a88 (1a88 phys) = 5f73227a
[   60.388546] Corrupted low memory at c0001a8c (1a8c phys) = a184ef03
[   60.388548] Corrupted low memory at c0001a90 (1a90 phys) = 423b1d9e
[   60.388550] Corrupted low memory at c0001a94 (1a94 phys) = a05d0ced
[   60.388551] Corrupted low memory at c0001a98 (1a98 phys) = b1b1a9a0
[   60.388553] Corrupted low memory at c0001a9c (1a9c phys) = 9a3c48f8
[   60.388555] Corrupted low memory at c0001aa0 (1aa0 phys) = 25641d10
[   60.388556] Corrupted low memory at c0001aa4 (1aa4 phys) = f81d4233
[   60.388558] Corrupted low memory at c0001aa8 (1aa8 phys) = d953eb3c
[   60.388560] Corrupted low memory at c0001aac (1aac phys) = 5622cdbf
[   60.388561] Corrupted low memory at c0001ab0 (1ab0 phys) = d03fa06f
[   60.388563] Corrupted low memory at c0001ab4 (1ab4 phys) = b831bbc5
[   60.388565] Corrupted low memory at c0001ab8 (1ab8 phys) = f48612c4
[   60.388566] Corrupted low memory at c0001abc (1abc phys) = 4b52dd2c
[   60.388568] Corrupted low memory at c0001ac0 (1ac0 phys) = 0d20fca0
[   60.388570] Corrupted low memory at c0001ac4 (1ac4 phys) = 5408b0b0
[   60.388571] Corrupted low memory at c0001ac8 (1ac8 phys) = 7d5aba8b
[   60.388573] Corrupted low memory at c0001acc (1acc phys) = 1deeea11
[   60.388575] Corrupted low memory at c0001ad0 (1ad0 phys) = a865b088
[   60.388577] Corrupted low memory at c0001ad4 (1ad4 phys) = d423a7bd
[   60.388578] Corrupted low memory at c0001ad8 (1ad8 phys) = fa8973bd
[   60.388580] Corrupted low memory at c0001adc (1adc phys) = 892d93aa
[   60.388582] Corrupted low memory at c0001ae0 (1ae0 phys) = c19d8103
[   60.388583] Corrupted low memory at c0001ae4 (1ae4 phys) = cf606ab5
[   60.388585] Corrupted low memory at c0001ae8 (1ae8 phys) = b04d4440
[   60.388587] Corrupted low memory at c0001aec (1aec phys) = 3e0129c2
[   60.388588] Corrupted low memory at c0001af0 (1af0 phys) = ddce81e4
[   60.388590] Corrupted low memory at c0001af4 (1af4 phys) = 22521996
[   60.388592] Corrupted low memory at c0001af8 (1af8 phys) = 81992812
[   60.388593] Corrupted low memory at c0001afc (1afc phys) = cd968ec8
[   60.388595] Corrupted low memory at c0001b00 (1b00 phys) = 09170a9b
[   60.388597] Corrupted low memory at c0001b04 (1b04 phys) = afa916f1
[   60.388598] Corrupted low memory at c0001b08 (1b08 phys) = ba067bdb
[   60.388600] Corrupted low memory at c0001b0c (1b0c phys) = 211b1709
[   60.388602] Corrupted low memory at c0001b10 (1b10 phys) = 9fa048e1
[   60.388603] Corrupted low memory at c0001b14 (1b14 phys) = 477743b5
[   60.388605] Corrupted low memory at c0001b18 (1b18 phys) = fcb2ef70
[   60.388607] Corrupted low memory at c0001b1c (1b1c phys) = a2565ef0
[   60.388608] Corrupted low memory at c0001b20 (1b20 phys) = 201979a5
[   60.388610] Corrupted low memory at c0001b24 (1b24 phys) = eb17b05b
[   60.388612] Corrupted low memory at c0001b28 (1b28 phys) = 902dacd6
[   60.388614] Corrupted low memory at c0001b2c (1b2c phys) = dac5ac95
[   60.388615] Corrupted low memory at c0001b30 (1b30 phys) = e279ced5
[   60.388617] Corrupted low memory at c0001b34 (1b34 phys) = 7abda5bd
[   60.388619] Corrupted low memory at c0001b38 (1b38 phys) = 468fd17a
[   60.388620] Corrupted low memory at c0001b3c (1b3c phys) = 87740fc3
[   60.388622] Corrupted low memory at c0001b40 (1b40 phys) = 7b7b7a07
[   60.388624] Corrupted low memory at c0001b44 (1b44 phys) = 0ec123fa
[   60.388625] Corrupted low memory at c0001b48 (1b48 phys) = 55bc211e
[   60.388627] Corrupted low memory at c0001b4c (1b4c phys) = c2046159
[   60.388629] Corrupted low memory at c0001b50 (1b50 phys) = eb65e81e
[   60.388630] Corrupted low memory at c0001b54 (1b54 phys) = 088e1600
[   60.388632] Corrupted low memory at c0001b58 (1b58 phys) = ff59655b
[   60.388634] Corrupted low memory at c0001b5c (1b5c phys) = 0676319a
[   60.388635] Corrupted low memory at c0001b60 (1b60 phys) = c2c50a7b
[   60.388637] Corrupted low memory at c0001b64 (1b64 phys) = e21a18c0
[   60.388639] Corrupted low memory at c0001b68 (1b68 phys) = 9f273424
[   60.388640] Corrupted low memory at c0001b6c (1b6c phys) = 5f308687
[   60.388642] Corrupted low memory at c0001b70 (1b70 phys) = 885e3076
[   60.388644] Corrupted low memory at c0001b74 (1b74 phys) = 9c703938
[   60.388645] Corrupted low memory at c0001b78 (1b78 phys) = 602cb1dd
[   60.388647] Corrupted low memory at c0001b7c (1b7c phys) = dbfb022c
[   60.388649] Corrupted low memory at c0001b80 (1b80 phys) = c7f74709
[   60.388650] Corrupted low memory at c0001b84 (1b84 phys) = f3312ff6
[   60.388652] Corrupted low memory at c0001b88 (1b88 phys) = 29d9ecef
[   60.388654] Corrupted low memory at c0001b8c (1b8c phys) = d48c1ed6
[   60.388656] Corrupted low memory at c0001b90 (1b90 phys) = debfafeb
[   60.388657] Corrupted low memory at c0001b94 (1b94 phys) = ab4371cf
[   60.388659] Corrupted low memory at c0001b98 (1b98 phys) = c16698bf
[   60.388661] Corrupted low memory at c0001b9c (1b9c phys) = b483044e
[   60.388662] Corrupted low memory at c0001ba0 (1ba0 phys) = 2e0464e9
[   60.388664] Corrupted low memory at c0001ba4 (1ba4 phys) = c0be0126
[   60.388666] Corrupted low memory at c0001ba8 (1ba8 phys) = 953679b0
[   60.388667] Corrupted low memory at c0001bac (1bac phys) = e18c9e9d
[   60.388669] Corrupted low memory at c0001bb0 (1bb0 phys) = 83cd8861
[   60.388671] Corrupted low memory at c0001bb4 (1bb4 phys) = e47470c4
[   60.388672] Corrupted low memory at c0001bb8 (1bb8 phys) = 303d3540
[   60.388674] Corrupted low memory at c0001bbc (1bbc phys) = a319297a
[   60.388676] Corrupted low memory at c0001bc0 (1bc0 phys) = a48d0def
[   60.388677] Corrupted low memory at c0001bc4 (1bc4 phys) = fed992a6
[   60.388679] Corrupted low memory at c0001bc8 (1bc8 phys) = 05147458
[   60.388681] Corrupted low memory at c0001bcc (1bcc phys) = 0d14e219
[   60.388682] Corrupted low memory at c0001bd0 (1bd0 phys) = 3a6c9981
[   60.388684] Corrupted low memory at c0001bd4 (1bd4 phys) = 77409d16
[   60.388686] Corrupted low memory at c0001bd8 (1bd8 phys) = d2c54df6
[   60.388687] Corrupted low memory at c0001bdc (1bdc phys) = 3ac921d3
[   60.388689] Corrupted low memory at c0001be0 (1be0 phys) = 64952423
[   60.388691] Corrupted low memory at c0001be4 (1be4 phys) = 259a4e5c
[   60.388693] Corrupted low memory at c0001be8 (1be8 phys) = e18a7bbb
[   60.388694] Corrupted low memory at c0001bec (1bec phys) = 9211469a
[   60.388696] Corrupted low memory at c0001bf0 (1bf0 phys) = 4d01149a
[   60.388698] Corrupted low memory at c0001bf4 (1bf4 phys) = 28160014
[   60.388699] Corrupted low memory at c0001bf8 (1bf8 phys) = 4f1a3070
[   60.388701] Corrupted low memory at c0001bfc (1bfc phys) = 31138744
[   60.388703] Corrupted low memory at c0001c00 (1c00 phys) = b0510990
[   60.388704] Corrupted low memory at c0001c04 (1c04 phys) = 76c93d9b
[   60.388706] Corrupted low memory at c0001c08 (1c08 phys) = d8fef5fe
[   60.388708] Corrupted low memory at c0001c0c (1c0c phys) = 5af19c68
[   60.388709] Corrupted low memory at c0001c10 (1c10 phys) = b99a06d9
[   60.388711] Corrupted low memory at c0001c14 (1c14 phys) = 3aff2359
[   60.388713] Corrupted low memory at c0001c18 (1c18 phys) = f2df2c4a
[   60.388714] Corrupted low memory at c0001c1c (1c1c phys) = 33424627
[   60.388716] Corrupted low memory at c0001c20 (1c20 phys) = a993295e
[   60.388718] Corrupted low memory at c0001c24 (1c24 phys) = 4ecedf96
[   60.388719] Corrupted low memory at c0001c28 (1c28 phys) = 10cc0b11
[   60.388721] Corrupted low memory at c0001c2c (1c2c phys) = b63a54d2
[   60.388723] Corrupted low memory at c0001c30 (1c30 phys) = d7d8c93b
[   60.388724] Corrupted low memory at c0001c34 (1c34 phys) = 3d23d476
[   60.388726] Corrupted low memory at c0001c38 (1c38 phys) = 92b8cedd
[   60.388728] Corrupted low memory at c0001c3c (1c3c phys) = 0c2df932
[   60.388729] Corrupted low memory at c0001c40 (1c40 phys) = e318b1a5
[   60.388731] Corrupted low memory at c0001c44 (1c44 phys) = 32735901
[   60.388733] Corrupted low memory at c0001c48 (1c48 phys) = e9c1d44c
[   60.388734] Corrupted low memory at c0001c4c (1c4c phys) = 2860fa34
[   60.388736] Corrupted low memory at c0001c50 (1c50 phys) = d9b2cf19
[   60.388738] Corrupted low memory at c0001c54 (1c54 phys) = 3627bf1d
[   60.388740] Corrupted low memory at c0001c58 (1c58 phys) = ea8be84a
[   60.388741] Corrupted low memory at c0001c5c (1c5c phys) = be2070cd
[   60.388743] Corrupted low memory at c0001c60 (1c60 phys) = decee8c0
[   60.388745] Corrupted low memory at c0001c64 (1c64 phys) = 9c4a3a68
[   60.388746] Corrupted low memory at c0001c68 (1c68 phys) = e87b10d8
[   60.388748] Corrupted low memory at c0001c6c (1c6c phys) = cf43dd4a
[   60.388750] Corrupted low memory at c0001c70 (1c70 phys) = 8ca4bc8d
[   60.388751] Corrupted low memory at c0001c74 (1c74 phys) = 394c6267
[   60.388753] Corrupted low memory at c0001c78 (1c78 phys) = 308a1b50
[   60.388755] Corrupted low memory at c0001c7c (1c7c phys) = 8b0842c4
[   60.388756] Corrupted low memory at c0001c80 (1c80 phys) = 4e461450
[   60.388758] Corrupted low memory at c0001c84 (1c84 phys) = 4799bba5
[   60.388760] Corrupted low memory at c0001c88 (1c88 phys) = 99a3c749
[   60.388761] Corrupted low memory at c0001c8c (1c8c phys) = 908258fe
[   60.388763] Corrupted low memory at c0001c90 (1c90 phys) = c50e9227
[   60.388765] Corrupted low memory at c0001c94 (1c94 phys) = c4f737d9
[   60.388766] Corrupted low memory at c0001c98 (1c98 phys) = dc40e3b3
[   60.388768] Corrupted low memory at c0001c9c (1c9c phys) = 9e182e34
[   60.388770] Corrupted low memory at c0001ca0 (1ca0 phys) = a48865e2
[   60.388771] Corrupted low memory at c0001ca4 (1ca4 phys) = 16ccb610
[   60.388773] Corrupted low memory at c0001ca8 (1ca8 phys) = 9b2b0416
[   60.388775] Corrupted low memory at c0001cac (1cac phys) = e51856cc
[   60.388776] Corrupted low memory at c0001cb0 (1cb0 phys) = 92a74943
[   60.388778] Corrupted low memory at c0001cb4 (1cb4 phys) = 568d307c
[   60.388780] Corrupted low memory at c0001cb8 (1cb8 phys) = f112e950
[   60.388782] Corrupted low memory at c0001cbc (1cbc phys) = 329ea668
[   60.388783] Corrupted low memory at c0001cc0 (1cc0 phys) = 4151651f
[   60.388785] Corrupted low memory at c0001cc4 (1cc4 phys) = fd51a7b1
[   60.388787] Corrupted low memory at c0001cc8 (1cc8 phys) = 35281021
[   60.388788] Corrupted low memory at c0001ccc (1ccc phys) = 51c88812
[   60.388790] Corrupted low memory at c0001cd0 (1cd0 phys) = b4747ac0
[   60.388792] Corrupted low memory at c0001cd4 (1cd4 phys) = 86989927
[   60.388793] Corrupted low memory at c0001cd8 (1cd8 phys) = 711508aa
[   60.388795] Corrupted low memory at c0001cdc (1cdc phys) = 8996bad4
[   60.388797] Corrupted low memory at c0001ce0 (1ce0 phys) = 06a48ffe
[   60.388798] Corrupted low memory at c0001ce4 (1ce4 phys) = 18ba5926
[   60.388800] Corrupted low memory at c0001ce8 (1ce8 phys) = 6c08627d
[   60.388802] Corrupted low memory at c0001cec (1cec phys) = 6009e53b
[   60.388803] Corrupted low memory at c0001cf0 (1cf0 phys) = 6f58a889
[   60.388805] Corrupted low memory at c0001cf4 (1cf4 phys) = 21fff05a
[   60.388807] Corrupted low memory at c0001cf8 (1cf8 phys) = 78d0fc46
[   60.388808] Corrupted low memory at c0001cfc (1cfc phys) = 72338f19
[   60.388810] Corrupted low memory at c0001d00 (1d00 phys) = 9311493d
[   60.388812] Corrupted low memory at c0001d04 (1d04 phys) = c00805ac
[   60.388813] Corrupted low memory at c0001d08 (1d08 phys) = dfaaa3f0
[   60.388815] Corrupted low memory at c0001d0c (1d0c phys) = 3ac4218d
[   60.388817] Corrupted low memory at c0001d10 (1d10 phys) = 6178cd3e
[   60.388818] Corrupted low memory at c0001d14 (1d14 phys) = c710f0fa
[   60.388820] Corrupted low memory at c0001d18 (1d18 phys) = e9523bbf
[   60.388822] Corrupted low memory at c0001d1c (1d1c phys) = 6d5e7a89
[   60.388824] Corrupted low memory at c0001d20 (1d20 phys) = 870948f9
[   60.388825] Corrupted low memory at c0001d24 (1d24 phys) = 310aab18
[   60.388827] Corrupted low memory at c0001d28 (1d28 phys) = 643e7c01
[   60.388829] Corrupted low memory at c0001d2c (1d2c phys) = 78b1c953
[   60.388830] Corrupted low memory at c0001d30 (1d30 phys) = c4be827a
[   60.388832] Corrupted low memory at c0001d34 (1d34 phys) = ea777bf0
[   60.388834] Corrupted low memory at c0001d38 (1d38 phys) = d26f3350
[   60.388835] Corrupted low memory at c0001d3c (1d3c phys) = ec8d964c
[   60.388837] Corrupted low memory at c0001d40 (1d40 phys) = a9ba3176
[   60.388839] Corrupted low memory at c0001d44 (1d44 phys) = 33b03274
[   60.388840] Corrupted low memory at c0001d48 (1d48 phys) = aa2e17c5
[   60.388842] Corrupted low memory at c0001d4c (1d4c phys) = 885514d8
[   60.388844] Corrupted low memory at c0001d50 (1d50 phys) = 28c1d906
[   60.388845] Corrupted low memory at c0001d54 (1d54 phys) = 883cf91f
[   60.388847] Corrupted low memory at c0001d58 (1d58 phys) = ca22f9f5
[   60.388849] Corrupted low memory at c0001d5c (1d5c phys) = 40617d89
[   60.388850] Corrupted low memory at c0001d60 (1d60 phys) = 0ce06086
[   60.388852] Corrupted low memory at c0001d64 (1d64 phys) = 2dbb3825
[   60.388854] Corrupted low memory at c0001d68 (1d68 phys) = 399a237a
[   60.388855] Corrupted low memory at c0001d6c (1d6c phys) = 6b670540
[   60.388857] Corrupted low memory at c0001d70 (1d70 phys) = 2c751b26
[   60.388859] Corrupted low memory at c0001d74 (1d74 phys) = d18750db
[   60.388860] Corrupted low memory at c0001d78 (1d78 phys) = 94d83ba9
[   60.388862] Corrupted low memory at c0001d7c (1d7c phys) = 64a684c5
[   60.388864] Corrupted low memory at c0001d80 (1d80 phys) = 4a42f234
[   60.388865] Corrupted low memory at c0001d84 (1d84 phys) = 1a3532e8
[   60.388867] Corrupted low memory at c0001d88 (1d88 phys) = 68e4f498
[   60.388869] Corrupted low memory at c0001d8c (1d8c phys) = 4bab9b27
[   60.388871] Corrupted low memory at c0001d90 (1d90 phys) = 4e0b22f2
[   60.388872] Corrupted low memory at c0001d94 (1d94 phys) = 3652f59c
[   60.388874] Corrupted low memory at c0001d98 (1d98 phys) = 8c1d8f1d
[   60.388876] Corrupted low memory at c0001d9c (1d9c phys) = c8ce49f1
[   60.388877] Corrupted low memory at c0001da0 (1da0 phys) = 66a4cc5d
[   60.388879] Corrupted low memory at c0001da4 (1da4 phys) = 96612e5b
[   60.388881] Corrupted low memory at c0001da8 (1da8 phys) = 3f8c51b0
[   60.388882] Corrupted low memory at c0001dac (1dac phys) = 319ee256
[   60.388884] Corrupted low memory at c0001db0 (1db0 phys) = a512f274
[   60.388886] Corrupted low memory at c0001db4 (1db4 phys) = a3a302a9
[   60.388887] Corrupted low memory at c0001db8 (1db8 phys) = a7367341
[   60.388889] Corrupted low memory at c0001dbc (1dbc phys) = bd66f892
[   60.388891] Corrupted low memory at c0001dc0 (1dc0 phys) = db4e8048
[   60.388892] Corrupted low memory at c0001dc4 (1dc4 phys) = 146e9b1d
[   60.388894] Corrupted low memory at c0001dc8 (1dc8 phys) = 918f1c03
[   60.388896] Corrupted low memory at c0001dcc (1dcc phys) = b2745754
[   60.388897] Corrupted low memory at c0001dd0 (1dd0 phys) = bd05779a
[   60.388899] Corrupted low memory at c0001dd4 (1dd4 phys) = 250ed4b7
[   60.388901] Corrupted low memory at c0001dd8 (1dd8 phys) = 7406a258
[   60.388903] Corrupted low memory at c0001ddc (1ddc phys) = 86846e09
[   60.388904] Corrupted low memory at c0001de0 (1de0 phys) = 95fa24d5
[   60.388906] Corrupted low memory at c0001de4 (1de4 phys) = 31978215
[   60.388908] Corrupted low memory at c0001de8 (1de8 phys) = ff19e623
[   60.388909] Corrupted low memory at c0001dec (1dec phys) = 2111bd7a
[   60.388911] Corrupted low memory at c0001df0 (1df0 phys) = 89797204
[   60.388913] Corrupted low memory at c0001df4 (1df4 phys) = b0c818f1
[   60.388914] Corrupted low memory at c0001df8 (1df8 phys) = b8abc579
[   60.388916] Corrupted low memory at c0001dfc (1dfc phys) = e9340954
[   60.388918] Corrupted low memory at c0001e00 (1e00 phys) = 342566ac
[   60.388919] Corrupted low memory at c0001e04 (1e04 phys) = b34b1e3c
[   60.388921] Corrupted low memory at c0001e08 (1e08 phys) = 383a9cc5
[   60.388923] Corrupted low memory at c0001e0c (1e0c phys) = 73ec5faf
[   60.388924] Corrupted low memory at c0001e10 (1e10 phys) = c95e8d09
[   60.388926] Corrupted low memory at c0001e14 (1e14 phys) = 269c121b
[   60.388928] Corrupted low memory at c0001e18 (1e18 phys) = 13cc03a1
[   60.388929] Corrupted low memory at c0001e1c (1e1c phys) = d31a0092
[   60.388931] Corrupted low memory at c0001e20 (1e20 phys) = 1b449ea4
[   60.388933] Corrupted low memory at c0001e24 (1e24 phys) = 58c7f0c6
[   60.388934] Corrupted low memory at c0001e28 (1e28 phys) = 491843e7
[   60.388936] Corrupted low memory at c0001e2c (1e2c phys) = 87f43469
[   60.388938] Corrupted low memory at c0001e30 (1e30 phys) = d1918daa
[   60.388939] Corrupted low memory at c0001e34 (1e34 phys) = 50d3144f
[   60.388941] Corrupted low memory at c0001e38 (1e38 phys) = 06a98c42
[   60.388943] Corrupted low memory at c0001e3c (1e3c phys) = b19d90d1
[   60.388944] Corrupted low memory at c0001e40 (1e40 phys) = fbe618e8
[   60.388946] Corrupted low memory at c0001e44 (1e44 phys) = 46f03e1c
[   60.388948] Corrupted low memory at c0001e48 (1e48 phys) = a454f98b
[   60.388949] Corrupted low memory at c0001e4c (1e4c phys) = 53268f2e
[   60.388951] Corrupted low memory at c0001e50 (1e50 phys) = 6009e9c9
[   60.388953] Corrupted low memory at c0001e54 (1e54 phys) = 6f7d3418
[   60.388955] Corrupted low memory at c0001e58 (1e58 phys) = 0eae3b34
[   60.388956] Corrupted low memory at c0001e5c (1e5c phys) = f93a3959
[   60.388958] Corrupted low memory at c0001e60 (1e60 phys) = 24a13227
[   60.388960] Corrupted low memory at c0001e64 (1e64 phys) = 25ebd263
[   60.388961] Corrupted low memory at c0001e68 (1e68 phys) = ec9c981c
[   60.388963] Corrupted low memory at c0001e6c (1e6c phys) = e41ba788
[   60.388965] Corrupted low memory at c0001e70 (1e70 phys) = ac082423
[   60.388966] Corrupted low memory at c0001e74 (1e74 phys) = e696754e
[   60.388968] Corrupted low memory at c0001e78 (1e78 phys) = 913317ad
[   60.388970] Corrupted low memory at c0001e7c (1e7c phys) = 1cccda0f
[   60.388971] Corrupted low memory at c0001e80 (1e80 phys) = 25e3c452
[   60.388973] Corrupted low memory at c0001e84 (1e84 phys) = 82e3b4c1
[   60.388975] Corrupted low memory at c0001e88 (1e88 phys) = d6a93927
[   60.388976] Corrupted low memory at c0001e8c (1e8c phys) = 8d86f28b
[   60.388978] Corrupted low memory at c0001e90 (1e90 phys) = dea22010
[   60.388980] Corrupted low memory at c0001e94 (1e94 phys) = 2a64f4a9
[   60.388981] Corrupted low memory at c0001e98 (1e98 phys) = 9d4ac843
[   60.388983] Corrupted low memory at c0001e9c (1e9c phys) = 33b058a0
[   60.388985] Corrupted low memory at c0001ea0 (1ea0 phys) = 07909353
[   60.388987] Corrupted low memory at c0001ea4 (1ea4 phys) = 99960431
[   60.388988] Corrupted low memory at c0001ea8 (1ea8 phys) = 0c64a464
[   60.388990] Corrupted low memory at c0001eac (1eac phys) = e389aba3
[   60.388992] Corrupted low memory at c0001eb0 (1eb0 phys) = 8f927ebf
[   60.388993] Corrupted low memory at c0001eb4 (1eb4 phys) = a226a476
[   60.388995] Corrupted low memory at c0001eb8 (1eb8 phys) = 1e12e4f1
[   60.388997] Corrupted low memory at c0001ebc (1ebc phys) = a1879ac4
[   60.388998] Corrupted low memory at c0001ec0 (1ec0 phys) = 28e30cbf
[   60.389000] Corrupted low memory at c0001ec4 (1ec4 phys) = 54b1c942
[   60.389002] Corrupted low memory at c0001ec8 (1ec8 phys) = 8c9c3113
[   60.389003] Corrupted low memory at c0001ecc (1ecc phys) = 328d8c74
[   60.389005] Corrupted low memory at c0001ed0 (1ed0 phys) = 177c36b4
[   60.389007] Corrupted low memory at c0001ed4 (1ed4 phys) = 02cde343
[   60.389008] Corrupted low memory at c0001ed8 (1ed8 phys) = 6ca6f3c9
[   60.389010] Corrupted low memory at c0001edc (1edc phys) = 66b30e9b
[   60.389012] Corrupted low memory at c0001ee0 (1ee0 phys) = 32c336a9
[   60.389013] Corrupted low memory at c0001ee4 (1ee4 phys) = ce200118
[   60.389015] Corrupted low memory at c0001ee8 (1ee8 phys) = 0f2e80a7
[   60.389017] Corrupted low memory at c0001eec (1eec phys) = 6c53dac6
[   60.389019] Corrupted low memory at c0001ef0 (1ef0 phys) = dfc8e085
[   60.389020] Corrupted low memory at c0001ef4 (1ef4 phys) = 5b68b6c4
[   60.389022] Corrupted low memory at c0001ef8 (1ef8 phys) = a9580e81
[   60.389024] Corrupted low memory at c0001efc (1efc phys) = 589218b3
[   60.389025] Corrupted low memory at c0001f00 (1f00 phys) = 98e1f1be
[   60.389027] Corrupted low memory at c0001f04 (1f04 phys) = ae978b19
[   60.389029] Corrupted low memory at c0001f08 (1f08 phys) = cbcee0c2
[   60.389030] Corrupted low memory at c0001f0c (1f0c phys) = f865d088
[   60.389032] Corrupted low memory at c0001f10 (1f10 phys) = 3d025eb9
[   60.389034] Corrupted low memory at c0001f14 (1f14 phys) = 92ac6363
[   60.389035] Corrupted low memory at c0001f18 (1f18 phys) = 8b2763c6
[   60.389037] Corrupted low memory at c0001f1c (1f1c phys) = 33af1d97
[   60.389039] Corrupted low memory at c0001f20 (1f20 phys) = ac6cc7e8
[   60.389040] Corrupted low memory at c0001f24 (1f24 phys) = e555b42f
[   60.389042] Corrupted low memory at c0001f28 (1f28 phys) = b03f1d7c
[   60.389044] Corrupted low memory at c0001f2c (1f2c phys) = 13608489
[   60.389045] Corrupted low memory at c0001f30 (1f30 phys) = 82e9d993
[   60.389047] Corrupted low memory at c0001f34 (1f34 phys) = 0735e7df
[   60.389049] Corrupted low memory at c0001f38 (1f38 phys) = c58e9634
[   60.389050] Corrupted low memory at c0001f3c (1f3c phys) = 684421ee
[   60.389052] Corrupted low memory at c0001f40 (1f40 phys) = 9a058639
[   60.389054] Corrupted low memory at c0001f44 (1f44 phys) = 05d1d426
[   60.389055] Corrupted low memory at c0001f48 (1f48 phys) = fad3746a
[   60.389057] Corrupted low memory at c0001f4c (1f4c phys) = 6d365bb5
[   60.389059] Corrupted low memory at c0001f50 (1f50 phys) = 196d1cde
[   60.389061] Corrupted low memory at c0001f54 (1f54 phys) = df596d8b
[   60.389062] Corrupted low memory at c0001f58 (1f58 phys) = f0e8c372
[   60.389064] Corrupted low memory at c0001f5c (1f5c phys) = 632d96fa
[   60.389066] Corrupted low memory at c0001f60 (1f60 phys) = 6cb4b7eb
[   60.389067] Corrupted low memory at c0001f64 (1f64 phys) = 6479bb5e
[   60.389069] Corrupted low memory at c0001f68 (1f68 phys) = b1e6cd6c
[   60.389071] Corrupted low memory at c0001f6c (1f6c phys) = d86dd61b
[   60.389072] Corrupted low memory at c0001f70 (1f70 phys) = ad219f78
[   60.389074] Corrupted low memory at c0001f74 (1f74 phys) = 7194750e
[   60.389076] Corrupted low memory at c0001f78 (1f78 phys) = eecf53a9
[   60.389077] Corrupted low memory at c0001f7c (1f7c phys) = fb29394a
[   60.389079] Corrupted low memory at c0001f80 (1f80 phys) = 48241b07
[   60.389081] Corrupted low memory at c0001f84 (1f84 phys) = 746c736b
[   60.389082] Corrupted low memory at c0001f88 (1f88 phys) = 58e84729
[   60.389084] Corrupted low memory at c0001f8c (1f8c phys) = ddab6d21
[   60.389086] Corrupted low memory at c0001f90 (1f90 phys) = 753ac490
[   60.389087] Corrupted low memory at c0001f94 (1f94 phys) = 6d541c16
[   60.389089] Corrupted low memory at c0001f98 (1f98 phys) = 269900ae
[   60.389091] Corrupted low memory at c0001f9c (1f9c phys) = c853a48e
[   60.389092] Corrupted low memory at c0001fa0 (1fa0 phys) = 5a644f88
[   60.389094] Corrupted low memory at c0001fa4 (1fa4 phys) = c60e8770
[   60.389096] Corrupted low memory at c0001fa8 (1fa8 phys) = 9cb67133
[   60.389098] Corrupted low memory at c0001fac (1fac phys) = 8678a65b
[   60.389099] Corrupted low memory at c0001fb0 (1fb0 phys) = dc7e6810
[   60.389101] Corrupted low memory at c0001fb4 (1fb4 phys) = 1ece3a54
[   60.389103] Corrupted low memory at c0001fb8 (1fb8 phys) = e44b09a2
[   60.389104] Corrupted low memory at c0001fbc (1fbc phys) = c5e65c2d
[   60.389106] Corrupted low memory at c0001fc0 (1fc0 phys) = 8f54cd16
[   60.389108] Corrupted low memory at c0001fc4 (1fc4 phys) = d61a7bd3
[   60.389109] Corrupted low memory at c0001fc8 (1fc8 phys) = c4f153c3
[   60.389111] Corrupted low memory at c0001fcc (1fcc phys) = 15a43b68
[   60.389113] Corrupted low memory at c0001fd0 (1fd0 phys) = 98b8b2cd
[   60.389114] Corrupted low memory at c0001fd4 (1fd4 phys) = 13732243
[   60.389116] Corrupted low memory at c0001fd8 (1fd8 phys) = 0d812a73
[   60.389118] Corrupted low memory at c0001fdc (1fdc phys) = 553f6046
[   60.389119] Corrupted low memory at c0001fe0 (1fe0 phys) = ef5c9177
[   60.389121] Corrupted low memory at c0001fe4 (1fe4 phys) = 0f031b28
[   60.389123] Corrupted low memory at c0001fe8 (1fe8 phys) = 304a188b
[   60.389124] Corrupted low memory at c0001fec (1fec phys) = 2430c1b5
[   60.389126] Corrupted low memory at c0001ff0 (1ff0 phys) = 6825c0c5
[   60.389128] Corrupted low memory at c0001ff4 (1ff4 phys) = 8961e31e
[   60.389129] Corrupted low memory at c0001ff8 (1ff8 phys) = b1418fc0
[   60.389131] Corrupted low memory at c0001ffc (1ffc phys) = 2712322a
[   60.389151] ------------[ cut here ]------------
[   60.389158] WARNING: CPU: 0 PID: 698 at arch/x86/kernel/check.c:141 check_for_bios_corruption+0xbf/0xd0()
[   60.389160] Memory corruption detected in low memory
[   60.389161] Modules linked in: nf_conntrack_ftp nf_conntrack_netbios_ns nf_conntrack_broadcast nf_log_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_NFLOG nfnetlink_log nfnetlink iptable_filter ip_tables nf_log_ipv6 nf_log_common xt_LOG xt_limit nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack xt_tcpudp xt_pkttype xt_owner xt_multiport ip6table_filter ip6_tables x_tables ipv6 nvidia_modeset(O) w83627ehf adt7475 hwmon_vid coretemp snd_hda_codec_realtek snd_hda_codec_generic vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) binfmt_misc fuse hid_generic usbhid hid snd_usb_audio snd_usbmidi_lib snd_rawmidi uvcvideo videobuf2_core v4l2_common videodev videobuf2_vmalloc videobuf2_memops btusb btintel bluetooth snd_hda_codec_hdmi 8250 8250_base serial_core microcode pcspkr fan i2c_i801 sr_mod cdrom sg xhci_pci xhci_hcd ehci_pci ehci_hcd snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd e1000e ptp pps_core nvidia(O) drm agpgart evdev
[   60.389225] CPU: 0 PID: 698 Comm: kworker/0:2 Tainted: G           O    4.3.3-ic #1
[   60.389227] Hardware name: System manufacturer System Product Name/P8P67 PRO, BIOS 3602 11/01/2012
[   60.389231] Workqueue: events check_corruption
[   60.389233]  00000000 00000000 ef32de88 c8fd1c54 ef32dec8 ef32deb8 c8e43913 c924a3c0
[   60.389238]  ef32dee4 000002ba c9243492 0000008d c8e389ff c8e389ff 00000000 c0010000
[   60.389244]  c9371a90 ef32ded0 c8e439ce 00000009 ef32dec8 c924a3c0 ef32dee4 ef32def8
[   60.389249] Call Trace:
[   60.389254]  [<c8fd1c54>] dump_stack+0x48/0x74
[   60.389258]  [<c8e43913>] warn_slowpath_common+0x83/0xc0
[   60.389262]  [<c8e389ff>] ? check_for_bios_corruption+0xbf/0xd0
[   60.389265]  [<c8e389ff>] ? check_for_bios_corruption+0xbf/0xd0
[   60.389268]  [<c8e439ce>] warn_slowpath_fmt+0x2e/0x30
[   60.389271]  [<c8e389ff>] check_for_bios_corruption+0xbf/0xd0
[   60.389274]  [<c8e38a1b>] check_corruption+0xb/0x40
[   60.389278]  [<c8e56065>] process_one_work+0xf5/0x2b0
[   60.389281]  [<c8e5633d>] worker_thread+0xed/0x3c0
[   60.389284]  [<c8e56250>] ? process_scheduled_works+0x30/0x30
[   60.389288]  [<c8e5ae96>] kthread+0x96/0xb0
[   60.389293]  [<c91ab0c1>] ret_from_kernel_thread+0x21/0x30
[   60.389297]  [<c8e5ae00>] ? kthread_freezable_should_stop+0x50/0x50
[   60.389300] ---[ end trace 3cad7d8ebaa721c0 ]---
[   74.367608] ata1.00: exception Emask 0x3 SAct 0x180000 SErr 0x3040400 action 0x6 frozen
[   74.367613] ata1.00: irq_stat 0x45000008
[   74.367617] ata1: SError: { Proto CommWake TrStaTrns UnrecFIS }
[   74.367621] ata1.00: failed command: READ FPDMA QUEUED
[   74.367627] ata1.00: cmd 60/00:98:00:89:21/07:00:04:00:00/40 tag 19 ncq 917504 in
[   74.367627]          res 40/00:98:00:89:21/00:00:04:00:00/40 Emask 0x3 (HSM violation)
[   74.367630] ata1.00: status: { DRDY }
[   74.367632] XXX cmd=ee9e0260 cmd_tbl=ee9ed600 ahci_sg=ee9ed680
[   74.367634] XXX opts=140005 st=0 addr=2e9ed600 addr_hi=0 rsvd=0:0:0:0
[   74.367637] XXX fis=00608027:40218900:07000004:08000098 00000000:00000000:00000000:00001fff
[   74.367639] XXX qc->n_elem=20 fis_len=5 prdtl=20
[   74.367641] XXX sg[0] = 29007000 0 8fff (36864)
[   74.367643] XXX sg[1] = 29218000 0 7fff (32768)
[   74.367645] XXX sg[2] = 29298000 0 7fff (32768)
[   74.367646] XXX sg[3] = 29ad8000 0 7fff (32768)
[   74.367648] XXX sg[4] = 29338000 0 7fff (32768)
[   74.367650] XXX sg[5] = 29370000 0 2fff (12288)
[   74.367652] XXX sg[6] = 219000 0 2fff (12288)
[   74.367653] XXX sg[7] = 230000 0 3fff (16384)
[   74.367655] XXX sg[8] = 29373000 0 4fff (20480)
[   74.367657] XXX sg[9] = 29130000 0 ffff (65536)
[   74.367659] XXX sg[10] = 29170000 0 ffff (65536)
[   74.367660] XXX sg[11] = 29280000 0 ffff (65536)
[   74.367662] XXX sg[12] = 29200000 0 ffff (65536)
[   74.367664] XXX sg[13] = 29320000 0 ffff (65536)
[   74.367666] XXX sg[14] = 29360000 0 ffff (65536)
[   74.367667] XXX sg[15] = 29340000 0 ffff (65536)
[   74.367669] XXX sg[16] = 29350000 0 ffff (65536)
[   74.367671] XXX sg[17] = 29300000 0 ffff (65536)
[   74.367672] XXX sg[18] = 29310000 0 ffff (65536)
[   74.367674] XXX sg[19] = 29020000 0 7fff (32768)
[   74.367677] ata1.00: failed command: READ FPDMA QUEUED
[   74.367682] ata1.00: cmd 60/90:a0:c0:fe:23/00:00:04:00:00/40 tag 20 ncq 73728 in
[   74.367682]          res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x7 (timeout)
[   74.367685] ata1.00: status: { DRDY }
[   74.367686] XXX cmd=ee9e0280 cmd_tbl=ee9ee100 ahci_sg=ee9ee180
[   74.367689] XXX opts=120005 st=0 addr=2e9ee100 addr_hi=0 rsvd=0:0:0:0
[   74.367691] XXX fis=90608027:4023fec0:00000004:080000a0 00000000:00000000:00000000:00000000
[   74.367693] XXX qc->n_elem=18 fis_len=5 prdtl=18
[   74.367695] XXX sg[0] = 29006000 0 fff (4096)
[   74.367696] XXX sg[1] = 29005000 0 fff (4096)
[   74.367698] XXX sg[2] = 29004000 0 fff (4096)
[   74.367700] XXX sg[3] = 29213000 0 fff (4096)
[   74.367701] XXX sg[4] = 29212000 0 fff (4096)
[   74.367703] XXX sg[5] = 29211000 0 fff (4096)
[   74.367705] XXX sg[6] = 29210000 0 fff (4096)
[   74.367707] XXX sg[7] = 29293000 0 fff (4096)
[   74.367708] XXX sg[8] = 29292000 0 fff (4096)
[   74.367710] XXX sg[9] = 29291000 0 fff (4096)
[   74.367712] XXX sg[10] = 29290000 0 fff (4096)
[   74.367713] XXX sg[11] = 29a67000 0 fff (4096)
[   74.367715] XXX sg[12] = 29a66000 0 fff (4096)
[   74.367717] XXX sg[13] = 29a65000 0 fff (4096)
[   74.367719] XXX sg[14] = 29a64000 0 fff (4096)
[   74.367720] XXX sg[15] = 29ad7000 0 fff (4096)
[   74.367722] XXX sg[16] = 29ad6000 0 fff (4096)
[   74.367724] XXX sg[17] = 29ad5000 0 fff (4096)
[   74.367728] ata1: hard resetting link
[   74.687286] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   74.687763] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150818/psargs-359)
[   74.687772] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ef02c720), AE_NOT_FOUND (20150818/psparse-542)
[   74.688563] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150818/psargs-359)
[   74.688571] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ef02c720), AE_NOT_FOUND (20150818/psparse-542)
[   74.688839] ata1.00: configured for UDMA/133
[   74.688958] ata1.00: device reported invalid CHS sector 0
[   74.688972] ata1: EH complete
[   74.763895] ata1.00: exception Emask 0x3 SAct 0x800000 SErr 0x3040400 action 0x6
[   74.763900] ata1.00: irq_stat 0x45000008
[   74.763903] ata1: SError: { Proto CommWake TrStaTrns UnrecFIS }
[   74.763907] ata1.00: failed command: READ FPDMA QUEUED
[   74.763913] ata1.00: cmd 60/98:b8:00:c9:20/03:00:04:00:00/40 tag 23 ncq 471040 in
[   74.763913]          res 40/00:b8:00:c9:20/00:00:04:00:00/40 Emask 0x3 (HSM violation)
[   74.763916] ata1.00: status: { DRDY }
[   74.763918] XXX cmd=ee9e02e0 cmd_tbl=ee9f0200 ahci_sg=ee9f0280
[   74.763921] XXX opts=160005 st=0 addr=2e9f0200 addr_hi=0 rsvd=0:0:0:0
[   74.763924] XXX fis=98608027:4020c900:03000004:080000b8 00000000:00000000:00000000:00000fff
[   74.763925] XXX qc->n_elem=22 fis_len=5 prdtl=22
[   74.763928] XXX sg[0] = 2ab1b000 0 fff (4096)
[   74.763929] XXX sg[1] = 2ab1e000 0 1fff (8192)
[   74.763931] XXX sg[2] = 2ab22000 0 1fff (8192)
[   74.763933] XXX sg[3] = 29968000 0 1fff (8192)
[   74.763935] XXX sg[4] = 29162000 0 3fff (16384)
[   74.763936] XXX sg[5] = 29216000 0 1fff (8192)
[   74.763938] XXX sg[6] = 29b8e000 0 1fff (8192)
[   74.763940] XXX sg[7] = 2909c000 0 1fff (8192)
[   74.763941] XXX sg[8] = 29984000 0 1fff (8192)
[   74.763943] XXX sg[9] = 292b4000 0 1fff (8192)
[   74.763945] XXX sg[10] = 29098000 0 3fff (16384)
[   74.763947] XXX sg[11] = 292b0000 0 3fff (16384)
[   74.763949] XXX sg[12] = 29280000 0 7fff (32768)
[   74.763950] XXX sg[13] = 29370000 0 7fff (32768)
[   74.763952] XXX sg[14] = 29338000 0 7fff (32768)
[   74.763954] XXX sg[15] = 29ad8000 0 7fff (32768)
[   74.763955] XXX sg[16] = 29298000 0 7fff (32768)
[   74.763957] XXX sg[17] = 29218000 0 7fff (32768)
[   74.763959] XXX sg[18] = 29008000 0 7fff (32768)
[   74.763961] XXX sg[19] = 29090000 0 7fff (32768)
[   74.763962] XXX sg[20] = 292a0000 0 7fff (32768)
[   74.763964] XXX sg[21] = 29170000 0 dfff (57344)
[   74.763968] ata1: hard resetting link
[   75.083579] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   75.084051] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150818/psargs-359)
[   75.084060] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ef02c720), AE_NOT_FOUND (20150818/psparse-542)
[   75.084883] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150818/psargs-359)
[   75.084891] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ef02c720), AE_NOT_FOUND (20150818/psparse-542)
[   75.085156] ata1.00: configured for UDMA/133
[   75.085283] ata1: EH complete


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  5:23           ` Linus Torvalds
@ 2015-12-21  7:31             ` Artem S. Tashkinov
  2015-12-22  4:06             ` Artem S. Tashkinov
  1 sibling, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-21  7:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List, linus971

On 2015-12-21 10:23, Linus Torvalds wrote:
> On Sun, Dec 20, 2015 at 8:47 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> 
>> That said, we obviously need to figure out this current problem
>> regardless first..
> 
> ... although maybe it *would* be interesting to hear what happens if
> you just compile a 64-bit kernel instead?
> 
> Do you still see the problem? Because if not, then we should look very
> specifically for some 32-bit PAE issue.
> 
> For example, maybe we use "unsigned long" somewhere where we should
> use "phys_addr_t". On x86-64, they obviously end up being the same. On
> normal non-PAE x86-32, they are also the same. But ..
> 

Let's wait for what Tejun Heo might say - I've applied his debugging 
patch and sent back the results.

Building x86_64 kernel here involves installing a 64bit Linux VM, so I'd 
like it to be the last resort.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-20 18:41   ` Linus Torvalds
  2015-12-20 23:36     ` Artem S. Tashkinov
@ 2015-12-21 11:21     ` Dan Aloni
  1 sibling, 0 replies; 45+ messages in thread
From: Dan Aloni @ 2015-12-21 11:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Kent Overstreet, Ming Lin, Jens Axboe,
	Artem S. Tashkinov, Steven Whitehouse, Tejun Heo, IDE-ML,
	Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 497 bytes --]

On Sun, Dec 20, 2015 at 10:41:44AM -0800, Linus Torvalds wrote:
>[..]
> Sadly, without CONFIG_LOCALVERSION_AUTO, there's no way to match up
> the dmesg files (in the same bisection tar-file as the bisection log)
> with the actual versions.

Perhaps we can print the Git revision in a manner independent of
CONFIG_LOCALVERSION_AUTO, using the attached patch. It will be emitted
in the dmesg Linux banner (though not in /proc/version, that's more
interface-ish and may break things).

-- 
Dan Aloni

[-- Attachment #2: 0001-init-version.c-add-SCM_REVISION.patch --]
[-- Type: text/plain, Size: 3368 bytes --]

>From d2d4ab995911e59ba41153fade176ca805ca2db8 Mon Sep 17 00:00:00 2001
From: Dan Aloni <dan@kernelim.com>
Date: Mon, 21 Dec 2015 11:54:18 +0200
Subject: [PATCH] init/version.c: add SCM revision

This adds a 'SCM-*' string to the Linux banner if and only if
source control information was available during the compilation
(such in the case of a Git bisect).
---
 Documentation/dontdiff  |  1 +
 Makefile                | 11 +++++++++++
 init/version.c          |  4 +++-
 scripts/setlocalversion |  9 +++++++++
 4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index 8ea834f6b289..3dfda8883b7a 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -227,6 +227,7 @@ timeconst.h
 times.h*
 trix_boot.h
 utsrelease.h*
+scmrevision.h*
 vdso-syms.lds
 vdso.lds
 vdso32-int80-syms.lds
diff --git a/Makefile b/Makefile
index 4e2b18d56091..f966be40dafd 100644
--- a/Makefile
+++ b/Makefile
@@ -977,6 +977,7 @@ endif
 prepare2: prepare3 outputmakefile asm-generic
 
 prepare1: prepare2 $(version_h) include/generated/utsrelease.h \
+                   include/generated/scmrevision.h \
                    include/config/auto.conf
 	$(cmd_crmodverdir)
 
@@ -1003,6 +1004,13 @@ define filechk_utsrelease.h
 	(echo \#define UTS_RELEASE \"$(KERNELRELEASE)\";)
 endef
 
+define filechk_scmrevision.h
+	(SCMREV=`./scripts/setlocalversion --only-print`; \
+	 if [ "x$$SCMREV" = "x" ] ; \
+	    then echo \#define KERNEL_SCM_REVISION \"\" ; \
+	    else echo \#define KERNEL_SCM_REVISION \" SCM$$SCMREV\"; fi )
+endef
+
 define filechk_version.h
 	(echo \#define LINUX_VERSION_CODE $(shell                         \
 	expr $(VERSION) \* 65536 + 0$(PATCHLEVEL) \* 256 + 0$(SUBLEVEL)); \
@@ -1016,6 +1024,9 @@ $(version_h): $(srctree)/Makefile FORCE
 include/generated/utsrelease.h: include/config/kernel.release FORCE
 	$(call filechk,utsrelease.h)
 
+include/generated/scmrevision.h: include/config/kernel.release FORCE
+	$(call filechk,scmrevision.h)
+
 PHONY += headerdep
 headerdep:
 	$(Q)find $(srctree)/include/ -name '*.h' | xargs --max-args 1 \
diff --git a/init/version.c b/init/version.c
index fe41a63efed6..f9df083db7c8 100644
--- a/init/version.c
+++ b/init/version.c
@@ -11,6 +11,7 @@
 #include <linux/uts.h>
 #include <linux/utsname.h>
 #include <generated/utsrelease.h>
+#include <generated/scmrevision.h>
 #include <linux/version.h>
 #include <linux/proc_ns.h>
 
@@ -45,7 +46,8 @@ EXPORT_SYMBOL_GPL(init_uts_ns);
 /* FIXED STRINGS! Don't touch! */
 const char linux_banner[] =
 	"Linux version " UTS_RELEASE " (" LINUX_COMPILE_BY "@"
-	LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION "\n";
+	LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION
+	KERNEL_SCM_REVISION "\n";
 
 const char linux_proc_banner[] =
 	"%s version %s"
diff --git a/scripts/setlocalversion b/scripts/setlocalversion
index 63d91e22ed7c..be6a4cc6c348 100755
--- a/scripts/setlocalversion
+++ b/scripts/setlocalversion
@@ -20,6 +20,10 @@ if test "$1" = "--save-scmversion"; then
 	scm_only=true
 	shift
 fi
+if test "$1" = "--only-print"; then
+	onlyprint=true
+	shift
+fi
 if test $# -gt 0; then
 	srctree=$1
 	shift
@@ -132,6 +136,11 @@ collect_files()
 	echo "$res"
 }
 
+if $only_print; then
+	scm_version
+	exit
+fi
+
 if $scm_only; then
 	if test ! -e .scmversion; then
 		res=$(scm_version)
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  7:25   ` Artem S. Tashkinov
@ 2015-12-21 19:35     ` Tejun Heo
  2015-12-21 20:07       ` Tejun Heo
  2015-12-21 22:51       ` Ming Lei
  0 siblings, 2 replies; 45+ messages in thread
From: Tejun Heo @ 2015-12-21 19:35 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Artem S. Tashkinov, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

Hello, Artem.

On Mon, Dec 21, 2015 at 12:25:06PM +0500, Artem S. Tashkinov wrote:
> I've applied this patch on top of vanilla 4.3.3 kernel (without Linus'es
> revert). Hopefully it's how you intended it to be.
> 
> Here's the result (I skipped the beginning of dmesg - it's the same as
> always - see bugzilla).

I added some debug messages during init.  It isn't critical but it'd
be great if you attach the full log from now on.  Something seemingly
unrelated surprisingly often turns out to be an important clue.

> [   60.387407] Corrupted low memory at c0001000 (1000 phys) = cba3d25f
...
> [   60.389131] Corrupted low memory at c0001ffc (1ffc phys) = 2712322a

It looks like the controller shat on the entire second page, which is
really puzzling.  Looks like the controller is being fed corrupt DMA
SG table.

...
> [   74.367608] ata1.00: exception Emask 0x3 SAct 0x180000 SErr 0x3040400 action 0x6 frozen
> [   74.367613] ata1.00: irq_stat 0x45000008

The intresting bit here is that the controller is indicating OVERFLOW
which means that it consumed all PRD entries (ahci's DMA SG table) for
the command but the disk is still sending data to the host.

> [   74.367617] ata1: SError: { Proto CommWake TrStaTrns UnrecFIS }
> [   74.367621] ata1.00: failed command: READ FPDMA QUEUED
> [   74.367627] ata1.00: cmd 60/00:98:00:89:21/07:00:04:00:00/40 tag 19 ncq 917504 in
> [   74.367627]          res 40/00:98:00:89:21/00:00:04:00:00/40 Emask 0x3 (HSM violation)
> [   74.367630] ata1.00: status: { DRDY }

The followings are the data fed to the controller as seen from the
CPU.

> [   74.367632] XXX cmd=ee9e0260 cmd_tbl=ee9ed600 ahci_sg=ee9ed680
> [   74.367634] XXX opts=140005 st=0 addr=2e9ed600 addr_hi=0 rsvd=0:0:0:0
> [   74.367637] XXX fis=00608027:40218900:07000004:08000098 00000000:00000000:00000000:00001fff
> [   74.367639] XXX qc->n_elem=20 fis_len=5 prdtl=20
> [   74.367641] XXX sg[0] = 29007000 0 8fff (36864)
> [   74.367643] XXX sg[1] = 29218000 0 7fff (32768)
> [   74.367645] XXX sg[2] = 29298000 0 7fff (32768)
> [   74.367646] XXX sg[3] = 29ad8000 0 7fff (32768)
> [   74.367648] XXX sg[4] = 29338000 0 7fff (32768)
> [   74.367650] XXX sg[5] = 29370000 0 2fff (12288)
> [   74.367652] XXX sg[6] = 219000 0 2fff (12288)
> [   74.367653] XXX sg[7] = 230000 0 3fff (16384)
> [   74.367655] XXX sg[8] = 29373000 0 4fff (20480)
> [   74.367657] XXX sg[9] = 29130000 0 ffff (65536)
> [   74.367659] XXX sg[10] = 29170000 0 ffff (65536)
> [   74.367660] XXX sg[11] = 29280000 0 ffff (65536)
> [   74.367662] XXX sg[12] = 29200000 0 ffff (65536)
> [   74.367664] XXX sg[13] = 29320000 0 ffff (65536)
> [   74.367666] XXX sg[14] = 29360000 0 ffff (65536)
> [   74.367667] XXX sg[15] = 29340000 0 ffff (65536)
> [   74.367669] XXX sg[16] = 29350000 0 ffff (65536)
> [   74.367671] XXX sg[17] = 29300000 0 ffff (65536)
> [   74.367672] XXX sg[18] = 29310000 0 ffff (65536)
> [   74.367674] XXX sg[19] = 29020000 0 7fff (32768)

And everything checks out.  Data lenghts are consistent and all the
addresses look kosher - at least nothing should upset the data
transfer itself.

> [   74.367677] ata1.00: failed command: READ FPDMA QUEUED
> [   74.367682] ata1.00: cmd 60/90:a0:c0:fe:23/00:00:04:00:00/40 tag 20 ncq 73728 in
> [   74.367682]          res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x7 (timeout)

This one looks like a collateral damage.

...
> [   74.763895] ata1.00: exception Emask 0x3 SAct 0x800000 SErr 0x3040400 action 0x6
> [   74.763900] ata1.00: irq_stat 0x45000008
> [   74.763903] ata1: SError: { Proto CommWake TrStaTrns UnrecFIS }
> [   74.763907] ata1.00: failed command: READ FPDMA QUEUED
> [   74.763913] ata1.00: cmd 60/98:b8:00:c9:20/03:00:04:00:00/40 tag 23 ncq 471040 in
> [   74.763913]          res 40/00:b8:00:c9:20/00:00:04:00:00/40 Emask 0x3 (HSM violation)
> [   74.763916] ata1.00: status: { DRDY }
> [   74.763918] XXX cmd=ee9e02e0 cmd_tbl=ee9f0200 ahci_sg=ee9f0280
> [   74.763921] XXX opts=160005 st=0 addr=2e9f0200 addr_hi=0 rsvd=0:0:0:0
> [   74.763924] XXX fis=98608027:4020c900:03000004:080000b8 00000000:00000000:00000000:00000fff
> [   74.763925] XXX qc->n_elem=22 fis_len=5 prdtl=22
> [   74.763928] XXX sg[0] = 2ab1b000 0 fff (4096)
...
> [   74.763964] XXX sg[21] = 29170000 0 dfff (57344)

This is a separate failure and shares the same pattern as before.
Everything looks proper.

The thing is ahci doesn't have much restrictions in terms of its DMA
capabilities.  It can digest pretty much anything.  The only
restriction is that each entry can't be larger than 4M - but our
segment maximum is 64k.  There's no noticeable boundary crossing
happening both in target DMA regions and command tables.  All
addresses are in linear mapped normal area.

If the controller is seeing what the host is seeing in the command
area, I can't see why it would be declaring overflow or dumping stuff
into the lowest pages.

Ming Lei reported a similar issue on 32bit ARM w/ PAE.  I don't
understand why PAE is making any difference.  Native 64bit machines
should be generating IOs just as large.  Maybe iommu is hiding the
issue?

I'm afraid we'll have to go brute force with the problem - dump more
information on both 32 and 64bits and see where the differences lie.
At the moment, given that the DMA table looks completely fine and ahci
isn't picky about the shape of data areas, I think it's more likely
some obscure issue in address mapping under PAE or a silly bug in ahci
than block layer screwing up merging.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21 19:35     ` Tejun Heo
@ 2015-12-21 20:07       ` Tejun Heo
  2015-12-21 21:08         ` Tejun Heo
                           ` (2 more replies)
  2015-12-21 22:51       ` Ming Lei
  1 sibling, 3 replies; 45+ messages in thread
From: Tejun Heo @ 2015-12-21 20:07 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Artem S. Tashkinov, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

Hello, Artem.

Can you please apply the following patch on top and see whether
anything changes?  If it does make the issue go away, can you please
revert the ".can_queue" part and test again?

Thanks.

---
 drivers/ata/ahci.h    |    2 +-
 drivers/ata/libahci.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/ata/ahci.h
+++ b/drivers/ata/ahci.h
@@ -365,7 +365,7 @@ extern struct device_attribute *ahci_sde
  */
 #define AHCI_SHT(drv_name)						\
 	ATA_NCQ_SHT(drv_name),						\
-	.can_queue		= AHCI_MAX_CMDS - 1,			\
+	.can_queue		= 1/*AHCI_MAX_CMDS - 1*/,		\
 	.sg_tablesize		= AHCI_MAX_SG,				\
 	.dma_boundary		= AHCI_DMA_BOUNDARY,			\
 	.shost_attrs		= ahci_shost_attrs,			\
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -420,7 +420,7 @@ void ahci_save_initial_config(struct dev
 		hpriv->saved_cap2 = cap2 = 0;
 
 	/* some chips have errata preventing 64bit use */
-	if ((cap & HOST_CAP_64) && (hpriv->flags & AHCI_HFLAG_32BIT_ONLY)) {
+	if ((cap & HOST_CAP_64)/* && (hpriv->flags & AHCI_HFLAG_32BIT_ONLY)*/) {
 		dev_info(dev, "controller can't do 64bit DMA, forcing 32bit\n");
 		cap &= ~HOST_CAP_64;
 	}

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21 20:07       ` Tejun Heo
@ 2015-12-21 21:08         ` Tejun Heo
  2015-12-22  3:43           ` Kent Overstreet
                             ` (2 more replies)
  2015-12-22  5:10         ` Artem S. Tashkinov
  2015-12-22  5:20         ` Artem S. Tashkinov
  2 siblings, 3 replies; 45+ messages in thread
From: Tejun Heo @ 2015-12-21 21:08 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Artem S. Tashkinov, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

Hello, again.

On Mon, Dec 21, 2015 at 03:07:21PM -0500, Tejun Heo wrote:
> Hello, Artem.
> 
> Can you please apply the following patch on top and see whether
> anything changes?  If it does make the issue go away, can you please
> revert the ".can_queue" part and test again?

If the patch doesn't change anything, can you please try the
followings and see which one makes difference?

1. Exclude memory above 4G line with boot param "max_addr=4G".
2. Disable highmem with "highmem=0".
3. Try booting 64bit kernel.

At the moment, the only thing I can think of which can explain the PAE
+ bio_get_nr_vecs() situation is that the bio split code which is
activated by the bio_get_nr_vecs() somehow messes up 64bit or high
addresses on 32bit kernels.  I scanned for the obvious but at bio
layer, memory is represented by struct page, so nothing obvious seems
broken.

Note that for now I'm ignoring the debug dumps from the ahci debug
patch which indicates that the passed in addresses are all fine.  It
is possible that the controller gets confused with certain receiving
addresses and reports failure on later commands or maybe there is a
different sequence of events which can encompass both that we don't
know of yet.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21 19:35     ` Tejun Heo
  2015-12-21 20:07       ` Tejun Heo
@ 2015-12-21 22:51       ` Ming Lei
  1 sibling, 0 replies; 45+ messages in thread
From: Ming Lei @ 2015-12-21 22:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Artem S. Tashkinov, Artem S. Tashkinov, Kent Overstreet,
	Christoph Hellwig, Ming Lin, Jens Axboe, Linus Torvalds,
	Steven Whitehouse, IDE-ML, Linux Kernel Mailing List

On Tue, Dec 22, 2015 at 3:35 AM, Tejun Heo <tj@kernel.org> wrote:
>
>> [   74.367632] XXX cmd=ee9e0260 cmd_tbl=ee9ed600 ahci_sg=ee9ed680
>> [   74.367634] XXX opts=140005 st=0 addr=2e9ed600 addr_hi=0 rsvd=0:0:0:0
>> [   74.367637] XXX fis=00608027:40218900:07000004:08000098 00000000:00000000:00000000:00001fff
>> [   74.367639] XXX qc->n_elem=20 fis_len=5 prdtl=20
>> [   74.367641] XXX sg[0] = 29007000 0 8fff (36864)
>> [   74.367643] XXX sg[1] = 29218000 0 7fff (32768)
>> [   74.367645] XXX sg[2] = 29298000 0 7fff (32768)
>> [   74.367646] XXX sg[3] = 29ad8000 0 7fff (32768)
>> [   74.367648] XXX sg[4] = 29338000 0 7fff (32768)
>> [   74.367650] XXX sg[5] = 29370000 0 2fff (12288)
>> [   74.367652] XXX sg[6] = 219000 0 2fff (12288)
>> [   74.367653] XXX sg[7] = 230000 0 3fff (16384)
>> [   74.367655] XXX sg[8] = 29373000 0 4fff (20480)
>> [   74.367657] XXX sg[9] = 29130000 0 ffff (65536)
>> [   74.367659] XXX sg[10] = 29170000 0 ffff (65536)
>> [   74.367660] XXX sg[11] = 29280000 0 ffff (65536)
>> [   74.367662] XXX sg[12] = 29200000 0 ffff (65536)
>> [   74.367664] XXX sg[13] = 29320000 0 ffff (65536)
>> [   74.367666] XXX sg[14] = 29360000 0 ffff (65536)
>> [   74.367667] XXX sg[15] = 29340000 0 ffff (65536)
>> [   74.367669] XXX sg[16] = 29350000 0 ffff (65536)
>> [   74.367671] XXX sg[17] = 29300000 0 ffff (65536)
>> [   74.367672] XXX sg[18] = 29310000 0 ffff (65536)
>> [   74.367674] XXX sg[19] = 29020000 0 7fff (32768)
>
> And everything checks out.  Data lenghts are consistent and all the
> addresses look kosher - at least nothing should upset the data
> transfer itself.

Maybe we can check more, such as if the sg element is correctly
merged from bvec, and the following code should be useful to check
that:

+static void ahci_dump_req(struct ata_queued_cmd *qc)
+{
+       struct scsi_cmnd *cmd = qc->scsicmd;
+       struct request *req = cmd->request;
+       struct req_iterator iter;
+       struct bio_vec bv;
+       int i = 0;
+       phys_addr_t paddr;
+
+       printk("%s: \n", __func__);
+       rq_for_each_segment(bv, req, iter) {
+               paddr = page_to_phys(bv.bv_page);
+               printk("\t %3d: %x-%x %x %u\n", i++,
+                       (unsigned)paddr & 0xffffffff,
+                       (unsigned)(paddr >> 32),
+                       bv.bv_offset,
+                       bv.bv_len);
+       }
+}

Thanks,

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21 21:08         ` Tejun Heo
@ 2015-12-22  3:43           ` Kent Overstreet
  2015-12-22  3:59           ` Kent Overstreet
  2015-12-22  4:45           ` Kent Overstreet
  2 siblings, 0 replies; 45+ messages in thread
From: Kent Overstreet @ 2015-12-22  3:43 UTC (permalink / raw)
  To: Tejun Heo, Artem S. Tashkinov
  Cc: Christoph Hellwig, Ming Lin, Jens Axboe, Linus Torvalds,
	Steven Whitehouse, IDE-ML, Linux Kernel Mailing List, Ming Lei

I just reproduced it - Artem, I'll let you know when we have a possible fix but
hopefully there won't be any need for you to beat up your hardware any more :)

On Mon, Dec 21, 2015 at 04:08:11PM -0500, Tejun Heo wrote:
> Hello, again.
> 
> On Mon, Dec 21, 2015 at 03:07:21PM -0500, Tejun Heo wrote:
> > Hello, Artem.
> > 
> > Can you please apply the following patch on top and see whether
> > anything changes?  If it does make the issue go away, can you please
> > revert the ".can_queue" part and test again?
> 
> If the patch doesn't change anything, can you please try the
> followings and see which one makes difference?
> 
> 1. Exclude memory above 4G line with boot param "max_addr=4G".
> 2. Disable highmem with "highmem=0".
> 3. Try booting 64bit kernel.
> 
> At the moment, the only thing I can think of which can explain the PAE
> + bio_get_nr_vecs() situation is that the bio split code which is
> activated by the bio_get_nr_vecs() somehow messes up 64bit or high
> addresses on 32bit kernels.  I scanned for the obvious but at bio
> layer, memory is represented by struct page, so nothing obvious seems
> broken.
> 
> Note that for now I'm ignoring the debug dumps from the ahci debug
> patch which indicates that the passed in addresses are all fine.  It
> is possible that the controller gets confused with certain receiving
> addresses and reports failure on later commands or maybe there is a
> different sequence of events which can encompass both that we don't
> know of yet.

It repros pretty easily, I should be able to just write some test code that
sends down single specific IOs so no matter what the controller is doing we know
exactly what IO triggered the error.

I'm checking all the 64 bit/pae/etc. stuff now.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21 21:08         ` Tejun Heo
  2015-12-22  3:43           ` Kent Overstreet
@ 2015-12-22  3:59           ` Kent Overstreet
  2015-12-22  5:26             ` Junichi Nomura
  2015-12-22  4:45           ` Kent Overstreet
  2 siblings, 1 reply; 45+ messages in thread
From: Kent Overstreet @ 2015-12-22  3:59 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Artem S. Tashkinov, Artem S. Tashkinov, Christoph Hellwig,
	Ming Lin, Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

On Mon, Dec 21, 2015 at 04:08:11PM -0500, Tejun Heo wrote:

reproduced it with 32 bit pae:

> 1. Exclude memory above 4G line with boot param "max_addr=4G".

doesn't work - max_addr=1G doesn't work either

> 2. Disable highmem with "highmem=0".

works!

> 3. Try booting 64bit kernel.

works

Ok, so maybe it actually is PAE specific... but like you noted the block layer
works entirely in terms of pages so...

The one idea I can think of is - maybe BIOVEC_PHYS_MERGEABLE() is broken in PAE
mode? I am unfamiliar with anything PAE though.

Where does the ahci code consume the sglist?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21  5:23           ` Linus Torvalds
  2015-12-21  7:31             ` Artem S. Tashkinov
@ 2015-12-22  4:06             ` Artem S. Tashkinov
  1 sibling, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-22  4:06 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ming Lei, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Artem S. Tashkinov, Steven Whitehouse, Tejun Heo,
	IDE-ML, Linux Kernel Mailing List, linus971

On 2015-12-21 10:23, Linus Torvalds wrote:
> On Sun, Dec 20, 2015 at 8:47 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> 
>> That said, we obviously need to figure out this current problem
>> regardless first..
> 
> ... although maybe it *would* be interesting to hear what happens if
> you just compile a 64-bit kernel instead?

Under x86-64 I cannot reproduce this problem. It seems like it's PAE 
specific (Kent Overstreet says he has reproduced it).

> 
> Do you still see the problem? Because if not, then we should look very
> specifically for some 32-bit PAE issue.
> 
> For example, maybe we use "unsigned long" somewhere where we should
> use "phys_addr_t". On x86-64, they obviously end up being the same. On
> normal non-PAE x86-32, they are also the same. But ..
> 
>                  Linus

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21 21:08         ` Tejun Heo
  2015-12-22  3:43           ` Kent Overstreet
  2015-12-22  3:59           ` Kent Overstreet
@ 2015-12-22  4:45           ` Kent Overstreet
  2 siblings, 0 replies; 45+ messages in thread
From: Kent Overstreet @ 2015-12-22  4:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Artem S. Tashkinov, Artem S. Tashkinov, Christoph Hellwig,
	Ming Lin, Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

So what I _really_ want to know is - how the fuck is the actual ATA command
itself malformed?

You told me at one point that the error code indicated the controller was
claiming it overran the end of the sglist - well, if that's the case we ought to
be able to prove it with an assertion (I already tried; qc->nbytes does match
the sglist, checking from ata_sg_setup()).

Ignore bvec merging, PAE, all that crap - the controller doesn't know about any
of that. _WHAT_ are we feeding it that it doesn't like?

There shouldn't even be that much stuff to check, since in theory the only
possible thing that could be at fault is the sglist. Maybe they're too big, too
small, misaligned, too many of them, god knows what, but somehow the sglist
we're feeding the device has to be at fault, right?

But the sglists in Artem's debugging output look pretty uninteresting (one
of them has _no_ merged segments - it's just 18 4k pages - how could THAT be an
issue?)

Gonna apply your debugging patch and start throwing stuff at the wall next, I
guess...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21 20:07       ` Tejun Heo
  2015-12-21 21:08         ` Tejun Heo
@ 2015-12-22  5:10         ` Artem S. Tashkinov
  2015-12-22  5:20         ` Artem S. Tashkinov
  2 siblings, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-22  5:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Artem S. Tashkinov, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei, Tejun Heo

On 2015-12-22 01:07, Tejun Heo wrote:
> Hello, Artem.
> 
> Can you please apply the following patch on top and see whether
> anything changes?  If it does make the issue go away, can you please
> revert the ".can_queue" part and test again?
> 
> Thanks.
> 
> ---
>  drivers/ata/ahci.h    |    2 +-
>  drivers/ata/libahci.c |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> --- a/drivers/ata/ahci.h
> +++ b/drivers/ata/ahci.h
> @@ -365,7 +365,7 @@ extern struct device_attribute *ahci_sde
>   */
>  #define AHCI_SHT(drv_name)						\
>  	ATA_NCQ_SHT(drv_name),						\
> -	.can_queue		= AHCI_MAX_CMDS - 1,			\
> +	.can_queue		= 1/*AHCI_MAX_CMDS - 1*/,		\
>  	.sg_tablesize		= AHCI_MAX_SG,				\
>  	.dma_boundary		= AHCI_DMA_BOUNDARY,			\
>  	.shost_attrs		= ahci_shost_attrs,			\
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -420,7 +420,7 @@ void ahci_save_initial_config(struct dev
>  		hpriv->saved_cap2 = cap2 = 0;
> 
>  	/* some chips have errata preventing 64bit use */
> -	if ((cap & HOST_CAP_64) && (hpriv->flags & AHCI_HFLAG_32BIT_ONLY)) {
> +	if ((cap & HOST_CAP_64)/* && (hpriv->flags & 
> AHCI_HFLAG_32BIT_ONLY)*/) {
>  		dev_info(dev, "controller can't do 64bit DMA, forcing 32bit\n");
>  		cap &= ~HOST_CAP_64;
>  	}

This patch fixes the issue for me. Now rechecking without .can_queue 
part.

BTW, since I left debugging on, here's the part you wanted:

[    0.613851] XXX port 0 dma_sz=91392 mem=c0020000 mem_dma=00020000 
cmd_slot=0 rx_fis=1024 cmd_tbl=1280
[    0.613865] XXX port 1 dma_sz=91392 mem=eea00000 mem_dma=2ea00000 
cmd_slot=0 rx_fis=1024 cmd_tbl=1280
[    0.620464] XXX port 2 dma_sz=91392 mem=eea20000 mem_dma=2ea20000 
cmd_slot=0 rx_fis=1024 cmd_tbl=1280
[    0.627121] XXX port 3 dma_sz=91392 mem=eea40000 mem_dma=2ea40000 
cmd_slot=0 rx_fis=1024 cmd_tbl=1280
[    0.633791] XXX port 4 dma_sz=91392 mem=eea60000 mem_dma=2ea60000 
cmd_slot=0 rx_fis=1024 cmd_tbl=1280
[    0.640445] XXX port 5 dma_sz=91392 mem=eea80000 mem_dma=2ea80000 
cmd_slot=0 rx_fis=1024 cmd_tbl=1280


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-21 20:07       ` Tejun Heo
  2015-12-21 21:08         ` Tejun Heo
  2015-12-22  5:10         ` Artem S. Tashkinov
@ 2015-12-22  5:20         ` Artem S. Tashkinov
  2 siblings, 0 replies; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-22  5:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Artem S. Tashkinov, Kent Overstreet, Christoph Hellwig, Ming Lin,
	Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 1355 bytes --]

On 2015-12-22 01:07, Tejun Heo wrote:
> Hello, Artem.
> 
> Can you please apply the following patch on top and see whether
> anything changes?  If it does make the issue go away, can you please
> revert the ".can_queue" part and test again?
> 
> Thanks.
> 
> ---
>  drivers/ata/ahci.h    |    2 +-
>  drivers/ata/libahci.c |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> --- a/drivers/ata/ahci.h
> +++ b/drivers/ata/ahci.h
> @@ -365,7 +365,7 @@ extern struct device_attribute *ahci_sde
>   */
>  #define AHCI_SHT(drv_name)						\
>  	ATA_NCQ_SHT(drv_name),						\
> -	.can_queue		= AHCI_MAX_CMDS - 1,			\
> +	.can_queue		= 1/*AHCI_MAX_CMDS - 1*/,		\
>  	.sg_tablesize		= AHCI_MAX_SG,				\
>  	.dma_boundary		= AHCI_DMA_BOUNDARY,			\
>  	.shost_attrs		= ahci_shost_attrs,			\
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -420,7 +420,7 @@ void ahci_save_initial_config(struct dev
>  		hpriv->saved_cap2 = cap2 = 0;
> 
>  	/* some chips have errata preventing 64bit use */
> -	if ((cap & HOST_CAP_64) && (hpriv->flags & AHCI_HFLAG_32BIT_ONLY)) {
> +	if ((cap & HOST_CAP_64)/* && (hpriv->flags & 
> AHCI_HFLAG_32BIT_ONLY)*/) {
>  		dev_info(dev, "controller can't do 64bit DMA, forcing 32bit\n");
>  		cap &= ~HOST_CAP_64;
>  	}

With the ".can_queue" part left intact the bug resurfaced. Full dmesg 
output is attached.

[-- Attachment #2: dmesg.xz --]
[-- Type: application/x-xz, Size: 13616 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-22  3:59           ` Kent Overstreet
@ 2015-12-22  5:26             ` Junichi Nomura
  2015-12-22  5:37               ` Kent Overstreet
                                 ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Junichi Nomura @ 2015-12-22  5:26 UTC (permalink / raw)
  To: Kent Overstreet, Tejun Heo
  Cc: Artem S. Tashkinov, Artem S. Tashkinov, Christoph Hellwig,
	Ming Lin, Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

On 12/22/15 12:59, Kent Overstreet wrote:
> reproduced it with 32 bit pae:
> 
>> 1. Exclude memory above 4G line with boot param "max_addr=4G".
> 
> doesn't work - max_addr=1G doesn't work either
> 
>> 2. Disable highmem with "highmem=0".
> 
> works!
> 
>> 3. Try booting 64bit kernel.
> 
> works

blk_queue_bio() does split then bounce, which makes the segment
counting based on pages before bouncing and could go wrong.

What do you think of a patch like this?

-- 
Jun'ichi Nomura, NEC Corporation

diff --git a/block/blk-core.c b/block/blk-core.c
index 5131993b..1d1c3c7 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1689,8 +1689,6 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
 	struct request *req;
 	unsigned int request_count = 0;
 
-	blk_queue_split(q, &bio, q->bio_split);
-
 	/*
 	 * low level driver can indicate that it wants pages above a
 	 * certain limit bounced to low memory (ie for highmem, or even
@@ -1698,6 +1696,8 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
 	 */
 	blk_queue_bounce(q, &bio);
 
+	blk_queue_split(q, &bio, q->bio_split);
+
 	if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
 		bio->bi_error = -EIO;
 		bio_endio(bio);

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-22  5:26             ` Junichi Nomura
@ 2015-12-22  5:37               ` Kent Overstreet
  2015-12-22  5:38               ` Kent Overstreet
  2015-12-22 17:28               ` Jens Axboe
  2 siblings, 0 replies; 45+ messages in thread
From: Kent Overstreet @ 2015-12-22  5:37 UTC (permalink / raw)
  To: Junichi Nomura
  Cc: Tejun Heo, Artem S. Tashkinov, Artem S. Tashkinov,
	Christoph Hellwig, Ming Lin, Jens Axboe, Linus Torvalds,
	Steven Whitehouse, IDE-ML, Linux Kernel Mailing List, Ming Lei

On Tue, Dec 22, 2015 at 05:26:12AM +0000, Junichi Nomura wrote:
> On 12/22/15 12:59, Kent Overstreet wrote:
> > reproduced it with 32 bit pae:
> > 
> >> 1. Exclude memory above 4G line with boot param "max_addr=4G".
> > 
> > doesn't work - max_addr=1G doesn't work either
> > 
> >> 2. Disable highmem with "highmem=0".
> > 
> > works!
> > 
> >> 3. Try booting 64bit kernel.
> > 
> > works
> 
> blk_queue_bio() does split then bounce, which makes the segment
> counting based on pages before bouncing and could go wrong.
> 
> What do you think of a patch like this?

Shit, you nailed it. Can't believe I didn't think to check that.

> 
> -- 
> Jun'ichi Nomura, NEC Corporation
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 5131993b..1d1c3c7 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1689,8 +1689,6 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
>  	struct request *req;
>  	unsigned int request_count = 0;
>  
> -	blk_queue_split(q, &bio, q->bio_split);
> -
>  	/*
>  	 * low level driver can indicate that it wants pages above a
>  	 * certain limit bounced to low memory (ie for highmem, or even
> @@ -1698,6 +1696,8 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
>  	 */
>  	blk_queue_bounce(q, &bio);
>  
> +	blk_queue_split(q, &bio, q->bio_split);
> +
>  	if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
>  		bio->bi_error = -EIO;
>  		bio_endio(bio);

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-22  5:26             ` Junichi Nomura
  2015-12-22  5:37               ` Kent Overstreet
@ 2015-12-22  5:38               ` Kent Overstreet
  2015-12-22  5:52                 ` Artem S. Tashkinov
  2015-12-22 17:28               ` Jens Axboe
  2 siblings, 1 reply; 45+ messages in thread
From: Kent Overstreet @ 2015-12-22  5:38 UTC (permalink / raw)
  To: Junichi Nomura
  Cc: Tejun Heo, Artem S. Tashkinov, Artem S. Tashkinov,
	Christoph Hellwig, Ming Lin, Jens Axboe, Linus Torvalds,
	Steven Whitehouse, IDE-ML, Linux Kernel Mailing List, Ming Lei

On Tue, Dec 22, 2015 at 05:26:12AM +0000, Junichi Nomura wrote:
> On 12/22/15 12:59, Kent Overstreet wrote:
> > reproduced it with 32 bit pae:
> > 
> >> 1. Exclude memory above 4G line with boot param "max_addr=4G".
> > 
> > doesn't work - max_addr=1G doesn't work either
> > 
> >> 2. Disable highmem with "highmem=0".
> > 
> > works!
> > 
> >> 3. Try booting 64bit kernel.
> > 
> > works
> 
> blk_queue_bio() does split then bounce, which makes the segment
> counting based on pages before bouncing and could go wrong.
> 
> What do you think of a patch like this?

Artem, can you give this patch a try?

> 
> -- 
> Jun'ichi Nomura, NEC Corporation
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 5131993b..1d1c3c7 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1689,8 +1689,6 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
>  	struct request *req;
>  	unsigned int request_count = 0;
>  
> -	blk_queue_split(q, &bio, q->bio_split);
> -
>  	/*
>  	 * low level driver can indicate that it wants pages above a
>  	 * certain limit bounced to low memory (ie for highmem, or even
> @@ -1698,6 +1696,8 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
>  	 */
>  	blk_queue_bounce(q, &bio);
>  
> +	blk_queue_split(q, &bio, q->bio_split);
> +
>  	if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
>  		bio->bi_error = -EIO;
>  		bio_endio(bio);

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-22  5:38               ` Kent Overstreet
@ 2015-12-22  5:52                 ` Artem S. Tashkinov
  2015-12-22  5:55                   ` Kent Overstreet
  0 siblings, 1 reply; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-22  5:52 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Junichi Nomura, Tejun Heo, Artem S. Tashkinov, Christoph Hellwig,
	Ming Lin, Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

On 2015-12-22 10:38, Kent Overstreet wrote:
> On Tue, Dec 22, 2015 at 05:26:12AM +0000, Junichi Nomura wrote:
>> On 12/22/15 12:59, Kent Overstreet wrote:
>> > reproduced it with 32 bit pae:
>> >
>> >> 1. Exclude memory above 4G line with boot param "max_addr=4G".
>> >
>> > doesn't work - max_addr=1G doesn't work either
>> >
>> >> 2. Disable highmem with "highmem=0".
>> >
>> > works!
>> >
>> >> 3. Try booting 64bit kernel.
>> >
>> > works
>> 
>> blk_queue_bio() does split then bounce, which makes the segment
>> counting based on pages before bouncing and could go wrong.
>> 
>> What do you think of a patch like this?
> 
> Artem, can you give this patch a try?


This patch ostensibly fixes the issue - at least I cannot immediately 
reproduce it. You can count me in as "Tested-by: Artem S. Tashkinov"

> 
>> 
>> --
>> Jun'ichi Nomura, NEC Corporation
>> 
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 5131993b..1d1c3c7 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1689,8 +1689,6 @@ static blk_qc_t blk_queue_bio(struct 
>> request_queue *q, struct bio *bio)
>>  	struct request *req;
>>  	unsigned int request_count = 0;
>> 
>> -	blk_queue_split(q, &bio, q->bio_split);
>> -
>>  	/*
>>  	 * low level driver can indicate that it wants pages above a
>>  	 * certain limit bounced to low memory (ie for highmem, or even
>> @@ -1698,6 +1696,8 @@ static blk_qc_t blk_queue_bio(struct 
>> request_queue *q, struct bio *bio)
>>  	 */
>>  	blk_queue_bounce(q, &bio);
>> 
>> +	blk_queue_split(q, &bio, q->bio_split);
>> +
>>  	if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
>>  		bio->bi_error = -EIO;
>>  		bio_endio(bio);

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-22  5:52                 ` Artem S. Tashkinov
@ 2015-12-22  5:55                   ` Kent Overstreet
  2015-12-22  5:59                     ` Artem S. Tashkinov
  0 siblings, 1 reply; 45+ messages in thread
From: Kent Overstreet @ 2015-12-22  5:55 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Junichi Nomura, Tejun Heo, Artem S. Tashkinov, Christoph Hellwig,
	Ming Lin, Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

On Tue, Dec 22, 2015 at 10:52:37AM +0500, Artem S. Tashkinov wrote:
> On 2015-12-22 10:38, Kent Overstreet wrote:
> >On Tue, Dec 22, 2015 at 05:26:12AM +0000, Junichi Nomura wrote:
> >>On 12/22/15 12:59, Kent Overstreet wrote:
> >>> reproduced it with 32 bit pae:
> >>>
> >>>> 1. Exclude memory above 4G line with boot param "max_addr=4G".
> >>>
> >>> doesn't work - max_addr=1G doesn't work either
> >>>
> >>>> 2. Disable highmem with "highmem=0".
> >>>
> >>> works!
> >>>
> >>>> 3. Try booting 64bit kernel.
> >>>
> >>> works
> >>
> >>blk_queue_bio() does split then bounce, which makes the segment
> >>counting based on pages before bouncing and could go wrong.
> >>
> >>What do you think of a patch like this?
> >
> >Artem, can you give this patch a try?
> 
> 
> This patch ostensibly fixes the issue - at least I cannot immediately
> reproduce it. You can count me in as "Tested-by: Artem S. Tashkinov"

Let's all contemplate the fact that blk_segment_map_sg() _overrunning the end of
the provided sglist_ was this much of a clusterfuck to debug.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-22  5:55                   ` Kent Overstreet
@ 2015-12-22  5:59                     ` Artem S. Tashkinov
  2015-12-22  6:02                       ` Kent Overstreet
  0 siblings, 1 reply; 45+ messages in thread
From: Artem S. Tashkinov @ 2015-12-22  5:59 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Junichi Nomura, Tejun Heo, Artem S. Tashkinov, Christoph Hellwig,
	Ming Lin, Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

On 2015-12-22 10:55, Kent Overstreet wrote:
> On Tue, Dec 22, 2015 at 10:52:37AM +0500, Artem S. Tashkinov wrote:
>> On 2015-12-22 10:38, Kent Overstreet wrote:
>> >On Tue, Dec 22, 2015 at 05:26:12AM +0000, Junichi Nomura wrote:
>> >>On 12/22/15 12:59, Kent Overstreet wrote:
>> >>> reproduced it with 32 bit pae:
>> >>>
>> >>>> 1. Exclude memory above 4G line with boot param "max_addr=4G".
>> >>>
>> >>> doesn't work - max_addr=1G doesn't work either
>> >>>
>> >>>> 2. Disable highmem with "highmem=0".
>> >>>
>> >>> works!
>> >>>
>> >>>> 3. Try booting 64bit kernel.
>> >>>
>> >>> works
>> >>
>> >>blk_queue_bio() does split then bounce, which makes the segment
>> >>counting based on pages before bouncing and could go wrong.
>> >>
>> >>What do you think of a patch like this?
>> >
>> >Artem, can you give this patch a try?
>> 
>> 
>> This patch ostensibly fixes the issue - at least I cannot immediately
>> reproduce it. You can count me in as "Tested-by: Artem S. Tashkinov"
> 
> Let's all contemplate the fact that blk_segment_map_sg() _overrunning 
> the end of
> the provided sglist_ was this much of a clusterfuck to debug.

 From the look of it this fix has nothing to do with PAE, so then why 
only PAE users like me were affected by the original 
(b54ffb73cadcdcff9cc1ae0e11f502407e3e2e4c) patch?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-22  5:59                     ` Artem S. Tashkinov
@ 2015-12-22  6:02                       ` Kent Overstreet
  0 siblings, 0 replies; 45+ messages in thread
From: Kent Overstreet @ 2015-12-22  6:02 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Junichi Nomura, Tejun Heo, Artem S. Tashkinov, Christoph Hellwig,
	Ming Lin, Jens Axboe, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

On Tue, Dec 22, 2015 at 10:59:09AM +0500, Artem S. Tashkinov wrote:
> On 2015-12-22 10:55, Kent Overstreet wrote:
> >On Tue, Dec 22, 2015 at 10:52:37AM +0500, Artem S. Tashkinov wrote:
> >>On 2015-12-22 10:38, Kent Overstreet wrote:
> >>>On Tue, Dec 22, 2015 at 05:26:12AM +0000, Junichi Nomura wrote:
> >>>>On 12/22/15 12:59, Kent Overstreet wrote:
> >>>>> reproduced it with 32 bit pae:
> >>>>>
> >>>>>> 1. Exclude memory above 4G line with boot param "max_addr=4G".
> >>>>>
> >>>>> doesn't work - max_addr=1G doesn't work either
> >>>>>
> >>>>>> 2. Disable highmem with "highmem=0".
> >>>>>
> >>>>> works!
> >>>>>
> >>>>>> 3. Try booting 64bit kernel.
> >>>>>
> >>>>> works
> >>>>
> >>>>blk_queue_bio() does split then bounce, which makes the segment
> >>>>counting based on pages before bouncing and could go wrong.
> >>>>
> >>>>What do you think of a patch like this?
> >>>
> >>>Artem, can you give this patch a try?
> >>
> >>
> >>This patch ostensibly fixes the issue - at least I cannot immediately
> >>reproduce it. You can count me in as "Tested-by: Artem S. Tashkinov"
> >
> >Let's all contemplate the fact that blk_segment_map_sg() _overrunning the
> >end of
> >the provided sglist_ was this much of a clusterfuck to debug.
> 
> From the look of it this fix has nothing to do with PAE, so then why only
> PAE users like me were affected by the original
> (b54ffb73cadcdcff9cc1ae0e11f502407e3e2e4c) patch?

The amusing thing is that I doubt PAE actually requires bouncing - addressing
limits come from the device, not the cpu.

But evidently in PAE mode, the block layer is in fact bouncing bios. Probably
from some default setting in the queue limits that no one ever looks at.

The whole queue limits design is an atrocity, it leads to exactly this kind of
crap where no one can predict the actual behaviour of any given setup.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: IO errors after "block: remove bio_get_nr_vecs()"
  2015-12-22  5:26             ` Junichi Nomura
  2015-12-22  5:37               ` Kent Overstreet
  2015-12-22  5:38               ` Kent Overstreet
@ 2015-12-22 17:28               ` Jens Axboe
  2 siblings, 0 replies; 45+ messages in thread
From: Jens Axboe @ 2015-12-22 17:28 UTC (permalink / raw)
  To: Junichi Nomura, Kent Overstreet, Tejun Heo
  Cc: Artem S. Tashkinov, Artem S. Tashkinov, Christoph Hellwig,
	Ming Lin, Linus Torvalds, Steven Whitehouse, IDE-ML,
	Linux Kernel Mailing List, Ming Lei

On 12/21/2015 10:26 PM, Junichi Nomura wrote:
> On 12/22/15 12:59, Kent Overstreet wrote:
>> reproduced it with 32 bit pae:
>>
>>> 1. Exclude memory above 4G line with boot param "max_addr=4G".
>>
>> doesn't work - max_addr=1G doesn't work either
>>
>>> 2. Disable highmem with "highmem=0".
>>
>> works!
>>
>>> 3. Try booting 64bit kernel.
>>
>> works
>
> blk_queue_bio() does split then bounce, which makes the segment
> counting based on pages before bouncing and could go wrong.

Good catch! The blk-mq parts aren't affected by this, the screw up only 
happened in the old IO path. I've added this with the appropriate 
tested-by from Artem, and CC stable and listed the commit that broke it:

commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e
Author: Kent Overstreet <kent.overstreet@gmail.com>
Date:   Thu Apr 23 22:37:18 2015 -0700

     block: make generic_make_request handle arbitrarily sized bios

Thanks to all involved in nailing this down, it'll go out shortly.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2015-12-22 17:28 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-20 17:51 IO errors after "block: remove bio_get_nr_vecs()" Linus Torvalds
2015-12-20 18:18 ` Christoph Hellwig
2015-12-20 18:41   ` Linus Torvalds
2015-12-20 23:36     ` Artem S. Tashkinov
2015-12-21 11:21     ` Dan Aloni
2015-12-20 18:44   ` Kent Overstreet
2015-12-20 23:41     ` Artem S. Tashkinov
2015-12-20 23:25   ` Artem S. Tashkinov
2015-12-20 23:42     ` Kent Overstreet
2015-12-20 23:49       ` Artem S. Tashkinov
2015-12-20 23:23 ` Artem S. Tashkinov
2015-12-21  1:38 ` Ming Lei
2015-12-21  1:50   ` Artem S. Tashkinov
2015-12-21  2:18     ` Ming Lei
2015-12-21  2:25       ` Artem S. Tashkinov
2015-12-21  2:32     ` Kent Overstreet
2015-12-21  3:21       ` Ming Lei
2015-12-21  3:36         ` Artem S. Tashkinov
2015-12-21  4:32     ` Linus Torvalds
2015-12-21  4:43       ` Artem S. Tashkinov
2015-12-21  4:47         ` Linus Torvalds
2015-12-21  5:23           ` Linus Torvalds
2015-12-21  7:31             ` Artem S. Tashkinov
2015-12-22  4:06             ` Artem S. Tashkinov
2015-12-21  4:26 ` Tejun Heo
2015-12-21  5:10   ` Linus Torvalds
2015-12-21  6:55 ` Tejun Heo
2015-12-21  7:25   ` Artem S. Tashkinov
2015-12-21 19:35     ` Tejun Heo
2015-12-21 20:07       ` Tejun Heo
2015-12-21 21:08         ` Tejun Heo
2015-12-22  3:43           ` Kent Overstreet
2015-12-22  3:59           ` Kent Overstreet
2015-12-22  5:26             ` Junichi Nomura
2015-12-22  5:37               ` Kent Overstreet
2015-12-22  5:38               ` Kent Overstreet
2015-12-22  5:52                 ` Artem S. Tashkinov
2015-12-22  5:55                   ` Kent Overstreet
2015-12-22  5:59                     ` Artem S. Tashkinov
2015-12-22  6:02                       ` Kent Overstreet
2015-12-22 17:28               ` Jens Axboe
2015-12-22  4:45           ` Kent Overstreet
2015-12-22  5:10         ` Artem S. Tashkinov
2015-12-22  5:20         ` Artem S. Tashkinov
2015-12-21 22:51       ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.