linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [Intel-gfx] Public i915 CI shardruns are disabled
       [not found]         ` <CAPM=9twngQ=T6WgJBVje9PUtYrSa4LyZgsMZKEykCRc_MObrHw@mail.gmail.com>
@ 2021-03-02 23:56           ` Linus Torvalds
  2021-03-03  0:15             ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2021-03-02 23:56 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Jens Axboe, Christoph Hellwig, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

On Tue, Mar 2, 2021 at 3:38 PM Dave Airlie <airlied@gmail.com> wrote:
>
> Looks like Jens saw it at least, he posted this on twitter a few mins
> ago so I assume it'll be incoming soon.
>
> https://git.kernel.dk/cgit/linux-block/commit/?h=swap-fix

Ahh. You use a swap file. This might be the same thing that I think
the phoronix people hit as ext4 corruption this merge window.

Jens, if that can get confirmed, please send it my way asap.. Thanks,

               Linus


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 23:56           ` [Intel-gfx] Public i915 CI shardruns are disabled Linus Torvalds
@ 2021-03-03  0:15             ` Jens Axboe
       [not found]               ` <f436251f-2eab-df40-7d0a-0f32b40f5996@kernel.dk>
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-03-03  0:15 UTC (permalink / raw)
  To: Linus Torvalds, Dave Airlie
  Cc: Christoph Hellwig, Damien Le Moal, Johannes Thumshirn,
	Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

On 3/2/21 4:56 PM, Linus Torvalds wrote:
> On Tue, Mar 2, 2021 at 3:38 PM Dave Airlie <airlied@gmail.com> wrote:
>>
>> Looks like Jens saw it at least, he posted this on twitter a few mins
>> ago so I assume it'll be incoming soon.
>>
>> https://git.kernel.dk/cgit/linux-block/commit/?h=swap-fix
> 
> Ahh. You use a swap file. This might be the same thing that I think
> the phoronix people hit as ext4 corruption this merge window.
> 
> Jens, if that can get confirmed, please send it my way asap.. Thanks,

Yep, it's the same issue indeed. Was made aware of it after lunch today
and emailed Christoph, but then decided to dig into it myself a few
hours later. Andrew already queued it up I just saw, but I noticed that
that version will break on !CONFIG_HIBERNATION.

Patch below if you just want to grab it.

commit e25b1010db005a59727e1ff5f43af889effd31a3
Author: Jens Axboe <axboe@kernel.dk>
Date:   Tue Mar 2 14:53:21 2021 -0700

    swap: fix swapfile read/write offset
    
    We're not factoring in the start of the file for where to write and
    read the swapfile, which leads to very unfortunate side effects of
    writing where we should not be...
    
    Fixes: 48d15436fde6 ("mm: remove get_swap_bio")
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 32f665b1ee85..4cc6ec3bf0ab 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -485,6 +485,7 @@ struct backing_dev_info;
 extern int init_swap_address_space(unsigned int type, unsigned long nr_pages);
 extern void exit_swap_address_space(unsigned int type);
 extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
+sector_t swap_page_sector(struct page *page);
 
 static inline void put_swap_device(struct swap_info_struct *si)
 {
diff --git a/mm/page_io.c b/mm/page_io.c
index 485fa5cca4a2..c493ce9ebcf5 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -254,11 +254,6 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
 	return ret;
 }
 
-static sector_t swap_page_sector(struct page *page)
-{
-	return (sector_t)__page_file_index(page) << (PAGE_SHIFT - 9);
-}
-
 static inline void count_swpout_vm_event(struct page *page)
 {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/mm/swapfile.c b/mm/swapfile.c
index f039745989d2..084a5b9a18e5 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -219,6 +219,19 @@ offset_to_swap_extent(struct swap_info_struct *sis, unsigned long offset)
 	BUG();
 }
 
+sector_t swap_page_sector(struct page *page)
+{
+	struct swap_info_struct *sis = page_swap_info(page);
+	struct swap_extent *se;
+	sector_t sector;
+	pgoff_t offset;
+
+	offset = __page_file_index(page);
+	se = offset_to_swap_extent(sis, offset);
+	sector = se->start_block + (offset - se->start_page);
+	return sector << (PAGE_SHIFT - 9);
+}
+
 /*
  * swap allocation tell device that a cluster of swap can now be discarded,
  * to allow the swap device to optimize its wear-levelling.

-- 
Jens Axboe



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
       [not found]               ` <f436251f-2eab-df40-7d0a-0f32b40f5996@kernel.dk>
@ 2021-03-03  1:01                 ` Linus Torvalds
  2021-03-03  1:18                   ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2021-03-03  1:01 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Dave Airlie, Christoph Hellwig, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

On Tue, Mar 2, 2021 at 4:36 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> Or if you want a pull, just let me know. Have another misc patch to
> flush out anyway that doesn't belong in any of my usual branches.

Ok, if you have something else pending anyway, let's do that. Send me
the pull request, and I'll take it asap.

Thanks,
               Linus


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-03  1:01                 ` Linus Torvalds
@ 2021-03-03  1:18                   ` Jens Axboe
  2021-03-03  2:48                     ` Linus Torvalds
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-03-03  1:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Airlie, Christoph Hellwig, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

On 3/2/21 6:01 PM, Linus Torvalds wrote:
> On Tue, Mar 2, 2021 at 4:36 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Or if you want a pull, just let me know. Have another misc patch to
>> flush out anyway that doesn't belong in any of my usual branches.
> 
> Ok, if you have something else pending anyway, let's do that. Send me
> the pull request, and I'll take it asap.

Done

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-03  1:18                   ` Jens Axboe
@ 2021-03-03  2:48                     ` Linus Torvalds
  0 siblings, 0 replies; 6+ messages in thread
From: Linus Torvalds @ 2021-03-03  2:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Dave Airlie, Christoph Hellwig, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

Ok, slightly delayed by dinner, but commit caf6912f3f4a ("swap: fix
swapfile read/write offset") is out in my tree now.

Dave - can you check that the current -git works for your CI people?

Thanks,
              Linus

On Tue, Mar 2, 2021 at 5:18 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 3/2/21 6:01 PM, Linus Torvalds wrote:
> > On Tue, Mar 2, 2021 at 4:36 PM Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >> Or if you want a pull, just let me know. Have another misc patch to
> >> flush out anyway that doesn't belong in any of my usual branches.
> >
> > Ok, if you have something else pending anyway, let's do that. Send me
> > the pull request, and I'll take it asap.
>
> Done
>
> --
> Jens Axboe
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [Intel-gfx] Public i915 CI shardruns are disabled
       [not found]       ` <CAHk-=whxZJXkuvX2j56QH6ANA_girjWK3nQCPJGuOWwfYEgtag@mail.gmail.com>
       [not found]         ` <CAPM=9twngQ=T6WgJBVje9PUtYrSa4LyZgsMZKEykCRc_MObrHw@mail.gmail.com>
@ 2021-03-03  9:38         ` Sarvela, Tomi P
  1 sibling, 0 replies; 6+ messages in thread
From: Sarvela, Tomi P @ 2021-03-03  9:38 UTC (permalink / raw)
  To: Linus Torvalds, Dave Airlie, Jens Axboe, Christoph Hellwig,
	Damien Le Moal, Johannes Thumshirn, Chaitanya Kulkarni
  Cc: Linux Memory Management List, Andrew Morton

From my earlier message on the mailing list: 
[...] "Hitting the bug corrupts the underlying filesystem very thoroughly, wiping out large amount of data from the beginning of the partition which leaves fsck sad with thousands of items lost. Bisection of the IGT testlist was done with two root filesystems, where testable kernel booted from 2. partition, and copy of the 2. partition was stored on 1. partition and could be restored at will."

The CI public interface doesn't really show this: the hosts started testing, died, and in boot stuck to the grub menu because grub.cfg (or anything) wasn't available on root disk.

Decision to shut down the extended testing was mine, when I saw ~1 host per shard dying each testing round (couple of hosts per hour).

It's a kind of bug our CI is not handling well, because on the catastrophic scale the effects are close to the maximum (where max would be permanent hw damage), and cause is not related to i915 at all.

Regards,

Tomi Sarvela


> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Wednesday, March 3, 2021 1:28 AM
> To: Dave Airlie <airlied@gmail.com>; Jens Axboe <axboe@kernel.dk>;
> Christoph Hellwig <hch@lst.de>; Damien Le Moal
> <damien.lemoal@wdc.com>; Johannes Thumshirn
> <johannes.thumshirn@wdc.com>; Chaitanya Kulkarni
> <chaitanya.kulkarni@wdc.com>
> Cc: Sarvela, Tomi P <tomi.p.sarvela@intel.com>; Linux Memory Management
> List <linux-mm@kvack.org>; Andrew Morton <akpm@linux-foundation.org>;
> intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] Public i915 CI shardruns are disabled
> 
> Adding the right people.
> 
> It seems that the three commits that needed reverting are
> 
>   f885056a48cc ("mm: simplify swapdev_block")
>   3e3126cf2a6d ("mm: only make map_swap_entry available for
> CONFIG_HIBERNATION")
>   48d15436fde6 ("mm: remove get_swap_bio")
> 
> and while they look very harmless to me, let's bring in Christoph and
> Jens who were actually involved with them.
> 
> I'm assuming that it's that third one that is the real issue (and the
> two other ones were to get to it), but it would also be good to know
> what the actual details of the regression actually were.
> 
> Maybe that's obvious to somebody who has more context about the 9815
> CI runs and its web interface, but it sure isn't clear to me.
> 
> Jens, Christoph?
> 
>                   Linus
> 
> On Tue, Mar 2, 2021 at 11:31 AM Dave Airlie <airlied@gmail.com> wrote:
> >
> > On Wed, 3 Mar 2021 at 03:27, Sarvela, Tomi P <tomi.p.sarvela@intel.com>
> wrote:
> > >
> > > The regression has been identified; Chris Wilson found commits touching
> > >
> > > swapfile.c, and reverting them the issue couldn’t be reproduced any
> more.
> > >
> > >
> > >
> > > https://patchwork.freedesktop.org/series/87549/
> > >
> > >
> > >
> > > This revert will be applied to core-for-CI branch. When new CI_DRM has
> > >
> > > been built, shard-testing will be enabled again.
> >
> > Just making sure this is on the radar upstream.
> >
> > Dave.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-03  9:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <e12dfaac0aa242f4a10d8c5b920a98db@intel.com>
     [not found] ` <51946a94b1154605bd7dda2c77ab12fc@intel.com>
     [not found]   ` <fb8a2d722d4b4c008eeb1ffae87233be@intel.com>
     [not found]     ` <CAPM=9tzLJAgjo=+JCNJrVaz3RY3D66tG+zdw_nCCTQGSwFbwCg@mail.gmail.com>
     [not found]       ` <CAHk-=whxZJXkuvX2j56QH6ANA_girjWK3nQCPJGuOWwfYEgtag@mail.gmail.com>
     [not found]         ` <CAPM=9twngQ=T6WgJBVje9PUtYrSa4LyZgsMZKEykCRc_MObrHw@mail.gmail.com>
2021-03-02 23:56           ` [Intel-gfx] Public i915 CI shardruns are disabled Linus Torvalds
2021-03-03  0:15             ` Jens Axboe
     [not found]               ` <f436251f-2eab-df40-7d0a-0f32b40f5996@kernel.dk>
2021-03-03  1:01                 ` Linus Torvalds
2021-03-03  1:18                   ` Jens Axboe
2021-03-03  2:48                     ` Linus Torvalds
2021-03-03  9:38         ` Sarvela, Tomi P

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).