linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Sarvela, Tomi P" <tomi.p.sarvela@intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Dave Airlie <airlied@gmail.com>, Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@lst.de>,
	Damien Le Moal <damien.lemoal@wdc.com>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>,
	Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: RE: [Intel-gfx] Public i915 CI shardruns are disabled
Date: Wed, 3 Mar 2021 09:38:25 +0000	[thread overview]
Message-ID: <78d84056ae5d48a99b8fa8344e1e4fdc@intel.com> (raw)
In-Reply-To: <CAHk-=whxZJXkuvX2j56QH6ANA_girjWK3nQCPJGuOWwfYEgtag@mail.gmail.com>

From my earlier message on the mailing list: 
[...] "Hitting the bug corrupts the underlying filesystem very thoroughly, wiping out large amount of data from the beginning of the partition which leaves fsck sad with thousands of items lost. Bisection of the IGT testlist was done with two root filesystems, where testable kernel booted from 2. partition, and copy of the 2. partition was stored on 1. partition and could be restored at will."

The CI public interface doesn't really show this: the hosts started testing, died, and in boot stuck to the grub menu because grub.cfg (or anything) wasn't available on root disk.

Decision to shut down the extended testing was mine, when I saw ~1 host per shard dying each testing round (couple of hosts per hour).

It's a kind of bug our CI is not handling well, because on the catastrophic scale the effects are close to the maximum (where max would be permanent hw damage), and cause is not related to i915 at all.

Regards,

Tomi Sarvela


> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Wednesday, March 3, 2021 1:28 AM
> To: Dave Airlie <airlied@gmail.com>; Jens Axboe <axboe@kernel.dk>;
> Christoph Hellwig <hch@lst.de>; Damien Le Moal
> <damien.lemoal@wdc.com>; Johannes Thumshirn
> <johannes.thumshirn@wdc.com>; Chaitanya Kulkarni
> <chaitanya.kulkarni@wdc.com>
> Cc: Sarvela, Tomi P <tomi.p.sarvela@intel.com>; Linux Memory Management
> List <linux-mm@kvack.org>; Andrew Morton <akpm@linux-foundation.org>;
> intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] Public i915 CI shardruns are disabled
> 
> Adding the right people.
> 
> It seems that the three commits that needed reverting are
> 
>   f885056a48cc ("mm: simplify swapdev_block")
>   3e3126cf2a6d ("mm: only make map_swap_entry available for
> CONFIG_HIBERNATION")
>   48d15436fde6 ("mm: remove get_swap_bio")
> 
> and while they look very harmless to me, let's bring in Christoph and
> Jens who were actually involved with them.
> 
> I'm assuming that it's that third one that is the real issue (and the
> two other ones were to get to it), but it would also be good to know
> what the actual details of the regression actually were.
> 
> Maybe that's obvious to somebody who has more context about the 9815
> CI runs and its web interface, but it sure isn't clear to me.
> 
> Jens, Christoph?
> 
>                   Linus
> 
> On Tue, Mar 2, 2021 at 11:31 AM Dave Airlie <airlied@gmail.com> wrote:
> >
> > On Wed, 3 Mar 2021 at 03:27, Sarvela, Tomi P <tomi.p.sarvela@intel.com>
> wrote:
> > >
> > > The regression has been identified; Chris Wilson found commits touching
> > >
> > > swapfile.c, and reverting them the issue couldn’t be reproduced any
> more.
> > >
> > >
> > >
> > > https://patchwork.freedesktop.org/series/87549/
> > >
> > >
> > >
> > > This revert will be applied to core-for-CI branch. When new CI_DRM has
> > >
> > > been built, shard-testing will be enabled again.
> >
> > Just making sure this is on the radar upstream.
> >
> > Dave.

      parent reply	other threads:[~2021-03-03  9:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <e12dfaac0aa242f4a10d8c5b920a98db@intel.com>
     [not found] ` <51946a94b1154605bd7dda2c77ab12fc@intel.com>
     [not found]   ` <fb8a2d722d4b4c008eeb1ffae87233be@intel.com>
     [not found]     ` <CAPM=9tzLJAgjo=+JCNJrVaz3RY3D66tG+zdw_nCCTQGSwFbwCg@mail.gmail.com>
     [not found]       ` <CAHk-=whxZJXkuvX2j56QH6ANA_girjWK3nQCPJGuOWwfYEgtag@mail.gmail.com>
     [not found]         ` <CAPM=9twngQ=T6WgJBVje9PUtYrSa4LyZgsMZKEykCRc_MObrHw@mail.gmail.com>
2021-03-02 23:56           ` [Intel-gfx] Public i915 CI shardruns are disabled Linus Torvalds
2021-03-03  0:15             ` Jens Axboe
     [not found]               ` <f436251f-2eab-df40-7d0a-0f32b40f5996@kernel.dk>
2021-03-03  1:01                 ` Linus Torvalds
2021-03-03  1:18                   ` Jens Axboe
2021-03-03  2:48                     ` Linus Torvalds
2021-03-03  9:38         ` Sarvela, Tomi P [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78d84056ae5d48a99b8fa8344e1e4fdc@intel.com \
    --to=tomi.p.sarvela@intel.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=chaitanya.kulkarni@wdc.com \
    --cc=damien.lemoal@wdc.com \
    --cc=hch@lst.de \
    --cc=johannes.thumshirn@wdc.com \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).