All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] Public i915 CI shardruns are disabled
@ 2021-03-02 11:37 Sarvela, Tomi P
  2021-03-02 13:50 ` Sarvela, Tomi P
  2021-03-03 18:28 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
  0 siblings, 2 replies; 20+ messages in thread
From: Sarvela, Tomi P @ 2021-03-02 11:37 UTC (permalink / raw)
  To: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1496 bytes --]

Hello,

The linux i915 CI shardruns have been disabled. This is due to the unfortunate
filesystem-corrupting bug first seen in linux-next 20210215, which now has
been merged to linus 5.12-rc1 and further on to DRM-Tip, first instance seen
in CI_DRM_9818. Last changes coming in were:

fb3b93df7979 drm-tip: 2021y-03m-01d-09h-36m-57s UTC integration manifest
3b3c4086295b drm-tip: 2021y-03m-01d-08h-49m-06s UTC integration manifest
fe07bfda2fb9 Linux 5.12-rc1

More information can be seen at:
https://phoronix.com/scan.php?page=news_item&px=Linux-5.12-Early-Buggy-Issue

I've seen this bug happen regularly with (but not limited to) IGT test:
igt@gem_tiled_swapping@non-threaded

The range for bisection is linux-next 20210215 to 20210129 because the kernels
in-between taint the kernel and our i915 testing was not done. Hitting the bug
corrupts the underlying filesystem very thoroughly, wiping out large amount of
data from the beginning of the partition which leaves fsck sad with thousands of
items lost. Bisection of the IGT testlist was done with two root filesystems, where
testable kernel booted from 2. partition, and copy of the 2. partition was stored
on 1. partition and could be restored at will.

I'll continue bisecting this bug on the linux-next tree again. If someone has more
information where this issue originates from, help would be appreciated.

Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo


[-- Attachment #1.2: Type: text/html, Size: 5480 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 11:37 [Intel-gfx] Public i915 CI shardruns are disabled Sarvela, Tomi P
@ 2021-03-02 13:50 ` Sarvela, Tomi P
  2021-03-02 17:26   ` Sarvela, Tomi P
  2021-03-03 18:28 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
  1 sibling, 1 reply; 20+ messages in thread
From: Sarvela, Tomi P @ 2021-03-02 13:50 UTC (permalink / raw)
  To: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2361 bytes --]

More information (excuse my top-posting):

- Issue happens in igt@gem_tiled_swapping@non-threaded Mlocking
phase, before "starting subtest" appears.

- Filesystem trashed is the one containing swapfile

- If swap is partition, it seems that the swap signature is correct even
after running the test, so for now I'm assuming that the issue has to do
with swapfile

- Bisection between 20210129 and 20210215 proved to be challenging,
because the kernels have pre-init hang, don't leave dmesg and I don't
have console on testing host. Petri's suggestion to bisect between
CI_DRM_9817 and 9818 might work better

Regards,

Tomi Sarvela

From: Sarvela, Tomi P
Sent: Tuesday, March 2, 2021 1:38 PM
To: intel-gfx@lists.freedesktop.org
Cc: Szwichtenberg, Radoslaw <radoslaw.szwichtenberg@intel.com>
Subject: Public i915 CI shardruns are disabled

Hello,

The linux i915 CI shardruns have been disabled. This is due to the unfortunate
filesystem-corrupting bug first seen in linux-next 20210215, which now has
been merged to linus 5.12-rc1 and further on to DRM-Tip, first instance seen
in CI_DRM_9818. Last changes coming in were:

fb3b93df7979 drm-tip: 2021y-03m-01d-09h-36m-57s UTC integration manifest
3b3c4086295b drm-tip: 2021y-03m-01d-08h-49m-06s UTC integration manifest
fe07bfda2fb9 Linux 5.12-rc1

More information can be seen at:
https://phoronix.com/scan.php?page=news_item&px=Linux-5.12-Early-Buggy-Issue

I've seen this bug happen regularly with (but not limited to) IGT test:
igt@gem_tiled_swapping@non-threaded

The range for bisection is linux-next 20210215 to 20210129 because the kernels
in-between taint the kernel and our i915 testing was not done. Hitting the bug
corrupts the underlying filesystem very thoroughly, wiping out large amount of
data from the beginning of the partition which leaves fsck sad with thousands of
items lost. Bisection of the IGT testlist was done with two root filesystems, where
testable kernel booted from 2. partition, and copy of the 2. partition was stored
on 1. partition and could be restored at will.

I'll continue bisecting this bug on the linux-next tree again. If someone has more
information where this issue originates from, help would be appreciated.

Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo


[-- Attachment #1.2: Type: text/html, Size: 16013 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 13:50 ` Sarvela, Tomi P
@ 2021-03-02 17:26   ` Sarvela, Tomi P
  2021-03-02 19:31     ` Dave Airlie
  2021-03-09  8:31     ` Sarvela, Tomi P
  0 siblings, 2 replies; 20+ messages in thread
From: Sarvela, Tomi P @ 2021-03-02 17:26 UTC (permalink / raw)
  To: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2552 bytes --]

The regression has been identified; Chris Wilson found commits touching
swapfile.c, and reverting them the issue couldn't be reproduced any more.

https://patchwork.freedesktop.org/series/87549/

This revert will be applied to core-for-CI branch. When new CI_DRM has
been built, shard-testing will be enabled again.

Regards,

Tomi Sarvela

From: Sarvela, Tomi P

More information (excuse my top-posting):

- Issue happens in igt@gem_tiled_swapping@non-threaded Mlocking
phase, before "starting subtest" appears.

- Filesystem trashed is the one containing swapfile

- If swap is partition, it seems that the swap signature is correct even
after running the test, so for now I'm assuming that the issue has to do
with swapfile

- Bisection between 20210129 and 20210215 proved to be challenging,
because the kernels have pre-init hang, don't leave dmesg and I don't
have console on testing host. Petri's suggestion to bisect between
CI_DRM_9817 and 9818 might work better

Regards,

Tomi Sarvela

From: Sarvela, Tomi P

Hello,

The linux i915 CI shardruns have been disabled. This is due to the unfortunate
filesystem-corrupting bug first seen in linux-next 20210215, which now has
been merged to linus 5.12-rc1 and further on to DRM-Tip, first instance seen
in CI_DRM_9818. Last changes coming in were:

fb3b93df7979 drm-tip: 2021y-03m-01d-09h-36m-57s UTC integration manifest
3b3c4086295b drm-tip: 2021y-03m-01d-08h-49m-06s UTC integration manifest
fe07bfda2fb9 Linux 5.12-rc1

More information can be seen at:
https://phoronix.com/scan.php?page=news_item&px=Linux-5.12-Early-Buggy-Issue

I've seen this bug happen regularly with (but not limited to) IGT test:
igt@gem_tiled_swapping@non-threaded

The range for bisection is linux-next 20210215 to 20210129 because the kernels
in-between taint the kernel and our i915 testing was not done. Hitting the bug
corrupts the underlying filesystem very thoroughly, wiping out large amount of
data from the beginning of the partition which leaves fsck sad with thousands of
items lost. Bisection of the IGT testlist was done with two root filesystems, where
testable kernel booted from 2. partition, and copy of the 2. partition was stored
on 1. partition and could be restored at will.

I'll continue bisecting this bug on the linux-next tree again. If someone has more
information where this issue originates from, help would be appreciated.

Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo


[-- Attachment #1.2: Type: text/html, Size: 9260 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 17:26   ` Sarvela, Tomi P
@ 2021-03-02 19:31     ` Dave Airlie
  2021-03-02 23:27       ` Linus Torvalds
  2021-03-09  8:31     ` Sarvela, Tomi P
  1 sibling, 1 reply; 20+ messages in thread
From: Dave Airlie @ 2021-03-02 19:31 UTC (permalink / raw)
  To: Sarvela, Tomi P, Linus Torvalds, Linux Memory Management List,
	Andrew Morton
  Cc: intel-gfx

On Wed, 3 Mar 2021 at 03:27, Sarvela, Tomi P <tomi.p.sarvela@intel.com> wrote:
>
> The regression has been identified; Chris Wilson found commits touching
>
> swapfile.c, and reverting them the issue couldn’t be reproduced any more.
>
>
>
> https://patchwork.freedesktop.org/series/87549/
>
>
>
> This revert will be applied to core-for-CI branch. When new CI_DRM has
>
> been built, shard-testing will be enabled again.

Just making sure this is on the radar upstream.

Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 19:31     ` Dave Airlie
@ 2021-03-02 23:27       ` Linus Torvalds
  2021-03-02 23:38         ` Dave Airlie
  2021-03-03  9:38         ` Sarvela, Tomi P
  0 siblings, 2 replies; 20+ messages in thread
From: Linus Torvalds @ 2021-03-02 23:27 UTC (permalink / raw)
  To: Dave Airlie, Jens Axboe, Christoph Hellwig, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni
  Cc: Sarvela, Tomi P, Linux Memory Management List, Andrew Morton, intel-gfx

Adding the right people.

It seems that the three commits that needed reverting are

  f885056a48cc ("mm: simplify swapdev_block")
  3e3126cf2a6d ("mm: only make map_swap_entry available for CONFIG_HIBERNATION")
  48d15436fde6 ("mm: remove get_swap_bio")

and while they look very harmless to me, let's bring in Christoph and
Jens who were actually involved with them.

I'm assuming that it's that third one that is the real issue (and the
two other ones were to get to it), but it would also be good to know
what the actual details of the regression actually were.

Maybe that's obvious to somebody who has more context about the 9815
CI runs and its web interface, but it sure isn't clear to me.

Jens, Christoph?

                  Linus

On Tue, Mar 2, 2021 at 11:31 AM Dave Airlie <airlied@gmail.com> wrote:
>
> On Wed, 3 Mar 2021 at 03:27, Sarvela, Tomi P <tomi.p.sarvela@intel.com> wrote:
> >
> > The regression has been identified; Chris Wilson found commits touching
> >
> > swapfile.c, and reverting them the issue couldn’t be reproduced any more.
> >
> >
> >
> > https://patchwork.freedesktop.org/series/87549/
> >
> >
> >
> > This revert will be applied to core-for-CI branch. When new CI_DRM has
> >
> > been built, shard-testing will be enabled again.
>
> Just making sure this is on the radar upstream.
>
> Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 23:27       ` Linus Torvalds
@ 2021-03-02 23:38         ` Dave Airlie
  2021-03-02 23:56             ` Linus Torvalds
  2021-03-03  9:38         ` Sarvela, Tomi P
  1 sibling, 1 reply; 20+ messages in thread
From: Dave Airlie @ 2021-03-02 23:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Damien Le Moal, Chaitanya Kulkarni, Sarvela, Tomi P,
	Johannes Thumshirn, intel-gfx, Linux Memory Management List,
	Andrew Morton, Christoph Hellwig

On Wed, 3 Mar 2021 at 09:28, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Adding the right people.
>
> It seems that the three commits that needed reverting are
>
>   f885056a48cc ("mm: simplify swapdev_block")
>   3e3126cf2a6d ("mm: only make map_swap_entry available for CONFIG_HIBERNATION")
>   48d15436fde6 ("mm: remove get_swap_bio")
>
> and while they look very harmless to me, let's bring in Christoph and
> Jens who were actually involved with them.
>
> I'm assuming that it's that third one that is the real issue (and the
> two other ones were to get to it), but it would also be good to know
> what the actual details of the regression actually were.
>
> Maybe that's obvious to somebody who has more context about the 9815
> CI runs and its web interface, but it sure isn't clear to me.
>
> Jens, Christoph?

Looks like Jens saw it at least, he posted this on twitter a few mins
ago so I assume it'll be incoming soon.

https://git.kernel.dk/cgit/linux-block/commit/?h=swap-fix

Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 23:38         ` Dave Airlie
@ 2021-03-02 23:56             ` Linus Torvalds
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2021-03-02 23:56 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Jens Axboe, Christoph Hellwig, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

On Tue, Mar 2, 2021 at 3:38 PM Dave Airlie <airlied@gmail.com> wrote:
>
> Looks like Jens saw it at least, he posted this on twitter a few mins
> ago so I assume it'll be incoming soon.
>
> https://git.kernel.dk/cgit/linux-block/commit/?h=swap-fix

Ahh. You use a swap file. This might be the same thing that I think
the phoronix people hit as ext4 corruption this merge window.

Jens, if that can get confirmed, please send it my way asap.. Thanks,

               Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
@ 2021-03-02 23:56             ` Linus Torvalds
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2021-03-02 23:56 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Jens Axboe, Damien Le Moal, Chaitanya Kulkarni, Sarvela, Tomi P,
	Johannes Thumshirn, intel-gfx, Linux Memory Management List,
	Andrew Morton, Christoph Hellwig

On Tue, Mar 2, 2021 at 3:38 PM Dave Airlie <airlied@gmail.com> wrote:
>
> Looks like Jens saw it at least, he posted this on twitter a few mins
> ago so I assume it'll be incoming soon.
>
> https://git.kernel.dk/cgit/linux-block/commit/?h=swap-fix

Ahh. You use a swap file. This might be the same thing that I think
the phoronix people hit as ext4 corruption this merge window.

Jens, if that can get confirmed, please send it my way asap.. Thanks,

               Linus
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 23:56             ` Linus Torvalds
@ 2021-03-03  0:15               ` Jens Axboe
  -1 siblings, 0 replies; 20+ messages in thread
From: Jens Axboe @ 2021-03-03  0:15 UTC (permalink / raw)
  To: Linus Torvalds, Dave Airlie
  Cc: Christoph Hellwig, Damien Le Moal, Johannes Thumshirn,
	Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

On 3/2/21 4:56 PM, Linus Torvalds wrote:
> On Tue, Mar 2, 2021 at 3:38 PM Dave Airlie <airlied@gmail.com> wrote:
>>
>> Looks like Jens saw it at least, he posted this on twitter a few mins
>> ago so I assume it'll be incoming soon.
>>
>> https://git.kernel.dk/cgit/linux-block/commit/?h=swap-fix
> 
> Ahh. You use a swap file. This might be the same thing that I think
> the phoronix people hit as ext4 corruption this merge window.
> 
> Jens, if that can get confirmed, please send it my way asap.. Thanks,

Yep, it's the same issue indeed. Was made aware of it after lunch today
and emailed Christoph, but then decided to dig into it myself a few
hours later. Andrew already queued it up I just saw, but I noticed that
that version will break on !CONFIG_HIBERNATION.

Patch below if you just want to grab it.

commit e25b1010db005a59727e1ff5f43af889effd31a3
Author: Jens Axboe <axboe@kernel.dk>
Date:   Tue Mar 2 14:53:21 2021 -0700

    swap: fix swapfile read/write offset
    
    We're not factoring in the start of the file for where to write and
    read the swapfile, which leads to very unfortunate side effects of
    writing where we should not be...
    
    Fixes: 48d15436fde6 ("mm: remove get_swap_bio")
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 32f665b1ee85..4cc6ec3bf0ab 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -485,6 +485,7 @@ struct backing_dev_info;
 extern int init_swap_address_space(unsigned int type, unsigned long nr_pages);
 extern void exit_swap_address_space(unsigned int type);
 extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
+sector_t swap_page_sector(struct page *page);
 
 static inline void put_swap_device(struct swap_info_struct *si)
 {
diff --git a/mm/page_io.c b/mm/page_io.c
index 485fa5cca4a2..c493ce9ebcf5 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -254,11 +254,6 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
 	return ret;
 }
 
-static sector_t swap_page_sector(struct page *page)
-{
-	return (sector_t)__page_file_index(page) << (PAGE_SHIFT - 9);
-}
-
 static inline void count_swpout_vm_event(struct page *page)
 {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/mm/swapfile.c b/mm/swapfile.c
index f039745989d2..084a5b9a18e5 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -219,6 +219,19 @@ offset_to_swap_extent(struct swap_info_struct *sis, unsigned long offset)
 	BUG();
 }
 
+sector_t swap_page_sector(struct page *page)
+{
+	struct swap_info_struct *sis = page_swap_info(page);
+	struct swap_extent *se;
+	sector_t sector;
+	pgoff_t offset;
+
+	offset = __page_file_index(page);
+	se = offset_to_swap_extent(sis, offset);
+	sector = se->start_block + (offset - se->start_page);
+	return sector << (PAGE_SHIFT - 9);
+}
+
 /*
  * swap allocation tell device that a cluster of swap can now be discarded,
  * to allow the swap device to optimize its wear-levelling.

-- 
Jens Axboe



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
@ 2021-03-03  0:15               ` Jens Axboe
  0 siblings, 0 replies; 20+ messages in thread
From: Jens Axboe @ 2021-03-03  0:15 UTC (permalink / raw)
  To: Linus Torvalds, Dave Airlie
  Cc: Damien Le Moal, Chaitanya Kulkarni, Sarvela, Tomi P,
	Johannes Thumshirn, intel-gfx, Linux Memory Management List,
	Andrew Morton, Christoph Hellwig

On 3/2/21 4:56 PM, Linus Torvalds wrote:
> On Tue, Mar 2, 2021 at 3:38 PM Dave Airlie <airlied@gmail.com> wrote:
>>
>> Looks like Jens saw it at least, he posted this on twitter a few mins
>> ago so I assume it'll be incoming soon.
>>
>> https://git.kernel.dk/cgit/linux-block/commit/?h=swap-fix
> 
> Ahh. You use a swap file. This might be the same thing that I think
> the phoronix people hit as ext4 corruption this merge window.
> 
> Jens, if that can get confirmed, please send it my way asap.. Thanks,

Yep, it's the same issue indeed. Was made aware of it after lunch today
and emailed Christoph, but then decided to dig into it myself a few
hours later. Andrew already queued it up I just saw, but I noticed that
that version will break on !CONFIG_HIBERNATION.

Patch below if you just want to grab it.

commit e25b1010db005a59727e1ff5f43af889effd31a3
Author: Jens Axboe <axboe@kernel.dk>
Date:   Tue Mar 2 14:53:21 2021 -0700

    swap: fix swapfile read/write offset
    
    We're not factoring in the start of the file for where to write and
    read the swapfile, which leads to very unfortunate side effects of
    writing where we should not be...
    
    Fixes: 48d15436fde6 ("mm: remove get_swap_bio")
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 32f665b1ee85..4cc6ec3bf0ab 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -485,6 +485,7 @@ struct backing_dev_info;
 extern int init_swap_address_space(unsigned int type, unsigned long nr_pages);
 extern void exit_swap_address_space(unsigned int type);
 extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
+sector_t swap_page_sector(struct page *page);
 
 static inline void put_swap_device(struct swap_info_struct *si)
 {
diff --git a/mm/page_io.c b/mm/page_io.c
index 485fa5cca4a2..c493ce9ebcf5 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -254,11 +254,6 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
 	return ret;
 }
 
-static sector_t swap_page_sector(struct page *page)
-{
-	return (sector_t)__page_file_index(page) << (PAGE_SHIFT - 9);
-}
-
 static inline void count_swpout_vm_event(struct page *page)
 {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/mm/swapfile.c b/mm/swapfile.c
index f039745989d2..084a5b9a18e5 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -219,6 +219,19 @@ offset_to_swap_extent(struct swap_info_struct *sis, unsigned long offset)
 	BUG();
 }
 
+sector_t swap_page_sector(struct page *page)
+{
+	struct swap_info_struct *sis = page_swap_info(page);
+	struct swap_extent *se;
+	sector_t sector;
+	pgoff_t offset;
+
+	offset = __page_file_index(page);
+	se = offset_to_swap_extent(sis, offset);
+	sector = se->start_block + (offset - se->start_page);
+	return sector << (PAGE_SHIFT - 9);
+}
+
 /*
  * swap allocation tell device that a cluster of swap can now be discarded,
  * to allow the swap device to optimize its wear-levelling.

-- 
Jens Axboe

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-03  0:15               ` Jens Axboe
  (?)
@ 2021-03-03  0:36               ` Jens Axboe
  2021-03-03  1:01                   ` Linus Torvalds
  -1 siblings, 1 reply; 20+ messages in thread
From: Jens Axboe @ 2021-03-03  0:36 UTC (permalink / raw)
  To: Linus Torvalds, Dave Airlie
  Cc: Damien Le Moal, Chaitanya Kulkarni, Sarvela, Tomi P,
	Johannes Thumshirn, intel-gfx, Linux Memory Management List,
	Andrew Morton, Christoph Hellwig

On 3/2/21 5:15 PM, Jens Axboe wrote:
> On 3/2/21 4:56 PM, Linus Torvalds wrote:
>> On Tue, Mar 2, 2021 at 3:38 PM Dave Airlie <airlied@gmail.com> wrote:
>>>
>>> Looks like Jens saw it at least, he posted this on twitter a few mins
>>> ago so I assume it'll be incoming soon.
>>>
>>> https://git.kernel.dk/cgit/linux-block/commit/?h=swap-fix
>>
>> Ahh. You use a swap file. This might be the same thing that I think
>> the phoronix people hit as ext4 corruption this merge window.
>>
>> Jens, if that can get confirmed, please send it my way asap.. Thanks,
> 
> Yep, it's the same issue indeed. Was made aware of it after lunch today
> and emailed Christoph, but then decided to dig into it myself a few
> hours later. Andrew already queued it up I just saw, but I noticed that
> that version will break on !CONFIG_HIBERNATION.
> 
> Patch below if you just want to grab it.

Or if you want a pull, just let me know. Have another misc patch to
flush out anyway that doesn't belong in any of my usual branches.

-- 
Jens Axboe

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-03  0:36               ` Jens Axboe
@ 2021-03-03  1:01                   ` Linus Torvalds
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2021-03-03  1:01 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Dave Airlie, Christoph Hellwig, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

On Tue, Mar 2, 2021 at 4:36 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> Or if you want a pull, just let me know. Have another misc patch to
> flush out anyway that doesn't belong in any of my usual branches.

Ok, if you have something else pending anyway, let's do that. Send me
the pull request, and I'll take it asap.

Thanks,
               Linus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
@ 2021-03-03  1:01                   ` Linus Torvalds
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2021-03-03  1:01 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Damien Le Moal, Chaitanya Kulkarni, Sarvela, Tomi P,
	Johannes Thumshirn, intel-gfx, Linux Memory Management List,
	Andrew Morton, Christoph Hellwig

On Tue, Mar 2, 2021 at 4:36 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> Or if you want a pull, just let me know. Have another misc patch to
> flush out anyway that doesn't belong in any of my usual branches.

Ok, if you have something else pending anyway, let's do that. Send me
the pull request, and I'll take it asap.

Thanks,
               Linus
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-03  1:01                   ` Linus Torvalds
@ 2021-03-03  1:18                     ` Jens Axboe
  -1 siblings, 0 replies; 20+ messages in thread
From: Jens Axboe @ 2021-03-03  1:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Airlie, Christoph Hellwig, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

On 3/2/21 6:01 PM, Linus Torvalds wrote:
> On Tue, Mar 2, 2021 at 4:36 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Or if you want a pull, just let me know. Have another misc patch to
>> flush out anyway that doesn't belong in any of my usual branches.
> 
> Ok, if you have something else pending anyway, let's do that. Send me
> the pull request, and I'll take it asap.

Done

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
@ 2021-03-03  1:18                     ` Jens Axboe
  0 siblings, 0 replies; 20+ messages in thread
From: Jens Axboe @ 2021-03-03  1:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Damien Le Moal, Chaitanya Kulkarni, Sarvela, Tomi P,
	Johannes Thumshirn, intel-gfx, Linux Memory Management List,
	Andrew Morton, Christoph Hellwig

On 3/2/21 6:01 PM, Linus Torvalds wrote:
> On Tue, Mar 2, 2021 at 4:36 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Or if you want a pull, just let me know. Have another misc patch to
>> flush out anyway that doesn't belong in any of my usual branches.
> 
> Ok, if you have something else pending anyway, let's do that. Send me
> the pull request, and I'll take it asap.

Done

-- 
Jens Axboe

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-03  1:18                     ` Jens Axboe
@ 2021-03-03  2:48                       ` Linus Torvalds
  -1 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2021-03-03  2:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Dave Airlie, Christoph Hellwig, Damien Le Moal,
	Johannes Thumshirn, Chaitanya Kulkarni, Sarvela, Tomi P,
	Linux Memory Management List, Andrew Morton, intel-gfx

Ok, slightly delayed by dinner, but commit caf6912f3f4a ("swap: fix
swapfile read/write offset") is out in my tree now.

Dave - can you check that the current -git works for your CI people?

Thanks,
              Linus

On Tue, Mar 2, 2021 at 5:18 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 3/2/21 6:01 PM, Linus Torvalds wrote:
> > On Tue, Mar 2, 2021 at 4:36 PM Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >> Or if you want a pull, just let me know. Have another misc patch to
> >> flush out anyway that doesn't belong in any of my usual branches.
> >
> > Ok, if you have something else pending anyway, let's do that. Send me
> > the pull request, and I'll take it asap.
>
> Done
>
> --
> Jens Axboe
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
@ 2021-03-03  2:48                       ` Linus Torvalds
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2021-03-03  2:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Damien Le Moal, Chaitanya Kulkarni, Sarvela, Tomi P,
	Johannes Thumshirn, intel-gfx, Linux Memory Management List,
	Andrew Morton, Christoph Hellwig

Ok, slightly delayed by dinner, but commit caf6912f3f4a ("swap: fix
swapfile read/write offset") is out in my tree now.

Dave - can you check that the current -git works for your CI people?

Thanks,
              Linus

On Tue, Mar 2, 2021 at 5:18 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 3/2/21 6:01 PM, Linus Torvalds wrote:
> > On Tue, Mar 2, 2021 at 4:36 PM Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >> Or if you want a pull, just let me know. Have another misc patch to
> >> flush out anyway that doesn't belong in any of my usual branches.
> >
> > Ok, if you have something else pending anyway, let's do that. Send me
> > the pull request, and I'll take it asap.
>
> Done
>
> --
> Jens Axboe
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 23:27       ` Linus Torvalds
  2021-03-02 23:38         ` Dave Airlie
@ 2021-03-03  9:38         ` Sarvela, Tomi P
  1 sibling, 0 replies; 20+ messages in thread
From: Sarvela, Tomi P @ 2021-03-03  9:38 UTC (permalink / raw)
  To: Linus Torvalds, Dave Airlie, Jens Axboe, Christoph Hellwig,
	Damien Le Moal, Johannes Thumshirn, Chaitanya Kulkarni
  Cc: Linux Memory Management List, Andrew Morton

From my earlier message on the mailing list: 
[...] "Hitting the bug corrupts the underlying filesystem very thoroughly, wiping out large amount of data from the beginning of the partition which leaves fsck sad with thousands of items lost. Bisection of the IGT testlist was done with two root filesystems, where testable kernel booted from 2. partition, and copy of the 2. partition was stored on 1. partition and could be restored at will."

The CI public interface doesn't really show this: the hosts started testing, died, and in boot stuck to the grub menu because grub.cfg (or anything) wasn't available on root disk.

Decision to shut down the extended testing was mine, when I saw ~1 host per shard dying each testing round (couple of hosts per hour).

It's a kind of bug our CI is not handling well, because on the catastrophic scale the effects are close to the maximum (where max would be permanent hw damage), and cause is not related to i915 at all.

Regards,

Tomi Sarvela


> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Wednesday, March 3, 2021 1:28 AM
> To: Dave Airlie <airlied@gmail.com>; Jens Axboe <axboe@kernel.dk>;
> Christoph Hellwig <hch@lst.de>; Damien Le Moal
> <damien.lemoal@wdc.com>; Johannes Thumshirn
> <johannes.thumshirn@wdc.com>; Chaitanya Kulkarni
> <chaitanya.kulkarni@wdc.com>
> Cc: Sarvela, Tomi P <tomi.p.sarvela@intel.com>; Linux Memory Management
> List <linux-mm@kvack.org>; Andrew Morton <akpm@linux-foundation.org>;
> intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] Public i915 CI shardruns are disabled
> 
> Adding the right people.
> 
> It seems that the three commits that needed reverting are
> 
>   f885056a48cc ("mm: simplify swapdev_block")
>   3e3126cf2a6d ("mm: only make map_swap_entry available for
> CONFIG_HIBERNATION")
>   48d15436fde6 ("mm: remove get_swap_bio")
> 
> and while they look very harmless to me, let's bring in Christoph and
> Jens who were actually involved with them.
> 
> I'm assuming that it's that third one that is the real issue (and the
> two other ones were to get to it), but it would also be good to know
> what the actual details of the regression actually were.
> 
> Maybe that's obvious to somebody who has more context about the 9815
> CI runs and its web interface, but it sure isn't clear to me.
> 
> Jens, Christoph?
> 
>                   Linus
> 
> On Tue, Mar 2, 2021 at 11:31 AM Dave Airlie <airlied@gmail.com> wrote:
> >
> > On Wed, 3 Mar 2021 at 03:27, Sarvela, Tomi P <tomi.p.sarvela@intel.com>
> wrote:
> > >
> > > The regression has been identified; Chris Wilson found commits touching
> > >
> > > swapfile.c, and reverting them the issue couldn’t be reproduced any
> more.
> > >
> > >
> > >
> > > https://patchwork.freedesktop.org/series/87549/
> > >
> > >
> > >
> > > This revert will be applied to core-for-CI branch. When new CI_DRM has
> > >
> > > been built, shard-testing will be enabled again.
> >
> > Just making sure this is on the radar upstream.
> >
> > Dave.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for Public i915 CI shardruns are disabled
  2021-03-02 11:37 [Intel-gfx] Public i915 CI shardruns are disabled Sarvela, Tomi P
  2021-03-02 13:50 ` Sarvela, Tomi P
@ 2021-03-03 18:28 ` Patchwork
  1 sibling, 0 replies; 20+ messages in thread
From: Patchwork @ 2021-03-03 18:28 UTC (permalink / raw)
  To: Jens Axboe; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 393 bytes --]

== Series Details ==

Series: Public i915 CI shardruns are disabled
URL   : https://patchwork.freedesktop.org/series/87558/
State : failure

== Summary ==

Applying: Public i915 CI shardruns are disabled
Using index info to reconstruct a base tree...
M	include/linux/swap.h
M	mm/page_io.c
M	mm/swapfile.c
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.



[-- Attachment #1.2: Type: text/html, Size: 903 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] Public i915 CI shardruns are disabled
  2021-03-02 17:26   ` Sarvela, Tomi P
  2021-03-02 19:31     ` Dave Airlie
@ 2021-03-09  8:31     ` Sarvela, Tomi P
  1 sibling, 0 replies; 20+ messages in thread
From: Sarvela, Tomi P @ 2021-03-09  8:31 UTC (permalink / raw)
  To: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 3020 bytes --]

It seems that the chops to be built has been re-defined several times in pipelines. Fixed.

https://github.com/intel-innersource/drivers.gpu.i915.ci.pipelines/commit/89d2f8174a15585c082b2f714551225ba6cafe08

Tomi

From: Sarvela, Tomi P
Sent: Tuesday, March 2, 2021 7:27 PM
To: 'intel-gfx@lists.freedesktop.org' <intel-gfx@lists.freedesktop.org>
Cc: Szwichtenberg, Radoslaw <radoslaw.szwichtenberg@intel.com>
Subject: RE: Public i915 CI shardruns are disabled

The regression has been identified; Chris Wilson found commits touching
swapfile.c, and reverting them the issue couldn't be reproduced any more.

https://patchwork.freedesktop.org/series/87549/

This revert will be applied to core-for-CI branch. When new CI_DRM has
been built, shard-testing will be enabled again.

Regards,

Tomi Sarvela

From: Sarvela, Tomi P
More information (excuse my top-posting):

- Issue happens in igt@gem_tiled_swapping@non-threaded Mlocking
phase, before "starting subtest" appears.

- Filesystem trashed is the one containing swapfile

- If swap is partition, it seems that the swap signature is correct even
after running the test, so for now I'm assuming that the issue has to do
with swapfile

- Bisection between 20210129 and 20210215 proved to be challenging,
because the kernels have pre-init hang, don't leave dmesg and I don't
have console on testing host. Petri's suggestion to bisect between
CI_DRM_9817 and 9818 might work better

Regards,

Tomi Sarvela

From: Sarvela, Tomi P
Hello,

The linux i915 CI shardruns have been disabled. This is due to the unfortunate
filesystem-corrupting bug first seen in linux-next 20210215, which now has
been merged to linus 5.12-rc1 and further on to DRM-Tip, first instance seen
in CI_DRM_9818. Last changes coming in were:

fb3b93df7979 drm-tip: 2021y-03m-01d-09h-36m-57s UTC integration manifest
3b3c4086295b drm-tip: 2021y-03m-01d-08h-49m-06s UTC integration manifest
fe07bfda2fb9 Linux 5.12-rc1

More information can be seen at:
https://phoronix.com/scan.php?page=news_item&px=Linux-5.12-Early-Buggy-Issue

I've seen this bug happen regularly with (but not limited to) IGT test:
igt@gem_tiled_swapping@non-threaded

The range for bisection is linux-next 20210215 to 20210129 because the kernels
in-between taint the kernel and our i915 testing was not done. Hitting the bug
corrupts the underlying filesystem very thoroughly, wiping out large amount of
data from the beginning of the partition which leaves fsck sad with thousands of
items lost. Bisection of the IGT testlist was done with two root filesystems, where
testable kernel booted from 2. partition, and copy of the 2. partition was stored
on 1. partition and could be restored at will.

I'll continue bisecting this bug on the linux-next tree again. If someone has more
information where this issue originates from, help would be appreciated.

Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo


[-- Attachment #1.2: Type: text/html, Size: 10741 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-03-09  8:31 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-02 11:37 [Intel-gfx] Public i915 CI shardruns are disabled Sarvela, Tomi P
2021-03-02 13:50 ` Sarvela, Tomi P
2021-03-02 17:26   ` Sarvela, Tomi P
2021-03-02 19:31     ` Dave Airlie
2021-03-02 23:27       ` Linus Torvalds
2021-03-02 23:38         ` Dave Airlie
2021-03-02 23:56           ` Linus Torvalds
2021-03-02 23:56             ` Linus Torvalds
2021-03-03  0:15             ` Jens Axboe
2021-03-03  0:15               ` Jens Axboe
2021-03-03  0:36               ` Jens Axboe
2021-03-03  1:01                 ` Linus Torvalds
2021-03-03  1:01                   ` Linus Torvalds
2021-03-03  1:18                   ` Jens Axboe
2021-03-03  1:18                     ` Jens Axboe
2021-03-03  2:48                     ` Linus Torvalds
2021-03-03  2:48                       ` Linus Torvalds
2021-03-03  9:38         ` Sarvela, Tomi P
2021-03-09  8:31     ` Sarvela, Tomi P
2021-03-03 18:28 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.