All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	Jani Nikula <jani.nikula@linux.intel.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>,
	Tvrtko Ursulin <tvrtko.ursulin@intel.com>,
	Matthew Auld <matthew.auld@intel.com>,
	<intel-gfx@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>,
	LKML <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>
Subject: Solved: [PATCH 0/4] mm/gup, drm/i915: refactor gup_fast, convert to pin_user_pages()
Date: Thu, 21 May 2020 13:40:39 -0700	[thread overview]
Message-ID: <7d79c089-7b21-cf7f-66ea-078d44c5e007@nvidia.com> (raw)
In-Reply-To: <b907c1d5-b95a-3d00-cafa-0a321f0141d8@nvidia.com>

On 2020-05-21 12:11, John Hubbard wrote:
> On 2020-05-21 11:57, Chris Wilson wrote:
>> Quoting John Hubbard (2020-05-19 01:21:20)
>>> This needs to go through Andrew's -mm tree, due to adding a new gup.c
>>> routine. However, I would really love to have some testing from the
>>> drm/i915 folks, because I haven't been able to run-time test that part
>>> of it.
>>
>> CI hit
>>
>> <4> [185.667750] WARNING: CPU: 0 PID: 1387 at mm/gup.c:2699 
>> internal_get_user_pages_fast+0x63a/0xac0


OK, what happened here is that it's WARN()'ing due to passing in the new
FOLL_FAST_ONLY flag, which was not added to the whitelist.

So the fix is easy, and should be applied to the refactoring patch. I'll
send out a v2 of the series, which will effectively have this applied:


diff --git a/mm/gup.c b/mm/gup.c
index 6cbe98c93466..4f0ca3f849d1 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2696,7 +2696,8 @@ static int internal_get_user_pages_fast(unsigned long start, 
int nr_pages,
  	int nr_pinned = 0, ret = 0;

  	if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM |
-				       FOLL_FORCE | FOLL_PIN | FOLL_GET)))
+				       FOLL_FORCE | FOLL_PIN | FOLL_GET |
+				       FOLL_FAST_ONLY)))
  		return -EINVAL;

  	start = untagged_addr(start) & PAGE_MASK;


>> <4> [185.667752] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek 
>> snd_hda_codec_generic i915 mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel 
>> snd_intel_dspcfg crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep snd_hda_core 
>> ghash_clmulni_intel cdc_ether usbnet mii snd_pcm e1000e mei_me ptp pps_core mei 
>> intel_lpss_pci prime_numbers
>> <4> [185.667774] CPU: 0 PID: 1387 Comm: gem_userptr_bli Tainted: G     U            
>> 5.7.0-rc5-CI-Patchwork_17704+ #1
>> <4> [185.667777] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake 
>> U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
>> <4> [185.667782] RIP: 0010:internal_get_user_pages_fast+0x63a/0xac0
>> <4> [185.667785] Code: 24 40 08 48 39 5c 24 38 49 89 df 0f 85 74 fc ff ff 48 83 44 
>> 24 50 08 48 39 5c 24 58 49 89 dc 0f 85 e0 fb ff ff e9 14 fe ff ff <0f> 0b b8 ea ff 
>> ff ff e9 36 fb ff ff 4c 89 e8 48 21 e8 48 39 e8 0f
>> <4> [185.667789] RSP: 0018:ffffc90001133c38 EFLAGS: 00010206
>> <4> [185.667792] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8884999ee800
>> <4> [185.667795] RDX: 00000000000c0001 RSI: 0000000000000100 RDI: 00007f419e774000
>> <4> [185.667798] RBP: ffff888453dbf040 R08: 0000000000000000 R09: 0000000000000001
>> <4> [185.667800] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888453dbf380
>> <4> [185.667803] R13: ffff8884999ee800 R14: ffff888453dbf3e8 R15: 0000000000000040
>> <4> [185.667806] FS:  00007f419e875e40(0000) GS:ffff88849fe00000(0000) 
>> knlGS:0000000000000000
>> <4> [185.667808] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> <4> [185.667811] CR2: 00007f419e873000 CR3: 0000000458bd2004 CR4: 0000000000760ef0
>> <4> [185.667814] PKRU: 55555554
>> <4> [185.667816] Call Trace:
>> <4> [185.667912]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>> <4> [185.667918]  ? mark_held_locks+0x49/0x70
>> <4> [185.667998]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>> <4> [185.668073]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>>
>> and then panicked, across a range of systems.
>> -Chris
>>

btw, the panic seems to indicate an additional, pre-existing problem:
i915_gem_userptr_get_pages(), in this case at least, is not able to
recover from a get_user_pages/pin_user_pages failure.


thanks,
-- 
John Hubbard
NVIDIA

WARNING: multiple messages have this Message-ID
From: John Hubbard <jhubbard@nvidia.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	dri-devel@lists.freedesktop.org,
	Tvrtko Ursulin <tvrtko.ursulin@intel.com>,
	David Airlie <airlied@linux.ie>,
	intel-gfx@lists.freedesktop.org,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Souptick Joarder <jrdr.linux@gmail.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Matthew Auld <matthew.auld@intel.com>
Subject: Solved: [PATCH 0/4] mm/gup, drm/i915: refactor gup_fast, convert to pin_user_pages()
Date: Thu, 21 May 2020 13:40:39 -0700	[thread overview]
Message-ID: <7d79c089-7b21-cf7f-66ea-078d44c5e007@nvidia.com> (raw)
In-Reply-To: <b907c1d5-b95a-3d00-cafa-0a321f0141d8@nvidia.com>

On 2020-05-21 12:11, John Hubbard wrote:
> On 2020-05-21 11:57, Chris Wilson wrote:
>> Quoting John Hubbard (2020-05-19 01:21:20)
>>> This needs to go through Andrew's -mm tree, due to adding a new gup.c
>>> routine. However, I would really love to have some testing from the
>>> drm/i915 folks, because I haven't been able to run-time test that part
>>> of it.
>>
>> CI hit
>>
>> <4> [185.667750] WARNING: CPU: 0 PID: 1387 at mm/gup.c:2699 
>> internal_get_user_pages_fast+0x63a/0xac0


OK, what happened here is that it's WARN()'ing due to passing in the new
FOLL_FAST_ONLY flag, which was not added to the whitelist.

So the fix is easy, and should be applied to the refactoring patch. I'll
send out a v2 of the series, which will effectively have this applied:


diff --git a/mm/gup.c b/mm/gup.c
index 6cbe98c93466..4f0ca3f849d1 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2696,7 +2696,8 @@ static int internal_get_user_pages_fast(unsigned long start, 
int nr_pages,
  	int nr_pinned = 0, ret = 0;

  	if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM |
-				       FOLL_FORCE | FOLL_PIN | FOLL_GET)))
+				       FOLL_FORCE | FOLL_PIN | FOLL_GET |
+				       FOLL_FAST_ONLY)))
  		return -EINVAL;

  	start = untagged_addr(start) & PAGE_MASK;


>> <4> [185.667752] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek 
>> snd_hda_codec_generic i915 mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel 
>> snd_intel_dspcfg crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep snd_hda_core 
>> ghash_clmulni_intel cdc_ether usbnet mii snd_pcm e1000e mei_me ptp pps_core mei 
>> intel_lpss_pci prime_numbers
>> <4> [185.667774] CPU: 0 PID: 1387 Comm: gem_userptr_bli Tainted: G     U            
>> 5.7.0-rc5-CI-Patchwork_17704+ #1
>> <4> [185.667777] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake 
>> U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
>> <4> [185.667782] RIP: 0010:internal_get_user_pages_fast+0x63a/0xac0
>> <4> [185.667785] Code: 24 40 08 48 39 5c 24 38 49 89 df 0f 85 74 fc ff ff 48 83 44 
>> 24 50 08 48 39 5c 24 58 49 89 dc 0f 85 e0 fb ff ff e9 14 fe ff ff <0f> 0b b8 ea ff 
>> ff ff e9 36 fb ff ff 4c 89 e8 48 21 e8 48 39 e8 0f
>> <4> [185.667789] RSP: 0018:ffffc90001133c38 EFLAGS: 00010206
>> <4> [185.667792] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8884999ee800
>> <4> [185.667795] RDX: 00000000000c0001 RSI: 0000000000000100 RDI: 00007f419e774000
>> <4> [185.667798] RBP: ffff888453dbf040 R08: 0000000000000000 R09: 0000000000000001
>> <4> [185.667800] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888453dbf380
>> <4> [185.667803] R13: ffff8884999ee800 R14: ffff888453dbf3e8 R15: 0000000000000040
>> <4> [185.667806] FS:  00007f419e875e40(0000) GS:ffff88849fe00000(0000) 
>> knlGS:0000000000000000
>> <4> [185.667808] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> <4> [185.667811] CR2: 00007f419e873000 CR3: 0000000458bd2004 CR4: 0000000000760ef0
>> <4> [185.667814] PKRU: 55555554
>> <4> [185.667816] Call Trace:
>> <4> [185.667912]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>> <4> [185.667918]  ? mark_held_locks+0x49/0x70
>> <4> [185.667998]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>> <4> [185.668073]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>>
>> and then panicked, across a range of systems.
>> -Chris
>>

btw, the panic seems to indicate an additional, pre-existing problem:
i915_gem_userptr_get_pages(), in this case at least, is not able to
recover from a get_user_pages/pin_user_pages failure.


thanks,
-- 
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID
From: John Hubbard <jhubbard@nvidia.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	dri-devel@lists.freedesktop.org, David Airlie <airlied@linux.ie>,
	intel-gfx@lists.freedesktop.org,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Souptick Joarder <jrdr.linux@gmail.com>,
	Matthew Auld <matthew.auld@intel.com>
Subject: [Intel-gfx] Solved: [PATCH 0/4] mm/gup, drm/i915: refactor gup_fast, convert to pin_user_pages()
Date: Thu, 21 May 2020 13:40:39 -0700	[thread overview]
Message-ID: <7d79c089-7b21-cf7f-66ea-078d44c5e007@nvidia.com> (raw)
In-Reply-To: <b907c1d5-b95a-3d00-cafa-0a321f0141d8@nvidia.com>

On 2020-05-21 12:11, John Hubbard wrote:
> On 2020-05-21 11:57, Chris Wilson wrote:
>> Quoting John Hubbard (2020-05-19 01:21:20)
>>> This needs to go through Andrew's -mm tree, due to adding a new gup.c
>>> routine. However, I would really love to have some testing from the
>>> drm/i915 folks, because I haven't been able to run-time test that part
>>> of it.
>>
>> CI hit
>>
>> <4> [185.667750] WARNING: CPU: 0 PID: 1387 at mm/gup.c:2699 
>> internal_get_user_pages_fast+0x63a/0xac0


OK, what happened here is that it's WARN()'ing due to passing in the new
FOLL_FAST_ONLY flag, which was not added to the whitelist.

So the fix is easy, and should be applied to the refactoring patch. I'll
send out a v2 of the series, which will effectively have this applied:


diff --git a/mm/gup.c b/mm/gup.c
index 6cbe98c93466..4f0ca3f849d1 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2696,7 +2696,8 @@ static int internal_get_user_pages_fast(unsigned long start, 
int nr_pages,
  	int nr_pinned = 0, ret = 0;

  	if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM |
-				       FOLL_FORCE | FOLL_PIN | FOLL_GET)))
+				       FOLL_FORCE | FOLL_PIN | FOLL_GET |
+				       FOLL_FAST_ONLY)))
  		return -EINVAL;

  	start = untagged_addr(start) & PAGE_MASK;


>> <4> [185.667752] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek 
>> snd_hda_codec_generic i915 mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel 
>> snd_intel_dspcfg crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep snd_hda_core 
>> ghash_clmulni_intel cdc_ether usbnet mii snd_pcm e1000e mei_me ptp pps_core mei 
>> intel_lpss_pci prime_numbers
>> <4> [185.667774] CPU: 0 PID: 1387 Comm: gem_userptr_bli Tainted: G     U            
>> 5.7.0-rc5-CI-Patchwork_17704+ #1
>> <4> [185.667777] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake 
>> U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
>> <4> [185.667782] RIP: 0010:internal_get_user_pages_fast+0x63a/0xac0
>> <4> [185.667785] Code: 24 40 08 48 39 5c 24 38 49 89 df 0f 85 74 fc ff ff 48 83 44 
>> 24 50 08 48 39 5c 24 58 49 89 dc 0f 85 e0 fb ff ff e9 14 fe ff ff <0f> 0b b8 ea ff 
>> ff ff e9 36 fb ff ff 4c 89 e8 48 21 e8 48 39 e8 0f
>> <4> [185.667789] RSP: 0018:ffffc90001133c38 EFLAGS: 00010206
>> <4> [185.667792] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8884999ee800
>> <4> [185.667795] RDX: 00000000000c0001 RSI: 0000000000000100 RDI: 00007f419e774000
>> <4> [185.667798] RBP: ffff888453dbf040 R08: 0000000000000000 R09: 0000000000000001
>> <4> [185.667800] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888453dbf380
>> <4> [185.667803] R13: ffff8884999ee800 R14: ffff888453dbf3e8 R15: 0000000000000040
>> <4> [185.667806] FS:  00007f419e875e40(0000) GS:ffff88849fe00000(0000) 
>> knlGS:0000000000000000
>> <4> [185.667808] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> <4> [185.667811] CR2: 00007f419e873000 CR3: 0000000458bd2004 CR4: 0000000000760ef0
>> <4> [185.667814] PKRU: 55555554
>> <4> [185.667816] Call Trace:
>> <4> [185.667912]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>> <4> [185.667918]  ? mark_held_locks+0x49/0x70
>> <4> [185.667998]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>> <4> [185.668073]  ? i915_gem_userptr_get_pages+0x1c6/0x290 [i915]
>>
>> and then panicked, across a range of systems.
>> -Chris
>>

btw, the panic seems to indicate an additional, pre-existing problem:
i915_gem_userptr_get_pages(), in this case at least, is not able to
recover from a get_user_pages/pin_user_pages failure.


thanks,
-- 
John Hubbard
NVIDIA
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-05-21 20:40 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-19  0:21 John Hubbard
2020-05-19  0:21 ` [Intel-gfx] " John Hubbard
2020-05-19  0:21 ` John Hubbard
2020-05-19  0:21 ` [PATCH 1/4] mm/gup: move __get_user_pages_fast() down a few lines in gup.c John Hubbard
2020-05-19  0:21   ` [Intel-gfx] " John Hubbard
2020-05-19  0:21   ` John Hubbard
2020-05-19  0:21 ` [PATCH 2/4] mm/gup: refactor and de-duplicate gup_fast() code John Hubbard
2020-05-19  0:21   ` [Intel-gfx] " John Hubbard
2020-05-19  0:21   ` John Hubbard
2020-05-19  0:21 ` [PATCH 3/4] mm/gup: introduce pin_user_pages_fast_only() John Hubbard
2020-05-19  0:21   ` [Intel-gfx] " John Hubbard
2020-05-19  0:21   ` John Hubbard
2020-05-19  0:21 ` [PATCH 4/4] drm/i915: convert get_user_pages() --> pin_user_pages() John Hubbard
2020-05-19  0:21   ` [Intel-gfx] " John Hubbard
2020-05-19  0:21   ` John Hubbard
2020-05-19  0:50 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for mm/gup, drm/i915: refactor gup_fast, convert to pin_user_pages() Patchwork
2020-05-19  0:52 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-05-19  1:20 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-05-19  3:52 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2020-05-21 18:57 ` [PATCH 0/4] " Chris Wilson
2020-05-21 18:57   ` [Intel-gfx] " Chris Wilson
2020-05-21 18:57   ` Chris Wilson
2020-05-21 19:11   ` John Hubbard
2020-05-21 19:11     ` [Intel-gfx] " John Hubbard
2020-05-21 19:11     ` John Hubbard
2020-05-21 20:40     ` John Hubbard [this message]
2020-05-21 20:40       ` [Intel-gfx] Solved: " John Hubbard
2020-05-21 20:40       ` John Hubbard
2020-05-22 11:40 ` Souptick Joarder
2020-05-22 11:40   ` [Intel-gfx] " Souptick Joarder
2020-05-22 11:40   ` Souptick Joarder
2020-05-22 11:40   ` Souptick Joarder
2020-05-22 22:22   ` John Hubbard
2020-05-22 22:22     ` [Intel-gfx] " John Hubbard
2020-05-22 22:22     ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d79c089-7b21-cf7f-66ea-078d44c5e007@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=airlied@linux.ie \
    --cc=akpm@linux-foundation.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=jrdr.linux@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.auld@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=tvrtko.ursulin@intel.com \
    --cc=willy@infradead.org \
    --subject='Re: Solved: [PATCH 0/4] mm/gup, drm/i915: refactor gup_fast, convert to pin_user_pages()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.