All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Zmudzinski <brchuckz@netscape.net>
To: Jan Beulich <jbeulich@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Jani Nikula <jani.nikula@linux.intel.com>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>,
	xen-devel@lists.xenproject.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, intel-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, Juergen Gross <jgross@suse.com>
Subject: Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
Date: Fri, 20 May 2022 02:59:11 -0400	[thread overview]
Message-ID: <8b1ebea5-7820-69c4-2e2b-9866d55bc180@netscape.net> (raw)
In-Reply-To: <a2e95587-418b-879f-2468-8699a6df4a6a@suse.com>

On 5/20/2022 2:05 AM, Jan Beulich wrote:
> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>> Some drivers are using pat_enabled() in order to test availability of
>>>>> special caching modes (WC and UC-). This will lead to false negatives
>>>>> in case the system was booted e.g. with the "nopat" variant and the
>>>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>>>> system is running as a Xen PV guest.
>>>> ...
>>>>> Add test functions for those caching modes instead and use them at the
>>>>> appropriate places.
>>>>>
>>>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with
>>>>> pat_enabled()")
>>>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ...
>>>>
>>>>> --- a/arch/x86/include/asm/pci.h
>>>>> +++ b/arch/x86/include/asm/pci.h
>>>>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev,
>>>>> int pin, int irq);
>>>>>        #define HAVE_PCI_MMAP
>>>>> -#define arch_can_pci_mmap_wc()    pat_enabled()
>>>>> +#define arch_can_pci_mmap_wc()    x86_has_pat_wc()
>>>> Besides this and ...
>>>>
>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void
>>>>> *data,
>>>>>        if (args->flags & ~(I915_MMAP_WC))
>>>>>            return -EINVAL;
>>>>>    -    if (args->flags & I915_MMAP_WC && !pat_enabled())
>>>>> +    if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>>>>            return -ENODEV;
>>>>>          obj = i915_gem_object_lookup(file, args->handle);
>>>>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>>>>          if (HAS_LMEM(to_i915(dev)))
>>>>>            mmap_type = I915_MMAP_TYPE_FIXED;
>>>>> -    else if (pat_enabled())
>>>>> +    else if (x86_has_pat_wc())
>>>>>            mmap_type = I915_MMAP_TYPE_WC;
>>>>>        else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>>>>            return -ENODEV;
>>>>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device
>>>>> *dev, void *data,
>>>>>            break;
>>>>>          case I915_MMAP_OFFSET_WC:
>>>>> -        if (!pat_enabled())
>>>>> +        if (!x86_has_pat_wc())
>>>>>                return -ENODEV;
>>>>>            type = I915_MMAP_TYPE_WC;
>>>>>            break;
>>>>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device
>>>>> *dev, void *data,
>>>>>            break;
>>>>>          case I915_MMAP_OFFSET_UC:
>>>>> -        if (!pat_enabled())
>>>>> +        if (!x86_has_pat_uc_minus())
>>>>>                return -ENODEV;
>>>>>            type = I915_MMAP_TYPE_UC;
>>>>>            break;
>>>> ... these uses there are several more. You say nothing on why those want
>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>> and came to the conclusion that these all would also better observe the
>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>> only
>>>> predicate). In fact, as said in the description of my earlier patch, in
>>>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>>>> problematic one, which you leave alone.
>>> Oh, I missed that one, sorry.
>> That is why your patch would not fix my Haswell unless
>> it also touches i915_gem_object_pin_map() in
>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>
>>> I wanted to be rather defensive in my changes, but I agree at least the
>>> case in arch_phys_wc_add() might want to be changed, too.
>> I think your approach needs to be more aggressive so it will fix
>> all the known false negatives introduced by bdd8b6c98239
>> such as the one in i915_gem_object_pin_map().
>>
>> I looked at Jan's approach and I think it would fix the issue
>> with my Haswell as long as I don't use the nopat option. I
>> really don't have a strong opinion on that question, but I
>> think the nopat option as a Linux kernel option, as opposed
>> to a hypervisor option, should only affect the kernel, and
>> if the hypervisor provides the pat feature, then the kernel
>> should not override that,
> Hmm, why would the kernel not be allowed to override that? Such
> an override would affect only the single domain where the
> kernel runs; other domains could take their own decisions.
>
> Also, for the sake of completeness: "nopat" used when running on
> bare metal has the same bad effect on system boot, so there
> pretty clearly is an error cleanup issue in the i915 driver. But
> that's orthogonal, and I expect the maintainers may not even care
> (but tell us "don't do that then").
>
> Jan
>
>> but because of the confusion,

As I just wrote earlier, the confusion is whether or not "nopat"
means the kernel drivers will not use pat even if the firmware
and hypervisor provides it. I think you are correct to
point out that is the way the i915 driver behaved with the nopat
option before bdd8b6c98239 was applied, with the same
bad effects on bare metal as with the hypervisor. I think perhaps
dealing with the nopat option to fix bdd8b6c98239 is a solution in
search of a problem, at least as regards the i915 driver.

The only problem we have, as I see it, is with a false negative
when the nopat option is *not* enabled. But the forced disabling
of pat in Jan's patch when the nopat option is enabled is probably
needed if the goal of the patch is to preserve the same
behavior of the i915 driver that it had before bdd8b6c98239
was applied.

In any case, especially if we do include Jan's aggressive approach
of disabling pat with the nopat option and preserving the same bad
behavior we had with nopat before bdd8b6c98239 was applied, the
i915 driver should log a warning when pat is disabled. Right now,
the driver returns -ENODEV with the problem in
i915_gem_object_pin_map(), but it does not log an error. The only
log message I get now is the add_taint_for_CI in intel_gt_init
which was not very helpful information for debugging
this problem. It was only the starting point of a longer debugging
process because of a lack of error log messages in the i915 driver.

Chuck

WARNING: multiple messages have this Message-ID (diff)
From: Chuck Zmudzinski <brchuckz@netscape.net>
To: Jan Beulich <jbeulich@suse.com>
Cc: Juergen Gross <jgross@suse.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	intel-gfx@lists.freedesktop.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	David Airlie <airlied@linux.ie>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	dri-devel@lists.freedesktop.org,
	Andy Lutomirski <luto@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	xen-devel@lists.xenproject.org,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 2/2] x86/pat: add functions to query specific cache mode availability
Date: Fri, 20 May 2022 02:59:11 -0400	[thread overview]
Message-ID: <8b1ebea5-7820-69c4-2e2b-9866d55bc180@netscape.net> (raw)
In-Reply-To: <a2e95587-418b-879f-2468-8699a6df4a6a@suse.com>

On 5/20/2022 2:05 AM, Jan Beulich wrote:
> On 20.05.2022 06:43, Chuck Zmudzinski wrote:
>> On 5/4/22 5:14 AM, Juergen Gross wrote:
>>> On 04.05.22 10:31, Jan Beulich wrote:
>>>> On 03.05.2022 15:22, Juergen Gross wrote:
>>>>> Some drivers are using pat_enabled() in order to test availability of
>>>>> special caching modes (WC and UC-). This will lead to false negatives
>>>>> in case the system was booted e.g. with the "nopat" variant and the
>>>>> BIOS did setup the PAT MSR supporting the queried mode, or if the
>>>>> system is running as a Xen PV guest.
>>>> ...
>>>>> Add test functions for those caching modes instead and use them at the
>>>>> appropriate places.
>>>>>
>>>>> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with
>>>>> pat_enabled()")
>>>>> Fixes: ae749c7ab475 ("PCI: Add arch_can_pci_mmap_wc() macro")
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ...
>>>>
>>>>> --- a/arch/x86/include/asm/pci.h
>>>>> +++ b/arch/x86/include/asm/pci.h
>>>>> @@ -94,7 +94,7 @@ int pcibios_set_irq_routing(struct pci_dev *dev,
>>>>> int pin, int irq);
>>>>>        #define HAVE_PCI_MMAP
>>>>> -#define arch_can_pci_mmap_wc()    pat_enabled()
>>>>> +#define arch_can_pci_mmap_wc()    x86_has_pat_wc()
>>>> Besides this and ...
>>>>
>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>>>> @@ -76,7 +76,7 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void
>>>>> *data,
>>>>>        if (args->flags & ~(I915_MMAP_WC))
>>>>>            return -EINVAL;
>>>>>    -    if (args->flags & I915_MMAP_WC && !pat_enabled())
>>>>> +    if (args->flags & I915_MMAP_WC && !x86_has_pat_wc())
>>>>>            return -ENODEV;
>>>>>          obj = i915_gem_object_lookup(file, args->handle);
>>>>> @@ -757,7 +757,7 @@ i915_gem_dumb_mmap_offset(struct drm_file *file,
>>>>>          if (HAS_LMEM(to_i915(dev)))
>>>>>            mmap_type = I915_MMAP_TYPE_FIXED;
>>>>> -    else if (pat_enabled())
>>>>> +    else if (x86_has_pat_wc())
>>>>>            mmap_type = I915_MMAP_TYPE_WC;
>>>>>        else if (!i915_ggtt_has_aperture(to_gt(i915)->ggtt))
>>>>>            return -ENODEV;
>>>>> @@ -813,7 +813,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device
>>>>> *dev, void *data,
>>>>>            break;
>>>>>          case I915_MMAP_OFFSET_WC:
>>>>> -        if (!pat_enabled())
>>>>> +        if (!x86_has_pat_wc())
>>>>>                return -ENODEV;
>>>>>            type = I915_MMAP_TYPE_WC;
>>>>>            break;
>>>>> @@ -823,7 +823,7 @@ i915_gem_mmap_offset_ioctl(struct drm_device
>>>>> *dev, void *data,
>>>>>            break;
>>>>>          case I915_MMAP_OFFSET_UC:
>>>>> -        if (!pat_enabled())
>>>>> +        if (!x86_has_pat_uc_minus())
>>>>>                return -ENODEV;
>>>>>            type = I915_MMAP_TYPE_UC;
>>>>>            break;
>>>> ... these uses there are several more. You say nothing on why those want
>>>> leaving unaltered. When preparing my earlier patch I did inspect them
>>>> and came to the conclusion that these all would also better observe the
>>>> adjusted behavior (or else I couldn't have left pat_enabled() as the
>>>> only
>>>> predicate). In fact, as said in the description of my earlier patch, in
>>>> my debugging I did find the use in i915_gem_object_pin_map() to be the
>>>> problematic one, which you leave alone.
>>> Oh, I missed that one, sorry.
>> That is why your patch would not fix my Haswell unless
>> it also touches i915_gem_object_pin_map() in
>> drivers/gpu/drm/i915/gem/i915_gem_pages.c
>>
>>> I wanted to be rather defensive in my changes, but I agree at least the
>>> case in arch_phys_wc_add() might want to be changed, too.
>> I think your approach needs to be more aggressive so it will fix
>> all the known false negatives introduced by bdd8b6c98239
>> such as the one in i915_gem_object_pin_map().
>>
>> I looked at Jan's approach and I think it would fix the issue
>> with my Haswell as long as I don't use the nopat option. I
>> really don't have a strong opinion on that question, but I
>> think the nopat option as a Linux kernel option, as opposed
>> to a hypervisor option, should only affect the kernel, and
>> if the hypervisor provides the pat feature, then the kernel
>> should not override that,
> Hmm, why would the kernel not be allowed to override that? Such
> an override would affect only the single domain where the
> kernel runs; other domains could take their own decisions.
>
> Also, for the sake of completeness: "nopat" used when running on
> bare metal has the same bad effect on system boot, so there
> pretty clearly is an error cleanup issue in the i915 driver. But
> that's orthogonal, and I expect the maintainers may not even care
> (but tell us "don't do that then").
>
> Jan
>
>> but because of the confusion,

As I just wrote earlier, the confusion is whether or not "nopat"
means the kernel drivers will not use pat even if the firmware
and hypervisor provides it. I think you are correct to
point out that is the way the i915 driver behaved with the nopat
option before bdd8b6c98239 was applied, with the same
bad effects on bare metal as with the hypervisor. I think perhaps
dealing with the nopat option to fix bdd8b6c98239 is a solution in
search of a problem, at least as regards the i915 driver.

The only problem we have, as I see it, is with a false negative
when the nopat option is *not* enabled. But the forced disabling
of pat in Jan's patch when the nopat option is enabled is probably
needed if the goal of the patch is to preserve the same
behavior of the i915 driver that it had before bdd8b6c98239
was applied.

In any case, especially if we do include Jan's aggressive approach
of disabling pat with the nopat option and preserving the same bad
behavior we had with nopat before bdd8b6c98239 was applied, the
i915 driver should log a warning when pat is disabled. Right now,
the driver returns -ENODEV with the problem in
i915_gem_object_pin_map(), but it does not log an error. The only
log message I get now is the add_taint_for_CI in intel_gt_init
which was not very helpful information for debugging
this problem. It was only the starting point of a longer debugging
process because of a lack of error log messages in the i915 driver.

Chuck

  reply	other threads:[~2022-05-20  6:59 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-03 13:22 [PATCH 0/2] x86/pat: fix querying available caching modes Juergen Gross
2022-05-03 13:22 ` [Intel-gfx] " Juergen Gross
2022-05-03 13:22 ` Juergen Gross
2022-05-03 13:22 ` [PATCH 1/2] x86/pat: fix x86_has_pat_wp() Juergen Gross
2022-05-27 10:21   ` Juergen Gross
2022-06-14 15:09   ` Juergen Gross
2022-06-20  5:22     ` Thorsten Leemhuis
2022-06-20  5:30       ` Juergen Gross
2022-06-20  6:15         ` Thorsten Leemhuis
2022-06-20 10:26   ` Borislav Petkov
2022-06-20 10:41     ` Juergen Gross
2022-06-20 15:27       ` Dave Hansen
2022-06-20 15:34         ` Juergen Gross
2022-05-03 13:22 ` [PATCH 2/2] x86/pat: add functions to query specific cache mode availability Juergen Gross
2022-05-03 13:22   ` [Intel-gfx] " Juergen Gross
2022-05-03 13:22   ` Juergen Gross
2022-05-04  8:31   ` Jan Beulich
2022-05-04  8:31     ` [Intel-gfx] " Jan Beulich
2022-05-04  8:31     ` Jan Beulich
2022-05-04  9:14     ` Juergen Gross
2022-05-04  9:14       ` [Intel-gfx] " Juergen Gross
2022-05-04  9:14       ` Juergen Gross
2022-05-04  9:51       ` Jan Beulich
2022-05-04  9:51         ` [Intel-gfx] " Jan Beulich
2022-05-04  9:51         ` Jan Beulich
2022-05-20  4:43       ` Chuck Zmudzinski
2022-05-20  4:43         ` Chuck Zmudzinski
2022-05-20  5:56         ` Chuck Zmudzinski
2022-05-20  5:56           ` Chuck Zmudzinski
2022-05-20  6:05         ` Jan Beulich
2022-05-20  6:05           ` Jan Beulich
2022-05-20  6:59           ` Chuck Zmudzinski [this message]
2022-05-20  6:59             ` Chuck Zmudzinski
2022-05-20  8:30             ` Chuck Zmudzinski
2022-05-20  8:30               ` Chuck Zmudzinski
2022-05-20  9:41               ` Jan Beulich
2022-05-20  9:41                 ` Jan Beulich
2022-05-20 13:33                 ` Chuck Zmudzinski
2022-05-20 13:33                   ` Chuck Zmudzinski
2022-05-20 14:06                   ` Jan Beulich
2022-05-20 14:06                     ` Jan Beulich
2022-05-20 14:48                     ` Chuck Zmudzinski
2022-05-20 14:48                       ` Chuck Zmudzinski
2022-05-21 10:47                       ` Thorsten Leemhuis
2022-05-21 10:47                         ` [Intel-gfx] " Thorsten Leemhuis
2022-05-21 10:47                         ` Thorsten Leemhuis
2022-05-24 18:32                         ` Chuck Zmudzinski
2022-05-24 18:32                           ` [Intel-gfx] " Chuck Zmudzinski
2022-05-24 18:32                           ` Chuck Zmudzinski
2022-05-25  7:45                           ` [Intel-gfx] " Thorsten Leemhuis
2022-05-25  7:45                             ` Thorsten Leemhuis
2022-05-25  7:45                             ` Thorsten Leemhuis
2022-05-25  8:04                             ` Juergen Gross
2022-05-25  8:04                               ` [Intel-gfx] " Juergen Gross
2022-05-25  8:04                               ` Juergen Gross
2022-05-25  8:37                             ` Jan Beulich
2022-05-25  8:37                               ` Jan Beulich
2022-05-25  8:51                               ` Thorsten Leemhuis
2022-05-25  8:51                                 ` [Intel-gfx] " Thorsten Leemhuis
2022-05-25  8:51                                 ` Thorsten Leemhuis
2022-05-25 19:25                             ` Chuck Zmudzinski
2022-05-25 19:25                               ` [Intel-gfx] " Chuck Zmudzinski
2022-05-25 19:25                               ` Chuck Zmudzinski
2022-05-20 15:46                     ` [REGRESSION} " Chuck Zmudzinski
2022-05-20 15:46                       ` Chuck Zmudzinski
2022-05-20 17:13                       ` Chuck Zmudzinski
2022-05-20 17:13                         ` Chuck Zmudzinski
2022-05-20 17:17                         ` Chuck Zmudzinski
2022-05-20 17:17                           ` Chuck Zmudzinski
2022-05-18 13:45   ` Christoph Hellwig
2022-05-18 13:45     ` [Intel-gfx] " Christoph Hellwig
2022-05-20  2:15   ` Chuck Zmudzinski
2022-05-20  2:15     ` Chuck Zmudzinski
2022-05-20  2:21     ` Chuck Zmudzinski
2022-05-20  2:21       ` Chuck Zmudzinski
2022-05-21 13:24   ` Chuck Zmudzinski
2022-05-21 13:24     ` Chuck Zmudzinski
2022-07-11  9:46 ` [tip: x86/urgent] x86/pat: Fix x86_has_pat_wp() tip-bot2 for Juergen Gross
2022-07-13 10:45 ` tip-bot2 for Juergen Gross
2022-07-13 10:52 ` tip-bot2 for Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8b1ebea5-7820-69c4-2e2b-9866d55bc180@netscape.net \
    --to=brchuckz@netscape.net \
    --cc=airlied@linux.ie \
    --cc=bp@alien8.de \
    --cc=daniel@ffwll.ch \
    --cc=dave.hansen@linux.intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hpa@zytor.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tvrtko.ursulin@linux.intel.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.