All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jani Nikula <jani.nikula@intel.com>
To: Jim Cromie <jim.cromie@gmail.com>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org,
	intel-gvt-dev@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org
Cc: ville.syrjala@linux.intel.com, daniel.vetter@ffwll.ch,
	seanpaul@chromium.org, robdclark@gmail.com,
	Jim Cromie <jim.cromie@gmail.com>,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Maxime Ripard <mripard@kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	Dave Airlie <airlied@gmail.com>,
	Greg KH <gregkh@linuxfoundation.org>
Subject: Re: [PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression
Date: Fri, 03 Feb 2023 11:34:02 +0200	[thread overview]
Message-ID: <87a61v14ad.fsf@intel.com> (raw)
In-Reply-To: <20230125203743.564009-1-jim.cromie@gmail.com>

On Wed, 25 Jan 2023, Jim Cromie <jim.cromie@gmail.com> wrote:
> Hi everyone,
>
> In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in
> drivers at modprobe.

I realize we haven't actually addressed the regression in any way yet,
and any distro enabling DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE will have
DRM_USE_DYNAMIC_DEBUG=y by default, and we're hitting the issue with
trying to gather logs from users on v6.1 or later. It hampers debugging
pretty badly.

I appreciate the effort in fixing the problem properly here, but we'll
need a fix that we can backport to stable kernels.

Maybe just Ville's idea of

 config DRM_USE_DYNAMIC_DEBUG
        bool "use dynamic debug to implement drm.debug"
-       default y
+       default n
+       depends on BROKEN
        depends on DRM
        depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE

but we'll need that as a patch and merged and backported ASAP.

In the mean time, is there a workaround that the user could enable, say,
on the kernel command line, to enable drm debugs on driver kernel
modules, all the way from boot?


BR,
Jani.




>
> It is due to a chicken-egg problem loading modules; on `modprobe
> i915`, drm is loaded 1st, and drm/parameters/debug is set.  When
> drm_debug_enabled() tested __drm_debug at runtime, this just worked.
>
> But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a
> static_key for each drm_dbg/dyndbg callsite, enabled by dyndbg's
> kparam callback on __drm_debug.  So with drm.ko loaded and initialized
> before the dependent modules, their debug callsites aren't yet present
> to be enabled.
>
> STATUS - v3
>
> not quite ready.
> rebased on -rc5, hopefully applies to patchwork head 
> still has RFC patch -> CI_ONLY temporary, to avoid panics
> boots on my amdgpu box, drm.debug=0x3ff works at boot-time
> the "toggled" warning is repeatable with test_dynamic_debug*.ko
> it also occurs on amdgpu, so not just artificial.
> v2 is https://lore.kernel.org/lkml/20230113193016.749791-1-jim.cromie@gmail.com/
>
> OVERVIEW
>
> As Jani Nikula noted rather more gently, DECLARE_DYNDBG_CLASSMAP is
> error-prone enough to call broken: sharing of a common classmap
> required identical classmap definitions in all modules using DRM_UT_*,
> which is inherently error-prone.  IOW, it muddled the K&R distinction
> between a (single) definition, and multiple references.
>
> So patches 10-13 split it into:
>
> DYNDBG_CLASSMAP_DEFINE	used once per subsystem to define each classmap.
> DYNDBG_CLASSMAP_USE	declare dependence on a DEFINEd classmap.
>
> DYNDBG_CLASSMAP_DEFINE initializes the classmap, stores it into the
> (existing) __dyndbg_classes section, and exports the struct var
> (unlike DECLARE_DYNDBG_CLASSMAP).
>
> DYNDBG_CLASSMAP_USE initializes a class-ref struct, containing the
> user-module-name, and a ref to the exported classmap var.
>
> The distinction allows separate treatment of classmaps and
> classmap-refs, the latter getting additional behavior to propagate
> parent's kparam settings to USEr. (forex: drm.debug to drm-drivers) 
>
> . lookup the classmap defn being referenced, and its module
> . find the module's kernel-params using the classmap
> . propagate kparam vals into the prdgs in module being added.
>
> It also makes the weird coordinated-changes-by-identical-classmaps
> "feature" unnecessary.
>
> Patch-10 splits the DECLARE macro into DEFINE & USE, and updates uses.
>
> Patch-11 is the core of it; the separate treatment begins in
> ddebug_add_module().  It calls ddebug_attach_module_classes(1) to
> handle class-defns; this adds ddebug_attach_client_module_classes(2)
> to handle class-refs, as they are found while modprobing drm
> drivers. (2) calls ddebug_apply_parents_params(3) on each USEr's
> referred classmap definition.
>
> (3) scans kernel-params owned by the module DEFINEing the classmap,
> either builtin or loadable, calls ddebug_match_apply_kparam(4) on each.
>
> (4) looks for kparams which are wired to dyndbg's param-ops.  Those
> params have a struct ddebug_class_param attached, which has a classmap
> and a ref to a state-var (__drm_debug for DRM case).  If the kparam's
> classmap is the same as from (2), then apply its state-var to the
> client module by calling ddebug_apply_class_bitmap().
>
> Patch-12 cleans up DYNDBG_CLASSMAP_USE, dropping now unneeded args.
>
> Patch-13 improves DYNDBG_CLASSMAP_DEFINE, by accepting DRM_UT_*
> symbols directly, not "DRM_UT_*" (their strings).  It adds new
> include/linux/map.h to support this.
>
> Patches 1-9 are prep, refactor, cleanup, tighten interfaces
>
> Patches 15-18 extend test_dynamic_debug to recreate DRM's multi-module
> regression; it builds both test_dynamic_debug.ko and _submod.ko, with
> an ifdef to _DEFINE in the main module, and _USE in the submod.  This
> gives both modules identical set of prdbgs, which is helpful for
> comparing results.
>
> here it is, working properly:
>
> doing class DRM_UT_CORE -p
> [ 9904.961750] dyndbg: read 21 bytes from userspace
> [ 9904.962286] dyndbg: query 0: "class DRM_UT_CORE -p" mod:*
> [ 9904.962848] dyndbg: split into words: "class" "DRM_UT_CORE" "-p"
> [ 9904.963444] dyndbg: op='-' flags=0x0 maskp=0xfffffffe
> [ 9904.963945] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=DRM_UT_CORE
> [ 9904.964781] dyndbg: good-class: drm.DRM_UT_CORE  module:drm nd:302 nc:1 nu:0
> [ 9904.966411] dyndbg: class-ref: drm_kms_helper.DRM_UT_CORE  module:drm_kms_helper nd:95 nc:0 nu:1
> [ 9904.967265] dyndbg: class-ref: drm_display_helper.DRM_UT_CORE  module:drm_display_helper nd:150 nc:0 nu:1
> [ 9904.968349] dyndbg: class-ref: i915.DRM_UT_CORE  module:i915 nd:1659 nc:0 nu:1
> [ 9904.969801] dyndbg: class-ref: amdgpu.DRM_UT_CORE  module:amdgpu nd:4425 nc:0 nu:1
> [ 9904.977079] dyndbg: class-ref: nouveau.DRM_UT_CORE  module:nouveau nd:103 nc:0 nu:1
> [ 9904.977830] dyndbg: processed 1 queries, with 507 matches, 0 errs
> doing class DRM_UT_DRIVER +p
> [ 9906.151761] dyndbg: read 23 bytes from userspace
> [ 9906.152241] dyndbg: query 0: "class DRM_UT_DRIVER +p" mod:*
> [ 9906.152793] dyndbg: split into words: "class" "DRM_UT_DRIVER" "+p"
> [ 9906.153388] dyndbg: op='+' flags=0x1 maskp=0xffffffff
> [ 9906.153896] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=DRM_UT_DRIVER
> [ 9906.154746] dyndbg: good-class: drm.DRM_UT_DRIVER  module:drm nd:302 nc:1 nu:0
> [ 9906.155433] dyndbg: class-ref: drm_kms_helper.DRM_UT_DRIVER  module:drm_kms_helper nd:95 nc:0 nu:1
> [ 9906.156267] dyndbg: class-ref: drm_display_helper.DRM_UT_DRIVER  module:drm_display_helper nd:150 nc:0 nu:1
> [ 9906.157365] dyndbg: class-ref: i915.DRM_UT_DRIVER  module:i915 nd:1659 nc:0 nu:1
> [ 9906.163848] dyndbg: class-ref: amdgpu.DRM_UT_DRIVER  module:amdgpu nd:4425 nc:0 nu:1
> [ 9906.178963] dyndbg: class-ref: nouveau.DRM_UT_DRIVER  module:nouveau nd:103 nc:0 nu:1
> [ 9906.179934] dyndbg: processed 1 queries, with 1286 matches, 0 errs
>
>
> Patch-19 is a *workaround* for a panic: __jump_label_patch can "crash
> the box" when the jump-entry is in the wrong state.  The current code
> makes no distinction between a well-formed "toggled" state and an
> "insane" state.  Not for keeps.
>
> It fixes mis-initialization problems like this:
>
> [ 1594.032504] dyndbg: query 0: "class D2_DRIVER -p" mod:*
> [ 1594.032823] dyndbg: split into words: "class" "D2_DRIVER" "-p"
> [ 1594.033183] dyndbg: op='-' flags=0x0 maskp=0xfffffffe
> [ 1594.033507] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=D2_DRIVER
> [ 1594.034014] dyndbg: good-class: test_dynamic_debug.D2_DRIVER  module:test_dynamic_debug nd:32 nc:4 nu:0
> [ 1594.034695] dyndbg: changed lib/test_dynamic_debug.c:156 [test_dynamic_debug]do_cats p => _
> [ 1594.035304] dyndbg: class-ref: test_dynamic_debug_submod.D2_DRIVER  module:test_dynamic_debug_submod nd:32 nc:0 nu:4
> [ 1594.036052] jump_label: found toggled op at do_cats+0x16/0x180 [test_dynamic_debug_submod] [00000000ff2582ac] (0f 1f 44 00 00 != e9 e1 00 00 00)) size:5 type:0
> [ 1594.037036] dyndbg: changed lib/test_dynamic_debug.c:156 [test_dynamic_debug_submod]do_cats p => _
> [ 1594.037604] dyndbg: processed 1 queries, with 2 matches, 0 errs
> [ 1594.037968] dyndbg: bit_1: 2 matches on class: D2_DRIVER -> 0x0
>
> These errors are reliably reproduced by a shell-func which modprobes
> (with the right args) the test mod & submod.ko (in the commit message).
>
> So this isnt really ready for inclusion, but Id like to send the whole
> set to the CI-gym for a workout.  The RFC/for-TESTING patch will
> mitigate panics, and still be detectable.
>
> Besides, Murphys law requires I publish some error before I can make progress.
>
>
> Jim Cromie (19):
>   test-dyndbg: fixup CLASSMAP usage error
>   test-dyndbg: show that DEBUG enables prdbgs at compiletime
>   dyndbg: replace classmap list with a vector
>   dyndbg: make ddebug_apply_class_bitmap more selective
>   dyndbg: split param_set_dyndbg_classes to inner/outer fns
>   dyndbg: drop NUM_TYPE_ARRAY
>   dyndbg: reduce verbose/debug clutter
>   dyndbg: tighten ddebug_class_name() 1st arg
>   dyndbg: constify ddebug_apply_class_bitmap args
>   dyndbg-API: split DECLARE_(DYNDBG_CLASSMAP) to $1(_DEFINE|_USE)
>   dyndbg-API: specialize DYNDBG_CLASSMAP_(DEFINE|USE)
>   dyndbg-API: DYNDBG_CLASSMAP_USE drop extra args
>   dyndbg-API: DYNDBG_CLASSMAP_DEFINE() improvements
>   drm_print: fix stale macro-name in comment
>   test-dyndbg: build test_dynamic_debug_submod
>   test-dyndbg: rename DD_SYS_WRAP to DYNDBG_CLASSMAP_PARAM
>   test-dyndbg: disable WIP dyndbg-trace params
>   test-dyndbg: tune sub-module behavior
>   jump_label: RFC / temporary for CI - tolerate toggled state
>
>  arch/x86/kernel/jump_label.c            |  24 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  14 +-
>  drivers/gpu/drm/display/drm_dp_helper.c |  14 +-
>  drivers/gpu/drm/drm_crtc_helper.c       |  14 +-
>  drivers/gpu/drm/drm_print.c             |  22 +-
>  drivers/gpu/drm/i915/i915_params.c      |  14 +-
>  drivers/gpu/drm/nouveau/nouveau_drm.c   |  14 +-
>  include/asm-generic/vmlinux.lds.h       |   1 +
>  include/drm/drm_print.h                 |   6 +-
>  include/linux/dynamic_debug.h           |  57 ++++--
>  include/linux/map.h                     |  55 +++++
>  kernel/module/main.c                    |   3 +
>  lib/Makefile                            |   3 +-
>  lib/dynamic_debug.c                     | 258 ++++++++++++++++++------
>  lib/test_dynamic_debug.c                | 118 +++++++----
>  lib/test_dynamic_debug_submod.c         |  10 +
>  16 files changed, 437 insertions(+), 190 deletions(-)
>  create mode 100644 include/linux/map.h
>  create mode 100644 lib/test_dynamic_debug_submod.c

-- 
Jani Nikula, Intel Open Source Graphics Center

WARNING: multiple messages have this Message-ID (diff)
From: Jani Nikula <jani.nikula@intel.com>
To: Jim Cromie <jim.cromie@gmail.com>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org,
	intel-gvt-dev@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org
Cc: daniel.vetter@ffwll.ch, seanpaul@chromium.org,
	Thomas Zimmermann <tzimmermann@suse.de>,
	Greg KH <gregkh@linuxfoundation.org>
Subject: Re: [PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression
Date: Fri, 03 Feb 2023 11:34:02 +0200	[thread overview]
Message-ID: <87a61v14ad.fsf@intel.com> (raw)
In-Reply-To: <20230125203743.564009-1-jim.cromie@gmail.com>

On Wed, 25 Jan 2023, Jim Cromie <jim.cromie@gmail.com> wrote:
> Hi everyone,
>
> In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in
> drivers at modprobe.

I realize we haven't actually addressed the regression in any way yet,
and any distro enabling DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE will have
DRM_USE_DYNAMIC_DEBUG=y by default, and we're hitting the issue with
trying to gather logs from users on v6.1 or later. It hampers debugging
pretty badly.

I appreciate the effort in fixing the problem properly here, but we'll
need a fix that we can backport to stable kernels.

Maybe just Ville's idea of

 config DRM_USE_DYNAMIC_DEBUG
        bool "use dynamic debug to implement drm.debug"
-       default y
+       default n
+       depends on BROKEN
        depends on DRM
        depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE

but we'll need that as a patch and merged and backported ASAP.

In the mean time, is there a workaround that the user could enable, say,
on the kernel command line, to enable drm debugs on driver kernel
modules, all the way from boot?


BR,
Jani.




>
> It is due to a chicken-egg problem loading modules; on `modprobe
> i915`, drm is loaded 1st, and drm/parameters/debug is set.  When
> drm_debug_enabled() tested __drm_debug at runtime, this just worked.
>
> But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a
> static_key for each drm_dbg/dyndbg callsite, enabled by dyndbg's
> kparam callback on __drm_debug.  So with drm.ko loaded and initialized
> before the dependent modules, their debug callsites aren't yet present
> to be enabled.
>
> STATUS - v3
>
> not quite ready.
> rebased on -rc5, hopefully applies to patchwork head 
> still has RFC patch -> CI_ONLY temporary, to avoid panics
> boots on my amdgpu box, drm.debug=0x3ff works at boot-time
> the "toggled" warning is repeatable with test_dynamic_debug*.ko
> it also occurs on amdgpu, so not just artificial.
> v2 is https://lore.kernel.org/lkml/20230113193016.749791-1-jim.cromie@gmail.com/
>
> OVERVIEW
>
> As Jani Nikula noted rather more gently, DECLARE_DYNDBG_CLASSMAP is
> error-prone enough to call broken: sharing of a common classmap
> required identical classmap definitions in all modules using DRM_UT_*,
> which is inherently error-prone.  IOW, it muddled the K&R distinction
> between a (single) definition, and multiple references.
>
> So patches 10-13 split it into:
>
> DYNDBG_CLASSMAP_DEFINE	used once per subsystem to define each classmap.
> DYNDBG_CLASSMAP_USE	declare dependence on a DEFINEd classmap.
>
> DYNDBG_CLASSMAP_DEFINE initializes the classmap, stores it into the
> (existing) __dyndbg_classes section, and exports the struct var
> (unlike DECLARE_DYNDBG_CLASSMAP).
>
> DYNDBG_CLASSMAP_USE initializes a class-ref struct, containing the
> user-module-name, and a ref to the exported classmap var.
>
> The distinction allows separate treatment of classmaps and
> classmap-refs, the latter getting additional behavior to propagate
> parent's kparam settings to USEr. (forex: drm.debug to drm-drivers) 
>
> . lookup the classmap defn being referenced, and its module
> . find the module's kernel-params using the classmap
> . propagate kparam vals into the prdgs in module being added.
>
> It also makes the weird coordinated-changes-by-identical-classmaps
> "feature" unnecessary.
>
> Patch-10 splits the DECLARE macro into DEFINE & USE, and updates uses.
>
> Patch-11 is the core of it; the separate treatment begins in
> ddebug_add_module().  It calls ddebug_attach_module_classes(1) to
> handle class-defns; this adds ddebug_attach_client_module_classes(2)
> to handle class-refs, as they are found while modprobing drm
> drivers. (2) calls ddebug_apply_parents_params(3) on each USEr's
> referred classmap definition.
>
> (3) scans kernel-params owned by the module DEFINEing the classmap,
> either builtin or loadable, calls ddebug_match_apply_kparam(4) on each.
>
> (4) looks for kparams which are wired to dyndbg's param-ops.  Those
> params have a struct ddebug_class_param attached, which has a classmap
> and a ref to a state-var (__drm_debug for DRM case).  If the kparam's
> classmap is the same as from (2), then apply its state-var to the
> client module by calling ddebug_apply_class_bitmap().
>
> Patch-12 cleans up DYNDBG_CLASSMAP_USE, dropping now unneeded args.
>
> Patch-13 improves DYNDBG_CLASSMAP_DEFINE, by accepting DRM_UT_*
> symbols directly, not "DRM_UT_*" (their strings).  It adds new
> include/linux/map.h to support this.
>
> Patches 1-9 are prep, refactor, cleanup, tighten interfaces
>
> Patches 15-18 extend test_dynamic_debug to recreate DRM's multi-module
> regression; it builds both test_dynamic_debug.ko and _submod.ko, with
> an ifdef to _DEFINE in the main module, and _USE in the submod.  This
> gives both modules identical set of prdbgs, which is helpful for
> comparing results.
>
> here it is, working properly:
>
> doing class DRM_UT_CORE -p
> [ 9904.961750] dyndbg: read 21 bytes from userspace
> [ 9904.962286] dyndbg: query 0: "class DRM_UT_CORE -p" mod:*
> [ 9904.962848] dyndbg: split into words: "class" "DRM_UT_CORE" "-p"
> [ 9904.963444] dyndbg: op='-' flags=0x0 maskp=0xfffffffe
> [ 9904.963945] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=DRM_UT_CORE
> [ 9904.964781] dyndbg: good-class: drm.DRM_UT_CORE  module:drm nd:302 nc:1 nu:0
> [ 9904.966411] dyndbg: class-ref: drm_kms_helper.DRM_UT_CORE  module:drm_kms_helper nd:95 nc:0 nu:1
> [ 9904.967265] dyndbg: class-ref: drm_display_helper.DRM_UT_CORE  module:drm_display_helper nd:150 nc:0 nu:1
> [ 9904.968349] dyndbg: class-ref: i915.DRM_UT_CORE  module:i915 nd:1659 nc:0 nu:1
> [ 9904.969801] dyndbg: class-ref: amdgpu.DRM_UT_CORE  module:amdgpu nd:4425 nc:0 nu:1
> [ 9904.977079] dyndbg: class-ref: nouveau.DRM_UT_CORE  module:nouveau nd:103 nc:0 nu:1
> [ 9904.977830] dyndbg: processed 1 queries, with 507 matches, 0 errs
> doing class DRM_UT_DRIVER +p
> [ 9906.151761] dyndbg: read 23 bytes from userspace
> [ 9906.152241] dyndbg: query 0: "class DRM_UT_DRIVER +p" mod:*
> [ 9906.152793] dyndbg: split into words: "class" "DRM_UT_DRIVER" "+p"
> [ 9906.153388] dyndbg: op='+' flags=0x1 maskp=0xffffffff
> [ 9906.153896] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=DRM_UT_DRIVER
> [ 9906.154746] dyndbg: good-class: drm.DRM_UT_DRIVER  module:drm nd:302 nc:1 nu:0
> [ 9906.155433] dyndbg: class-ref: drm_kms_helper.DRM_UT_DRIVER  module:drm_kms_helper nd:95 nc:0 nu:1
> [ 9906.156267] dyndbg: class-ref: drm_display_helper.DRM_UT_DRIVER  module:drm_display_helper nd:150 nc:0 nu:1
> [ 9906.157365] dyndbg: class-ref: i915.DRM_UT_DRIVER  module:i915 nd:1659 nc:0 nu:1
> [ 9906.163848] dyndbg: class-ref: amdgpu.DRM_UT_DRIVER  module:amdgpu nd:4425 nc:0 nu:1
> [ 9906.178963] dyndbg: class-ref: nouveau.DRM_UT_DRIVER  module:nouveau nd:103 nc:0 nu:1
> [ 9906.179934] dyndbg: processed 1 queries, with 1286 matches, 0 errs
>
>
> Patch-19 is a *workaround* for a panic: __jump_label_patch can "crash
> the box" when the jump-entry is in the wrong state.  The current code
> makes no distinction between a well-formed "toggled" state and an
> "insane" state.  Not for keeps.
>
> It fixes mis-initialization problems like this:
>
> [ 1594.032504] dyndbg: query 0: "class D2_DRIVER -p" mod:*
> [ 1594.032823] dyndbg: split into words: "class" "D2_DRIVER" "-p"
> [ 1594.033183] dyndbg: op='-' flags=0x0 maskp=0xfffffffe
> [ 1594.033507] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=D2_DRIVER
> [ 1594.034014] dyndbg: good-class: test_dynamic_debug.D2_DRIVER  module:test_dynamic_debug nd:32 nc:4 nu:0
> [ 1594.034695] dyndbg: changed lib/test_dynamic_debug.c:156 [test_dynamic_debug]do_cats p => _
> [ 1594.035304] dyndbg: class-ref: test_dynamic_debug_submod.D2_DRIVER  module:test_dynamic_debug_submod nd:32 nc:0 nu:4
> [ 1594.036052] jump_label: found toggled op at do_cats+0x16/0x180 [test_dynamic_debug_submod] [00000000ff2582ac] (0f 1f 44 00 00 != e9 e1 00 00 00)) size:5 type:0
> [ 1594.037036] dyndbg: changed lib/test_dynamic_debug.c:156 [test_dynamic_debug_submod]do_cats p => _
> [ 1594.037604] dyndbg: processed 1 queries, with 2 matches, 0 errs
> [ 1594.037968] dyndbg: bit_1: 2 matches on class: D2_DRIVER -> 0x0
>
> These errors are reliably reproduced by a shell-func which modprobes
> (with the right args) the test mod & submod.ko (in the commit message).
>
> So this isnt really ready for inclusion, but Id like to send the whole
> set to the CI-gym for a workout.  The RFC/for-TESTING patch will
> mitigate panics, and still be detectable.
>
> Besides, Murphys law requires I publish some error before I can make progress.
>
>
> Jim Cromie (19):
>   test-dyndbg: fixup CLASSMAP usage error
>   test-dyndbg: show that DEBUG enables prdbgs at compiletime
>   dyndbg: replace classmap list with a vector
>   dyndbg: make ddebug_apply_class_bitmap more selective
>   dyndbg: split param_set_dyndbg_classes to inner/outer fns
>   dyndbg: drop NUM_TYPE_ARRAY
>   dyndbg: reduce verbose/debug clutter
>   dyndbg: tighten ddebug_class_name() 1st arg
>   dyndbg: constify ddebug_apply_class_bitmap args
>   dyndbg-API: split DECLARE_(DYNDBG_CLASSMAP) to $1(_DEFINE|_USE)
>   dyndbg-API: specialize DYNDBG_CLASSMAP_(DEFINE|USE)
>   dyndbg-API: DYNDBG_CLASSMAP_USE drop extra args
>   dyndbg-API: DYNDBG_CLASSMAP_DEFINE() improvements
>   drm_print: fix stale macro-name in comment
>   test-dyndbg: build test_dynamic_debug_submod
>   test-dyndbg: rename DD_SYS_WRAP to DYNDBG_CLASSMAP_PARAM
>   test-dyndbg: disable WIP dyndbg-trace params
>   test-dyndbg: tune sub-module behavior
>   jump_label: RFC / temporary for CI - tolerate toggled state
>
>  arch/x86/kernel/jump_label.c            |  24 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  14 +-
>  drivers/gpu/drm/display/drm_dp_helper.c |  14 +-
>  drivers/gpu/drm/drm_crtc_helper.c       |  14 +-
>  drivers/gpu/drm/drm_print.c             |  22 +-
>  drivers/gpu/drm/i915/i915_params.c      |  14 +-
>  drivers/gpu/drm/nouveau/nouveau_drm.c   |  14 +-
>  include/asm-generic/vmlinux.lds.h       |   1 +
>  include/drm/drm_print.h                 |   6 +-
>  include/linux/dynamic_debug.h           |  57 ++++--
>  include/linux/map.h                     |  55 +++++
>  kernel/module/main.c                    |   3 +
>  lib/Makefile                            |   3 +-
>  lib/dynamic_debug.c                     | 258 ++++++++++++++++++------
>  lib/test_dynamic_debug.c                | 118 +++++++----
>  lib/test_dynamic_debug_submod.c         |  10 +
>  16 files changed, 437 insertions(+), 190 deletions(-)
>  create mode 100644 include/linux/map.h
>  create mode 100644 lib/test_dynamic_debug_submod.c

-- 
Jani Nikula, Intel Open Source Graphics Center

WARNING: multiple messages have this Message-ID (diff)
From: Jani Nikula <jani.nikula@intel.com>
To: Jim Cromie <jim.cromie@gmail.com>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org,
	intel-gvt-dev@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org
Cc: Jim Cromie <jim.cromie@gmail.com>,
	daniel.vetter@ffwll.ch, Maxime Ripard <mripard@kernel.org>,
	seanpaul@chromium.org, Thomas Zimmermann <tzimmermann@suse.de>,
	Greg KH <gregkh@linuxfoundation.org>,
	Dave Airlie <airlied@gmail.com>
Subject: Re: [Intel-gfx] [PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression
Date: Fri, 03 Feb 2023 11:34:02 +0200	[thread overview]
Message-ID: <87a61v14ad.fsf@intel.com> (raw)
In-Reply-To: <20230125203743.564009-1-jim.cromie@gmail.com>

On Wed, 25 Jan 2023, Jim Cromie <jim.cromie@gmail.com> wrote:
> Hi everyone,
>
> In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in
> drivers at modprobe.

I realize we haven't actually addressed the regression in any way yet,
and any distro enabling DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE will have
DRM_USE_DYNAMIC_DEBUG=y by default, and we're hitting the issue with
trying to gather logs from users on v6.1 or later. It hampers debugging
pretty badly.

I appreciate the effort in fixing the problem properly here, but we'll
need a fix that we can backport to stable kernels.

Maybe just Ville's idea of

 config DRM_USE_DYNAMIC_DEBUG
        bool "use dynamic debug to implement drm.debug"
-       default y
+       default n
+       depends on BROKEN
        depends on DRM
        depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE

but we'll need that as a patch and merged and backported ASAP.

In the mean time, is there a workaround that the user could enable, say,
on the kernel command line, to enable drm debugs on driver kernel
modules, all the way from boot?


BR,
Jani.




>
> It is due to a chicken-egg problem loading modules; on `modprobe
> i915`, drm is loaded 1st, and drm/parameters/debug is set.  When
> drm_debug_enabled() tested __drm_debug at runtime, this just worked.
>
> But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a
> static_key for each drm_dbg/dyndbg callsite, enabled by dyndbg's
> kparam callback on __drm_debug.  So with drm.ko loaded and initialized
> before the dependent modules, their debug callsites aren't yet present
> to be enabled.
>
> STATUS - v3
>
> not quite ready.
> rebased on -rc5, hopefully applies to patchwork head 
> still has RFC patch -> CI_ONLY temporary, to avoid panics
> boots on my amdgpu box, drm.debug=0x3ff works at boot-time
> the "toggled" warning is repeatable with test_dynamic_debug*.ko
> it also occurs on amdgpu, so not just artificial.
> v2 is https://lore.kernel.org/lkml/20230113193016.749791-1-jim.cromie@gmail.com/
>
> OVERVIEW
>
> As Jani Nikula noted rather more gently, DECLARE_DYNDBG_CLASSMAP is
> error-prone enough to call broken: sharing of a common classmap
> required identical classmap definitions in all modules using DRM_UT_*,
> which is inherently error-prone.  IOW, it muddled the K&R distinction
> between a (single) definition, and multiple references.
>
> So patches 10-13 split it into:
>
> DYNDBG_CLASSMAP_DEFINE	used once per subsystem to define each classmap.
> DYNDBG_CLASSMAP_USE	declare dependence on a DEFINEd classmap.
>
> DYNDBG_CLASSMAP_DEFINE initializes the classmap, stores it into the
> (existing) __dyndbg_classes section, and exports the struct var
> (unlike DECLARE_DYNDBG_CLASSMAP).
>
> DYNDBG_CLASSMAP_USE initializes a class-ref struct, containing the
> user-module-name, and a ref to the exported classmap var.
>
> The distinction allows separate treatment of classmaps and
> classmap-refs, the latter getting additional behavior to propagate
> parent's kparam settings to USEr. (forex: drm.debug to drm-drivers) 
>
> . lookup the classmap defn being referenced, and its module
> . find the module's kernel-params using the classmap
> . propagate kparam vals into the prdgs in module being added.
>
> It also makes the weird coordinated-changes-by-identical-classmaps
> "feature" unnecessary.
>
> Patch-10 splits the DECLARE macro into DEFINE & USE, and updates uses.
>
> Patch-11 is the core of it; the separate treatment begins in
> ddebug_add_module().  It calls ddebug_attach_module_classes(1) to
> handle class-defns; this adds ddebug_attach_client_module_classes(2)
> to handle class-refs, as they are found while modprobing drm
> drivers. (2) calls ddebug_apply_parents_params(3) on each USEr's
> referred classmap definition.
>
> (3) scans kernel-params owned by the module DEFINEing the classmap,
> either builtin or loadable, calls ddebug_match_apply_kparam(4) on each.
>
> (4) looks for kparams which are wired to dyndbg's param-ops.  Those
> params have a struct ddebug_class_param attached, which has a classmap
> and a ref to a state-var (__drm_debug for DRM case).  If the kparam's
> classmap is the same as from (2), then apply its state-var to the
> client module by calling ddebug_apply_class_bitmap().
>
> Patch-12 cleans up DYNDBG_CLASSMAP_USE, dropping now unneeded args.
>
> Patch-13 improves DYNDBG_CLASSMAP_DEFINE, by accepting DRM_UT_*
> symbols directly, not "DRM_UT_*" (their strings).  It adds new
> include/linux/map.h to support this.
>
> Patches 1-9 are prep, refactor, cleanup, tighten interfaces
>
> Patches 15-18 extend test_dynamic_debug to recreate DRM's multi-module
> regression; it builds both test_dynamic_debug.ko and _submod.ko, with
> an ifdef to _DEFINE in the main module, and _USE in the submod.  This
> gives both modules identical set of prdbgs, which is helpful for
> comparing results.
>
> here it is, working properly:
>
> doing class DRM_UT_CORE -p
> [ 9904.961750] dyndbg: read 21 bytes from userspace
> [ 9904.962286] dyndbg: query 0: "class DRM_UT_CORE -p" mod:*
> [ 9904.962848] dyndbg: split into words: "class" "DRM_UT_CORE" "-p"
> [ 9904.963444] dyndbg: op='-' flags=0x0 maskp=0xfffffffe
> [ 9904.963945] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=DRM_UT_CORE
> [ 9904.964781] dyndbg: good-class: drm.DRM_UT_CORE  module:drm nd:302 nc:1 nu:0
> [ 9904.966411] dyndbg: class-ref: drm_kms_helper.DRM_UT_CORE  module:drm_kms_helper nd:95 nc:0 nu:1
> [ 9904.967265] dyndbg: class-ref: drm_display_helper.DRM_UT_CORE  module:drm_display_helper nd:150 nc:0 nu:1
> [ 9904.968349] dyndbg: class-ref: i915.DRM_UT_CORE  module:i915 nd:1659 nc:0 nu:1
> [ 9904.969801] dyndbg: class-ref: amdgpu.DRM_UT_CORE  module:amdgpu nd:4425 nc:0 nu:1
> [ 9904.977079] dyndbg: class-ref: nouveau.DRM_UT_CORE  module:nouveau nd:103 nc:0 nu:1
> [ 9904.977830] dyndbg: processed 1 queries, with 507 matches, 0 errs
> doing class DRM_UT_DRIVER +p
> [ 9906.151761] dyndbg: read 23 bytes from userspace
> [ 9906.152241] dyndbg: query 0: "class DRM_UT_DRIVER +p" mod:*
> [ 9906.152793] dyndbg: split into words: "class" "DRM_UT_DRIVER" "+p"
> [ 9906.153388] dyndbg: op='+' flags=0x1 maskp=0xffffffff
> [ 9906.153896] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=DRM_UT_DRIVER
> [ 9906.154746] dyndbg: good-class: drm.DRM_UT_DRIVER  module:drm nd:302 nc:1 nu:0
> [ 9906.155433] dyndbg: class-ref: drm_kms_helper.DRM_UT_DRIVER  module:drm_kms_helper nd:95 nc:0 nu:1
> [ 9906.156267] dyndbg: class-ref: drm_display_helper.DRM_UT_DRIVER  module:drm_display_helper nd:150 nc:0 nu:1
> [ 9906.157365] dyndbg: class-ref: i915.DRM_UT_DRIVER  module:i915 nd:1659 nc:0 nu:1
> [ 9906.163848] dyndbg: class-ref: amdgpu.DRM_UT_DRIVER  module:amdgpu nd:4425 nc:0 nu:1
> [ 9906.178963] dyndbg: class-ref: nouveau.DRM_UT_DRIVER  module:nouveau nd:103 nc:0 nu:1
> [ 9906.179934] dyndbg: processed 1 queries, with 1286 matches, 0 errs
>
>
> Patch-19 is a *workaround* for a panic: __jump_label_patch can "crash
> the box" when the jump-entry is in the wrong state.  The current code
> makes no distinction between a well-formed "toggled" state and an
> "insane" state.  Not for keeps.
>
> It fixes mis-initialization problems like this:
>
> [ 1594.032504] dyndbg: query 0: "class D2_DRIVER -p" mod:*
> [ 1594.032823] dyndbg: split into words: "class" "D2_DRIVER" "-p"
> [ 1594.033183] dyndbg: op='-' flags=0x0 maskp=0xfffffffe
> [ 1594.033507] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=D2_DRIVER
> [ 1594.034014] dyndbg: good-class: test_dynamic_debug.D2_DRIVER  module:test_dynamic_debug nd:32 nc:4 nu:0
> [ 1594.034695] dyndbg: changed lib/test_dynamic_debug.c:156 [test_dynamic_debug]do_cats p => _
> [ 1594.035304] dyndbg: class-ref: test_dynamic_debug_submod.D2_DRIVER  module:test_dynamic_debug_submod nd:32 nc:0 nu:4
> [ 1594.036052] jump_label: found toggled op at do_cats+0x16/0x180 [test_dynamic_debug_submod] [00000000ff2582ac] (0f 1f 44 00 00 != e9 e1 00 00 00)) size:5 type:0
> [ 1594.037036] dyndbg: changed lib/test_dynamic_debug.c:156 [test_dynamic_debug_submod]do_cats p => _
> [ 1594.037604] dyndbg: processed 1 queries, with 2 matches, 0 errs
> [ 1594.037968] dyndbg: bit_1: 2 matches on class: D2_DRIVER -> 0x0
>
> These errors are reliably reproduced by a shell-func which modprobes
> (with the right args) the test mod & submod.ko (in the commit message).
>
> So this isnt really ready for inclusion, but Id like to send the whole
> set to the CI-gym for a workout.  The RFC/for-TESTING patch will
> mitigate panics, and still be detectable.
>
> Besides, Murphys law requires I publish some error before I can make progress.
>
>
> Jim Cromie (19):
>   test-dyndbg: fixup CLASSMAP usage error
>   test-dyndbg: show that DEBUG enables prdbgs at compiletime
>   dyndbg: replace classmap list with a vector
>   dyndbg: make ddebug_apply_class_bitmap more selective
>   dyndbg: split param_set_dyndbg_classes to inner/outer fns
>   dyndbg: drop NUM_TYPE_ARRAY
>   dyndbg: reduce verbose/debug clutter
>   dyndbg: tighten ddebug_class_name() 1st arg
>   dyndbg: constify ddebug_apply_class_bitmap args
>   dyndbg-API: split DECLARE_(DYNDBG_CLASSMAP) to $1(_DEFINE|_USE)
>   dyndbg-API: specialize DYNDBG_CLASSMAP_(DEFINE|USE)
>   dyndbg-API: DYNDBG_CLASSMAP_USE drop extra args
>   dyndbg-API: DYNDBG_CLASSMAP_DEFINE() improvements
>   drm_print: fix stale macro-name in comment
>   test-dyndbg: build test_dynamic_debug_submod
>   test-dyndbg: rename DD_SYS_WRAP to DYNDBG_CLASSMAP_PARAM
>   test-dyndbg: disable WIP dyndbg-trace params
>   test-dyndbg: tune sub-module behavior
>   jump_label: RFC / temporary for CI - tolerate toggled state
>
>  arch/x86/kernel/jump_label.c            |  24 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  14 +-
>  drivers/gpu/drm/display/drm_dp_helper.c |  14 +-
>  drivers/gpu/drm/drm_crtc_helper.c       |  14 +-
>  drivers/gpu/drm/drm_print.c             |  22 +-
>  drivers/gpu/drm/i915/i915_params.c      |  14 +-
>  drivers/gpu/drm/nouveau/nouveau_drm.c   |  14 +-
>  include/asm-generic/vmlinux.lds.h       |   1 +
>  include/drm/drm_print.h                 |   6 +-
>  include/linux/dynamic_debug.h           |  57 ++++--
>  include/linux/map.h                     |  55 +++++
>  kernel/module/main.c                    |   3 +
>  lib/Makefile                            |   3 +-
>  lib/dynamic_debug.c                     | 258 ++++++++++++++++++------
>  lib/test_dynamic_debug.c                | 118 +++++++----
>  lib/test_dynamic_debug_submod.c         |  10 +
>  16 files changed, 437 insertions(+), 190 deletions(-)
>  create mode 100644 include/linux/map.h
>  create mode 100644 lib/test_dynamic_debug_submod.c

-- 
Jani Nikula, Intel Open Source Graphics Center

WARNING: multiple messages have this Message-ID (diff)
From: Jani Nikula <jani.nikula@intel.com>
To: Jim Cromie <jim.cromie@gmail.com>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	amd-gfx@lists.freedesktop.org,
	intel-gvt-dev@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org
Cc: Jim Cromie <jim.cromie@gmail.com>,
	daniel.vetter@ffwll.ch,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Maxime Ripard <mripard@kernel.org>,
	robdclark@gmail.com, seanpaul@chromium.org,
	Thomas Zimmermann <tzimmermann@suse.de>,
	Greg KH <gregkh@linuxfoundation.org>,
	Dave Airlie <airlied@gmail.com>,
	ville.syrjala@linux.intel.com
Subject: Re: [PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression
Date: Fri, 03 Feb 2023 11:34:02 +0200	[thread overview]
Message-ID: <87a61v14ad.fsf@intel.com> (raw)
In-Reply-To: <20230125203743.564009-1-jim.cromie@gmail.com>

On Wed, 25 Jan 2023, Jim Cromie <jim.cromie@gmail.com> wrote:
> Hi everyone,
>
> In v6.1 DRM_USE_DYNAMIC_DEBUG=y has a regression enabling drm.debug in
> drivers at modprobe.

I realize we haven't actually addressed the regression in any way yet,
and any distro enabling DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE will have
DRM_USE_DYNAMIC_DEBUG=y by default, and we're hitting the issue with
trying to gather logs from users on v6.1 or later. It hampers debugging
pretty badly.

I appreciate the effort in fixing the problem properly here, but we'll
need a fix that we can backport to stable kernels.

Maybe just Ville's idea of

 config DRM_USE_DYNAMIC_DEBUG
        bool "use dynamic debug to implement drm.debug"
-       default y
+       default n
+       depends on BROKEN
        depends on DRM
        depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE

but we'll need that as a patch and merged and backported ASAP.

In the mean time, is there a workaround that the user could enable, say,
on the kernel command line, to enable drm debugs on driver kernel
modules, all the way from boot?


BR,
Jani.




>
> It is due to a chicken-egg problem loading modules; on `modprobe
> i915`, drm is loaded 1st, and drm/parameters/debug is set.  When
> drm_debug_enabled() tested __drm_debug at runtime, this just worked.
>
> But with DRM_USE_DYNAMIC_DEBUG=y, the runtime test is replaced with a
> static_key for each drm_dbg/dyndbg callsite, enabled by dyndbg's
> kparam callback on __drm_debug.  So with drm.ko loaded and initialized
> before the dependent modules, their debug callsites aren't yet present
> to be enabled.
>
> STATUS - v3
>
> not quite ready.
> rebased on -rc5, hopefully applies to patchwork head 
> still has RFC patch -> CI_ONLY temporary, to avoid panics
> boots on my amdgpu box, drm.debug=0x3ff works at boot-time
> the "toggled" warning is repeatable with test_dynamic_debug*.ko
> it also occurs on amdgpu, so not just artificial.
> v2 is https://lore.kernel.org/lkml/20230113193016.749791-1-jim.cromie@gmail.com/
>
> OVERVIEW
>
> As Jani Nikula noted rather more gently, DECLARE_DYNDBG_CLASSMAP is
> error-prone enough to call broken: sharing of a common classmap
> required identical classmap definitions in all modules using DRM_UT_*,
> which is inherently error-prone.  IOW, it muddled the K&R distinction
> between a (single) definition, and multiple references.
>
> So patches 10-13 split it into:
>
> DYNDBG_CLASSMAP_DEFINE	used once per subsystem to define each classmap.
> DYNDBG_CLASSMAP_USE	declare dependence on a DEFINEd classmap.
>
> DYNDBG_CLASSMAP_DEFINE initializes the classmap, stores it into the
> (existing) __dyndbg_classes section, and exports the struct var
> (unlike DECLARE_DYNDBG_CLASSMAP).
>
> DYNDBG_CLASSMAP_USE initializes a class-ref struct, containing the
> user-module-name, and a ref to the exported classmap var.
>
> The distinction allows separate treatment of classmaps and
> classmap-refs, the latter getting additional behavior to propagate
> parent's kparam settings to USEr. (forex: drm.debug to drm-drivers) 
>
> . lookup the classmap defn being referenced, and its module
> . find the module's kernel-params using the classmap
> . propagate kparam vals into the prdgs in module being added.
>
> It also makes the weird coordinated-changes-by-identical-classmaps
> "feature" unnecessary.
>
> Patch-10 splits the DECLARE macro into DEFINE & USE, and updates uses.
>
> Patch-11 is the core of it; the separate treatment begins in
> ddebug_add_module().  It calls ddebug_attach_module_classes(1) to
> handle class-defns; this adds ddebug_attach_client_module_classes(2)
> to handle class-refs, as they are found while modprobing drm
> drivers. (2) calls ddebug_apply_parents_params(3) on each USEr's
> referred classmap definition.
>
> (3) scans kernel-params owned by the module DEFINEing the classmap,
> either builtin or loadable, calls ddebug_match_apply_kparam(4) on each.
>
> (4) looks for kparams which are wired to dyndbg's param-ops.  Those
> params have a struct ddebug_class_param attached, which has a classmap
> and a ref to a state-var (__drm_debug for DRM case).  If the kparam's
> classmap is the same as from (2), then apply its state-var to the
> client module by calling ddebug_apply_class_bitmap().
>
> Patch-12 cleans up DYNDBG_CLASSMAP_USE, dropping now unneeded args.
>
> Patch-13 improves DYNDBG_CLASSMAP_DEFINE, by accepting DRM_UT_*
> symbols directly, not "DRM_UT_*" (their strings).  It adds new
> include/linux/map.h to support this.
>
> Patches 1-9 are prep, refactor, cleanup, tighten interfaces
>
> Patches 15-18 extend test_dynamic_debug to recreate DRM's multi-module
> regression; it builds both test_dynamic_debug.ko and _submod.ko, with
> an ifdef to _DEFINE in the main module, and _USE in the submod.  This
> gives both modules identical set of prdbgs, which is helpful for
> comparing results.
>
> here it is, working properly:
>
> doing class DRM_UT_CORE -p
> [ 9904.961750] dyndbg: read 21 bytes from userspace
> [ 9904.962286] dyndbg: query 0: "class DRM_UT_CORE -p" mod:*
> [ 9904.962848] dyndbg: split into words: "class" "DRM_UT_CORE" "-p"
> [ 9904.963444] dyndbg: op='-' flags=0x0 maskp=0xfffffffe
> [ 9904.963945] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=DRM_UT_CORE
> [ 9904.964781] dyndbg: good-class: drm.DRM_UT_CORE  module:drm nd:302 nc:1 nu:0
> [ 9904.966411] dyndbg: class-ref: drm_kms_helper.DRM_UT_CORE  module:drm_kms_helper nd:95 nc:0 nu:1
> [ 9904.967265] dyndbg: class-ref: drm_display_helper.DRM_UT_CORE  module:drm_display_helper nd:150 nc:0 nu:1
> [ 9904.968349] dyndbg: class-ref: i915.DRM_UT_CORE  module:i915 nd:1659 nc:0 nu:1
> [ 9904.969801] dyndbg: class-ref: amdgpu.DRM_UT_CORE  module:amdgpu nd:4425 nc:0 nu:1
> [ 9904.977079] dyndbg: class-ref: nouveau.DRM_UT_CORE  module:nouveau nd:103 nc:0 nu:1
> [ 9904.977830] dyndbg: processed 1 queries, with 507 matches, 0 errs
> doing class DRM_UT_DRIVER +p
> [ 9906.151761] dyndbg: read 23 bytes from userspace
> [ 9906.152241] dyndbg: query 0: "class DRM_UT_DRIVER +p" mod:*
> [ 9906.152793] dyndbg: split into words: "class" "DRM_UT_DRIVER" "+p"
> [ 9906.153388] dyndbg: op='+' flags=0x1 maskp=0xffffffff
> [ 9906.153896] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=DRM_UT_DRIVER
> [ 9906.154746] dyndbg: good-class: drm.DRM_UT_DRIVER  module:drm nd:302 nc:1 nu:0
> [ 9906.155433] dyndbg: class-ref: drm_kms_helper.DRM_UT_DRIVER  module:drm_kms_helper nd:95 nc:0 nu:1
> [ 9906.156267] dyndbg: class-ref: drm_display_helper.DRM_UT_DRIVER  module:drm_display_helper nd:150 nc:0 nu:1
> [ 9906.157365] dyndbg: class-ref: i915.DRM_UT_DRIVER  module:i915 nd:1659 nc:0 nu:1
> [ 9906.163848] dyndbg: class-ref: amdgpu.DRM_UT_DRIVER  module:amdgpu nd:4425 nc:0 nu:1
> [ 9906.178963] dyndbg: class-ref: nouveau.DRM_UT_DRIVER  module:nouveau nd:103 nc:0 nu:1
> [ 9906.179934] dyndbg: processed 1 queries, with 1286 matches, 0 errs
>
>
> Patch-19 is a *workaround* for a panic: __jump_label_patch can "crash
> the box" when the jump-entry is in the wrong state.  The current code
> makes no distinction between a well-formed "toggled" state and an
> "insane" state.  Not for keeps.
>
> It fixes mis-initialization problems like this:
>
> [ 1594.032504] dyndbg: query 0: "class D2_DRIVER -p" mod:*
> [ 1594.032823] dyndbg: split into words: "class" "D2_DRIVER" "-p"
> [ 1594.033183] dyndbg: op='-' flags=0x0 maskp=0xfffffffe
> [ 1594.033507] dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=D2_DRIVER
> [ 1594.034014] dyndbg: good-class: test_dynamic_debug.D2_DRIVER  module:test_dynamic_debug nd:32 nc:4 nu:0
> [ 1594.034695] dyndbg: changed lib/test_dynamic_debug.c:156 [test_dynamic_debug]do_cats p => _
> [ 1594.035304] dyndbg: class-ref: test_dynamic_debug_submod.D2_DRIVER  module:test_dynamic_debug_submod nd:32 nc:0 nu:4
> [ 1594.036052] jump_label: found toggled op at do_cats+0x16/0x180 [test_dynamic_debug_submod] [00000000ff2582ac] (0f 1f 44 00 00 != e9 e1 00 00 00)) size:5 type:0
> [ 1594.037036] dyndbg: changed lib/test_dynamic_debug.c:156 [test_dynamic_debug_submod]do_cats p => _
> [ 1594.037604] dyndbg: processed 1 queries, with 2 matches, 0 errs
> [ 1594.037968] dyndbg: bit_1: 2 matches on class: D2_DRIVER -> 0x0
>
> These errors are reliably reproduced by a shell-func which modprobes
> (with the right args) the test mod & submod.ko (in the commit message).
>
> So this isnt really ready for inclusion, but Id like to send the whole
> set to the CI-gym for a workout.  The RFC/for-TESTING patch will
> mitigate panics, and still be detectable.
>
> Besides, Murphys law requires I publish some error before I can make progress.
>
>
> Jim Cromie (19):
>   test-dyndbg: fixup CLASSMAP usage error
>   test-dyndbg: show that DEBUG enables prdbgs at compiletime
>   dyndbg: replace classmap list with a vector
>   dyndbg: make ddebug_apply_class_bitmap more selective
>   dyndbg: split param_set_dyndbg_classes to inner/outer fns
>   dyndbg: drop NUM_TYPE_ARRAY
>   dyndbg: reduce verbose/debug clutter
>   dyndbg: tighten ddebug_class_name() 1st arg
>   dyndbg: constify ddebug_apply_class_bitmap args
>   dyndbg-API: split DECLARE_(DYNDBG_CLASSMAP) to $1(_DEFINE|_USE)
>   dyndbg-API: specialize DYNDBG_CLASSMAP_(DEFINE|USE)
>   dyndbg-API: DYNDBG_CLASSMAP_USE drop extra args
>   dyndbg-API: DYNDBG_CLASSMAP_DEFINE() improvements
>   drm_print: fix stale macro-name in comment
>   test-dyndbg: build test_dynamic_debug_submod
>   test-dyndbg: rename DD_SYS_WRAP to DYNDBG_CLASSMAP_PARAM
>   test-dyndbg: disable WIP dyndbg-trace params
>   test-dyndbg: tune sub-module behavior
>   jump_label: RFC / temporary for CI - tolerate toggled state
>
>  arch/x86/kernel/jump_label.c            |  24 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  14 +-
>  drivers/gpu/drm/display/drm_dp_helper.c |  14 +-
>  drivers/gpu/drm/drm_crtc_helper.c       |  14 +-
>  drivers/gpu/drm/drm_print.c             |  22 +-
>  drivers/gpu/drm/i915/i915_params.c      |  14 +-
>  drivers/gpu/drm/nouveau/nouveau_drm.c   |  14 +-
>  include/asm-generic/vmlinux.lds.h       |   1 +
>  include/drm/drm_print.h                 |   6 +-
>  include/linux/dynamic_debug.h           |  57 ++++--
>  include/linux/map.h                     |  55 +++++
>  kernel/module/main.c                    |   3 +
>  lib/Makefile                            |   3 +-
>  lib/dynamic_debug.c                     | 258 ++++++++++++++++++------
>  lib/test_dynamic_debug.c                | 118 +++++++----
>  lib/test_dynamic_debug_submod.c         |  10 +
>  16 files changed, 437 insertions(+), 190 deletions(-)
>  create mode 100644 include/linux/map.h
>  create mode 100644 lib/test_dynamic_debug_submod.c

-- 
Jani Nikula, Intel Open Source Graphics Center

  parent reply	other threads:[~2023-02-03  9:34 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-25 20:37 [PATCH v3 00/19] fix DRM_USE_DYNAMIC_DEBUG regression Jim Cromie
2023-01-25 20:37 ` Jim Cromie
2023-01-25 20:37 ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37 ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 01/19] test-dyndbg: fixup CLASSMAP usage error Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 02/19] test-dyndbg: show that DEBUG enables prdbgs at compiletime Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 03/19] dyndbg: replace classmap list with a vector Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 04/19] dyndbg: make ddebug_apply_class_bitmap more selective Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 05/19] dyndbg: split param_set_dyndbg_classes to inner/outer fns Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 06/19] dyndbg: drop NUM_TYPE_ARRAY Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 07/19] dyndbg: reduce verbose/debug clutter Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 08/19] dyndbg: tighten ddebug_class_name() 1st arg Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 09/19] dyndbg: constify ddebug_apply_class_bitmap args Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 10/19] dyndbg-API: split DECLARE_(DYNDBG_CLASSMAP) to $1(_DEFINE|_USE) Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 11/19] dyndbg-API: specialize DYNDBG_CLASSMAP_(DEFINE|USE) Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 12/19] dyndbg-API: DYNDBG_CLASSMAP_USE drop extra args Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 13/19] dyndbg-API: DYNDBG_CLASSMAP_DEFINE() improvements Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37 ` [PATCH v3 14/19] drm_print: fix stale macro-name in comment Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-02-11 19:24   ` jim.cromie
2023-02-11 19:24     ` jim.cromie
2023-02-11 19:24     ` [Intel-gfx] " jim.cromie
2023-02-11 19:24     ` jim.cromie
2023-01-25 20:37 ` [PATCH v3 15/19] test-dyndbg: build test_dynamic_debug_submod Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37 ` [PATCH v3 16/19] test-dyndbg: rename DD_SYS_WRAP to DYNDBG_CLASSMAP_PARAM Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37 ` [PATCH v3 17/19] test-dyndbg: disable WIP dyndbg-trace params Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37 ` [PATCH v3 18/19] test-dyndbg: tune sub-module behavior Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-25 20:37 ` [PATCH v3 19/19] jump_label: RFC / temporary for CI - tolerate toggled state Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` Jim Cromie
2023-01-25 20:37   ` [Intel-gfx] " Jim Cromie
2023-01-26  2:39 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for fix DRM_USE_DYNAMIC_DEBUG regression Patchwork
2023-01-26  2:57 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-01-26 13:27 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2023-02-03  9:34 ` Jani Nikula [this message]
2023-02-03  9:34   ` [PATCH v3 00/19] " Jani Nikula
2023-02-03  9:34   ` [Intel-gfx] " Jani Nikula
2023-02-03  9:34   ` Jani Nikula
2023-02-03 15:08   ` Stanislaw Gruszka
2023-02-03 15:08     ` Stanislaw Gruszka
2023-02-03 15:08     ` Stanislaw Gruszka
2023-02-03 15:08     ` [Intel-gfx] " Stanislaw Gruszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a61v14ad.fsf@intel.com \
    --to=jani.nikula@intel.com \
    --cc=airlied@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jim.cromie@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=robdclark@gmail.com \
    --cc=seanpaul@chromium.org \
    --cc=tzimmermann@suse.de \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.