linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
@ 2019-02-10 11:09 Krzysztof Grygiencz
  2019-02-11 19:57 ` Jerome Glisse
  0 siblings, 1 reply; 13+ messages in thread
From: Krzysztof Grygiencz @ 2019-02-10 11:09 UTC (permalink / raw)
  To: dan.j.williams
  Cc: akpm, dri-devel, hch, jglisse, linux-kernel, linux-mm, logang,
	stable, torvalds

[-- Attachment #1: Type: text/plain, Size: 413 bytes --]

Dear Sir,

I'm using ArchLinux distribution. After kernel upgrade form 4.19.14 to 
4.19.15 my X environment stopped working. I have AMD HD3300 (RS780D) 
graphics card. I have bisected kernel and found a failing commit:

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v4.19.20&id=ec5471c92fb29ad848c81875840478be201eeb3f

I'm attaching Xorg.0.log file

Best Regards
Krzysztof Grygiencz

[-- Attachment #2: Xorg.0.log --]
[-- Type: text/x-log, Size: 12892 bytes --]

[    22.027] 
X.Org X Server 1.20.3
X Protocol Version 11, Revision 0
[    22.027] Build Operating System: Linux Arch Linux
[    22.027] Current Operating System: Linux h4xxx 4.19.15-1-lts #1 SMP Sun Jan 13 13:53:52 CET 2019 x86_64
[    22.027] Kernel command line: BOOT_IMAGE=../vmlinuz-linux-lts root=UUID=a6e22cbe-f0cd-4efc-b9c2-c9fe90e829d3 rw quiet loglevel=3 rd.systemd.show_status=auto rd.udev.log-priority=3 vt.global_cursor_default=0 net.ifnames=0 initrd=../initramfs-linux-lts.img
[    22.027] Build Date: 25 October 2018  04:42:32PM
[    22.027]  
[    22.027] Current version of pixman: 0.36.0
[    22.027] 	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
[    22.027] Markers: (--) probed, (**) from config file, (==) default setting,
	(++) from command line, (!!) notice, (II) informational,
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[    22.027] (==) Log file: "/var/log/Xorg.0.log", Time: Tue Jan 15 17:37:23 2019
[    22.027] (==) Using config directory: "/etc/X11/xorg.conf.d"
[    22.027] (==) Using system config directory "/usr/share/X11/xorg.conf.d"
[    22.028] (==) No Layout section.  Using the first Screen section.
[    22.028] (==) No screen section available. Using defaults.
[    22.028] (**) |-->Screen "Default Screen Section" (0)
[    22.028] (**) |   |-->Monitor "<default monitor>"
[    22.028] (==) No monitor specified for screen "Default Screen Section".
	Using a default monitor configuration.
[    22.028] (==) Automatically adding devices
[    22.028] (==) Automatically enabling devices
[    22.028] (==) Automatically adding GPU devices
[    22.028] (==) Automatically binding GPU devices
[    22.028] (==) Max clients allowed: 256, resource mask: 0x1fffff
[    22.028] (==) FontPath set to:
	/usr/share/fonts/misc,
	/usr/share/fonts/TTF,
	/usr/share/fonts/OTF,
	/usr/share/fonts/Type1,
	/usr/share/fonts/100dpi,
	/usr/share/fonts/75dpi
[    22.028] (==) ModulePath set to "/usr/lib/xorg/modules"
[    22.028] (II) The server relies on udev to provide the list of input devices.
	If no devices become available, reconfigure udev or disable AutoAddDevices.
[    22.028] (II) Module ABI versions:
[    22.028] 	X.Org ANSI C Emulation: 0.4
[    22.028] 	X.Org Video Driver: 24.0
[    22.028] 	X.Org XInput driver : 24.1
[    22.028] 	X.Org Server Extension : 10.0
[    22.029] (++) using VT number 7

[    22.029] (II) systemd-logind: logind integration requires -keeptty and -keeptty was not provided, disabling logind integration
[    22.031] (--) PCI:*(1@0:5:0) 1002:9614:1043:834d rev 0, Mem @ 0xd0000000/268435456, 0xfe0f0000/65536, 0xfdf00000/1048576, I/O @ 0x0000c000/256, BIOS @ 0x????????/131072
[    22.032] (II) Open ACPI successful (/var/run/acpid.socket)
[    22.032] (II) LoadModule: "glx"
[    22.032] (II) Loading /usr/lib/xorg/modules/extensions/libglx.so
[    22.034] (II) Module glx: vendor="X.Org Foundation"
[    22.034] 	compiled for 1.20.3, module version = 1.0.0
[    22.034] 	ABI class: X.Org Server Extension, version 10.0
[    22.034] (==) Matched ati as autoconfigured driver 0
[    22.034] (==) Matched modesetting as autoconfigured driver 1
[    22.034] (==) Matched fbdev as autoconfigured driver 2
[    22.034] (==) Matched vesa as autoconfigured driver 3
[    22.034] (==) Assigned the driver to the xf86ConfigLayout
[    22.034] (II) LoadModule: "ati"
[    22.034] (II) Loading /usr/lib/xorg/modules/drivers/ati_drv.so
[    22.034] (II) Module ati: vendor="X.Org Foundation"
[    22.034] 	compiled for 1.20.1, module version = 18.1.0
[    22.034] 	Module class: X.Org Video Driver
[    22.034] 	ABI class: X.Org Video Driver, version 24.0
[    22.034] (II) LoadModule: "radeon"
[    22.034] (II) Loading /usr/lib/xorg/modules/drivers/radeon_drv.so
[    22.035] (II) Module radeon: vendor="X.Org Foundation"
[    22.035] 	compiled for 1.20.1, module version = 18.1.0
[    22.035] 	Module class: X.Org Video Driver
[    22.035] 	ABI class: X.Org Video Driver, version 24.0
[    22.035] (II) LoadModule: "modesetting"
[    22.035] (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
[    22.035] (II) Module modesetting: vendor="X.Org Foundation"
[    22.035] 	compiled for 1.20.3, module version = 1.20.3
[    22.035] 	Module class: X.Org Video Driver
[    22.035] 	ABI class: X.Org Video Driver, version 24.0
[    22.035] (II) LoadModule: "fbdev"
[    22.035] (WW) Warning, couldn't open module fbdev
[    22.035] (EE) Failed to load module "fbdev" (module does not exist, 0)
[    22.035] (II) LoadModule: "vesa"
[    22.035] (WW) Warning, couldn't open module vesa
[    22.035] (EE) Failed to load module "vesa" (module does not exist, 0)
[    22.035] (II) RADEON: Driver for ATI/AMD Radeon chipsets:
	ATI Radeon Mobility X600 (M24), ATI FireMV 2400,
	ATI Radeon Mobility X300 (M24), ATI FireGL M24 GL,
	ATI Radeon X600 (RV380), ATI FireGL V3200 (RV380),
	ATI Radeon IGP320 (A3), ATI Radeon IGP330/340/350 (A4),
	ATI Radeon 9500, ATI Radeon 9600TX, ATI FireGL Z1, ATI Radeon 9800SE,
	ATI Radeon 9800, ATI FireGL X2, ATI Radeon 9600, ATI Radeon 9600SE,
	ATI Radeon 9600XT, ATI FireGL T2, ATI Radeon 9650, ATI FireGL RV360,
	ATI Radeon 7000 IGP (A4+), ATI Radeon 8500 AIW,
	ATI Radeon IGP320M (U1), ATI Radeon IGP330M/340M/350M (U2),
	ATI Radeon Mobility 7000 IGP, ATI Radeon 9000/PRO, ATI Radeon 9000,
	ATI Radeon X800 (R420), ATI Radeon X800PRO (R420),
	ATI Radeon X800SE (R420), ATI FireGL X3 (R420),
	ATI Radeon Mobility 9800 (M18), ATI Radeon X800 SE (R420),
	ATI Radeon X800XT (R420), ATI Radeon X800 VE (R420),
	ATI Radeon X850 (R480), ATI Radeon X850 XT (R480),
	ATI Radeon X850 SE (R480), ATI Radeon X850 PRO (R480),
	ATI Radeon X850 XT PE (R480), ATI Radeon Mobility M7,
	ATI Mobility FireGL 7800 M7, ATI Radeon Mobility M6,
	ATI FireGL Mobility 9000 (M9), ATI Radeon Mobility 9000 (M9),
	ATI Radeon 9700 Pro, ATI Radeon 9700/9500Pro, ATI FireGL X1,
	ATI Radeon 9800PRO, ATI Radeon 9800XT,
	ATI Radeon Mobility 9600/9700 (M10/M11),
	ATI Radeon Mobility 9600 (M10), ATI Radeon Mobility 9600 (M11),
	ATI FireGL Mobility T2 (M10), ATI FireGL Mobility T2e (M11),
	ATI Radeon, ATI FireGL 8700/8800, ATI Radeon 8500, ATI Radeon 9100,
	ATI Radeon 7500, ATI Radeon VE/7000, ATI ES1000,
	ATI Radeon Mobility X300 (M22), ATI Radeon Mobility X600 SE (M24C),
	ATI FireGL M22 GL, ATI Radeon X800 (R423), ATI Radeon X800PRO (R423),
	ATI Radeon X800LE (R423), ATI Radeon X800SE (R423),
	ATI Radeon X800 XTP (R430), ATI Radeon X800 XL (R430),
	ATI Radeon X800 SE (R430), ATI Radeon X800 (R430),
	ATI FireGL V7100 (R423), ATI FireGL V5100 (R423),
	ATI FireGL unknown (R423), ATI Mobility FireGL V5000 (M26),
	ATI Mobility Radeon X700 XL (M26), ATI Mobility Radeon X700 (M26),
	ATI Radeon X550XTX, ATI Radeon 9100 IGP (A5),
	ATI Radeon Mobility 9100 IGP (U3), ATI Radeon XPRESS 200,
	ATI Radeon XPRESS 200M, ATI Radeon 9250, ATI Radeon 9200,
	ATI Radeon 9200SE, ATI FireMV 2200, ATI Radeon X300 (RV370),
	ATI Radeon X600 (RV370), ATI Radeon X550 (RV370),
	ATI FireGL V3100 (RV370), ATI FireMV 2200 PCIE (RV370),
	ATI Radeon Mobility 9200 (M9+), ATI Mobility Radeon X800 XT (M28),
	ATI Mobility FireGL V5100 (M28), ATI Mobility Radeon X800 (M28),
	ATI Radeon X850, ATI unknown Radeon / FireGL (R480),
	ATI Radeon X800XT (R423), ATI FireGL V5000 (RV410),
	ATI Radeon X700 XT (RV410), ATI Radeon X700 PRO (RV410),
	ATI Radeon X700 SE (RV410), ATI Radeon X700 (RV410),
	ATI Radeon X1800, ATI Mobility Radeon X1800 XT,
	ATI Mobility Radeon X1800, ATI Mobility FireGL V7200,
	ATI FireGL V7200, ATI FireGL V5300, ATI Mobility FireGL V7100,
	ATI FireGL V7300, ATI FireGL V7350, ATI Radeon X1600, ATI RV505,
	ATI Radeon X1300/X1550, ATI Radeon X1550, ATI M54-GL,
	ATI Mobility Radeon X1400, ATI Radeon X1550 64-bit,
	ATI Mobility Radeon X1300, ATI Radeon X1300, ATI FireGL V3300,
	ATI FireGL V3350, ATI Mobility Radeon X1450,
	ATI Mobility Radeon X2300, ATI Mobility Radeon X1350,
	ATI FireMV 2250, ATI Radeon X1650, ATI Mobility FireGL V5200,
	ATI Mobility Radeon X1600, ATI Radeon X1300 XT/X1600 Pro,
	ATI FireGL V3400, ATI Mobility FireGL V5250,
	ATI Mobility Radeon X1700, ATI Mobility Radeon X1700 XT,
	ATI FireGL V5200, ATI Radeon X2300HD, ATI Mobility Radeon HD 2300,
	ATI Radeon X1950, ATI Radeon X1900, ATI AMD Stream Processor,
	ATI RV560, ATI Mobility Radeon X1900, ATI Radeon X1950 GT, ATI RV570,
	ATI FireGL V7400, ATI Radeon 9100 PRO IGP,
	ATI Radeon Mobility 9200 IGP, ATI Radeon X1200, ATI RS740,
	ATI RS740M, ATI Radeon HD 2900 XT, ATI Radeon HD 2900 Pro,
	ATI Radeon HD 2900 GT, ATI FireGL V8650, ATI FireGL V8600,
	ATI FireGL V7600, ATI Radeon 4800 Series, ATI Radeon HD 4870 x2,
	ATI Radeon HD 4850 x2, ATI FirePro V8750 (FireGL),
	ATI FirePro V7760 (FireGL), ATI Mobility RADEON HD 4850,
	ATI Mobility RADEON HD 4850 X2, ATI FirePro RV770,
	AMD FireStream 9270, AMD FireStream 9250, ATI FirePro V8700 (FireGL),
	ATI Mobility RADEON HD 4870, ATI Mobility RADEON M98,
	ATI FirePro M7750, ATI M98, ATI Mobility Radeon HD 4650,
	ATI Radeon RV730 (AGP), ATI Mobility Radeon HD 4670,
	ATI FirePro M5750, ATI RV730XT [Radeon HD 4670], ATI RADEON E4600,
	ATI Radeon HD 4600 Series, ATI RV730 PRO [Radeon HD 4650],
	ATI FirePro V7750 (FireGL), ATI FirePro V5700 (FireGL),
	ATI FirePro V3750 (FireGL), ATI Mobility Radeon HD 4830,
	ATI Mobility Radeon HD 4850, ATI FirePro M7740, ATI RV740,
	ATI Radeon HD 4770, ATI Radeon HD 4700 Series, ATI RV610,
	ATI Radeon HD 2400 XT, ATI Radeon HD 2400 Pro,
	ATI Radeon HD 2400 PRO AGP, ATI FireGL V4000, ATI Radeon HD 2350,
	ATI Mobility Radeon HD 2400 XT, ATI Mobility Radeon HD 2400,
	ATI RADEON E2400, ATI FireMV 2260, ATI RV670, ATI Radeon HD3870,
	ATI Mobility Radeon HD 3850, ATI Radeon HD3850,
	ATI Mobility Radeon HD 3850 X2, ATI Mobility Radeon HD 3870,
	ATI Mobility Radeon HD 3870 X2, ATI Radeon HD3870 X2,
	ATI FireGL V7700, ATI Radeon HD3690, AMD Firestream 9170,
	ATI Radeon HD 4550, ATI Radeon RV710, ATI Radeon HD 4350,
	ATI Mobility Radeon 4300 Series, ATI Mobility Radeon 4500 Series,
	ATI FirePro RG220, ATI Mobility Radeon 4330, ATI RV630,
	ATI Mobility Radeon HD 2600, ATI Mobility Radeon HD 2600 XT,
	ATI Radeon HD 2600 XT AGP, ATI Radeon HD 2600 Pro AGP,
	ATI Radeon HD 2600 XT, ATI Radeon HD 2600 Pro, ATI Gemini RV630,
	ATI Gemini Mobility Radeon HD 2600 XT, ATI FireGL V5600,
	ATI FireGL V3600, ATI Radeon HD 2600 LE,
	ATI Mobility FireGL Graphics Processor, ATI Radeon HD 3470,
	ATI Mobility Radeon HD 3430, ATI Mobility Radeon HD 3400 Series,
	ATI Radeon HD 3450, ATI Radeon HD 3430, ATI FirePro V3700,
	ATI FireMV 2450, ATI Radeon HD 3600 Series, ATI Radeon HD 3650 AGP,
	ATI Radeon HD 3600 PRO, ATI Radeon HD 3600 XT,
	ATI Mobility Radeon HD 3650, ATI Mobility Radeon HD 3670,
	ATI Mobility FireGL V5700, ATI Mobility FireGL V5725,
	ATI Radeon HD 3200 Graphics, ATI Radeon 3100 Graphics,
	ATI Radeon HD 3300 Graphics, ATI Radeon 3000 Graphics, SUMO, SUMO2,
	ATI Radeon HD 4200, ATI Radeon 4100, ATI Mobility Radeon HD 4200,
	ATI Mobility Radeon 4100, ATI Radeon HD 4290, ATI Radeon HD 4250,
	AMD Radeon HD 6310 Graphics, AMD Radeon HD 6250 Graphics,
	AMD Radeon HD 6300 Series Graphics,
	AMD Radeon HD 6200 Series Graphics, PALM, CYPRESS,
	ATI FirePro (FireGL) Graphics Adapter, AMD Firestream 9370,
	AMD Firestream 9350, ATI Radeon HD 5800 Series,
	ATI Radeon HD 5900 Series, ATI Mobility Radeon HD 5800 Series,
	ATI Radeon HD 5700 Series, ATI Radeon HD 6700 Series,
	ATI Mobility Radeon HD 5000 Series, ATI Mobility Radeon HD 5570,
	ATI Radeon HD 5670, ATI Radeon HD 5570, ATI Radeon HD 5500 Series,
	REDWOOD, ATI Mobility Radeon Graphics, CEDAR, ATI FirePro 2270,
	ATI Radeon HD 5450, CAYMAN, AMD Radeon HD 6900 Series,
	AMD Radeon HD 6900M Series, Mobility Radeon HD 6000 Series, BARTS,
	AMD Radeon HD 6800 Series, AMD Radeon HD 6700 Series, TURKS, CAICOS,
	ARUBA, TAHITI, PITCAIRN, VERDE, OLAND, HAINAN, BONAIRE, KABINI,
	MULLINS, KAVERI, HAWAII
[    22.037] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
[    22.038] (II) [KMS] drm report modesetting isn't supported.
[    22.038] (EE) open /dev/dri/card0: No such file or directory
[    22.038] (WW) Falling back to old probe method for modesetting
[    22.038] (EE) open /dev/dri/card0: No such file or directory
[    22.038] (EE) Screen 0 deleted because of no matching config section.
[    22.038] (II) UnloadModule: "radeon"
[    22.038] (EE) Screen 0 deleted because of no matching config section.
[    22.038] (II) UnloadModule: "modesetting"
[    22.038] (EE) Device(s) detected, but none match those in the config file.
[    22.038] (EE) 
Fatal server error:
[    22.038] (EE) no screens found(EE) 
[    22.038] (EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
[    22.038] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[    22.038] (EE) 
[    22.038] (EE) Server terminated with error (1). Closing log file.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2019-02-10 11:09 [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling Krzysztof Grygiencz
@ 2019-02-11 19:57 ` Jerome Glisse
  0 siblings, 0 replies; 13+ messages in thread
From: Jerome Glisse @ 2019-02-11 19:57 UTC (permalink / raw)
  To: Krzysztof Grygiencz
  Cc: dan.j.williams, akpm, dri-devel, hch, linux-kernel, linux-mm,
	logang, stable, torvalds

On Sun, Feb 10, 2019 at 12:09:08PM +0100, Krzysztof Grygiencz wrote:
> Dear Sir,
> 
> I'm using ArchLinux distribution. After kernel upgrade form 4.19.14 to
> 4.19.15 my X environment stopped working. I have AMD HD3300 (RS780D)
> graphics card. I have bisected kernel and found a failing commit:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v4.19.20&id=ec5471c92fb29ad848c81875840478be201eeb3f

This is a false positive, you should skip that commit. It will not impact
the GPU driver for your specific GPUs. My advice is to first bisect on
drivers/gpu/drm/radeon only.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-30 22:34                   ` Logan Gunthorpe
@ 2018-11-30 22:47                     ` Dan Williams
  0 siblings, 0 replies; 13+ messages in thread
From: Dan Williams @ 2018-11-30 22:47 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Andrew Morton, stable, Jérôme Glisse,
	Christoph Hellwig, Linus Torvalds, Linux MM,
	Linux Kernel Mailing List, Maling list - DRI developers,
	Bjorn Helgaas, Stephen Bates

On Fri, Nov 30, 2018 at 2:34 PM Logan Gunthorpe <logang@deltatee.com> wrote:
>
>
>
> On 2018-11-30 3:28 p.m., Dan Williams wrote:
> > On Fri, Nov 30, 2018 at 2:19 PM Logan Gunthorpe <logang@deltatee.com> wrote:
> >>
> >> Hey,
> >>
> >> On 2018-11-29 11:51 a.m., Dan Williams wrote:
> >>> Got it, let me see how bad moving arch_remove_memory() turns out,
> >>> sounds like a decent approach to coordinate multiple users of a single
> >>> ref.
> >>
> >> I've put together a patch set[1] that fixes all the users of
> >> devm_memremap_pages() without moving arch_remove_memory(). It's pretty
> >> clean except for the p2pdma case which is fairly tricky but I don't
> >> think there's an easy way around that.
> >
> > The solution I'm trying is to introduce a devm_memremap_pages_remove()
> > that each user can call after they have called percpu_ref_exit(), it's
> > just crashing for me currently...
>
> Ok, that's probably less of a clean up for other users, but sounds like
> it would be less tricky for p2pdma. I'd have to create a list of all
> pgmaps, but that's not so hard and doesn't create any nasty races to
> consider like my current solution.
>
> >> If you come up with a better solution that's great, otherwise let me
> >> know and I'll do some clean up and more testing and send this set to the
> >> lists. Though, we might need to wait for your patch to land before we
> >> can properly send the fix to it (the first patch in my series)...
> >
> > I'd say go ahead and send it. We can fix p2pdma as a follow-on. Send
> > it to Andrew as a patch relative to the current -next tree.
>
> Ok, though, how do I reference the current patch in Andrew's tree? Or
> does it matter?

I would just let Andrew know that this applies incrementally to
"mm-hmm-mark-hmm_devmem_add-add_resource-export_symbol_gpl.patch" in
his tree. You can't specify Fixes: tags for pending patches in -mm.
Andrew may choose to squash the change into the existing patch, which
may be the best outcome for not exposing a bisect regression point for
p2pdma.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-30 22:28                 ` Dan Williams
@ 2018-11-30 22:34                   ` Logan Gunthorpe
  2018-11-30 22:47                     ` Dan Williams
  0 siblings, 1 reply; 13+ messages in thread
From: Logan Gunthorpe @ 2018-11-30 22:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: Andrew Morton, stable, Jérôme Glisse,
	Christoph Hellwig, Linus Torvalds, Linux MM,
	Linux Kernel Mailing List, Maling list - DRI developers,
	Bjorn Helgaas, Stephen Bates



On 2018-11-30 3:28 p.m., Dan Williams wrote:
> On Fri, Nov 30, 2018 at 2:19 PM Logan Gunthorpe <logang@deltatee.com> wrote:
>>
>> Hey,
>>
>> On 2018-11-29 11:51 a.m., Dan Williams wrote:
>>> Got it, let me see how bad moving arch_remove_memory() turns out,
>>> sounds like a decent approach to coordinate multiple users of a single
>>> ref.
>>
>> I've put together a patch set[1] that fixes all the users of
>> devm_memremap_pages() without moving arch_remove_memory(). It's pretty
>> clean except for the p2pdma case which is fairly tricky but I don't
>> think there's an easy way around that.
> 
> The solution I'm trying is to introduce a devm_memremap_pages_remove()
> that each user can call after they have called percpu_ref_exit(), it's
> just crashing for me currently...

Ok, that's probably less of a clean up for other users, but sounds like
it would be less tricky for p2pdma. I'd have to create a list of all
pgmaps, but that's not so hard and doesn't create any nasty races to
consider like my current solution.

>> If you come up with a better solution that's great, otherwise let me
>> know and I'll do some clean up and more testing and send this set to the
>> lists. Though, we might need to wait for your patch to land before we
>> can properly send the fix to it (the first patch in my series)...
> 
> I'd say go ahead and send it. We can fix p2pdma as a follow-on. Send
> it to Andrew as a patch relative to the current -next tree.

Ok, though, how do I reference the current patch in Andrew's tree? Or
does it matter?

Logan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-30 22:19               ` Logan Gunthorpe
@ 2018-11-30 22:28                 ` Dan Williams
  2018-11-30 22:34                   ` Logan Gunthorpe
  0 siblings, 1 reply; 13+ messages in thread
From: Dan Williams @ 2018-11-30 22:28 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Andrew Morton, stable, Jérôme Glisse,
	Christoph Hellwig, Linus Torvalds, Linux MM,
	Linux Kernel Mailing List, Maling list - DRI developers,
	Bjorn Helgaas, Stephen Bates

On Fri, Nov 30, 2018 at 2:19 PM Logan Gunthorpe <logang@deltatee.com> wrote:
>
> Hey,
>
> On 2018-11-29 11:51 a.m., Dan Williams wrote:
> > Got it, let me see how bad moving arch_remove_memory() turns out,
> > sounds like a decent approach to coordinate multiple users of a single
> > ref.
>
> I've put together a patch set[1] that fixes all the users of
> devm_memremap_pages() without moving arch_remove_memory(). It's pretty
> clean except for the p2pdma case which is fairly tricky but I don't
> think there's an easy way around that.

The solution I'm trying is to introduce a devm_memremap_pages_remove()
that each user can call after they have called percpu_ref_exit(), it's
just crashing for me currently...

> If you come up with a better solution that's great, otherwise let me
> know and I'll do some clean up and more testing and send this set to the
> lists. Though, we might need to wait for your patch to land before we
> can properly send the fix to it (the first patch in my series)...

I'd say go ahead and send it. We can fix p2pdma as a follow-on. Send
it to Andrew as a patch relative to the current -next tree.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-29 18:51             ` Dan Williams
@ 2018-11-30 22:19               ` Logan Gunthorpe
  2018-11-30 22:28                 ` Dan Williams
  0 siblings, 1 reply; 13+ messages in thread
From: Logan Gunthorpe @ 2018-11-30 22:19 UTC (permalink / raw)
  To: Dan Williams
  Cc: Andrew Morton, stable, Jérôme Glisse,
	Christoph Hellwig, Linus Torvalds, Linux MM,
	Linux Kernel Mailing List, Maling list - DRI developers,
	Bjorn Helgaas, Stephen Bates

Hey,

On 2018-11-29 11:51 a.m., Dan Williams wrote:
> Got it, let me see how bad moving arch_remove_memory() turns out,
> sounds like a decent approach to coordinate multiple users of a single
> ref.

I've put together a patch set[1] that fixes all the users of
devm_memremap_pages() without moving arch_remove_memory(). It's pretty
clean except for the p2pdma case which is fairly tricky but I don't
think there's an easy way around that.

If you come up with a better solution that's great, otherwise let me
know and I'll do some clean up and more testing and send this set to the
lists. Though, we might need to wait for your patch to land before we
can properly send the fix to it (the first patch in my series)...

Logan

[1] https://github.com/sbates130272/linux-p2pmem/ memremap_fix


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-29 17:50           ` Logan Gunthorpe
@ 2018-11-29 18:51             ` Dan Williams
  2018-11-30 22:19               ` Logan Gunthorpe
  0 siblings, 1 reply; 13+ messages in thread
From: Dan Williams @ 2018-11-29 18:51 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Andrew Morton, stable, Jérôme Glisse,
	Christoph Hellwig, Linus Torvalds, Linux MM,
	Linux Kernel Mailing List, Maling list - DRI developers,
	Bjorn Helgaas, Stephen Bates

On Thu, Nov 29, 2018 at 9:51 AM Logan Gunthorpe <logang@deltatee.com> wrote:
>
>
>
> On 2018-11-29 10:30 a.m., Dan Williams wrote:
> > Oh! Yes, nice find. We need to wait for the percpu-ref to be dead and
> > all outstanding references dropped before we can proceed to
> > arch_remove_memory(), and I think this problem has been there since
> > day one because the final exit was always after devm_memremap_pages()
> > release which means arch_remove_memory() was always racing any final
> > put_page(). I'll take a look, it seems the arch_remove_pages() call
> > needs to be moved out-of-line to its own context and wait for the
> > final exit of the percpu-ref.
>
> Ok, well I thought moving the wait_for_completion() into the kill() call
> was a pretty good solution to this.

True, it is...

> Though, if we move the
> arch_remove_pages() into a different context, it *may* help with the
> problem below...

Glad to see my over-engineered proposal in this case might be good for
something...

>
> >> Though, now that I look at it, the current change in question will be
> >> wrong if there are two devm_memremap_pages_release()s to call. Both need
> >> to drop their references before we can wait_for_completion() ;(. I guess
> >> I need multiple percpu_refs or more complex changes to
> >> devm_memremap_pages_release().
> >
> > Can you just have a normal device-level kref for this case? On final
> > device-level kref_put then kill the percpu_ref? I guess the problem is
> > devm semantics where p2pdma only gets one callback on a driver
> > ->remove() event. I'm not sure how to support multiple references of
> > the same pages without creating a non-devm version of
> > devm_memremap_pages(). I'm not opposed to that, but afaiu I don't
> > think p2pdma is compatible with devm as long as it supports N>1:1
> > mappings of the same range.
>
> Hmm, no I think you misunderstood what I said. I'm saying I need to have
> exactly one percpu_ref per call to devm_memremap_pages() and this is
> doable, just slightly annoying. Right now I have one percpu_ref for
> multiple calls to devm_memremap_pages() which doesn't work with the
> above fix because there will always be a wait_for_completion() before
> the last references are dropped in this way:
>
> 1) First devm_memremap_pages_release() is called which drops it's
> reference and waits_for_completion().
>
> 2) The second devm_memremap_pages_release() needs to be called to drop
> it's reference, but can't seeing the first is waiting, and therefore the
> percpu_ref never goes to zero and the wait_for_completion() never returns.
>

Got it, let me see how bad moving arch_remove_memory() turns out,
sounds like a decent approach to coordinate multiple users of a single
ref.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-29 17:30         ` Dan Williams
@ 2018-11-29 17:50           ` Logan Gunthorpe
  2018-11-29 18:51             ` Dan Williams
  0 siblings, 1 reply; 13+ messages in thread
From: Logan Gunthorpe @ 2018-11-29 17:50 UTC (permalink / raw)
  To: Dan Williams
  Cc: Andrew Morton, stable, Jérôme Glisse,
	Christoph Hellwig, Linus Torvalds, Linux MM,
	Linux Kernel Mailing List, Maling list - DRI developers,
	Bjorn Helgaas, Stephen Bates



On 2018-11-29 10:30 a.m., Dan Williams wrote:
> Oh! Yes, nice find. We need to wait for the percpu-ref to be dead and
> all outstanding references dropped before we can proceed to
> arch_remove_memory(), and I think this problem has been there since
> day one because the final exit was always after devm_memremap_pages()
> release which means arch_remove_memory() was always racing any final
> put_page(). I'll take a look, it seems the arch_remove_pages() call
> needs to be moved out-of-line to its own context and wait for the
> final exit of the percpu-ref.

Ok, well I thought moving the wait_for_completion() into the kill() call
was a pretty good solution to this. Though, if we move the
arch_remove_pages() into a different context, it *may* help with the
problem below...

>> Though, now that I look at it, the current change in question will be
>> wrong if there are two devm_memremap_pages_release()s to call. Both need
>> to drop their references before we can wait_for_completion() ;(. I guess
>> I need multiple percpu_refs or more complex changes to
>> devm_memremap_pages_release().
> 
> Can you just have a normal device-level kref for this case? On final
> device-level kref_put then kill the percpu_ref? I guess the problem is
> devm semantics where p2pdma only gets one callback on a driver
> ->remove() event. I'm not sure how to support multiple references of
> the same pages without creating a non-devm version of
> devm_memremap_pages(). I'm not opposed to that, but afaiu I don't
> think p2pdma is compatible with devm as long as it supports N>1:1
> mappings of the same range.

Hmm, no I think you misunderstood what I said. I'm saying I need to have
exactly one percpu_ref per call to devm_memremap_pages() and this is
doable, just slightly annoying. Right now I have one percpu_ref for
multiple calls to devm_memremap_pages() which doesn't work with the
above fix because there will always be a wait_for_completion() before
the last references are dropped in this way:

1) First devm_memremap_pages_release() is called which drops it's
reference and waits_for_completion().

2) The second devm_memremap_pages_release() needs to be called to drop
it's reference, but can't seeing the first is waiting, and therefore the
percpu_ref never goes to zero and the wait_for_completion() never returns.

Logan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-29 17:06       ` Logan Gunthorpe
@ 2018-11-29 17:30         ` Dan Williams
  2018-11-29 17:50           ` Logan Gunthorpe
  0 siblings, 1 reply; 13+ messages in thread
From: Dan Williams @ 2018-11-29 17:30 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Andrew Morton, stable, Jérôme Glisse,
	Christoph Hellwig, Linus Torvalds, Linux MM,
	Linux Kernel Mailing List, Maling list - DRI developers,
	Bjorn Helgaas, Stephen Bates

On Thu, Nov 29, 2018 at 9:07 AM Logan Gunthorpe <logang@deltatee.com> wrote:
>
>
>
> On 2018-11-28 8:10 p.m., Dan Williams wrote:
> > Yes, please send a proper patch.
>
> Ok, I'll send one shortly.
>
> > Although, I'm still not sure I see
> > the problem with the order of the percpu-ref kill. It's likely more
> > efficient to put the kill after the put_page() loop because the
> > percpu-ref will still be in "fast" per-cpu mode, but the kernel panic
> > should not be possible as long as their is a wait_for_completion()
> > before the exit, unless something else is wrong.
>
> The series of events looks something like this:
>
> 1) Some p2pdma user calls pci_alloc_p2pmem() to get some memory to DMA
> to taking a reference to the pgmap.
> 2) Another process unbinds the underlying p2pdma driver and the devm
> chain starts to unwind.
> 3) devm_memremap_pages_release() is called and it kills the reference
> and drop's it's last reference.

Oh! Yes, nice find. We need to wait for the percpu-ref to be dead and
all outstanding references dropped before we can proceed to
arch_remove_memory(), and I think this problem has been there since
day one because the final exit was always after devm_memremap_pages()
release which means arch_remove_memory() was always racing any final
put_page(). I'll take a look, it seems the arch_remove_pages() call
needs to be moved out-of-line to its own context and wait for the
final exit of the percpu-ref.

> 4) arch_remove_memory() is called which will remove all the struct pages.
> 5) We eventually get to pci_p2pdma_release() where we wait for the
> completion indicating all the pages have been freed.
> 6) The user in (1) tries to use the page that has been removed,
> typically by calling pci_p2pdma_map_sg(), but the page doesn't exist so
> the kernel panics.
>
> So we really need the wait in (5) to occur before (4) but after (3) so
> that the pages continue to exist until the last reference is dropped.
>
> > Certainly you can't move the wait_for_completion() into your ->kill()
> > callback without switching the ordering, but I'm not on board with
> > that change until I understand a bit more about why you think
> > device-dax might be broken?
> >
> > I took a look at the p2pdma shutdown path and the:
> >
> >         if (percpu_ref_is_dying(ref))
> >                 return;
> > ...looks fishy. If multiple agents can overlap their requests for the
> > same range why not track that simply as additional refs? Could it be
> > the crash that you are seeing is a result of mis-accounting when it is
> > safe to assume the page allocation can be freed?
>
> Yeah, someone else mentioned the same thing during review but if I
> remove it, there can be a double kill() on a hypothetical driver that
> might call pci_p2pdma_add_resource() twice. The issue is we only have
> one percpu_ref per device not one per range/BAR.
>
> Though, now that I look at it, the current change in question will be
> wrong if there are two devm_memremap_pages_release()s to call. Both need
> to drop their references before we can wait_for_completion() ;(. I guess
> I need multiple percpu_refs or more complex changes to
> devm_memremap_pages_release().

Can you just have a normal device-level kref for this case? On final
device-level kref_put then kill the percpu_ref? I guess the problem is
devm semantics where p2pdma only gets one callback on a driver
->remove() event. I'm not sure how to support multiple references of
the same pages without creating a non-devm version of
devm_memremap_pages(). I'm not opposed to that, but afaiu I don't
think p2pdma is compatible with devm as long as it supports N>1:1
mappings of the same range.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-29  3:10     ` Dan Williams
@ 2018-11-29 17:06       ` Logan Gunthorpe
  2018-11-29 17:30         ` Dan Williams
  0 siblings, 1 reply; 13+ messages in thread
From: Logan Gunthorpe @ 2018-11-29 17:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Andrew Morton, stable, Jérôme Glisse,
	Christoph Hellwig, Linus Torvalds, Linux MM,
	Linux Kernel Mailing List, Maling list - DRI developers,
	Bjorn Helgaas, Stephen Bates



On 2018-11-28 8:10 p.m., Dan Williams wrote:
> Yes, please send a proper patch. 

Ok, I'll send one shortly.

> Although, I'm still not sure I see
> the problem with the order of the percpu-ref kill. It's likely more
> efficient to put the kill after the put_page() loop because the
> percpu-ref will still be in "fast" per-cpu mode, but the kernel panic
> should not be possible as long as their is a wait_for_completion()
> before the exit, unless something else is wrong.

The series of events looks something like this:

1) Some p2pdma user calls pci_alloc_p2pmem() to get some memory to DMA
to taking a reference to the pgmap.
2) Another process unbinds the underlying p2pdma driver and the devm
chain starts to unwind.
3) devm_memremap_pages_release() is called and it kills the reference
and drop's it's last reference.
4) arch_remove_memory() is called which will remove all the struct pages.
5) We eventually get to pci_p2pdma_release() where we wait for the
completion indicating all the pages have been freed.
6) The user in (1) tries to use the page that has been removed,
typically by calling pci_p2pdma_map_sg(), but the page doesn't exist so
the kernel panics.

So we really need the wait in (5) to occur before (4) but after (3) so
that the pages continue to exist until the last reference is dropped.

> Certainly you can't move the wait_for_completion() into your ->kill()
> callback without switching the ordering, but I'm not on board with
> that change until I understand a bit more about why you think
> device-dax might be broken?
> 
> I took a look at the p2pdma shutdown path and the:
> 
>         if (percpu_ref_is_dying(ref))
>                 return;
> ...looks fishy. If multiple agents can overlap their requests for the
> same range why not track that simply as additional refs? Could it be
> the crash that you are seeing is a result of mis-accounting when it is
> safe to assume the page allocation can be freed?

Yeah, someone else mentioned the same thing during review but if I
remove it, there can be a double kill() on a hypothetical driver that
might call pci_p2pdma_add_resource() twice. The issue is we only have
one percpu_ref per device not one per range/BAR.

Though, now that I look at it, the current change in question will be
wrong if there are two devm_memremap_pages_release()s to call. Both need
to drop their references before we can wait_for_completion() ;(. I guess
I need multiple percpu_refs or more complex changes to
devm_memremap_pages_release().

Thanks

Logan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-27 21:43   ` Logan Gunthorpe
@ 2018-11-29  3:10     ` Dan Williams
  2018-11-29 17:06       ` Logan Gunthorpe
  0 siblings, 1 reply; 13+ messages in thread
From: Dan Williams @ 2018-11-29  3:10 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Andrew Morton, stable, Jérôme Glisse,
	Christoph Hellwig, Linus Torvalds, Linux MM,
	Linux Kernel Mailing List, Maling list - DRI developers,
	Bjorn Helgaas, Stephen Bates

On Tue, Nov 27, 2018 at 1:44 PM Logan Gunthorpe <logang@deltatee.com> wrote:
>
> Hey Dan,
>
> On 2018-11-20 4:13 p.m., Dan Williams wrote:
> > The last step before devm_memremap_pages() returns success is to
> > allocate a release action, devm_memremap_pages_release(), to tear the
> > entire setup down. However, the result from devm_add_action() is not
> > checked.
> >
> > Checking the error from devm_add_action() is not enough. The api
> > currently relies on the fact that the percpu_ref it is using is killed
> > by the time the devm_memremap_pages_release() is run. Rather than
> > continue this awkward situation, offload the responsibility of killing
> > the percpu_ref to devm_memremap_pages_release() directly. This allows
> > devm_memremap_pages() to do the right thing  relative to init failures
> > and shutdown.
> >
> > Without this change we could fail to register the teardown of
> > devm_memremap_pages(). The likelihood of hitting this failure is tiny as
> > small memory allocations almost always succeed. However, the impact of
> > the failure is large given any future reconfiguration, or
> > disable/enable, of an nvdimm namespace will fail forever as subsequent
> > calls to devm_memremap_pages() will fail to setup the pgmap_radix since
> > there will be stale entries for the physical address range.
> >
> > An argument could be made to require that the ->kill() operation be set
> > in the @pgmap arg rather than passed in separately. However, it helps
> > code readability, tracking the lifetime of a given instance, to be able
> > to grep the kill routine directly at the devm_memremap_pages() call
> > site.
> >
> > Cc: <stable@vger.kernel.org>
> > Fixes: e8d513483300 ("memremap: change devm_memremap_pages interface...")
> > Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com>
> > Reported-by: Logan Gunthorpe <logang@deltatee.com>
> > Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> I recently realized this patch, which was recently added to the mm tree,
> will break p2pdma. This is largely because the patch was written and
> reviewed before p2pdma was merged (in 4.20). Originally, I think we both
> expected this patch would be merged before p2pdma but that's not what
> happened.

Indeed, sorry I missed this.

>
> Also, while testing this, I found the teardown is still not quite
> correct. In p2pdma, the struct pages will be removed before all of the
> percpu references have released and if the device is unbound while pages
> are in use, there will be a kernel panic. This is because we wait on the
> completion that indicates all references have been free'd after
> devm_memremap_pages_release() is called and the pages are removed. This
> is fairly easily fixed by waiting for the completion in the kill
> function and moving the call after the last put_page(). I suspect device
> DAX also has this problem but I'm not entirely certain if something else
> might be preventing us from hitting this bug.
>
> Ideally, as part of this patch we need to update the p2pdma call site
> for devm_memremap_pages() and fix the completion issue. The diff for all
> this is below, but if you'd like I can send a proper patch.

Yes, please send a proper patch. Although, I'm still not sure I see
the problem with the order of the percpu-ref kill. It's likely more
efficient to put the kill after the put_page() loop because the
percpu-ref will still be in "fast" per-cpu mode, but the kernel panic
should not be possible as long as their is a wait_for_completion()
before the exit, unless something else is wrong.

Certainly you can't move the wait_for_completion() into your ->kill()
callback without switching the ordering, but I'm not on board with
that change until I understand a bit more about why you think
device-dax might be broken?

I took a look at the p2pdma shutdown path and the:

        if (percpu_ref_is_dying(ref))
                return;

...looks fishy. If multiple agents can overlap their requests for the
same range why not track that simply as additional refs? Could it be
the crash that you are seeing is a result of mis-accounting when it is
safe to assume the page allocation can be freed?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-20 23:13 ` [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling Dan Williams
@ 2018-11-27 21:43   ` Logan Gunthorpe
  2018-11-29  3:10     ` Dan Williams
  0 siblings, 1 reply; 13+ messages in thread
From: Logan Gunthorpe @ 2018-11-27 21:43 UTC (permalink / raw)
  To: Dan Williams, akpm
  Cc: stable, Jérôme Glisse, Christoph Hellwig, torvalds,
	linux-mm, linux-kernel, dri-devel, Bjorn Helgaas, Stephen Bates

Hey Dan,

On 2018-11-20 4:13 p.m., Dan Williams wrote:
> The last step before devm_memremap_pages() returns success is to
> allocate a release action, devm_memremap_pages_release(), to tear the
> entire setup down. However, the result from devm_add_action() is not
> checked.
> 
> Checking the error from devm_add_action() is not enough. The api
> currently relies on the fact that the percpu_ref it is using is killed
> by the time the devm_memremap_pages_release() is run. Rather than
> continue this awkward situation, offload the responsibility of killing
> the percpu_ref to devm_memremap_pages_release() directly. This allows
> devm_memremap_pages() to do the right thing  relative to init failures
> and shutdown.
> 
> Without this change we could fail to register the teardown of
> devm_memremap_pages(). The likelihood of hitting this failure is tiny as
> small memory allocations almost always succeed. However, the impact of
> the failure is large given any future reconfiguration, or
> disable/enable, of an nvdimm namespace will fail forever as subsequent
> calls to devm_memremap_pages() will fail to setup the pgmap_radix since
> there will be stale entries for the physical address range.
> 
> An argument could be made to require that the ->kill() operation be set
> in the @pgmap arg rather than passed in separately. However, it helps
> code readability, tracking the lifetime of a given instance, to be able
> to grep the kill routine directly at the devm_memremap_pages() call
> site.
> 
> Cc: <stable@vger.kernel.org>
> Fixes: e8d513483300 ("memremap: change devm_memremap_pages interface...")
> Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com>
> Reported-by: Logan Gunthorpe <logang@deltatee.com>
> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

I recently realized this patch, which was recently added to the mm tree,
will break p2pdma. This is largely because the patch was written and
reviewed before p2pdma was merged (in 4.20). Originally, I think we both
expected this patch would be merged before p2pdma but that's not what
happened.

Also, while testing this, I found the teardown is still not quite
correct. In p2pdma, the struct pages will be removed before all of the
percpu references have released and if the device is unbound while pages
are in use, there will be a kernel panic. This is because we wait on the
completion that indicates all references have been free'd after
devm_memremap_pages_release() is called and the pages are removed. This
is fairly easily fixed by waiting for the completion in the kill
function and moving the call after the last put_page(). I suspect device
DAX also has this problem but I'm not entirely certain if something else
might be preventing us from hitting this bug.

Ideally, as part of this patch we need to update the p2pdma call site
for devm_memremap_pages() and fix the completion issue. The diff for all
this is below, but if you'd like I can send a proper patch.

Thanks,

Logan

--


diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index ae3c5b25dcc7..1df7bdb45eab 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -82,9 +82,10 @@ static void pci_p2pdma_percpu_release(struct
percpu_ref *ref)
        complete_all(&p2p->devmap_ref_done);
 }

-static void pci_p2pdma_percpu_kill(void *data)
+static void pci_p2pdma_percpu_kill(struct percpu_ref *ref)
 {
-       struct percpu_ref *ref = data;
+       struct pci_p2pdma *p2p =
+               container_of(ref, struct pci_p2pdma, devmap_ref);

        /*
         * pci_p2pdma_add_resource() may be called multiple times
@@ -96,6 +97,7 @@ static void pci_p2pdma_percpu_kill(void *data)
                return;

        percpu_ref_kill(ref);
+       wait_for_completion(&p2p->devmap_ref_done);
 }

 static void pci_p2pdma_release(void *data)
@@ -105,7 +107,6 @@ static void pci_p2pdma_release(void *data)
        if (!pdev->p2pdma)
                return;

-       wait_for_completion(&pdev->p2pdma->devmap_ref_done);
        percpu_ref_exit(&pdev->p2pdma->devmap_ref);

        gen_pool_destroy(pdev->p2pdma->pool);
@@ -198,6 +199,7 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev,
int bar, size_t size,
        pgmap->type = MEMORY_DEVICE_PCI_P2PDMA;
        pgmap->pci_p2pdma_bus_offset = pci_bus_address(pdev, bar) -
                pci_resource_start(pdev, bar);
+       pgmap->kill = pci_p2pdma_percpu_kill;

        addr = devm_memremap_pages(&pdev->dev, pgmap);
        if (IS_ERR(addr)) {
@@ -211,11 +213,6 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev,
int bar, size_t size,
        if (error)
                goto pgmap_free;

-       error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_percpu_kill,
-                                         &pdev->p2pdma->devmap_ref);
-       if (error)
-               goto pgmap_free;
-
        pci_info(pdev, "added peer-to-peer DMA memory %pR\n",
                 &pgmap->res);

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 5e45f0c327a5..dd9a953e796a 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -88,9 +88,9 @@ static void devm_memremap_pages_release(void *data)
        resource_size_t align_start, align_size;
        unsigned long pfn;

-       pgmap->kill(pgmap->ref);
        for_each_device_pfn(pfn, pgmap)
                put_page(pfn_to_page(pfn));
+       pgmap->kill(pgmap->ref);

        /* pages are dead and unused, undo the arch mapping */
        align_start = res->start & ~(SECTION_SIZE - 1);








^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling
  2018-11-20 23:12 [PATCH v8 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Dan Williams
@ 2018-11-20 23:13 ` Dan Williams
  2018-11-27 21:43   ` Logan Gunthorpe
  0 siblings, 1 reply; 13+ messages in thread
From: Dan Williams @ 2018-11-20 23:13 UTC (permalink / raw)
  To: akpm
  Cc: stable, Jérôme Glisse, Logan Gunthorpe,
	Logan Gunthorpe, Christoph Hellwig, torvalds, linux-mm,
	linux-kernel, dri-devel

The last step before devm_memremap_pages() returns success is to
allocate a release action, devm_memremap_pages_release(), to tear the
entire setup down. However, the result from devm_add_action() is not
checked.

Checking the error from devm_add_action() is not enough. The api
currently relies on the fact that the percpu_ref it is using is killed
by the time the devm_memremap_pages_release() is run. Rather than
continue this awkward situation, offload the responsibility of killing
the percpu_ref to devm_memremap_pages_release() directly. This allows
devm_memremap_pages() to do the right thing  relative to init failures
and shutdown.

Without this change we could fail to register the teardown of
devm_memremap_pages(). The likelihood of hitting this failure is tiny as
small memory allocations almost always succeed. However, the impact of
the failure is large given any future reconfiguration, or
disable/enable, of an nvdimm namespace will fail forever as subsequent
calls to devm_memremap_pages() will fail to setup the pgmap_radix since
there will be stale entries for the physical address range.

An argument could be made to require that the ->kill() operation be set
in the @pgmap arg rather than passed in separately. However, it helps
code readability, tracking the lifetime of a given instance, to be able
to grep the kill routine directly at the devm_memremap_pages() call
site.

Cc: <stable@vger.kernel.org>
Fixes: e8d513483300 ("memremap: change devm_memremap_pages interface...")
Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com>
Reported-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dax/pmem.c                |   14 +++-----------
 drivers/nvdimm/pmem.c             |   13 +++++--------
 include/linux/memremap.h          |    2 ++
 kernel/memremap.c                 |   30 ++++++++++++++----------------
 tools/testing/nvdimm/test/iomap.c |   15 ++++++++++++++-
 5 files changed, 38 insertions(+), 36 deletions(-)

diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index 99e2aace8078..2c1f459c0c63 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -48,9 +48,8 @@ static void dax_pmem_percpu_exit(void *data)
 	percpu_ref_exit(ref);
 }
 
-static void dax_pmem_percpu_kill(void *data)
+static void dax_pmem_percpu_kill(struct percpu_ref *ref)
 {
-	struct percpu_ref *ref = data;
 	struct dax_pmem *dax_pmem = to_dax_pmem(ref);
 
 	dev_dbg(dax_pmem->dev, "trace\n");
@@ -112,17 +111,10 @@ static int dax_pmem_probe(struct device *dev)
 	}
 
 	dax_pmem->pgmap.ref = &dax_pmem->ref;
+	dax_pmem->pgmap.kill = dax_pmem_percpu_kill;
 	addr = devm_memremap_pages(dev, &dax_pmem->pgmap);
-	if (IS_ERR(addr)) {
-		devm_remove_action(dev, dax_pmem_percpu_exit, &dax_pmem->ref);
-		percpu_ref_exit(&dax_pmem->ref);
+	if (IS_ERR(addr))
 		return PTR_ERR(addr);
-	}
-
-	rc = devm_add_action_or_reset(dev, dax_pmem_percpu_kill,
-							&dax_pmem->ref);
-	if (rc)
-		return rc;
 
 	/* adjust the dax_region resource to the start of data */
 	memcpy(&res, &dax_pmem->pgmap.res, sizeof(res));
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index f7019294740c..bc2f700feef8 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -309,8 +309,11 @@ static void pmem_release_queue(void *q)
 	blk_cleanup_queue(q);
 }
 
-static void pmem_freeze_queue(void *q)
+static void pmem_freeze_queue(struct percpu_ref *ref)
 {
+	struct request_queue *q;
+
+	q = container_of(ref, typeof(*q), q_usage_counter);
 	blk_freeze_queue_start(q);
 }
 
@@ -402,6 +405,7 @@ static int pmem_attach_disk(struct device *dev,
 
 	pmem->pfn_flags = PFN_DEV;
 	pmem->pgmap.ref = &q->q_usage_counter;
+	pmem->pgmap.kill = pmem_freeze_queue;
 	if (is_nd_pfn(dev)) {
 		if (setup_pagemap_fsdax(dev, &pmem->pgmap))
 			return -ENOMEM;
@@ -427,13 +431,6 @@ static int pmem_attach_disk(struct device *dev,
 		memcpy(&bb_res, &nsio->res, sizeof(bb_res));
 	}
 
-	/*
-	 * At release time the queue must be frozen before
-	 * devm_memremap_pages is unwound
-	 */
-	if (devm_add_action_or_reset(dev, pmem_freeze_queue, q))
-		return -ENOMEM;
-
 	if (IS_ERR(addr))
 		return PTR_ERR(addr);
 	pmem->virt_addr = addr;
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 0ac69ddf5fc4..55db66b3716f 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -111,6 +111,7 @@ typedef void (*dev_page_free_t)(struct page *page, void *data);
  * @altmap: pre-allocated/reserved memory for vmemmap allocations
  * @res: physical address range covered by @ref
  * @ref: reference count that pins the devm_memremap_pages() mapping
+ * @kill: callback to transition @ref to the dead state
  * @dev: host device of the mapping for debug
  * @data: private data pointer for page_free()
  * @type: memory type: see MEMORY_* in memory_hotplug.h
@@ -122,6 +123,7 @@ struct dev_pagemap {
 	bool altmap_valid;
 	struct resource res;
 	struct percpu_ref *ref;
+	void (*kill)(struct percpu_ref *ref);
 	struct device *dev;
 	void *data;
 	enum memory_type type;
diff --git a/kernel/memremap.c b/kernel/memremap.c
index 99d14940acfa..5e45f0c327a5 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -88,14 +88,10 @@ static void devm_memremap_pages_release(void *data)
 	resource_size_t align_start, align_size;
 	unsigned long pfn;
 
+	pgmap->kill(pgmap->ref);
 	for_each_device_pfn(pfn, pgmap)
 		put_page(pfn_to_page(pfn));
 
-	if (percpu_ref_tryget_live(pgmap->ref)) {
-		dev_WARN(dev, "%s: page mapping is still live!\n", __func__);
-		percpu_ref_put(pgmap->ref);
-	}
-
 	/* pages are dead and unused, undo the arch mapping */
 	align_start = res->start & ~(SECTION_SIZE - 1);
 	align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE)
@@ -116,7 +112,7 @@ static void devm_memremap_pages_release(void *data)
 /**
  * devm_memremap_pages - remap and provide memmap backing for the given resource
  * @dev: hosting device for @res
- * @pgmap: pointer to a struct dev_pgmap
+ * @pgmap: pointer to a struct dev_pagemap
  *
  * Notes:
  * 1/ At a minimum the res, ref and type members of @pgmap must be initialized
@@ -125,11 +121,8 @@ static void devm_memremap_pages_release(void *data)
  * 2/ The altmap field may optionally be initialized, in which case altmap_valid
  *    must be set to true
  *
- * 3/ pgmap.ref must be 'live' on entry and 'dead' before devm_memunmap_pages()
- *    time (or devm release event). The expected order of events is that ref has
- *    been through percpu_ref_kill() before devm_memremap_pages_release(). The
- *    wait for the completion of all references being dropped and
- *    percpu_ref_exit() must occur after devm_memremap_pages_release().
+ * 3/ pgmap->ref must be 'live' on entry and will be killed at
+ *    devm_memremap_pages_release() time, or if this routine fails.
  *
  * 4/ res is expected to be a host memory range that could feasibly be
  *    treated as a "System RAM" range, i.e. not a device mmio range, but
@@ -145,6 +138,9 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
 	pgprot_t pgprot = PAGE_KERNEL;
 	int error, nid, is_ram;
 
+	if (!pgmap->ref || !pgmap->kill)
+		return ERR_PTR(-EINVAL);
+
 	align_start = res->start & ~(SECTION_SIZE - 1);
 	align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE)
 		- align_start;
@@ -170,12 +166,10 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
 	if (is_ram != REGION_DISJOINT) {
 		WARN_ONCE(1, "%s attempted on %s region %pr\n", __func__,
 				is_ram == REGION_MIXED ? "mixed" : "ram", res);
-		return ERR_PTR(-ENXIO);
+		error = -ENXIO;
+		goto err_array;
 	}
 
-	if (!pgmap->ref)
-		return ERR_PTR(-EINVAL);
-
 	pgmap->dev = dev;
 
 	error = xa_err(xa_store_range(&pgmap_array, PHYS_PFN(res->start),
@@ -217,7 +211,10 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
 				align_size >> PAGE_SHIFT, pgmap);
 	percpu_ref_get_many(pgmap->ref, pfn_end(pgmap) - pfn_first(pgmap));
 
-	devm_add_action(dev, devm_memremap_pages_release, pgmap);
+	error = devm_add_action_or_reset(dev, devm_memremap_pages_release,
+			pgmap);
+	if (error)
+		return ERR_PTR(error);
 
 	return __va(res->start);
 
@@ -228,6 +225,7 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
  err_pfn_remap:
 	pgmap_array_delete(res);
  err_array:
+	pgmap->kill(pgmap->ref);
 	return ERR_PTR(error);
 }
 EXPORT_SYMBOL_GPL(devm_memremap_pages);
diff --git a/tools/testing/nvdimm/test/iomap.c b/tools/testing/nvdimm/test/iomap.c
index ed18a0cbc0c8..c6635fee27d8 100644
--- a/tools/testing/nvdimm/test/iomap.c
+++ b/tools/testing/nvdimm/test/iomap.c
@@ -104,13 +104,26 @@ void *__wrap_devm_memremap(struct device *dev, resource_size_t offset,
 }
 EXPORT_SYMBOL(__wrap_devm_memremap);
 
+static void nfit_test_kill(void *_pgmap)
+{
+	struct dev_pagemap *pgmap = _pgmap;
+
+	pgmap->kill(pgmap->ref);
+}
+
 void *__wrap_devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
 {
 	resource_size_t offset = pgmap->res.start;
 	struct nfit_test_resource *nfit_res = get_nfit_res(offset);
 
-	if (nfit_res)
+	if (nfit_res) {
+		int rc;
+
+		rc = devm_add_action_or_reset(dev, nfit_test_kill, pgmap);
+		if (rc)
+			return ERR_PTR(rc);
 		return nfit_res->buf + offset - nfit_res->res.start;
+	}
 	return devm_memremap_pages(dev, pgmap);
 }
 EXPORT_SYMBOL_GPL(__wrap_devm_memremap_pages);


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-02-11 19:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-10 11:09 [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling Krzysztof Grygiencz
2019-02-11 19:57 ` Jerome Glisse
  -- strict thread matches above, loose matches on Subject: below --
2018-11-20 23:12 [PATCH v8 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only Dan Williams
2018-11-20 23:13 ` [PATCH v8 3/7] mm, devm_memremap_pages: Fix shutdown handling Dan Williams
2018-11-27 21:43   ` Logan Gunthorpe
2018-11-29  3:10     ` Dan Williams
2018-11-29 17:06       ` Logan Gunthorpe
2018-11-29 17:30         ` Dan Williams
2018-11-29 17:50           ` Logan Gunthorpe
2018-11-29 18:51             ` Dan Williams
2018-11-30 22:19               ` Logan Gunthorpe
2018-11-30 22:28                 ` Dan Williams
2018-11-30 22:34                   ` Logan Gunthorpe
2018-11-30 22:47                     ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).