All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, "Paul Menzel" <pmenzel@molgen.mpg.de>,
	"Christian König" <christian.koenig@amd.com>,
	"Qiang Yu" <qiang.yu@amd.com>,
	"Alex Deucher" <alexander.deucher@amd.com>
Subject: [PATCH 5.10 10/80] drm/amdgpu: check vm ready by amdgpu_vm->evicting flag
Date: Mon, 28 Feb 2022 18:23:51 +0100	[thread overview]
Message-ID: <20220228172312.712322810@linuxfoundation.org> (raw)
In-Reply-To: <20220228172311.789892158@linuxfoundation.org>

From: Qiang Yu <qiang.yu@amd.com>

commit c1a66c3bc425ff93774fb2f6eefa67b83170dd7e upstream.

Workstation application ANSA/META v21.1.4 get this error dmesg when
running CI test suite provided by ANSA/META:
[drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)

This is caused by:
1. create a 256MB buffer in invisible VRAM
2. CPU map the buffer and access it causes vm_fault and try to move
   it to visible VRAM
3. force visible VRAM space and traverse all VRAM bos to check if
   evicting this bo is valuable
4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
   will set amdgpu_vm->evicting, but latter due to not in visible
   VRAM, won't really evict it so not add it to amdgpu_vm->evicted
5. before next CS to clear the amdgpu_vm->evicting, user VM ops
   ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
   but fail in amdgpu_vm_bo_update_mapping() (check
   amdgpu_vm->evicting) and get this error log

This error won't affect functionality as next CS will finish the
waiting VM ops. But we'd better clear the error log by checking
the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling
amdgpu_vm_bo_update_mapping() later.

Another reason is amdgpu_vm->evicted list holds all BOs (both
user buffer and page table), but only page table BOs' eviction
prevent VM ops. amdgpu_vm->evicting flag is set only for page
table BOs, so we should use evicting flag instead of evicted list
in amdgpu_vm_ready().

The side effect of this change is: previously blocked VM op (user
buffer in "evicted" list but no page table in it) gets done
immediately.

v2: update commit comments.

Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Qiang Yu <qiang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -715,11 +715,16 @@ int amdgpu_vm_validate_pt_bos(struct amd
  * Check if all VM PDs/PTs are ready for updates
  *
  * Returns:
- * True if eviction list is empty.
+ * True if VM is not evicting.
  */
 bool amdgpu_vm_ready(struct amdgpu_vm *vm)
 {
-	return list_empty(&vm->evicted);
+	bool ret;
+
+	amdgpu_vm_eviction_lock(vm);
+	ret = !vm->evicting;
+	amdgpu_vm_eviction_unlock(vm);
+	return ret;
 }
 
 /**



  parent reply	other threads:[~2022-02-28 17:39 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-28 17:23 [PATCH 5.10 00/80] 5.10.103-rc1 review Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 01/80] cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 02/80] btrfs: tree-checker: check item_size for inode_item Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 03/80] btrfs: tree-checker: check item_size for dev_item Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 04/80] clk: jz4725b: fix mmc0 clock gating Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 05/80] vhost/vsock: dont check owner in vhost_vsock_stop() while releasing Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 06/80] parisc/unaligned: Fix fldd and fstd unaligned handlers on 32-bit kernel Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 07/80] parisc/unaligned: Fix ldw() and stw() unalignment handlers Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 08/80] KVM: x86/mmu: make apf token non-zero to fix bug Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 09/80] drm/amdgpu: disable MMHUB PG for Picasso Greg Kroah-Hartman
2022-02-28 17:23 ` Greg Kroah-Hartman [this message]
2022-02-28 17:23 ` [PATCH 5.10 11/80] drm/i915: Correctly populate use_sagv_wm for all pipes Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 12/80] sr9700: sanity check for packet length Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 13/80] USB: zaurus: support another broken Zaurus Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 14/80] CDC-NCM: avoid overflow in sanity checking Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 15/80] netfilter: nf_tables_offload: incorrect flow offload action array size Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 16/80] x86/fpu: Correct pkru/xstate inconsistency Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 17/80] tee: export teedev_open() and teedev_close_context() Greg Kroah-Hartman
2022-02-28 17:23 ` [PATCH 5.10 18/80] optee: use driver internal tee_context for some rpc Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 19/80] ping: remove pr_err from ping_lookup Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 20/80] perf data: Fix double free in perf_session__delete() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 21/80] bnx2x: fix driver load from initrd Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 22/80] bnxt_en: Fix active FEC reporting to ethtool Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 23/80] hwmon: Handle failure to register sensor with thermal zone correctly Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 24/80] bpf: Do not try bpf_msg_push_data with len 0 Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 25/80] selftests: bpf: Check bpf_msg_push_data return value Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 26/80] bpf: Add schedule points in batch ops Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 27/80] io_uring: add a schedule point in io_add_buffers() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 28/80] net: __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor friends Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 29/80] tipc: Fix end of loop tests for list_for_each_entry() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 30/80] gso: do not skip outer ip header in case of ipip and net_failover Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 31/80] openvswitch: Fix setting ipv6 fields causing hw csum failure Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 32/80] drm/edid: Always set RGB444 Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 33/80] net/mlx5e: Fix wrong return value on ioctl EEPROM query failure Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 34/80] net/sched: act_ct: Fix flow table lookup after ct clear or switching zones Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 35/80] net: ll_temac: check the return value of devm_kmalloc() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 36/80] net: Force inlining of checksum functions in net/checksum.h Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 37/80] nfp: flower: Fix a potential leak in nfp_tunnel_add_shared_mac() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 38/80] netfilter: nf_tables: fix memory leak during stateful obj update Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 39/80] net/smc: Use a mutex for locking "struct smc_pnettable" Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 40/80] surface: surface3_power: Fix battery readings on batteries without a serial number Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 41/80] udp_tunnel: Fix end of loop test in udp_tunnel_nic_unregister() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 42/80] net/mlx5: Fix possible deadlock on rule deletion Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 43/80] net/mlx5: Fix wrong limitation of metadata match on ecpf Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 44/80] net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 45/80] spi: spi-zynq-qspi: Fix a NULL pointer dereference in zynq_qspi_exec_mem_op() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 46/80] regmap-irq: Update interrupt clear register for proper reset Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 47/80] RDMA/rtrs-clt: Fix possible double free in error case Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 48/80] RDMA/rtrs-clt: Kill wait_for_inflight_permits Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 49/80] RDMA/rtrs-clt: Move free_permit from free_clt to rtrs_clt_close Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 50/80] configfs: fix a race in configfs_{,un}register_subsystem() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 51/80] RDMA/ib_srp: Fix a deadlock Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 52/80] tracing: Have traceon and traceoff trigger honor the instance Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 53/80] iio: adc: men_z188_adc: Fix a resource leak in an error handling path Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 54/80] iio: adc: ad7124: fix mask used for setting AIN_BUFP & AIN_BUFM bits Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 55/80] iio: imu: st_lsm6dsx: wait for settling time in st_lsm6dsx_read_oneshot Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 56/80] iio: Fix error handling for PM Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 57/80] sc16is7xx: Fix for incorrect data being transmitted Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 58/80] ata: pata_hpt37x: disable primary channel on HPT371 Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 59/80] Revert "USB: serial: ch341: add new Product ID for CH341A" Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 60/80] usb: gadget: rndis: add spinlock for rndis response list Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 61/80] USB: gadget: validate endpoint index for xilinx udc Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 62/80] tracefs: Set the group ownership in apply_options() not parse_options() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 63/80] USB: serial: option: add support for DW5829e Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 64/80] USB: serial: option: add Telit LE910R1 compositions Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 65/80] usb: dwc2: drd: fix soft connect when gadget is unconfigured Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 66/80] usb: dwc3: pci: Fix Bay Trail phy GPIO mappings Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 67/80] usb: dwc3: gadget: Let the interrupt handler disable bottom halves Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 68/80] xhci: re-initialize the HC during resume if HCE was set Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 69/80] xhci: Prevent futile URB re-submissions due to incorrect return value Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 70/80] driver core: Free DMA range map when device is released Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 71/80] RDMA/cma: Do not change route.addr.src_addr outside state checks Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 72/80] thermal: int340x: fix memory leak in int3400_notify() Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 73/80] riscv: fix oops caused by irqsoff latency tracer Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 74/80] tty: n_gsm: fix encoding of control signal octet bit DV Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 75/80] tty: n_gsm: fix proper link termination after failed open Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 76/80] tty: n_gsm: fix NULL pointer access due to DLCI release Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 77/80] tty: n_gsm: fix wrong tty control line for flow control Greg Kroah-Hartman
2022-02-28 17:24 ` [PATCH 5.10 78/80] tty: n_gsm: fix deadlock in gsmtty_open() Greg Kroah-Hartman
2022-02-28 17:25 ` [PATCH 5.10 79/80] gpio: tegra186: Fix chip_data type confusion Greg Kroah-Hartman
2022-02-28 17:25 ` [PATCH 5.10 80/80] memblock: use kfree() to release kmalloced memblock regions Greg Kroah-Hartman
2022-02-28 21:21 ` [PATCH 5.10 00/80] 5.10.103-rc1 review Pavel Machek
2022-02-28 21:39 ` Shuah Khan
2022-02-28 23:29 ` Florian Fainelli
2022-03-01  9:13 ` Jon Hunter
2022-03-01  9:38 ` Bagas Sanjaya
2022-03-01  9:57 ` Naresh Kamboju
2022-03-01 11:36 ` Sudip Mukherjee
2022-03-01 19:14 ` Guenter Roeck
2022-03-02  7:04 ` Slade Watkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220228172312.712322810@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alexander.deucher@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmenzel@molgen.mpg.de \
    --cc=qiang.yu@amd.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.