* [RFC i-g-t 0/6] 21st century intel_gpu_top & Co.
@ 2018-10-03 12:07 ` Tvrtko Ursulin
0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
IGT patches accompanying the similary named i915 series. Most notably to sketch
an improved intel_gpu_top which now, like the real top, can show the per client
engine utilisation:
intel-gpu-top - load avg 3.30, 1.51, 0.08; 949/ 949 MHz; 0% RC6; 14.66 Watts; 3605 irqs/s
IMC reads: 4651 MiB/s
IMC writes: 25 MiB/s
ENGINE BUSY Q r R MI_SEMA MI_WAIT
Render/3D/0 61.51% |█████████████████████████████████████████████▌ | 3 0 1 0% 0%
Blitter/0 0.00% | | 0 0 0 0% 0%
Video/0 60.86% |█████████████████████████████████████████████ | 1 0 1 0% 0%
Video/1 59.04% |███████████████████████████████████████████▋ | 1 0 1 0% 0%
VideoEnhance/0 0.00% | | 0 0 0 0% 0%
PID NAME Render/3D/0 Blitter/0 Video/0 Video/1 VideoEnhance/0
23373 gem_wsim |█████▎ || ||████████▍ ||█████▎ || |
23374 gem_wsim |███▉ || ||██▏ ||███ || |
23375 gem_wsim |███ || ||█▍ ||███▌ || |
Tvrtko Ursulin (6):
include: DRM uAPI headers update
intel-gpu-overlay: Add engine queue stats
intel-gpu-overlay: Show 1s, 30s and 15m GPU load
tests/perf_pmu: Add tests for engine queued/runnable/running stats
intel-gpu-top: Add queue depths and load average
intel-gpu-top: Support for client stats
include/drm-uapi/amdgpu_drm.h | 52 +++-
include/drm-uapi/drm.h | 16 ++
include/drm-uapi/drm_fourcc.h | 215 ++++++++++++++
include/drm-uapi/drm_mode.h | 26 +-
include/drm-uapi/etnaviv_drm.h | 6 +
include/drm-uapi/exynos_drm.h | 240 ++++++++++++++++
include/drm-uapi/i915_drm.h | 49 +++-
include/drm-uapi/msm_drm.h | 2 +
include/drm-uapi/tegra_drm.h | 492 ++++++++++++++++++++++++++++++++-
include/drm-uapi/v3d_drm.h | 194 +++++++++++++
include/drm-uapi/vc4_drm.h | 13 +-
include/drm-uapi/virtgpu_drm.h | 1 +
include/drm-uapi/vmwgfx_drm.h | 166 ++++++++---
overlay/gpu-top.c | 81 +++++-
overlay/gpu-top.h | 22 +-
overlay/overlay.c | 35 ++-
tests/perf_pmu.c | 259 +++++++++++++++++
tools/Makefile.am | 2 +-
tools/intel_gpu_top.c | 480 ++++++++++++++++++++++++++++++--
tools/meson.build | 2 +-
20 files changed, 2264 insertions(+), 89 deletions(-)
create mode 100644 include/drm-uapi/v3d_drm.h
--
2.17.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 16+ messages in thread
* [igt-dev] [RFC i-g-t 0/6] 21st century intel_gpu_top & Co.
@ 2018-10-03 12:07 ` Tvrtko Ursulin
0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
IGT patches accompanying the similary named i915 series. Most notably to sketch
an improved intel_gpu_top which now, like the real top, can show the per client
engine utilisation:
intel-gpu-top - load avg 3.30, 1.51, 0.08; 949/ 949 MHz; 0% RC6; 14.66 Watts; 3605 irqs/s
IMC reads: 4651 MiB/s
IMC writes: 25 MiB/s
ENGINE BUSY Q r R MI_SEMA MI_WAIT
Render/3D/0 61.51% |█████████████████████████████████████████████▌ | 3 0 1 0% 0%
Blitter/0 0.00% | | 0 0 0 0% 0%
Video/0 60.86% |█████████████████████████████████████████████ | 1 0 1 0% 0%
Video/1 59.04% |███████████████████████████████████████████▋ | 1 0 1 0% 0%
VideoEnhance/0 0.00% | | 0 0 0 0% 0%
PID NAME Render/3D/0 Blitter/0 Video/0 Video/1 VideoEnhance/0
23373 gem_wsim |█████▎ || ||████████▍ ||█████▎ || |
23374 gem_wsim |███▉ || ||██▏ ||███ || |
23375 gem_wsim |███ || ||█▍ ||███▌ || |
Tvrtko Ursulin (6):
include: DRM uAPI headers update
intel-gpu-overlay: Add engine queue stats
intel-gpu-overlay: Show 1s, 30s and 15m GPU load
tests/perf_pmu: Add tests for engine queued/runnable/running stats
intel-gpu-top: Add queue depths and load average
intel-gpu-top: Support for client stats
include/drm-uapi/amdgpu_drm.h | 52 +++-
include/drm-uapi/drm.h | 16 ++
include/drm-uapi/drm_fourcc.h | 215 ++++++++++++++
include/drm-uapi/drm_mode.h | 26 +-
include/drm-uapi/etnaviv_drm.h | 6 +
include/drm-uapi/exynos_drm.h | 240 ++++++++++++++++
include/drm-uapi/i915_drm.h | 49 +++-
include/drm-uapi/msm_drm.h | 2 +
include/drm-uapi/tegra_drm.h | 492 ++++++++++++++++++++++++++++++++-
include/drm-uapi/v3d_drm.h | 194 +++++++++++++
include/drm-uapi/vc4_drm.h | 13 +-
include/drm-uapi/virtgpu_drm.h | 1 +
include/drm-uapi/vmwgfx_drm.h | 166 ++++++++---
overlay/gpu-top.c | 81 +++++-
overlay/gpu-top.h | 22 +-
overlay/overlay.c | 35 ++-
tests/perf_pmu.c | 259 +++++++++++++++++
tools/Makefile.am | 2 +-
tools/intel_gpu_top.c | 480 ++++++++++++++++++++++++++++++--
tools/meson.build | 2 +-
20 files changed, 2264 insertions(+), 89 deletions(-)
create mode 100644 include/drm-uapi/v3d_drm.h
--
2.17.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC i-g-t 1/6] include: DRM uAPI headers update
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
@ 2018-10-03 12:07 ` Tvrtko Ursulin
-1 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
include/drm-uapi/amdgpu_drm.h | 52 +++-
include/drm-uapi/drm.h | 16 ++
include/drm-uapi/drm_fourcc.h | 215 ++++++++++++++
include/drm-uapi/drm_mode.h | 26 +-
include/drm-uapi/etnaviv_drm.h | 6 +
include/drm-uapi/exynos_drm.h | 240 ++++++++++++++++
include/drm-uapi/i915_drm.h | 49 +++-
include/drm-uapi/msm_drm.h | 2 +
include/drm-uapi/tegra_drm.h | 492 ++++++++++++++++++++++++++++++++-
include/drm-uapi/v3d_drm.h | 194 +++++++++++++
include/drm-uapi/vc4_drm.h | 13 +-
include/drm-uapi/virtgpu_drm.h | 1 +
include/drm-uapi/vmwgfx_drm.h | 166 ++++++++---
13 files changed, 1415 insertions(+), 57 deletions(-)
create mode 100644 include/drm-uapi/v3d_drm.h
diff --git a/include/drm-uapi/amdgpu_drm.h b/include/drm-uapi/amdgpu_drm.h
index 1816bd8200d1..370e9a5536ef 100644
--- a/include/drm-uapi/amdgpu_drm.h
+++ b/include/drm-uapi/amdgpu_drm.h
@@ -72,12 +72,41 @@ extern "C" {
#define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
#define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
+/**
+ * DOC: memory domains
+ *
+ * %AMDGPU_GEM_DOMAIN_CPU System memory that is not GPU accessible.
+ * Memory in this pool could be swapped out to disk if there is pressure.
+ *
+ * %AMDGPU_GEM_DOMAIN_GTT GPU accessible system memory, mapped into the
+ * GPU's virtual address space via gart. Gart memory linearizes non-contiguous
+ * pages of system memory, allows GPU access system memory in a linezrized
+ * fashion.
+ *
+ * %AMDGPU_GEM_DOMAIN_VRAM Local video memory. For APUs, it is memory
+ * carved out by the BIOS.
+ *
+ * %AMDGPU_GEM_DOMAIN_GDS Global on-chip data storage used to share data
+ * across shader threads.
+ *
+ * %AMDGPU_GEM_DOMAIN_GWS Global wave sync, used to synchronize the
+ * execution of all the waves on a device.
+ *
+ * %AMDGPU_GEM_DOMAIN_OA Ordered append, used by 3D or Compute engines
+ * for appending data.
+ */
#define AMDGPU_GEM_DOMAIN_CPU 0x1
#define AMDGPU_GEM_DOMAIN_GTT 0x2
#define AMDGPU_GEM_DOMAIN_VRAM 0x4
#define AMDGPU_GEM_DOMAIN_GDS 0x8
#define AMDGPU_GEM_DOMAIN_GWS 0x10
#define AMDGPU_GEM_DOMAIN_OA 0x20
+#define AMDGPU_GEM_DOMAIN_MASK (AMDGPU_GEM_DOMAIN_CPU | \
+ AMDGPU_GEM_DOMAIN_GTT | \
+ AMDGPU_GEM_DOMAIN_VRAM | \
+ AMDGPU_GEM_DOMAIN_GDS | \
+ AMDGPU_GEM_DOMAIN_GWS | \
+ AMDGPU_GEM_DOMAIN_OA)
/* Flag that CPU access will be required for the case of VRAM domain */
#define AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED (1 << 0)
@@ -95,6 +124,10 @@ extern "C" {
#define AMDGPU_GEM_CREATE_VM_ALWAYS_VALID (1 << 6)
/* Flag that BO sharing will be explicitly synchronized */
#define AMDGPU_GEM_CREATE_EXPLICIT_SYNC (1 << 7)
+/* Flag that indicates allocating MQD gart on GFX9, where the mtype
+ * for the second page onward should be set to NC.
+ */
+#define AMDGPU_GEM_CREATE_MQD_GFX9 (1 << 8)
struct drm_amdgpu_gem_create_in {
/** the requested memory size */
@@ -473,7 +506,8 @@ struct drm_amdgpu_gem_va {
#define AMDGPU_HW_IP_UVD_ENC 5
#define AMDGPU_HW_IP_VCN_DEC 6
#define AMDGPU_HW_IP_VCN_ENC 7
-#define AMDGPU_HW_IP_NUM 8
+#define AMDGPU_HW_IP_VCN_JPEG 8
+#define AMDGPU_HW_IP_NUM 9
#define AMDGPU_HW_IP_INSTANCE_MAX_COUNT 1
@@ -482,6 +516,7 @@ struct drm_amdgpu_gem_va {
#define AMDGPU_CHUNK_ID_DEPENDENCIES 0x03
#define AMDGPU_CHUNK_ID_SYNCOBJ_IN 0x04
#define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05
+#define AMDGPU_CHUNK_ID_BO_HANDLES 0x06
struct drm_amdgpu_cs_chunk {
__u32 chunk_id;
@@ -520,6 +555,10 @@ union drm_amdgpu_cs {
/* Preempt flag, IB should set Pre_enb bit if PREEMPT flag detected */
#define AMDGPU_IB_FLAG_PREEMPT (1<<2)
+/* The IB fence should do the L2 writeback but not invalidate any shader
+ * caches (L2/vL1/sL1/I$). */
+#define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3)
+
struct drm_amdgpu_cs_chunk_ib {
__u32 _pad;
/** AMDGPU_IB_FLAG_* */
@@ -618,6 +657,16 @@ struct drm_amdgpu_cs_chunk_data {
#define AMDGPU_INFO_FW_SOS 0x0c
/* Subquery id: Query PSP ASD firmware version */
#define AMDGPU_INFO_FW_ASD 0x0d
+ /* Subquery id: Query VCN firmware version */
+ #define AMDGPU_INFO_FW_VCN 0x0e
+ /* Subquery id: Query GFX RLC SRLC firmware version */
+ #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_CNTL 0x0f
+ /* Subquery id: Query GFX RLC SRLG firmware version */
+ #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_GPM_MEM 0x10
+ /* Subquery id: Query GFX RLC SRLS firmware version */
+ #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_SRM_MEM 0x11
+ /* Subquery id: Query DMCU firmware version */
+ #define AMDGPU_INFO_FW_DMCU 0x12
/* number of bytes moved for TTM migration */
#define AMDGPU_INFO_NUM_BYTES_MOVED 0x0f
/* the used VRAM size */
@@ -806,6 +855,7 @@ struct drm_amdgpu_info_firmware {
#define AMDGPU_VRAM_TYPE_GDDR5 5
#define AMDGPU_VRAM_TYPE_HBM 6
#define AMDGPU_VRAM_TYPE_DDR3 7
+#define AMDGPU_VRAM_TYPE_DDR4 8
struct drm_amdgpu_info_device {
/** PCI Device ID */
diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h
index f0bd91de0cf9..85c685a2075e 100644
--- a/include/drm-uapi/drm.h
+++ b/include/drm-uapi/drm.h
@@ -674,6 +674,22 @@ struct drm_get_cap {
*/
#define DRM_CLIENT_CAP_ATOMIC 3
+/**
+ * DRM_CLIENT_CAP_ASPECT_RATIO
+ *
+ * If set to 1, the DRM core will provide aspect ratio information in modes.
+ */
+#define DRM_CLIENT_CAP_ASPECT_RATIO 4
+
+/**
+ * DRM_CLIENT_CAP_WRITEBACK_CONNECTORS
+ *
+ * If set to 1, the DRM core will expose special connectors to be used for
+ * writing back to memory the scene setup in the commit. Depends on client
+ * also supporting DRM_CLIENT_CAP_ATOMIC
+ */
+#define DRM_CLIENT_CAP_WRITEBACK_CONNECTORS 5
+
/** DRM_IOCTL_SET_CLIENT_CAP ioctl argument type */
struct drm_set_client_cap {
__u64 capability;
diff --git a/include/drm-uapi/drm_fourcc.h b/include/drm-uapi/drm_fourcc.h
index e04613d30a13..139632b87181 100644
--- a/include/drm-uapi/drm_fourcc.h
+++ b/include/drm-uapi/drm_fourcc.h
@@ -30,11 +30,50 @@
extern "C" {
#endif
+/**
+ * DOC: overview
+ *
+ * In the DRM subsystem, framebuffer pixel formats are described using the
+ * fourcc codes defined in `include/uapi/drm/drm_fourcc.h`. In addition to the
+ * fourcc code, a Format Modifier may optionally be provided, in order to
+ * further describe the buffer's format - for example tiling or compression.
+ *
+ * Format Modifiers
+ * ----------------
+ *
+ * Format modifiers are used in conjunction with a fourcc code, forming a
+ * unique fourcc:modifier pair. This format:modifier pair must fully define the
+ * format and data layout of the buffer, and should be the only way to describe
+ * that particular buffer.
+ *
+ * Having multiple fourcc:modifier pairs which describe the same layout should
+ * be avoided, as such aliases run the risk of different drivers exposing
+ * different names for the same data format, forcing userspace to understand
+ * that they are aliases.
+ *
+ * Format modifiers may change any property of the buffer, including the number
+ * of planes and/or the required allocation size. Format modifiers are
+ * vendor-namespaced, and as such the relationship between a fourcc code and a
+ * modifier is specific to the modifer being used. For example, some modifiers
+ * may preserve meaning - such as number of planes - from the fourcc code,
+ * whereas others may not.
+ *
+ * Vendors should document their modifier usage in as much detail as
+ * possible, to ensure maximum compatibility across devices, drivers and
+ * applications.
+ *
+ * The authoritative list of format modifier codes is found in
+ * `include/uapi/drm/drm_fourcc.h`
+ */
+
#define fourcc_code(a, b, c, d) ((__u32)(a) | ((__u32)(b) << 8) | \
((__u32)(c) << 16) | ((__u32)(d) << 24))
#define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of little endian */
+/* Reserve 0 for the invalid format specifier */
+#define DRM_FORMAT_INVALID 0
+
/* color index */
#define DRM_FORMAT_C8 fourcc_code('C', '8', ' ', ' ') /* [7:0] C */
@@ -183,6 +222,7 @@ extern "C" {
#define DRM_FORMAT_MOD_VENDOR_QCOM 0x05
#define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06
#define DRM_FORMAT_MOD_VENDOR_BROADCOM 0x07
+#define DRM_FORMAT_MOD_VENDOR_ARM 0x08
/* add more to the end as needed */
#define DRM_FORMAT_RESERVED ((1ULL << 56) - 1)
@@ -298,6 +338,19 @@ extern "C" {
*/
#define DRM_FORMAT_MOD_SAMSUNG_64_32_TILE fourcc_mod_code(SAMSUNG, 1)
+/*
+ * Qualcomm Compressed Format
+ *
+ * Refers to a compressed variant of the base format that is compressed.
+ * Implementation may be platform and base-format specific.
+ *
+ * Each macrotile consists of m x n (mostly 4 x 4) tiles.
+ * Pixel data pitch/stride is aligned with macrotile width.
+ * Pixel data height is aligned with macrotile height.
+ * Entire pixel data buffer is aligned with 4k(bytes).
+ */
+#define DRM_FORMAT_MOD_QCOM_COMPRESSED fourcc_mod_code(QCOM, 1)
+
/* Vivante framebuffer modifiers */
/*
@@ -384,6 +437,23 @@ extern "C" {
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB \
fourcc_mod_code(NVIDIA, 0x15)
+/*
+ * Some Broadcom modifiers take parameters, for example the number of
+ * vertical lines in the image. Reserve the lower 32 bits for modifier
+ * type, and the next 24 bits for parameters. Top 8 bits are the
+ * vendor code.
+ */
+#define __fourcc_mod_broadcom_param_shift 8
+#define __fourcc_mod_broadcom_param_bits 48
+#define fourcc_mod_broadcom_code(val, params) \
+ fourcc_mod_code(BROADCOM, ((((__u64)params) << __fourcc_mod_broadcom_param_shift) | val))
+#define fourcc_mod_broadcom_param(m) \
+ ((int)(((m) >> __fourcc_mod_broadcom_param_shift) & \
+ ((1ULL << __fourcc_mod_broadcom_param_bits) - 1)))
+#define fourcc_mod_broadcom_mod(m) \
+ ((m) & ~(((1ULL << __fourcc_mod_broadcom_param_bits) - 1) << \
+ __fourcc_mod_broadcom_param_shift))
+
/*
* Broadcom VC4 "T" format
*
@@ -405,6 +475,151 @@ extern "C" {
*/
#define DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED fourcc_mod_code(BROADCOM, 1)
+/*
+ * Broadcom SAND format
+ *
+ * This is the native format that the H.264 codec block uses. For VC4
+ * HVS, it is only valid for H.264 (NV12/21) and RGBA modes.
+ *
+ * The image can be considered to be split into columns, and the
+ * columns are placed consecutively into memory. The width of those
+ * columns can be either 32, 64, 128, or 256 pixels, but in practice
+ * only 128 pixel columns are used.
+ *
+ * The pitch between the start of each column is set to optimally
+ * switch between SDRAM banks. This is passed as the number of lines
+ * of column width in the modifier (we can't use the stride value due
+ * to various core checks that look at it , so you should set the
+ * stride to width*cpp).
+ *
+ * Note that the column height for this format modifier is the same
+ * for all of the planes, assuming that each column contains both Y
+ * and UV. Some SAND-using hardware stores UV in a separate tiled
+ * image from Y to reduce the column height, which is not supported
+ * with these modifiers.
+ */
+
+#define DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(v) \
+ fourcc_mod_broadcom_code(2, v)
+#define DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(v) \
+ fourcc_mod_broadcom_code(3, v)
+#define DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(v) \
+ fourcc_mod_broadcom_code(4, v)
+#define DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(v) \
+ fourcc_mod_broadcom_code(5, v)
+
+#define DRM_FORMAT_MOD_BROADCOM_SAND32 \
+ DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(0)
+#define DRM_FORMAT_MOD_BROADCOM_SAND64 \
+ DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(0)
+#define DRM_FORMAT_MOD_BROADCOM_SAND128 \
+ DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(0)
+#define DRM_FORMAT_MOD_BROADCOM_SAND256 \
+ DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(0)
+
+/* Broadcom UIF format
+ *
+ * This is the common format for the current Broadcom multimedia
+ * blocks, including V3D 3.x and newer, newer video codecs, and
+ * displays.
+ *
+ * The image consists of utiles (64b blocks), UIF blocks (2x2 utiles),
+ * and macroblocks (4x4 UIF blocks). Those 4x4 UIF block groups are
+ * stored in columns, with padding between the columns to ensure that
+ * moving from one column to the next doesn't hit the same SDRAM page
+ * bank.
+ *
+ * To calculate the padding, it is assumed that each hardware block
+ * and the software driving it knows the platform's SDRAM page size,
+ * number of banks, and XOR address, and that it's identical between
+ * all blocks using the format. This tiling modifier will use XOR as
+ * necessary to reduce the padding. If a hardware block can't do XOR,
+ * the assumption is that a no-XOR tiling modifier will be created.
+ */
+#define DRM_FORMAT_MOD_BROADCOM_UIF fourcc_mod_code(BROADCOM, 6)
+
+/*
+ * Arm Framebuffer Compression (AFBC) modifiers
+ *
+ * AFBC is a proprietary lossless image compression protocol and format.
+ * It provides fine-grained random access and minimizes the amount of data
+ * transferred between IP blocks.
+ *
+ * AFBC has several features which may be supported and/or used, which are
+ * represented using bits in the modifier. Not all combinations are valid,
+ * and different devices or use-cases may support different combinations.
+ */
+#define DRM_FORMAT_MOD_ARM_AFBC(__afbc_mode) fourcc_mod_code(ARM, __afbc_mode)
+
+/*
+ * AFBC superblock size
+ *
+ * Indicates the superblock size(s) used for the AFBC buffer. The buffer
+ * size (in pixels) must be aligned to a multiple of the superblock size.
+ * Four lowest significant bits(LSBs) are reserved for block size.
+ */
+#define AFBC_FORMAT_MOD_BLOCK_SIZE_MASK 0xf
+#define AFBC_FORMAT_MOD_BLOCK_SIZE_16x16 (1ULL)
+#define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8 (2ULL)
+
+/*
+ * AFBC lossless colorspace transform
+ *
+ * Indicates that the buffer makes use of the AFBC lossless colorspace
+ * transform.
+ */
+#define AFBC_FORMAT_MOD_YTR (1ULL << 4)
+
+/*
+ * AFBC block-split
+ *
+ * Indicates that the payload of each superblock is split. The second
+ * half of the payload is positioned at a predefined offset from the start
+ * of the superblock payload.
+ */
+#define AFBC_FORMAT_MOD_SPLIT (1ULL << 5)
+
+/*
+ * AFBC sparse layout
+ *
+ * This flag indicates that the payload of each superblock must be stored at a
+ * predefined position relative to the other superblocks in the same AFBC
+ * buffer. This order is the same order used by the header buffer. In this mode
+ * each superblock is given the same amount of space as an uncompressed
+ * superblock of the particular format would require, rounding up to the next
+ * multiple of 128 bytes in size.
+ */
+#define AFBC_FORMAT_MOD_SPARSE (1ULL << 6)
+
+/*
+ * AFBC copy-block restrict
+ *
+ * Buffers with this flag must obey the copy-block restriction. The restriction
+ * is such that there are no copy-blocks referring across the border of 8x8
+ * blocks. For the subsampled data the 8x8 limitation is also subsampled.
+ */
+#define AFBC_FORMAT_MOD_CBR (1ULL << 7)
+
+/*
+ * AFBC tiled layout
+ *
+ * The tiled layout groups superblocks in 8x8 or 4x4 tiles, where all
+ * superblocks inside a tile are stored together in memory. 8x8 tiles are used
+ * for pixel formats up to and including 32 bpp while 4x4 tiles are used for
+ * larger bpp formats. The order between the tiles is scan line.
+ * When the tiled layout is used, the buffer size (in pixels) must be aligned
+ * to the tile size.
+ */
+#define AFBC_FORMAT_MOD_TILED (1ULL << 8)
+
+/*
+ * AFBC solid color blocks
+ *
+ * Indicates that the buffer makes use of solid-color blocks, whereby bandwidth
+ * can be reduced if a whole superblock is a single color.
+ */
+#define AFBC_FORMAT_MOD_SC (1ULL << 9)
+
#if defined(__cplusplus)
}
#endif
diff --git a/include/drm-uapi/drm_mode.h b/include/drm-uapi/drm_mode.h
index 2c575794fb52..d3e0fe31efc5 100644
--- a/include/drm-uapi/drm_mode.h
+++ b/include/drm-uapi/drm_mode.h
@@ -93,6 +93,15 @@ extern "C" {
#define DRM_MODE_PICTURE_ASPECT_NONE 0
#define DRM_MODE_PICTURE_ASPECT_4_3 1
#define DRM_MODE_PICTURE_ASPECT_16_9 2
+#define DRM_MODE_PICTURE_ASPECT_64_27 3
+#define DRM_MODE_PICTURE_ASPECT_256_135 4
+
+/* Content type options */
+#define DRM_MODE_CONTENT_TYPE_NO_DATA 0
+#define DRM_MODE_CONTENT_TYPE_GRAPHICS 1
+#define DRM_MODE_CONTENT_TYPE_PHOTO 2
+#define DRM_MODE_CONTENT_TYPE_CINEMA 3
+#define DRM_MODE_CONTENT_TYPE_GAME 4
/* Aspect ratio flag bitmask (4 bits 22:19) */
#define DRM_MODE_FLAG_PIC_AR_MASK (0x0F<<19)
@@ -102,6 +111,10 @@ extern "C" {
(DRM_MODE_PICTURE_ASPECT_4_3<<19)
#define DRM_MODE_FLAG_PIC_AR_16_9 \
(DRM_MODE_PICTURE_ASPECT_16_9<<19)
+#define DRM_MODE_FLAG_PIC_AR_64_27 \
+ (DRM_MODE_PICTURE_ASPECT_64_27<<19)
+#define DRM_MODE_FLAG_PIC_AR_256_135 \
+ (DRM_MODE_PICTURE_ASPECT_256_135<<19)
#define DRM_MODE_FLAG_ALL (DRM_MODE_FLAG_PHSYNC | \
DRM_MODE_FLAG_NHSYNC | \
@@ -173,8 +186,9 @@ extern "C" {
/*
* DRM_MODE_REFLECT_<axis>
*
- * Signals that the contents of a drm plane is reflected in the <axis> axis,
+ * Signals that the contents of a drm plane is reflected along the <axis> axis,
* in the same way as mirroring.
+ * See kerneldoc chapter "Plane Composition Properties" for more details.
*
* This define is provided as a convenience, looking up the property id
* using the name->prop id lookup is the preferred method.
@@ -338,6 +352,7 @@ enum drm_mode_subconnector {
#define DRM_MODE_CONNECTOR_VIRTUAL 15
#define DRM_MODE_CONNECTOR_DSI 16
#define DRM_MODE_CONNECTOR_DPI 17
+#define DRM_MODE_CONNECTOR_WRITEBACK 18
struct drm_mode_get_connector {
@@ -363,7 +378,7 @@ struct drm_mode_get_connector {
__u32 pad;
};
-#define DRM_MODE_PROP_PENDING (1<<0)
+#define DRM_MODE_PROP_PENDING (1<<0) /* deprecated, do not use */
#define DRM_MODE_PROP_RANGE (1<<1)
#define DRM_MODE_PROP_IMMUTABLE (1<<2)
#define DRM_MODE_PROP_ENUM (1<<3) /* enumerated type with text strings */
@@ -598,8 +613,11 @@ struct drm_mode_crtc_lut {
};
struct drm_color_ctm {
- /* Conversion matrix in S31.32 format. */
- __s64 matrix[9];
+ /*
+ * Conversion matrix in S31.32 sign-magnitude
+ * (not two's complement!) format.
+ */
+ __u64 matrix[9];
};
struct drm_color_lut {
diff --git a/include/drm-uapi/etnaviv_drm.h b/include/drm-uapi/etnaviv_drm.h
index e9b997a0ef27..0d5c49dc478c 100644
--- a/include/drm-uapi/etnaviv_drm.h
+++ b/include/drm-uapi/etnaviv_drm.h
@@ -55,6 +55,12 @@ struct drm_etnaviv_timespec {
#define ETNAVIV_PARAM_GPU_FEATURES_4 0x07
#define ETNAVIV_PARAM_GPU_FEATURES_5 0x08
#define ETNAVIV_PARAM_GPU_FEATURES_6 0x09
+#define ETNAVIV_PARAM_GPU_FEATURES_7 0x0a
+#define ETNAVIV_PARAM_GPU_FEATURES_8 0x0b
+#define ETNAVIV_PARAM_GPU_FEATURES_9 0x0c
+#define ETNAVIV_PARAM_GPU_FEATURES_10 0x0d
+#define ETNAVIV_PARAM_GPU_FEATURES_11 0x0e
+#define ETNAVIV_PARAM_GPU_FEATURES_12 0x0f
#define ETNAVIV_PARAM_GPU_STREAM_COUNT 0x10
#define ETNAVIV_PARAM_GPU_REGISTER_MAX 0x11
diff --git a/include/drm-uapi/exynos_drm.h b/include/drm-uapi/exynos_drm.h
index a00116b5cc5c..7414cfd76419 100644
--- a/include/drm-uapi/exynos_drm.h
+++ b/include/drm-uapi/exynos_drm.h
@@ -135,6 +135,219 @@ struct drm_exynos_g2d_exec {
__u64 async;
};
+/* Exynos DRM IPP v2 API */
+
+/**
+ * Enumerate available IPP hardware modules.
+ *
+ * @count_ipps: size of ipp_id array / number of ipp modules (set by driver)
+ * @reserved: padding
+ * @ipp_id_ptr: pointer to ipp_id array or NULL
+ */
+struct drm_exynos_ioctl_ipp_get_res {
+ __u32 count_ipps;
+ __u32 reserved;
+ __u64 ipp_id_ptr;
+};
+
+enum drm_exynos_ipp_format_type {
+ DRM_EXYNOS_IPP_FORMAT_SOURCE = 0x01,
+ DRM_EXYNOS_IPP_FORMAT_DESTINATION = 0x02,
+};
+
+struct drm_exynos_ipp_format {
+ __u32 fourcc;
+ __u32 type;
+ __u64 modifier;
+};
+
+enum drm_exynos_ipp_capability {
+ DRM_EXYNOS_IPP_CAP_CROP = 0x01,
+ DRM_EXYNOS_IPP_CAP_ROTATE = 0x02,
+ DRM_EXYNOS_IPP_CAP_SCALE = 0x04,
+ DRM_EXYNOS_IPP_CAP_CONVERT = 0x08,
+};
+
+/**
+ * Get IPP hardware capabilities and supported image formats.
+ *
+ * @ipp_id: id of IPP module to query
+ * @capabilities: bitmask of drm_exynos_ipp_capability (set by driver)
+ * @reserved: padding
+ * @formats_count: size of formats array (in entries) / number of filled
+ * formats (set by driver)
+ * @formats_ptr: pointer to formats array or NULL
+ */
+struct drm_exynos_ioctl_ipp_get_caps {
+ __u32 ipp_id;
+ __u32 capabilities;
+ __u32 reserved;
+ __u32 formats_count;
+ __u64 formats_ptr;
+};
+
+enum drm_exynos_ipp_limit_type {
+ /* size (horizontal/vertial) limits, in pixels (min, max, alignment) */
+ DRM_EXYNOS_IPP_LIMIT_TYPE_SIZE = 0x0001,
+ /* scale ratio (horizonta/vertial), 16.16 fixed point (min, max) */
+ DRM_EXYNOS_IPP_LIMIT_TYPE_SCALE = 0x0002,
+
+ /* image buffer area */
+ DRM_EXYNOS_IPP_LIMIT_SIZE_BUFFER = 0x0001 << 16,
+ /* src/dst rectangle area */
+ DRM_EXYNOS_IPP_LIMIT_SIZE_AREA = 0x0002 << 16,
+ /* src/dst rectangle area when rotation enabled */
+ DRM_EXYNOS_IPP_LIMIT_SIZE_ROTATED = 0x0003 << 16,
+
+ DRM_EXYNOS_IPP_LIMIT_TYPE_MASK = 0x000f,
+ DRM_EXYNOS_IPP_LIMIT_SIZE_MASK = 0x000f << 16,
+};
+
+struct drm_exynos_ipp_limit_val {
+ __u32 min;
+ __u32 max;
+ __u32 align;
+ __u32 reserved;
+};
+
+/**
+ * IPP module limitation.
+ *
+ * @type: limit type (see drm_exynos_ipp_limit_type enum)
+ * @reserved: padding
+ * @h: horizontal limits
+ * @v: vertical limits
+ */
+struct drm_exynos_ipp_limit {
+ __u32 type;
+ __u32 reserved;
+ struct drm_exynos_ipp_limit_val h;
+ struct drm_exynos_ipp_limit_val v;
+};
+
+/**
+ * Get IPP limits for given image format.
+ *
+ * @ipp_id: id of IPP module to query
+ * @fourcc: image format code (see DRM_FORMAT_* in drm_fourcc.h)
+ * @modifier: image format modifier (see DRM_FORMAT_MOD_* in drm_fourcc.h)
+ * @type: source/destination identifier (drm_exynos_ipp_format_flag enum)
+ * @limits_count: size of limits array (in entries) / number of filled entries
+ * (set by driver)
+ * @limits_ptr: pointer to limits array or NULL
+ */
+struct drm_exynos_ioctl_ipp_get_limits {
+ __u32 ipp_id;
+ __u32 fourcc;
+ __u64 modifier;
+ __u32 type;
+ __u32 limits_count;
+ __u64 limits_ptr;
+};
+
+enum drm_exynos_ipp_task_id {
+ /* buffer described by struct drm_exynos_ipp_task_buffer */
+ DRM_EXYNOS_IPP_TASK_BUFFER = 0x0001,
+ /* rectangle described by struct drm_exynos_ipp_task_rect */
+ DRM_EXYNOS_IPP_TASK_RECTANGLE = 0x0002,
+ /* transformation described by struct drm_exynos_ipp_task_transform */
+ DRM_EXYNOS_IPP_TASK_TRANSFORM = 0x0003,
+ /* alpha configuration described by struct drm_exynos_ipp_task_alpha */
+ DRM_EXYNOS_IPP_TASK_ALPHA = 0x0004,
+
+ /* source image data (for buffer and rectangle chunks) */
+ DRM_EXYNOS_IPP_TASK_TYPE_SOURCE = 0x0001 << 16,
+ /* destination image data (for buffer and rectangle chunks) */
+ DRM_EXYNOS_IPP_TASK_TYPE_DESTINATION = 0x0002 << 16,
+};
+
+/**
+ * Memory buffer with image data.
+ *
+ * @id: must be DRM_EXYNOS_IPP_TASK_BUFFER
+ * other parameters are same as for AddFB2 generic DRM ioctl
+ */
+struct drm_exynos_ipp_task_buffer {
+ __u32 id;
+ __u32 fourcc;
+ __u32 width, height;
+ __u32 gem_id[4];
+ __u32 offset[4];
+ __u32 pitch[4];
+ __u64 modifier;
+};
+
+/**
+ * Rectangle for processing.
+ *
+ * @id: must be DRM_EXYNOS_IPP_TASK_RECTANGLE
+ * @reserved: padding
+ * @x,@y: left corner in pixels
+ * @w,@h: width/height in pixels
+ */
+struct drm_exynos_ipp_task_rect {
+ __u32 id;
+ __u32 reserved;
+ __u32 x;
+ __u32 y;
+ __u32 w;
+ __u32 h;
+};
+
+/**
+ * Image tranformation description.
+ *
+ * @id: must be DRM_EXYNOS_IPP_TASK_TRANSFORM
+ * @rotation: DRM_MODE_ROTATE_* and DRM_MODE_REFLECT_* values
+ */
+struct drm_exynos_ipp_task_transform {
+ __u32 id;
+ __u32 rotation;
+};
+
+/**
+ * Image global alpha configuration for formats without alpha values.
+ *
+ * @id: must be DRM_EXYNOS_IPP_TASK_ALPHA
+ * @value: global alpha value (0-255)
+ */
+struct drm_exynos_ipp_task_alpha {
+ __u32 id;
+ __u32 value;
+};
+
+enum drm_exynos_ipp_flag {
+ /* generate DRM event after processing */
+ DRM_EXYNOS_IPP_FLAG_EVENT = 0x01,
+ /* dry run, only check task parameters */
+ DRM_EXYNOS_IPP_FLAG_TEST_ONLY = 0x02,
+ /* non-blocking processing */
+ DRM_EXYNOS_IPP_FLAG_NONBLOCK = 0x04,
+};
+
+#define DRM_EXYNOS_IPP_FLAGS (DRM_EXYNOS_IPP_FLAG_EVENT |\
+ DRM_EXYNOS_IPP_FLAG_TEST_ONLY | DRM_EXYNOS_IPP_FLAG_NONBLOCK)
+
+/**
+ * Perform image processing described by array of drm_exynos_ipp_task_*
+ * structures (parameters array).
+ *
+ * @ipp_id: id of IPP module to run the task
+ * @flags: bitmask of drm_exynos_ipp_flag values
+ * @reserved: padding
+ * @params_size: size of parameters array (in bytes)
+ * @params_ptr: pointer to parameters array or NULL
+ * @user_data: (optional) data for drm event
+ */
+struct drm_exynos_ioctl_ipp_commit {
+ __u32 ipp_id;
+ __u32 flags;
+ __u32 reserved;
+ __u32 params_size;
+ __u64 params_ptr;
+ __u64 user_data;
+};
+
#define DRM_EXYNOS_GEM_CREATE 0x00
#define DRM_EXYNOS_GEM_MAP 0x01
/* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
@@ -147,6 +360,11 @@ struct drm_exynos_g2d_exec {
#define DRM_EXYNOS_G2D_EXEC 0x22
/* Reserved 0x30 ~ 0x33 for obsolete Exynos IPP ioctls */
+/* IPP - Image Post Processing */
+#define DRM_EXYNOS_IPP_GET_RESOURCES 0x40
+#define DRM_EXYNOS_IPP_GET_CAPS 0x41
+#define DRM_EXYNOS_IPP_GET_LIMITS 0x42
+#define DRM_EXYNOS_IPP_COMMIT 0x43
#define DRM_IOCTL_EXYNOS_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + \
DRM_EXYNOS_GEM_CREATE, struct drm_exynos_gem_create)
@@ -165,8 +383,20 @@ struct drm_exynos_g2d_exec {
#define DRM_IOCTL_EXYNOS_G2D_EXEC DRM_IOWR(DRM_COMMAND_BASE + \
DRM_EXYNOS_G2D_EXEC, struct drm_exynos_g2d_exec)
+#define DRM_IOCTL_EXYNOS_IPP_GET_RESOURCES DRM_IOWR(DRM_COMMAND_BASE + \
+ DRM_EXYNOS_IPP_GET_RESOURCES, \
+ struct drm_exynos_ioctl_ipp_get_res)
+#define DRM_IOCTL_EXYNOS_IPP_GET_CAPS DRM_IOWR(DRM_COMMAND_BASE + \
+ DRM_EXYNOS_IPP_GET_CAPS, struct drm_exynos_ioctl_ipp_get_caps)
+#define DRM_IOCTL_EXYNOS_IPP_GET_LIMITS DRM_IOWR(DRM_COMMAND_BASE + \
+ DRM_EXYNOS_IPP_GET_LIMITS, \
+ struct drm_exynos_ioctl_ipp_get_limits)
+#define DRM_IOCTL_EXYNOS_IPP_COMMIT DRM_IOWR(DRM_COMMAND_BASE + \
+ DRM_EXYNOS_IPP_COMMIT, struct drm_exynos_ioctl_ipp_commit)
+
/* EXYNOS specific events */
#define DRM_EXYNOS_G2D_EVENT 0x80000000
+#define DRM_EXYNOS_IPP_EVENT 0x80000002
struct drm_exynos_g2d_event {
struct drm_event base;
@@ -177,6 +407,16 @@ struct drm_exynos_g2d_event {
__u32 reserved;
};
+struct drm_exynos_ipp_event {
+ struct drm_event base;
+ __u64 user_data;
+ __u32 tv_sec;
+ __u32 tv_usec;
+ __u32 ipp_id;
+ __u32 sequence;
+ __u64 reserved;
+};
+
#if defined(__cplusplus)
}
#endif
diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index 16e452aa12d4..cf9dcf040321 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -110,9 +110,17 @@ enum drm_i915_gem_engine_class {
enum drm_i915_pmu_engine_sample {
I915_SAMPLE_BUSY = 0,
I915_SAMPLE_WAIT = 1,
- I915_SAMPLE_SEMA = 2
+ I915_SAMPLE_SEMA = 2,
+ I915_SAMPLE_QUEUED = 3,
+ I915_SAMPLE_RUNNABLE = 4,
+ I915_SAMPLE_RUNNING = 5,
};
+ /* Divide counter value by divisor to get the real value. */
+#define I915_SAMPLE_QUEUED_DIVISOR (1000)
+#define I915_SAMPLE_RUNNABLE_DIVISOR (1000)
+#define I915_SAMPLE_RUNNING_DIVISOR (1000)
+
#define I915_PMU_SAMPLE_BITS (4)
#define I915_PMU_SAMPLE_MASK (0xf)
#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
@@ -133,6 +141,15 @@ enum drm_i915_pmu_engine_sample {
#define I915_PMU_ENGINE_SEMA(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_RUNNABLE(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNABLE)
+
+#define I915_PMU_ENGINE_RUNNING(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNING)
+
#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
#define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0)
@@ -412,6 +429,14 @@ typedef struct drm_i915_irq_wait {
int irq_seq;
} drm_i915_irq_wait_t;
+/*
+ * Different modes of per-process Graphics Translation Table,
+ * see I915_PARAM_HAS_ALIASING_PPGTT
+ */
+#define I915_GEM_PPGTT_NONE 0
+#define I915_GEM_PPGTT_ALIASING 1
+#define I915_GEM_PPGTT_FULL 2
+
/* Ioctl to query kernel params:
*/
#define I915_PARAM_IRQ_ACTIVE 1
@@ -529,6 +554,28 @@ typedef struct drm_i915_irq_wait {
*/
#define I915_PARAM_CS_TIMESTAMP_FREQUENCY 51
+/*
+ * Once upon a time we supposed that writes through the GGTT would be
+ * immediately in physical memory (once flushed out of the CPU path). However,
+ * on a few different processors and chipsets, this is not necessarily the case
+ * as the writes appear to be buffered internally. Thus a read of the backing
+ * storage (physical memory) via a different path (with different physical tags
+ * to the indirect write via the GGTT) will see stale values from before
+ * the GGTT write. Inside the kernel, we can for the most part keep track of
+ * the different read/write domains in use (e.g. set-domain), but the assumption
+ * of coherency is baked into the ABI, hence reporting its true state in this
+ * parameter.
+ *
+ * Reports true when writes via mmap_gtt are immediately visible following an
+ * lfence to flush the WCB.
+ *
+ * Reports false when writes via mmap_gtt are indeterminately delayed in an in
+ * internal buffer and are _not_ immediately visible to third parties accessing
+ * directly via mmap_cpu/mmap_wc. Use of mmap_gtt as part of an IPC
+ * communications channel when reporting false is strongly disadvised.
+ */
+#define I915_PARAM_MMAP_GTT_COHERENT 52
+
typedef struct drm_i915_getparam {
__s32 param;
/*
diff --git a/include/drm-uapi/msm_drm.h b/include/drm-uapi/msm_drm.h
index bbbaffad772d..c06d0a5bdd80 100644
--- a/include/drm-uapi/msm_drm.h
+++ b/include/drm-uapi/msm_drm.h
@@ -201,10 +201,12 @@ struct drm_msm_gem_submit_bo {
#define MSM_SUBMIT_NO_IMPLICIT 0x80000000 /* disable implicit sync */
#define MSM_SUBMIT_FENCE_FD_IN 0x40000000 /* enable input fence_fd */
#define MSM_SUBMIT_FENCE_FD_OUT 0x20000000 /* enable output fence_fd */
+#define MSM_SUBMIT_SUDO 0x10000000 /* run submitted cmds from RB */
#define MSM_SUBMIT_FLAGS ( \
MSM_SUBMIT_NO_IMPLICIT | \
MSM_SUBMIT_FENCE_FD_IN | \
MSM_SUBMIT_FENCE_FD_OUT | \
+ MSM_SUBMIT_SUDO | \
0)
/* Each cmdstream submit consists of a table of buffers involved, and
diff --git a/include/drm-uapi/tegra_drm.h b/include/drm-uapi/tegra_drm.h
index 12f9bf848db1..6c07919c04e9 100644
--- a/include/drm-uapi/tegra_drm.h
+++ b/include/drm-uapi/tegra_drm.h
@@ -32,143 +32,615 @@ extern "C" {
#define DRM_TEGRA_GEM_CREATE_TILED (1 << 0)
#define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1)
+/**
+ * struct drm_tegra_gem_create - parameters for the GEM object creation IOCTL
+ */
struct drm_tegra_gem_create {
+ /**
+ * @size:
+ *
+ * The size, in bytes, of the buffer object to be created.
+ */
__u64 size;
+
+ /**
+ * @flags:
+ *
+ * A bitmask of flags that influence the creation of GEM objects:
+ *
+ * DRM_TEGRA_GEM_CREATE_TILED
+ * Use the 16x16 tiling format for this buffer.
+ *
+ * DRM_TEGRA_GEM_CREATE_BOTTOM_UP
+ * The buffer has a bottom-up layout.
+ */
__u32 flags;
+
+ /**
+ * @handle:
+ *
+ * The handle of the created GEM object. Set by the kernel upon
+ * successful completion of the IOCTL.
+ */
__u32 handle;
};
+/**
+ * struct drm_tegra_gem_mmap - parameters for the GEM mmap IOCTL
+ */
struct drm_tegra_gem_mmap {
+ /**
+ * @handle:
+ *
+ * Handle of the GEM object to obtain an mmap offset for.
+ */
__u32 handle;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
+
+ /**
+ * @offset:
+ *
+ * The mmap offset for the given GEM object. Set by the kernel upon
+ * successful completion of the IOCTL.
+ */
__u64 offset;
};
+/**
+ * struct drm_tegra_syncpt_read - parameters for the read syncpoint IOCTL
+ */
struct drm_tegra_syncpt_read {
+ /**
+ * @id:
+ *
+ * ID of the syncpoint to read the current value from.
+ */
__u32 id;
+
+ /**
+ * @value:
+ *
+ * The current syncpoint value. Set by the kernel upon successful
+ * completion of the IOCTL.
+ */
__u32 value;
};
+/**
+ * struct drm_tegra_syncpt_incr - parameters for the increment syncpoint IOCTL
+ */
struct drm_tegra_syncpt_incr {
+ /**
+ * @id:
+ *
+ * ID of the syncpoint to increment.
+ */
__u32 id;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
+/**
+ * struct drm_tegra_syncpt_wait - parameters for the wait syncpoint IOCTL
+ */
struct drm_tegra_syncpt_wait {
+ /**
+ * @id:
+ *
+ * ID of the syncpoint to wait on.
+ */
__u32 id;
+
+ /**
+ * @thresh:
+ *
+ * Threshold value for which to wait.
+ */
__u32 thresh;
+
+ /**
+ * @timeout:
+ *
+ * Timeout, in milliseconds, to wait.
+ */
__u32 timeout;
+
+ /**
+ * @value:
+ *
+ * The new syncpoint value after the wait. Set by the kernel upon
+ * successful completion of the IOCTL.
+ */
__u32 value;
};
#define DRM_TEGRA_NO_TIMEOUT (0xffffffff)
+/**
+ * struct drm_tegra_open_channel - parameters for the open channel IOCTL
+ */
struct drm_tegra_open_channel {
+ /**
+ * @client:
+ *
+ * The client ID for this channel.
+ */
__u32 client;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
+
+ /**
+ * @context:
+ *
+ * The application context of this channel. Set by the kernel upon
+ * successful completion of the IOCTL. This context needs to be passed
+ * to the DRM_TEGRA_CHANNEL_CLOSE or the DRM_TEGRA_SUBMIT IOCTLs.
+ */
__u64 context;
};
+/**
+ * struct drm_tegra_close_channel - parameters for the close channel IOCTL
+ */
struct drm_tegra_close_channel {
+ /**
+ * @context:
+ *
+ * The application context of this channel. This is obtained from the
+ * DRM_TEGRA_OPEN_CHANNEL IOCTL.
+ */
__u64 context;
};
+/**
+ * struct drm_tegra_get_syncpt - parameters for the get syncpoint IOCTL
+ */
struct drm_tegra_get_syncpt {
+ /**
+ * @context:
+ *
+ * The application context identifying the channel for which to obtain
+ * the syncpoint ID.
+ */
__u64 context;
+
+ /**
+ * @index:
+ *
+ * Index of the client syncpoint for which to obtain the ID.
+ */
__u32 index;
+
+ /**
+ * @id:
+ *
+ * The ID of the given syncpoint. Set by the kernel upon successful
+ * completion of the IOCTL.
+ */
__u32 id;
};
+/**
+ * struct drm_tegra_get_syncpt_base - parameters for the get wait base IOCTL
+ */
struct drm_tegra_get_syncpt_base {
+ /**
+ * @context:
+ *
+ * The application context identifying for which channel to obtain the
+ * wait base.
+ */
__u64 context;
+
+ /**
+ * @syncpt:
+ *
+ * ID of the syncpoint for which to obtain the wait base.
+ */
__u32 syncpt;
+
+ /**
+ * @id:
+ *
+ * The ID of the wait base corresponding to the client syncpoint. Set
+ * by the kernel upon successful completion of the IOCTL.
+ */
__u32 id;
};
+/**
+ * struct drm_tegra_syncpt - syncpoint increment operation
+ */
struct drm_tegra_syncpt {
+ /**
+ * @id:
+ *
+ * ID of the syncpoint to operate on.
+ */
__u32 id;
+
+ /**
+ * @incrs:
+ *
+ * Number of increments to perform for the syncpoint.
+ */
__u32 incrs;
};
+/**
+ * struct drm_tegra_cmdbuf - structure describing a command buffer
+ */
struct drm_tegra_cmdbuf {
+ /**
+ * @handle:
+ *
+ * Handle to a GEM object containing the command buffer.
+ */
__u32 handle;
+
+ /**
+ * @offset:
+ *
+ * Offset, in bytes, into the GEM object identified by @handle at
+ * which the command buffer starts.
+ */
__u32 offset;
+
+ /**
+ * @words:
+ *
+ * Number of 32-bit words in this command buffer.
+ */
__u32 words;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
+/**
+ * struct drm_tegra_reloc - GEM object relocation structure
+ */
struct drm_tegra_reloc {
struct {
+ /**
+ * @cmdbuf.handle:
+ *
+ * Handle to the GEM object containing the command buffer for
+ * which to perform this GEM object relocation.
+ */
__u32 handle;
+
+ /**
+ * @cmdbuf.offset:
+ *
+ * Offset, in bytes, into the command buffer at which to
+ * insert the relocated address.
+ */
__u32 offset;
} cmdbuf;
struct {
+ /**
+ * @target.handle:
+ *
+ * Handle to the GEM object to be relocated.
+ */
__u32 handle;
+
+ /**
+ * @target.offset:
+ *
+ * Offset, in bytes, into the target GEM object at which the
+ * relocated data starts.
+ */
__u32 offset;
} target;
+
+ /**
+ * @shift:
+ *
+ * The number of bits by which to shift relocated addresses.
+ */
__u32 shift;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
+/**
+ * struct drm_tegra_waitchk - wait check structure
+ */
struct drm_tegra_waitchk {
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object containing a command stream on which to
+ * perform the wait check.
+ */
__u32 handle;
+
+ /**
+ * @offset:
+ *
+ * Offset, in bytes, of the location in the command stream to perform
+ * the wait check on.
+ */
__u32 offset;
+
+ /**
+ * @syncpt:
+ *
+ * ID of the syncpoint to wait check.
+ */
__u32 syncpt;
+
+ /**
+ * @thresh:
+ *
+ * Threshold value for which to check.
+ */
__u32 thresh;
};
+/**
+ * struct drm_tegra_submit - job submission structure
+ */
struct drm_tegra_submit {
+ /**
+ * @context:
+ *
+ * The application context identifying the channel to use for the
+ * execution of this job.
+ */
__u64 context;
+
+ /**
+ * @num_syncpts:
+ *
+ * The number of syncpoints operated on by this job. This defines the
+ * length of the array pointed to by @syncpts.
+ */
__u32 num_syncpts;
+
+ /**
+ * @num_cmdbufs:
+ *
+ * The number of command buffers to execute as part of this job. This
+ * defines the length of the array pointed to by @cmdbufs.
+ */
__u32 num_cmdbufs;
+
+ /**
+ * @num_relocs:
+ *
+ * The number of relocations to perform before executing this job.
+ * This defines the length of the array pointed to by @relocs.
+ */
__u32 num_relocs;
+
+ /**
+ * @num_waitchks:
+ *
+ * The number of wait checks to perform as part of this job. This
+ * defines the length of the array pointed to by @waitchks.
+ */
__u32 num_waitchks;
+
+ /**
+ * @waitchk_mask:
+ *
+ * Bitmask of valid wait checks.
+ */
__u32 waitchk_mask;
+
+ /**
+ * @timeout:
+ *
+ * Timeout, in milliseconds, before this job is cancelled.
+ */
__u32 timeout;
+
+ /**
+ * @syncpts:
+ *
+ * A pointer to an array of &struct drm_tegra_syncpt structures that
+ * specify the syncpoint operations performed as part of this job.
+ * The number of elements in the array must be equal to the value
+ * given by @num_syncpts.
+ */
__u64 syncpts;
+
+ /**
+ * @cmdbufs:
+ *
+ * A pointer to an array of &struct drm_tegra_cmdbuf structures that
+ * define the command buffers to execute as part of this job. The
+ * number of elements in the array must be equal to the value given
+ * by @num_syncpts.
+ */
__u64 cmdbufs;
+
+ /**
+ * @relocs:
+ *
+ * A pointer to an array of &struct drm_tegra_reloc structures that
+ * specify the relocations that need to be performed before executing
+ * this job. The number of elements in the array must be equal to the
+ * value given by @num_relocs.
+ */
__u64 relocs;
+
+ /**
+ * @waitchks:
+ *
+ * A pointer to an array of &struct drm_tegra_waitchk structures that
+ * specify the wait checks to be performed while executing this job.
+ * The number of elements in the array must be equal to the value
+ * given by @num_waitchks.
+ */
__u64 waitchks;
- __u32 fence; /* Return value */
- __u32 reserved[5]; /* future expansion */
+ /**
+ * @fence:
+ *
+ * The threshold of the syncpoint associated with this job after it
+ * has been completed. Set by the kernel upon successful completion of
+ * the IOCTL. This can be used with the DRM_TEGRA_SYNCPT_WAIT IOCTL to
+ * wait for this job to be finished.
+ */
+ __u32 fence;
+
+ /**
+ * @reserved:
+ *
+ * This field is reserved for future use. Must be 0.
+ */
+ __u32 reserved[5];
};
#define DRM_TEGRA_GEM_TILING_MODE_PITCH 0
#define DRM_TEGRA_GEM_TILING_MODE_TILED 1
#define DRM_TEGRA_GEM_TILING_MODE_BLOCK 2
+/**
+ * struct drm_tegra_gem_set_tiling - parameters for the set tiling IOCTL
+ */
struct drm_tegra_gem_set_tiling {
- /* input */
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object for which to set the tiling parameters.
+ */
__u32 handle;
+
+ /**
+ * @mode:
+ *
+ * The tiling mode to set. Must be one of:
+ *
+ * DRM_TEGRA_GEM_TILING_MODE_PITCH
+ * pitch linear format
+ *
+ * DRM_TEGRA_GEM_TILING_MODE_TILED
+ * 16x16 tiling format
+ *
+ * DRM_TEGRA_GEM_TILING_MODE_BLOCK
+ * 16Bx2 tiling format
+ */
__u32 mode;
+
+ /**
+ * @value:
+ *
+ * The value to set for the tiling mode parameter.
+ */
__u32 value;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
+/**
+ * struct drm_tegra_gem_get_tiling - parameters for the get tiling IOCTL
+ */
struct drm_tegra_gem_get_tiling {
- /* input */
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object for which to query the tiling parameters.
+ */
__u32 handle;
- /* output */
+
+ /**
+ * @mode:
+ *
+ * The tiling mode currently associated with the GEM object. Set by
+ * the kernel upon successful completion of the IOCTL.
+ */
__u32 mode;
+
+ /**
+ * @value:
+ *
+ * The tiling mode parameter currently associated with the GEM object.
+ * Set by the kernel upon successful completion of the IOCTL.
+ */
__u32 value;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
#define DRM_TEGRA_GEM_BOTTOM_UP (1 << 0)
#define DRM_TEGRA_GEM_FLAGS (DRM_TEGRA_GEM_BOTTOM_UP)
+/**
+ * struct drm_tegra_gem_set_flags - parameters for the set flags IOCTL
+ */
struct drm_tegra_gem_set_flags {
- /* input */
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object for which to set the flags.
+ */
__u32 handle;
- /* output */
+
+ /**
+ * @flags:
+ *
+ * The flags to set for the GEM object.
+ */
__u32 flags;
};
+/**
+ * struct drm_tegra_gem_get_flags - parameters for the get flags IOCTL
+ */
struct drm_tegra_gem_get_flags {
- /* input */
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object for which to query the flags.
+ */
__u32 handle;
- /* output */
+
+ /**
+ * @flags:
+ *
+ * The flags currently associated with the GEM object. Set by the
+ * kernel upon successful completion of the IOCTL.
+ */
__u32 flags;
};
@@ -193,7 +665,7 @@ struct drm_tegra_gem_get_flags {
#define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr)
#define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait)
#define DRM_IOCTL_TEGRA_OPEN_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_OPEN_CHANNEL, struct drm_tegra_open_channel)
-#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_open_channel)
+#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_close_channel)
#define DRM_IOCTL_TEGRA_GET_SYNCPT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT, struct drm_tegra_get_syncpt)
#define DRM_IOCTL_TEGRA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SUBMIT, struct drm_tegra_submit)
#define DRM_IOCTL_TEGRA_GET_SYNCPT_BASE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT_BASE, struct drm_tegra_get_syncpt_base)
diff --git a/include/drm-uapi/v3d_drm.h b/include/drm-uapi/v3d_drm.h
new file mode 100644
index 000000000000..7b6627783608
--- /dev/null
+++ b/include/drm-uapi/v3d_drm.h
@@ -0,0 +1,194 @@
+/*
+ * Copyright © 2014-2018 Broadcom
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef _V3D_DRM_H_
+#define _V3D_DRM_H_
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+#define DRM_V3D_SUBMIT_CL 0x00
+#define DRM_V3D_WAIT_BO 0x01
+#define DRM_V3D_CREATE_BO 0x02
+#define DRM_V3D_MMAP_BO 0x03
+#define DRM_V3D_GET_PARAM 0x04
+#define DRM_V3D_GET_BO_OFFSET 0x05
+
+#define DRM_IOCTL_V3D_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl)
+#define DRM_IOCTL_V3D_WAIT_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo)
+#define DRM_IOCTL_V3D_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_CREATE_BO, struct drm_v3d_create_bo)
+#define DRM_IOCTL_V3D_MMAP_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_MMAP_BO, struct drm_v3d_mmap_bo)
+#define DRM_IOCTL_V3D_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)
+#define DRM_IOCTL_V3D_GET_BO_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset)
+
+/**
+ * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D
+ * engine.
+ *
+ * This asks the kernel to have the GPU execute an optional binner
+ * command list, and a render command list.
+ */
+struct drm_v3d_submit_cl {
+ /* Pointer to the binner command list.
+ *
+ * This is the first set of commands executed, which runs the
+ * coordinate shader to determine where primitives land on the screen,
+ * then writes out the state updates and draw calls necessary per tile
+ * to the tile allocation BO.
+ */
+ __u32 bcl_start;
+
+ /** End address of the BCL (first byte after the BCL) */
+ __u32 bcl_end;
+
+ /* Offset of the render command list.
+ *
+ * This is the second set of commands executed, which will either
+ * execute the tiles that have been set up by the BCL, or a fixed set
+ * of tiles (in the case of RCL-only blits).
+ */
+ __u32 rcl_start;
+
+ /** End address of the RCL (first byte after the RCL) */
+ __u32 rcl_end;
+
+ /** An optional sync object to wait on before starting the BCL. */
+ __u32 in_sync_bcl;
+ /** An optional sync object to wait on before starting the RCL. */
+ __u32 in_sync_rcl;
+ /** An optional sync object to place the completion fence in. */
+ __u32 out_sync;
+
+ /* Offset of the tile alloc memory
+ *
+ * This is optional on V3D 3.3 (where the CL can set the value) but
+ * required on V3D 4.1.
+ */
+ __u32 qma;
+
+ /** Size of the tile alloc memory. */
+ __u32 qms;
+
+ /** Offset of the tile state data array. */
+ __u32 qts;
+
+ /* Pointer to a u32 array of the BOs that are referenced by the job.
+ */
+ __u64 bo_handles;
+
+ /* Number of BO handles passed in (size is that times 4). */
+ __u32 bo_handle_count;
+
+ /* Pad, must be zero-filled. */
+ __u32 pad;
+};
+
+/**
+ * struct drm_v3d_wait_bo - ioctl argument for waiting for
+ * completion of the last DRM_V3D_SUBMIT_CL on a BO.
+ *
+ * This is useful for cases where multiple processes might be
+ * rendering to a BO and you want to wait for all rendering to be
+ * completed.
+ */
+struct drm_v3d_wait_bo {
+ __u32 handle;
+ __u32 pad;
+ __u64 timeout_ns;
+};
+
+/**
+ * struct drm_v3d_create_bo - ioctl argument for creating V3D BOs.
+ *
+ * There are currently no values for the flags argument, but it may be
+ * used in a future extension.
+ */
+struct drm_v3d_create_bo {
+ __u32 size;
+ __u32 flags;
+ /** Returned GEM handle for the BO. */
+ __u32 handle;
+ /**
+ * Returned offset for the BO in the V3D address space. This offset
+ * is private to the DRM fd and is valid for the lifetime of the GEM
+ * handle.
+ *
+ * This offset value will always be nonzero, since various HW
+ * units treat 0 specially.
+ */
+ __u32 offset;
+};
+
+/**
+ * struct drm_v3d_mmap_bo - ioctl argument for mapping V3D BOs.
+ *
+ * This doesn't actually perform an mmap. Instead, it returns the
+ * offset you need to use in an mmap on the DRM device node. This
+ * means that tools like valgrind end up knowing about the mapped
+ * memory.
+ *
+ * There are currently no values for the flags argument, but it may be
+ * used in a future extension.
+ */
+struct drm_v3d_mmap_bo {
+ /** Handle for the object being mapped. */
+ __u32 handle;
+ __u32 flags;
+ /** offset into the drm node to use for subsequent mmap call. */
+ __u64 offset;
+};
+
+enum drm_v3d_param {
+ DRM_V3D_PARAM_V3D_UIFCFG,
+ DRM_V3D_PARAM_V3D_HUB_IDENT1,
+ DRM_V3D_PARAM_V3D_HUB_IDENT2,
+ DRM_V3D_PARAM_V3D_HUB_IDENT3,
+ DRM_V3D_PARAM_V3D_CORE0_IDENT0,
+ DRM_V3D_PARAM_V3D_CORE0_IDENT1,
+ DRM_V3D_PARAM_V3D_CORE0_IDENT2,
+};
+
+struct drm_v3d_get_param {
+ __u32 param;
+ __u32 pad;
+ __u64 value;
+};
+
+/**
+ * Returns the offset for the BO in the V3D address space for this DRM fd.
+ * This is the same value returned by drm_v3d_create_bo, if that was called
+ * from this DRM fd.
+ */
+struct drm_v3d_get_bo_offset {
+ __u32 handle;
+ __u32 offset;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* _V3D_DRM_H_ */
diff --git a/include/drm-uapi/vc4_drm.h b/include/drm-uapi/vc4_drm.h
index 4117117b4204..31f50de39acb 100644
--- a/include/drm-uapi/vc4_drm.h
+++ b/include/drm-uapi/vc4_drm.h
@@ -183,10 +183,17 @@ struct drm_vc4_submit_cl {
/* ID of the perfmon to attach to this job. 0 means no perfmon. */
__u32 perfmonid;
- /* Unused field to align this struct on 64 bits. Must be set to 0.
- * If one ever needs to add an u32 field to this struct, this field
- * can be used.
+ /* Syncobj handle to wait on. If set, processing of this render job
+ * will not start until the syncobj is signaled. 0 means ignore.
*/
+ __u32 in_sync;
+
+ /* Syncobj handle to export fence to. If set, the fence in the syncobj
+ * will be replaced with a fence that signals upon completion of this
+ * render job. 0 means ignore.
+ */
+ __u32 out_sync;
+
__u32 pad2;
};
diff --git a/include/drm-uapi/virtgpu_drm.h b/include/drm-uapi/virtgpu_drm.h
index 91a31ffed828..9a781f0611df 100644
--- a/include/drm-uapi/virtgpu_drm.h
+++ b/include/drm-uapi/virtgpu_drm.h
@@ -63,6 +63,7 @@ struct drm_virtgpu_execbuffer {
};
#define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */
+#define VIRTGPU_PARAM_CAPSET_QUERY_FIX 2 /* do we have the capset fix */
struct drm_virtgpu_getparam {
__u64 param;
diff --git a/include/drm-uapi/vmwgfx_drm.h b/include/drm-uapi/vmwgfx_drm.h
index 0bc784f5e0db..399f58317cff 100644
--- a/include/drm-uapi/vmwgfx_drm.h
+++ b/include/drm-uapi/vmwgfx_drm.h
@@ -40,6 +40,7 @@ extern "C" {
#define DRM_VMW_GET_PARAM 0
#define DRM_VMW_ALLOC_DMABUF 1
+#define DRM_VMW_ALLOC_BO 1
#define DRM_VMW_UNREF_DMABUF 2
#define DRM_VMW_HANDLE_CLOSE 2
#define DRM_VMW_CURSOR_BYPASS 3
@@ -68,6 +69,8 @@ extern "C" {
#define DRM_VMW_GB_SURFACE_REF 24
#define DRM_VMW_SYNCCPU 25
#define DRM_VMW_CREATE_EXTENDED_CONTEXT 26
+#define DRM_VMW_GB_SURFACE_CREATE_EXT 27
+#define DRM_VMW_GB_SURFACE_REF_EXT 28
/*************************************************************************/
/**
@@ -79,6 +82,9 @@ extern "C" {
*
* DRM_VMW_PARAM_OVERLAY_IOCTL:
* Does the driver support the overlay ioctl.
+ *
+ * DRM_VMW_PARAM_SM4_1
+ * SM4_1 support is enabled.
*/
#define DRM_VMW_PARAM_NUM_STREAMS 0
@@ -94,6 +100,8 @@ extern "C" {
#define DRM_VMW_PARAM_MAX_MOB_SIZE 10
#define DRM_VMW_PARAM_SCREEN_TARGET 11
#define DRM_VMW_PARAM_DX 12
+#define DRM_VMW_PARAM_HW_CAPS2 13
+#define DRM_VMW_PARAM_SM4_1 14
/**
* enum drm_vmw_handle_type - handle type for ref ioctls
@@ -356,9 +364,9 @@ struct drm_vmw_fence_rep {
/*************************************************************************/
/**
- * DRM_VMW_ALLOC_DMABUF
+ * DRM_VMW_ALLOC_BO
*
- * Allocate a DMA buffer that is visible also to the host.
+ * Allocate a buffer object that is visible also to the host.
* NOTE: The buffer is
* identified by a handle and an offset, which are private to the guest, but
* useable in the command stream. The guest kernel may translate these
@@ -366,27 +374,28 @@ struct drm_vmw_fence_rep {
* be zero at all times, or it may disappear from the interface before it is
* fixed.
*
- * The DMA buffer may stay user-space mapped in the guest at all times,
+ * The buffer object may stay user-space mapped in the guest at all times,
* and is thus suitable for sub-allocation.
*
- * DMA buffers are mapped using the mmap() syscall on the drm device.
+ * Buffer objects are mapped using the mmap() syscall on the drm device.
*/
/**
- * struct drm_vmw_alloc_dmabuf_req
+ * struct drm_vmw_alloc_bo_req
*
* @size: Required minimum size of the buffer.
*
- * Input data to the DRM_VMW_ALLOC_DMABUF Ioctl.
+ * Input data to the DRM_VMW_ALLOC_BO Ioctl.
*/
-struct drm_vmw_alloc_dmabuf_req {
+struct drm_vmw_alloc_bo_req {
__u32 size;
__u32 pad64;
};
+#define drm_vmw_alloc_dmabuf_req drm_vmw_alloc_bo_req
/**
- * struct drm_vmw_dmabuf_rep
+ * struct drm_vmw_bo_rep
*
* @map_handle: Offset to use in the mmap() call used to map the buffer.
* @handle: Handle unique to this buffer. Used for unreferencing.
@@ -395,50 +404,32 @@ struct drm_vmw_alloc_dmabuf_req {
* @cur_gmr_offset: Offset to use in the command stream when this buffer is
* referenced. See note above.
*
- * Output data from the DRM_VMW_ALLOC_DMABUF Ioctl.
+ * Output data from the DRM_VMW_ALLOC_BO Ioctl.
*/
-struct drm_vmw_dmabuf_rep {
+struct drm_vmw_bo_rep {
__u64 map_handle;
__u32 handle;
__u32 cur_gmr_id;
__u32 cur_gmr_offset;
__u32 pad64;
};
+#define drm_vmw_dmabuf_rep drm_vmw_bo_rep
/**
- * union drm_vmw_dmabuf_arg
+ * union drm_vmw_alloc_bo_arg
*
* @req: Input data as described above.
* @rep: Output data as described above.
*
- * Argument to the DRM_VMW_ALLOC_DMABUF Ioctl.
+ * Argument to the DRM_VMW_ALLOC_BO Ioctl.
*/
-union drm_vmw_alloc_dmabuf_arg {
- struct drm_vmw_alloc_dmabuf_req req;
- struct drm_vmw_dmabuf_rep rep;
-};
-
-/*************************************************************************/
-/**
- * DRM_VMW_UNREF_DMABUF - Free a DMA buffer.
- *
- */
-
-/**
- * struct drm_vmw_unref_dmabuf_arg
- *
- * @handle: Handle indicating what buffer to free. Obtained from the
- * DRM_VMW_ALLOC_DMABUF Ioctl.
- *
- * Argument to the DRM_VMW_UNREF_DMABUF Ioctl.
- */
-
-struct drm_vmw_unref_dmabuf_arg {
- __u32 handle;
- __u32 pad64;
+union drm_vmw_alloc_bo_arg {
+ struct drm_vmw_alloc_bo_req req;
+ struct drm_vmw_bo_rep rep;
};
+#define drm_vmw_alloc_dmabuf_arg drm_vmw_alloc_bo_arg
/*************************************************************************/
/**
@@ -1103,9 +1094,8 @@ union drm_vmw_extended_context_arg {
* DRM_VMW_HANDLE_CLOSE - Close a user-space handle and release its
* underlying resource.
*
- * Note that this ioctl is overlaid on the DRM_VMW_UNREF_DMABUF Ioctl.
- * The ioctl arguments therefore need to be identical in layout.
- *
+ * Note that this ioctl is overlaid on the deprecated DRM_VMW_UNREF_DMABUF
+ * Ioctl.
*/
/**
@@ -1119,7 +1109,107 @@ struct drm_vmw_handle_close_arg {
__u32 handle;
__u32 pad64;
};
+#define drm_vmw_unref_dmabuf_arg drm_vmw_handle_close_arg
+
+/*************************************************************************/
+/**
+ * DRM_VMW_GB_SURFACE_CREATE_EXT - Create a host guest-backed surface.
+ *
+ * Allocates a surface handle and queues a create surface command
+ * for the host on the first use of the surface. The surface ID can
+ * be used as the surface ID in commands referencing the surface.
+ *
+ * This new command extends DRM_VMW_GB_SURFACE_CREATE by adding version
+ * parameter and 64 bit svga flag.
+ */
+
+/**
+ * enum drm_vmw_surface_version
+ *
+ * @drm_vmw_surface_gb_v1: Corresponds to current gb surface format with
+ * svga3d surface flags split into 2, upper half and lower half.
+ */
+enum drm_vmw_surface_version {
+ drm_vmw_gb_surface_v1
+};
+
+/**
+ * struct drm_vmw_gb_surface_create_ext_req
+ *
+ * @base: Surface create parameters.
+ * @version: Version of surface create ioctl.
+ * @svga3d_flags_upper_32_bits: Upper 32 bits of svga3d flags.
+ * @multisample_pattern: Multisampling pattern when msaa is supported.
+ * @quality_level: Precision settings for each sample.
+ * @must_be_zero: Reserved for future usage.
+ *
+ * Input argument to the DRM_VMW_GB_SURFACE_CREATE_EXT Ioctl.
+ * Part of output argument for the DRM_VMW_GB_SURFACE_REF_EXT Ioctl.
+ */
+struct drm_vmw_gb_surface_create_ext_req {
+ struct drm_vmw_gb_surface_create_req base;
+ enum drm_vmw_surface_version version;
+ uint32_t svga3d_flags_upper_32_bits;
+ SVGA3dMSPattern multisample_pattern;
+ SVGA3dMSQualityLevel quality_level;
+ uint64_t must_be_zero;
+};
+
+/**
+ * union drm_vmw_gb_surface_create_ext_arg
+ *
+ * @req: Input argument as described above.
+ * @rep: Output argument as described above.
+ *
+ * Argument to the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl.
+ */
+union drm_vmw_gb_surface_create_ext_arg {
+ struct drm_vmw_gb_surface_create_rep rep;
+ struct drm_vmw_gb_surface_create_ext_req req;
+};
+
+/*************************************************************************/
+/**
+ * DRM_VMW_GB_SURFACE_REF_EXT - Reference a host surface.
+ *
+ * Puts a reference on a host surface with a given handle, as previously
+ * returned by the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl.
+ * A reference will make sure the surface isn't destroyed while we hold
+ * it and will allow the calling client to use the surface handle in
+ * the command stream.
+ *
+ * On successful return, the Ioctl returns the surface information given
+ * to and returned from the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl.
+ */
+/**
+ * struct drm_vmw_gb_surface_ref_ext_rep
+ *
+ * @creq: The data used as input when the surface was created, as described
+ * above at "struct drm_vmw_gb_surface_create_ext_req"
+ * @crep: Additional data output when the surface was created, as described
+ * above at "struct drm_vmw_gb_surface_create_rep"
+ *
+ * Output Argument to the DRM_VMW_GB_SURFACE_REF_EXT ioctl.
+ */
+struct drm_vmw_gb_surface_ref_ext_rep {
+ struct drm_vmw_gb_surface_create_ext_req creq;
+ struct drm_vmw_gb_surface_create_rep crep;
+};
+
+/**
+ * union drm_vmw_gb_surface_reference_ext_arg
+ *
+ * @req: Input data as described above at "struct drm_vmw_surface_arg"
+ * @rep: Output data as described above at
+ * "struct drm_vmw_gb_surface_ref_ext_rep"
+ *
+ * Argument to the DRM_VMW_GB_SURFACE_REF Ioctl.
+ */
+union drm_vmw_gb_surface_reference_ext_arg {
+ struct drm_vmw_gb_surface_ref_ext_rep rep;
+ struct drm_vmw_surface_arg req;
+};
#if defined(__cplusplus)
}
--
2.17.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [igt-dev] [RFC i-g-t 1/6] include: DRM uAPI headers update
@ 2018-10-03 12:07 ` Tvrtko Ursulin
0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
include/drm-uapi/amdgpu_drm.h | 52 +++-
include/drm-uapi/drm.h | 16 ++
include/drm-uapi/drm_fourcc.h | 215 ++++++++++++++
include/drm-uapi/drm_mode.h | 26 +-
include/drm-uapi/etnaviv_drm.h | 6 +
include/drm-uapi/exynos_drm.h | 240 ++++++++++++++++
include/drm-uapi/i915_drm.h | 49 +++-
include/drm-uapi/msm_drm.h | 2 +
include/drm-uapi/tegra_drm.h | 492 ++++++++++++++++++++++++++++++++-
include/drm-uapi/v3d_drm.h | 194 +++++++++++++
include/drm-uapi/vc4_drm.h | 13 +-
include/drm-uapi/virtgpu_drm.h | 1 +
include/drm-uapi/vmwgfx_drm.h | 166 ++++++++---
13 files changed, 1415 insertions(+), 57 deletions(-)
create mode 100644 include/drm-uapi/v3d_drm.h
diff --git a/include/drm-uapi/amdgpu_drm.h b/include/drm-uapi/amdgpu_drm.h
index 1816bd8200d1..370e9a5536ef 100644
--- a/include/drm-uapi/amdgpu_drm.h
+++ b/include/drm-uapi/amdgpu_drm.h
@@ -72,12 +72,41 @@ extern "C" {
#define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
#define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
+/**
+ * DOC: memory domains
+ *
+ * %AMDGPU_GEM_DOMAIN_CPU System memory that is not GPU accessible.
+ * Memory in this pool could be swapped out to disk if there is pressure.
+ *
+ * %AMDGPU_GEM_DOMAIN_GTT GPU accessible system memory, mapped into the
+ * GPU's virtual address space via gart. Gart memory linearizes non-contiguous
+ * pages of system memory, allows GPU access system memory in a linezrized
+ * fashion.
+ *
+ * %AMDGPU_GEM_DOMAIN_VRAM Local video memory. For APUs, it is memory
+ * carved out by the BIOS.
+ *
+ * %AMDGPU_GEM_DOMAIN_GDS Global on-chip data storage used to share data
+ * across shader threads.
+ *
+ * %AMDGPU_GEM_DOMAIN_GWS Global wave sync, used to synchronize the
+ * execution of all the waves on a device.
+ *
+ * %AMDGPU_GEM_DOMAIN_OA Ordered append, used by 3D or Compute engines
+ * for appending data.
+ */
#define AMDGPU_GEM_DOMAIN_CPU 0x1
#define AMDGPU_GEM_DOMAIN_GTT 0x2
#define AMDGPU_GEM_DOMAIN_VRAM 0x4
#define AMDGPU_GEM_DOMAIN_GDS 0x8
#define AMDGPU_GEM_DOMAIN_GWS 0x10
#define AMDGPU_GEM_DOMAIN_OA 0x20
+#define AMDGPU_GEM_DOMAIN_MASK (AMDGPU_GEM_DOMAIN_CPU | \
+ AMDGPU_GEM_DOMAIN_GTT | \
+ AMDGPU_GEM_DOMAIN_VRAM | \
+ AMDGPU_GEM_DOMAIN_GDS | \
+ AMDGPU_GEM_DOMAIN_GWS | \
+ AMDGPU_GEM_DOMAIN_OA)
/* Flag that CPU access will be required for the case of VRAM domain */
#define AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED (1 << 0)
@@ -95,6 +124,10 @@ extern "C" {
#define AMDGPU_GEM_CREATE_VM_ALWAYS_VALID (1 << 6)
/* Flag that BO sharing will be explicitly synchronized */
#define AMDGPU_GEM_CREATE_EXPLICIT_SYNC (1 << 7)
+/* Flag that indicates allocating MQD gart on GFX9, where the mtype
+ * for the second page onward should be set to NC.
+ */
+#define AMDGPU_GEM_CREATE_MQD_GFX9 (1 << 8)
struct drm_amdgpu_gem_create_in {
/** the requested memory size */
@@ -473,7 +506,8 @@ struct drm_amdgpu_gem_va {
#define AMDGPU_HW_IP_UVD_ENC 5
#define AMDGPU_HW_IP_VCN_DEC 6
#define AMDGPU_HW_IP_VCN_ENC 7
-#define AMDGPU_HW_IP_NUM 8
+#define AMDGPU_HW_IP_VCN_JPEG 8
+#define AMDGPU_HW_IP_NUM 9
#define AMDGPU_HW_IP_INSTANCE_MAX_COUNT 1
@@ -482,6 +516,7 @@ struct drm_amdgpu_gem_va {
#define AMDGPU_CHUNK_ID_DEPENDENCIES 0x03
#define AMDGPU_CHUNK_ID_SYNCOBJ_IN 0x04
#define AMDGPU_CHUNK_ID_SYNCOBJ_OUT 0x05
+#define AMDGPU_CHUNK_ID_BO_HANDLES 0x06
struct drm_amdgpu_cs_chunk {
__u32 chunk_id;
@@ -520,6 +555,10 @@ union drm_amdgpu_cs {
/* Preempt flag, IB should set Pre_enb bit if PREEMPT flag detected */
#define AMDGPU_IB_FLAG_PREEMPT (1<<2)
+/* The IB fence should do the L2 writeback but not invalidate any shader
+ * caches (L2/vL1/sL1/I$). */
+#define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3)
+
struct drm_amdgpu_cs_chunk_ib {
__u32 _pad;
/** AMDGPU_IB_FLAG_* */
@@ -618,6 +657,16 @@ struct drm_amdgpu_cs_chunk_data {
#define AMDGPU_INFO_FW_SOS 0x0c
/* Subquery id: Query PSP ASD firmware version */
#define AMDGPU_INFO_FW_ASD 0x0d
+ /* Subquery id: Query VCN firmware version */
+ #define AMDGPU_INFO_FW_VCN 0x0e
+ /* Subquery id: Query GFX RLC SRLC firmware version */
+ #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_CNTL 0x0f
+ /* Subquery id: Query GFX RLC SRLG firmware version */
+ #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_GPM_MEM 0x10
+ /* Subquery id: Query GFX RLC SRLS firmware version */
+ #define AMDGPU_INFO_FW_GFX_RLC_RESTORE_LIST_SRM_MEM 0x11
+ /* Subquery id: Query DMCU firmware version */
+ #define AMDGPU_INFO_FW_DMCU 0x12
/* number of bytes moved for TTM migration */
#define AMDGPU_INFO_NUM_BYTES_MOVED 0x0f
/* the used VRAM size */
@@ -806,6 +855,7 @@ struct drm_amdgpu_info_firmware {
#define AMDGPU_VRAM_TYPE_GDDR5 5
#define AMDGPU_VRAM_TYPE_HBM 6
#define AMDGPU_VRAM_TYPE_DDR3 7
+#define AMDGPU_VRAM_TYPE_DDR4 8
struct drm_amdgpu_info_device {
/** PCI Device ID */
diff --git a/include/drm-uapi/drm.h b/include/drm-uapi/drm.h
index f0bd91de0cf9..85c685a2075e 100644
--- a/include/drm-uapi/drm.h
+++ b/include/drm-uapi/drm.h
@@ -674,6 +674,22 @@ struct drm_get_cap {
*/
#define DRM_CLIENT_CAP_ATOMIC 3
+/**
+ * DRM_CLIENT_CAP_ASPECT_RATIO
+ *
+ * If set to 1, the DRM core will provide aspect ratio information in modes.
+ */
+#define DRM_CLIENT_CAP_ASPECT_RATIO 4
+
+/**
+ * DRM_CLIENT_CAP_WRITEBACK_CONNECTORS
+ *
+ * If set to 1, the DRM core will expose special connectors to be used for
+ * writing back to memory the scene setup in the commit. Depends on client
+ * also supporting DRM_CLIENT_CAP_ATOMIC
+ */
+#define DRM_CLIENT_CAP_WRITEBACK_CONNECTORS 5
+
/** DRM_IOCTL_SET_CLIENT_CAP ioctl argument type */
struct drm_set_client_cap {
__u64 capability;
diff --git a/include/drm-uapi/drm_fourcc.h b/include/drm-uapi/drm_fourcc.h
index e04613d30a13..139632b87181 100644
--- a/include/drm-uapi/drm_fourcc.h
+++ b/include/drm-uapi/drm_fourcc.h
@@ -30,11 +30,50 @@
extern "C" {
#endif
+/**
+ * DOC: overview
+ *
+ * In the DRM subsystem, framebuffer pixel formats are described using the
+ * fourcc codes defined in `include/uapi/drm/drm_fourcc.h`. In addition to the
+ * fourcc code, a Format Modifier may optionally be provided, in order to
+ * further describe the buffer's format - for example tiling or compression.
+ *
+ * Format Modifiers
+ * ----------------
+ *
+ * Format modifiers are used in conjunction with a fourcc code, forming a
+ * unique fourcc:modifier pair. This format:modifier pair must fully define the
+ * format and data layout of the buffer, and should be the only way to describe
+ * that particular buffer.
+ *
+ * Having multiple fourcc:modifier pairs which describe the same layout should
+ * be avoided, as such aliases run the risk of different drivers exposing
+ * different names for the same data format, forcing userspace to understand
+ * that they are aliases.
+ *
+ * Format modifiers may change any property of the buffer, including the number
+ * of planes and/or the required allocation size. Format modifiers are
+ * vendor-namespaced, and as such the relationship between a fourcc code and a
+ * modifier is specific to the modifer being used. For example, some modifiers
+ * may preserve meaning - such as number of planes - from the fourcc code,
+ * whereas others may not.
+ *
+ * Vendors should document their modifier usage in as much detail as
+ * possible, to ensure maximum compatibility across devices, drivers and
+ * applications.
+ *
+ * The authoritative list of format modifier codes is found in
+ * `include/uapi/drm/drm_fourcc.h`
+ */
+
#define fourcc_code(a, b, c, d) ((__u32)(a) | ((__u32)(b) << 8) | \
((__u32)(c) << 16) | ((__u32)(d) << 24))
#define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of little endian */
+/* Reserve 0 for the invalid format specifier */
+#define DRM_FORMAT_INVALID 0
+
/* color index */
#define DRM_FORMAT_C8 fourcc_code('C', '8', ' ', ' ') /* [7:0] C */
@@ -183,6 +222,7 @@ extern "C" {
#define DRM_FORMAT_MOD_VENDOR_QCOM 0x05
#define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06
#define DRM_FORMAT_MOD_VENDOR_BROADCOM 0x07
+#define DRM_FORMAT_MOD_VENDOR_ARM 0x08
/* add more to the end as needed */
#define DRM_FORMAT_RESERVED ((1ULL << 56) - 1)
@@ -298,6 +338,19 @@ extern "C" {
*/
#define DRM_FORMAT_MOD_SAMSUNG_64_32_TILE fourcc_mod_code(SAMSUNG, 1)
+/*
+ * Qualcomm Compressed Format
+ *
+ * Refers to a compressed variant of the base format that is compressed.
+ * Implementation may be platform and base-format specific.
+ *
+ * Each macrotile consists of m x n (mostly 4 x 4) tiles.
+ * Pixel data pitch/stride is aligned with macrotile width.
+ * Pixel data height is aligned with macrotile height.
+ * Entire pixel data buffer is aligned with 4k(bytes).
+ */
+#define DRM_FORMAT_MOD_QCOM_COMPRESSED fourcc_mod_code(QCOM, 1)
+
/* Vivante framebuffer modifiers */
/*
@@ -384,6 +437,23 @@ extern "C" {
#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB \
fourcc_mod_code(NVIDIA, 0x15)
+/*
+ * Some Broadcom modifiers take parameters, for example the number of
+ * vertical lines in the image. Reserve the lower 32 bits for modifier
+ * type, and the next 24 bits for parameters. Top 8 bits are the
+ * vendor code.
+ */
+#define __fourcc_mod_broadcom_param_shift 8
+#define __fourcc_mod_broadcom_param_bits 48
+#define fourcc_mod_broadcom_code(val, params) \
+ fourcc_mod_code(BROADCOM, ((((__u64)params) << __fourcc_mod_broadcom_param_shift) | val))
+#define fourcc_mod_broadcom_param(m) \
+ ((int)(((m) >> __fourcc_mod_broadcom_param_shift) & \
+ ((1ULL << __fourcc_mod_broadcom_param_bits) - 1)))
+#define fourcc_mod_broadcom_mod(m) \
+ ((m) & ~(((1ULL << __fourcc_mod_broadcom_param_bits) - 1) << \
+ __fourcc_mod_broadcom_param_shift))
+
/*
* Broadcom VC4 "T" format
*
@@ -405,6 +475,151 @@ extern "C" {
*/
#define DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED fourcc_mod_code(BROADCOM, 1)
+/*
+ * Broadcom SAND format
+ *
+ * This is the native format that the H.264 codec block uses. For VC4
+ * HVS, it is only valid for H.264 (NV12/21) and RGBA modes.
+ *
+ * The image can be considered to be split into columns, and the
+ * columns are placed consecutively into memory. The width of those
+ * columns can be either 32, 64, 128, or 256 pixels, but in practice
+ * only 128 pixel columns are used.
+ *
+ * The pitch between the start of each column is set to optimally
+ * switch between SDRAM banks. This is passed as the number of lines
+ * of column width in the modifier (we can't use the stride value due
+ * to various core checks that look at it , so you should set the
+ * stride to width*cpp).
+ *
+ * Note that the column height for this format modifier is the same
+ * for all of the planes, assuming that each column contains both Y
+ * and UV. Some SAND-using hardware stores UV in a separate tiled
+ * image from Y to reduce the column height, which is not supported
+ * with these modifiers.
+ */
+
+#define DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(v) \
+ fourcc_mod_broadcom_code(2, v)
+#define DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(v) \
+ fourcc_mod_broadcom_code(3, v)
+#define DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(v) \
+ fourcc_mod_broadcom_code(4, v)
+#define DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(v) \
+ fourcc_mod_broadcom_code(5, v)
+
+#define DRM_FORMAT_MOD_BROADCOM_SAND32 \
+ DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(0)
+#define DRM_FORMAT_MOD_BROADCOM_SAND64 \
+ DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(0)
+#define DRM_FORMAT_MOD_BROADCOM_SAND128 \
+ DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(0)
+#define DRM_FORMAT_MOD_BROADCOM_SAND256 \
+ DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(0)
+
+/* Broadcom UIF format
+ *
+ * This is the common format for the current Broadcom multimedia
+ * blocks, including V3D 3.x and newer, newer video codecs, and
+ * displays.
+ *
+ * The image consists of utiles (64b blocks), UIF blocks (2x2 utiles),
+ * and macroblocks (4x4 UIF blocks). Those 4x4 UIF block groups are
+ * stored in columns, with padding between the columns to ensure that
+ * moving from one column to the next doesn't hit the same SDRAM page
+ * bank.
+ *
+ * To calculate the padding, it is assumed that each hardware block
+ * and the software driving it knows the platform's SDRAM page size,
+ * number of banks, and XOR address, and that it's identical between
+ * all blocks using the format. This tiling modifier will use XOR as
+ * necessary to reduce the padding. If a hardware block can't do XOR,
+ * the assumption is that a no-XOR tiling modifier will be created.
+ */
+#define DRM_FORMAT_MOD_BROADCOM_UIF fourcc_mod_code(BROADCOM, 6)
+
+/*
+ * Arm Framebuffer Compression (AFBC) modifiers
+ *
+ * AFBC is a proprietary lossless image compression protocol and format.
+ * It provides fine-grained random access and minimizes the amount of data
+ * transferred between IP blocks.
+ *
+ * AFBC has several features which may be supported and/or used, which are
+ * represented using bits in the modifier. Not all combinations are valid,
+ * and different devices or use-cases may support different combinations.
+ */
+#define DRM_FORMAT_MOD_ARM_AFBC(__afbc_mode) fourcc_mod_code(ARM, __afbc_mode)
+
+/*
+ * AFBC superblock size
+ *
+ * Indicates the superblock size(s) used for the AFBC buffer. The buffer
+ * size (in pixels) must be aligned to a multiple of the superblock size.
+ * Four lowest significant bits(LSBs) are reserved for block size.
+ */
+#define AFBC_FORMAT_MOD_BLOCK_SIZE_MASK 0xf
+#define AFBC_FORMAT_MOD_BLOCK_SIZE_16x16 (1ULL)
+#define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8 (2ULL)
+
+/*
+ * AFBC lossless colorspace transform
+ *
+ * Indicates that the buffer makes use of the AFBC lossless colorspace
+ * transform.
+ */
+#define AFBC_FORMAT_MOD_YTR (1ULL << 4)
+
+/*
+ * AFBC block-split
+ *
+ * Indicates that the payload of each superblock is split. The second
+ * half of the payload is positioned at a predefined offset from the start
+ * of the superblock payload.
+ */
+#define AFBC_FORMAT_MOD_SPLIT (1ULL << 5)
+
+/*
+ * AFBC sparse layout
+ *
+ * This flag indicates that the payload of each superblock must be stored at a
+ * predefined position relative to the other superblocks in the same AFBC
+ * buffer. This order is the same order used by the header buffer. In this mode
+ * each superblock is given the same amount of space as an uncompressed
+ * superblock of the particular format would require, rounding up to the next
+ * multiple of 128 bytes in size.
+ */
+#define AFBC_FORMAT_MOD_SPARSE (1ULL << 6)
+
+/*
+ * AFBC copy-block restrict
+ *
+ * Buffers with this flag must obey the copy-block restriction. The restriction
+ * is such that there are no copy-blocks referring across the border of 8x8
+ * blocks. For the subsampled data the 8x8 limitation is also subsampled.
+ */
+#define AFBC_FORMAT_MOD_CBR (1ULL << 7)
+
+/*
+ * AFBC tiled layout
+ *
+ * The tiled layout groups superblocks in 8x8 or 4x4 tiles, where all
+ * superblocks inside a tile are stored together in memory. 8x8 tiles are used
+ * for pixel formats up to and including 32 bpp while 4x4 tiles are used for
+ * larger bpp formats. The order between the tiles is scan line.
+ * When the tiled layout is used, the buffer size (in pixels) must be aligned
+ * to the tile size.
+ */
+#define AFBC_FORMAT_MOD_TILED (1ULL << 8)
+
+/*
+ * AFBC solid color blocks
+ *
+ * Indicates that the buffer makes use of solid-color blocks, whereby bandwidth
+ * can be reduced if a whole superblock is a single color.
+ */
+#define AFBC_FORMAT_MOD_SC (1ULL << 9)
+
#if defined(__cplusplus)
}
#endif
diff --git a/include/drm-uapi/drm_mode.h b/include/drm-uapi/drm_mode.h
index 2c575794fb52..d3e0fe31efc5 100644
--- a/include/drm-uapi/drm_mode.h
+++ b/include/drm-uapi/drm_mode.h
@@ -93,6 +93,15 @@ extern "C" {
#define DRM_MODE_PICTURE_ASPECT_NONE 0
#define DRM_MODE_PICTURE_ASPECT_4_3 1
#define DRM_MODE_PICTURE_ASPECT_16_9 2
+#define DRM_MODE_PICTURE_ASPECT_64_27 3
+#define DRM_MODE_PICTURE_ASPECT_256_135 4
+
+/* Content type options */
+#define DRM_MODE_CONTENT_TYPE_NO_DATA 0
+#define DRM_MODE_CONTENT_TYPE_GRAPHICS 1
+#define DRM_MODE_CONTENT_TYPE_PHOTO 2
+#define DRM_MODE_CONTENT_TYPE_CINEMA 3
+#define DRM_MODE_CONTENT_TYPE_GAME 4
/* Aspect ratio flag bitmask (4 bits 22:19) */
#define DRM_MODE_FLAG_PIC_AR_MASK (0x0F<<19)
@@ -102,6 +111,10 @@ extern "C" {
(DRM_MODE_PICTURE_ASPECT_4_3<<19)
#define DRM_MODE_FLAG_PIC_AR_16_9 \
(DRM_MODE_PICTURE_ASPECT_16_9<<19)
+#define DRM_MODE_FLAG_PIC_AR_64_27 \
+ (DRM_MODE_PICTURE_ASPECT_64_27<<19)
+#define DRM_MODE_FLAG_PIC_AR_256_135 \
+ (DRM_MODE_PICTURE_ASPECT_256_135<<19)
#define DRM_MODE_FLAG_ALL (DRM_MODE_FLAG_PHSYNC | \
DRM_MODE_FLAG_NHSYNC | \
@@ -173,8 +186,9 @@ extern "C" {
/*
* DRM_MODE_REFLECT_<axis>
*
- * Signals that the contents of a drm plane is reflected in the <axis> axis,
+ * Signals that the contents of a drm plane is reflected along the <axis> axis,
* in the same way as mirroring.
+ * See kerneldoc chapter "Plane Composition Properties" for more details.
*
* This define is provided as a convenience, looking up the property id
* using the name->prop id lookup is the preferred method.
@@ -338,6 +352,7 @@ enum drm_mode_subconnector {
#define DRM_MODE_CONNECTOR_VIRTUAL 15
#define DRM_MODE_CONNECTOR_DSI 16
#define DRM_MODE_CONNECTOR_DPI 17
+#define DRM_MODE_CONNECTOR_WRITEBACK 18
struct drm_mode_get_connector {
@@ -363,7 +378,7 @@ struct drm_mode_get_connector {
__u32 pad;
};
-#define DRM_MODE_PROP_PENDING (1<<0)
+#define DRM_MODE_PROP_PENDING (1<<0) /* deprecated, do not use */
#define DRM_MODE_PROP_RANGE (1<<1)
#define DRM_MODE_PROP_IMMUTABLE (1<<2)
#define DRM_MODE_PROP_ENUM (1<<3) /* enumerated type with text strings */
@@ -598,8 +613,11 @@ struct drm_mode_crtc_lut {
};
struct drm_color_ctm {
- /* Conversion matrix in S31.32 format. */
- __s64 matrix[9];
+ /*
+ * Conversion matrix in S31.32 sign-magnitude
+ * (not two's complement!) format.
+ */
+ __u64 matrix[9];
};
struct drm_color_lut {
diff --git a/include/drm-uapi/etnaviv_drm.h b/include/drm-uapi/etnaviv_drm.h
index e9b997a0ef27..0d5c49dc478c 100644
--- a/include/drm-uapi/etnaviv_drm.h
+++ b/include/drm-uapi/etnaviv_drm.h
@@ -55,6 +55,12 @@ struct drm_etnaviv_timespec {
#define ETNAVIV_PARAM_GPU_FEATURES_4 0x07
#define ETNAVIV_PARAM_GPU_FEATURES_5 0x08
#define ETNAVIV_PARAM_GPU_FEATURES_6 0x09
+#define ETNAVIV_PARAM_GPU_FEATURES_7 0x0a
+#define ETNAVIV_PARAM_GPU_FEATURES_8 0x0b
+#define ETNAVIV_PARAM_GPU_FEATURES_9 0x0c
+#define ETNAVIV_PARAM_GPU_FEATURES_10 0x0d
+#define ETNAVIV_PARAM_GPU_FEATURES_11 0x0e
+#define ETNAVIV_PARAM_GPU_FEATURES_12 0x0f
#define ETNAVIV_PARAM_GPU_STREAM_COUNT 0x10
#define ETNAVIV_PARAM_GPU_REGISTER_MAX 0x11
diff --git a/include/drm-uapi/exynos_drm.h b/include/drm-uapi/exynos_drm.h
index a00116b5cc5c..7414cfd76419 100644
--- a/include/drm-uapi/exynos_drm.h
+++ b/include/drm-uapi/exynos_drm.h
@@ -135,6 +135,219 @@ struct drm_exynos_g2d_exec {
__u64 async;
};
+/* Exynos DRM IPP v2 API */
+
+/**
+ * Enumerate available IPP hardware modules.
+ *
+ * @count_ipps: size of ipp_id array / number of ipp modules (set by driver)
+ * @reserved: padding
+ * @ipp_id_ptr: pointer to ipp_id array or NULL
+ */
+struct drm_exynos_ioctl_ipp_get_res {
+ __u32 count_ipps;
+ __u32 reserved;
+ __u64 ipp_id_ptr;
+};
+
+enum drm_exynos_ipp_format_type {
+ DRM_EXYNOS_IPP_FORMAT_SOURCE = 0x01,
+ DRM_EXYNOS_IPP_FORMAT_DESTINATION = 0x02,
+};
+
+struct drm_exynos_ipp_format {
+ __u32 fourcc;
+ __u32 type;
+ __u64 modifier;
+};
+
+enum drm_exynos_ipp_capability {
+ DRM_EXYNOS_IPP_CAP_CROP = 0x01,
+ DRM_EXYNOS_IPP_CAP_ROTATE = 0x02,
+ DRM_EXYNOS_IPP_CAP_SCALE = 0x04,
+ DRM_EXYNOS_IPP_CAP_CONVERT = 0x08,
+};
+
+/**
+ * Get IPP hardware capabilities and supported image formats.
+ *
+ * @ipp_id: id of IPP module to query
+ * @capabilities: bitmask of drm_exynos_ipp_capability (set by driver)
+ * @reserved: padding
+ * @formats_count: size of formats array (in entries) / number of filled
+ * formats (set by driver)
+ * @formats_ptr: pointer to formats array or NULL
+ */
+struct drm_exynos_ioctl_ipp_get_caps {
+ __u32 ipp_id;
+ __u32 capabilities;
+ __u32 reserved;
+ __u32 formats_count;
+ __u64 formats_ptr;
+};
+
+enum drm_exynos_ipp_limit_type {
+ /* size (horizontal/vertial) limits, in pixels (min, max, alignment) */
+ DRM_EXYNOS_IPP_LIMIT_TYPE_SIZE = 0x0001,
+ /* scale ratio (horizonta/vertial), 16.16 fixed point (min, max) */
+ DRM_EXYNOS_IPP_LIMIT_TYPE_SCALE = 0x0002,
+
+ /* image buffer area */
+ DRM_EXYNOS_IPP_LIMIT_SIZE_BUFFER = 0x0001 << 16,
+ /* src/dst rectangle area */
+ DRM_EXYNOS_IPP_LIMIT_SIZE_AREA = 0x0002 << 16,
+ /* src/dst rectangle area when rotation enabled */
+ DRM_EXYNOS_IPP_LIMIT_SIZE_ROTATED = 0x0003 << 16,
+
+ DRM_EXYNOS_IPP_LIMIT_TYPE_MASK = 0x000f,
+ DRM_EXYNOS_IPP_LIMIT_SIZE_MASK = 0x000f << 16,
+};
+
+struct drm_exynos_ipp_limit_val {
+ __u32 min;
+ __u32 max;
+ __u32 align;
+ __u32 reserved;
+};
+
+/**
+ * IPP module limitation.
+ *
+ * @type: limit type (see drm_exynos_ipp_limit_type enum)
+ * @reserved: padding
+ * @h: horizontal limits
+ * @v: vertical limits
+ */
+struct drm_exynos_ipp_limit {
+ __u32 type;
+ __u32 reserved;
+ struct drm_exynos_ipp_limit_val h;
+ struct drm_exynos_ipp_limit_val v;
+};
+
+/**
+ * Get IPP limits for given image format.
+ *
+ * @ipp_id: id of IPP module to query
+ * @fourcc: image format code (see DRM_FORMAT_* in drm_fourcc.h)
+ * @modifier: image format modifier (see DRM_FORMAT_MOD_* in drm_fourcc.h)
+ * @type: source/destination identifier (drm_exynos_ipp_format_flag enum)
+ * @limits_count: size of limits array (in entries) / number of filled entries
+ * (set by driver)
+ * @limits_ptr: pointer to limits array or NULL
+ */
+struct drm_exynos_ioctl_ipp_get_limits {
+ __u32 ipp_id;
+ __u32 fourcc;
+ __u64 modifier;
+ __u32 type;
+ __u32 limits_count;
+ __u64 limits_ptr;
+};
+
+enum drm_exynos_ipp_task_id {
+ /* buffer described by struct drm_exynos_ipp_task_buffer */
+ DRM_EXYNOS_IPP_TASK_BUFFER = 0x0001,
+ /* rectangle described by struct drm_exynos_ipp_task_rect */
+ DRM_EXYNOS_IPP_TASK_RECTANGLE = 0x0002,
+ /* transformation described by struct drm_exynos_ipp_task_transform */
+ DRM_EXYNOS_IPP_TASK_TRANSFORM = 0x0003,
+ /* alpha configuration described by struct drm_exynos_ipp_task_alpha */
+ DRM_EXYNOS_IPP_TASK_ALPHA = 0x0004,
+
+ /* source image data (for buffer and rectangle chunks) */
+ DRM_EXYNOS_IPP_TASK_TYPE_SOURCE = 0x0001 << 16,
+ /* destination image data (for buffer and rectangle chunks) */
+ DRM_EXYNOS_IPP_TASK_TYPE_DESTINATION = 0x0002 << 16,
+};
+
+/**
+ * Memory buffer with image data.
+ *
+ * @id: must be DRM_EXYNOS_IPP_TASK_BUFFER
+ * other parameters are same as for AddFB2 generic DRM ioctl
+ */
+struct drm_exynos_ipp_task_buffer {
+ __u32 id;
+ __u32 fourcc;
+ __u32 width, height;
+ __u32 gem_id[4];
+ __u32 offset[4];
+ __u32 pitch[4];
+ __u64 modifier;
+};
+
+/**
+ * Rectangle for processing.
+ *
+ * @id: must be DRM_EXYNOS_IPP_TASK_RECTANGLE
+ * @reserved: padding
+ * @x,@y: left corner in pixels
+ * @w,@h: width/height in pixels
+ */
+struct drm_exynos_ipp_task_rect {
+ __u32 id;
+ __u32 reserved;
+ __u32 x;
+ __u32 y;
+ __u32 w;
+ __u32 h;
+};
+
+/**
+ * Image tranformation description.
+ *
+ * @id: must be DRM_EXYNOS_IPP_TASK_TRANSFORM
+ * @rotation: DRM_MODE_ROTATE_* and DRM_MODE_REFLECT_* values
+ */
+struct drm_exynos_ipp_task_transform {
+ __u32 id;
+ __u32 rotation;
+};
+
+/**
+ * Image global alpha configuration for formats without alpha values.
+ *
+ * @id: must be DRM_EXYNOS_IPP_TASK_ALPHA
+ * @value: global alpha value (0-255)
+ */
+struct drm_exynos_ipp_task_alpha {
+ __u32 id;
+ __u32 value;
+};
+
+enum drm_exynos_ipp_flag {
+ /* generate DRM event after processing */
+ DRM_EXYNOS_IPP_FLAG_EVENT = 0x01,
+ /* dry run, only check task parameters */
+ DRM_EXYNOS_IPP_FLAG_TEST_ONLY = 0x02,
+ /* non-blocking processing */
+ DRM_EXYNOS_IPP_FLAG_NONBLOCK = 0x04,
+};
+
+#define DRM_EXYNOS_IPP_FLAGS (DRM_EXYNOS_IPP_FLAG_EVENT |\
+ DRM_EXYNOS_IPP_FLAG_TEST_ONLY | DRM_EXYNOS_IPP_FLAG_NONBLOCK)
+
+/**
+ * Perform image processing described by array of drm_exynos_ipp_task_*
+ * structures (parameters array).
+ *
+ * @ipp_id: id of IPP module to run the task
+ * @flags: bitmask of drm_exynos_ipp_flag values
+ * @reserved: padding
+ * @params_size: size of parameters array (in bytes)
+ * @params_ptr: pointer to parameters array or NULL
+ * @user_data: (optional) data for drm event
+ */
+struct drm_exynos_ioctl_ipp_commit {
+ __u32 ipp_id;
+ __u32 flags;
+ __u32 reserved;
+ __u32 params_size;
+ __u64 params_ptr;
+ __u64 user_data;
+};
+
#define DRM_EXYNOS_GEM_CREATE 0x00
#define DRM_EXYNOS_GEM_MAP 0x01
/* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
@@ -147,6 +360,11 @@ struct drm_exynos_g2d_exec {
#define DRM_EXYNOS_G2D_EXEC 0x22
/* Reserved 0x30 ~ 0x33 for obsolete Exynos IPP ioctls */
+/* IPP - Image Post Processing */
+#define DRM_EXYNOS_IPP_GET_RESOURCES 0x40
+#define DRM_EXYNOS_IPP_GET_CAPS 0x41
+#define DRM_EXYNOS_IPP_GET_LIMITS 0x42
+#define DRM_EXYNOS_IPP_COMMIT 0x43
#define DRM_IOCTL_EXYNOS_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + \
DRM_EXYNOS_GEM_CREATE, struct drm_exynos_gem_create)
@@ -165,8 +383,20 @@ struct drm_exynos_g2d_exec {
#define DRM_IOCTL_EXYNOS_G2D_EXEC DRM_IOWR(DRM_COMMAND_BASE + \
DRM_EXYNOS_G2D_EXEC, struct drm_exynos_g2d_exec)
+#define DRM_IOCTL_EXYNOS_IPP_GET_RESOURCES DRM_IOWR(DRM_COMMAND_BASE + \
+ DRM_EXYNOS_IPP_GET_RESOURCES, \
+ struct drm_exynos_ioctl_ipp_get_res)
+#define DRM_IOCTL_EXYNOS_IPP_GET_CAPS DRM_IOWR(DRM_COMMAND_BASE + \
+ DRM_EXYNOS_IPP_GET_CAPS, struct drm_exynos_ioctl_ipp_get_caps)
+#define DRM_IOCTL_EXYNOS_IPP_GET_LIMITS DRM_IOWR(DRM_COMMAND_BASE + \
+ DRM_EXYNOS_IPP_GET_LIMITS, \
+ struct drm_exynos_ioctl_ipp_get_limits)
+#define DRM_IOCTL_EXYNOS_IPP_COMMIT DRM_IOWR(DRM_COMMAND_BASE + \
+ DRM_EXYNOS_IPP_COMMIT, struct drm_exynos_ioctl_ipp_commit)
+
/* EXYNOS specific events */
#define DRM_EXYNOS_G2D_EVENT 0x80000000
+#define DRM_EXYNOS_IPP_EVENT 0x80000002
struct drm_exynos_g2d_event {
struct drm_event base;
@@ -177,6 +407,16 @@ struct drm_exynos_g2d_event {
__u32 reserved;
};
+struct drm_exynos_ipp_event {
+ struct drm_event base;
+ __u64 user_data;
+ __u32 tv_sec;
+ __u32 tv_usec;
+ __u32 ipp_id;
+ __u32 sequence;
+ __u64 reserved;
+};
+
#if defined(__cplusplus)
}
#endif
diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index 16e452aa12d4..cf9dcf040321 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -110,9 +110,17 @@ enum drm_i915_gem_engine_class {
enum drm_i915_pmu_engine_sample {
I915_SAMPLE_BUSY = 0,
I915_SAMPLE_WAIT = 1,
- I915_SAMPLE_SEMA = 2
+ I915_SAMPLE_SEMA = 2,
+ I915_SAMPLE_QUEUED = 3,
+ I915_SAMPLE_RUNNABLE = 4,
+ I915_SAMPLE_RUNNING = 5,
};
+ /* Divide counter value by divisor to get the real value. */
+#define I915_SAMPLE_QUEUED_DIVISOR (1000)
+#define I915_SAMPLE_RUNNABLE_DIVISOR (1000)
+#define I915_SAMPLE_RUNNING_DIVISOR (1000)
+
#define I915_PMU_SAMPLE_BITS (4)
#define I915_PMU_SAMPLE_MASK (0xf)
#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
@@ -133,6 +141,15 @@ enum drm_i915_pmu_engine_sample {
#define I915_PMU_ENGINE_SEMA(class, instance) \
__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_RUNNABLE(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNABLE)
+
+#define I915_PMU_ENGINE_RUNNING(class, instance) \
+ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_RUNNING)
+
#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
#define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0)
@@ -412,6 +429,14 @@ typedef struct drm_i915_irq_wait {
int irq_seq;
} drm_i915_irq_wait_t;
+/*
+ * Different modes of per-process Graphics Translation Table,
+ * see I915_PARAM_HAS_ALIASING_PPGTT
+ */
+#define I915_GEM_PPGTT_NONE 0
+#define I915_GEM_PPGTT_ALIASING 1
+#define I915_GEM_PPGTT_FULL 2
+
/* Ioctl to query kernel params:
*/
#define I915_PARAM_IRQ_ACTIVE 1
@@ -529,6 +554,28 @@ typedef struct drm_i915_irq_wait {
*/
#define I915_PARAM_CS_TIMESTAMP_FREQUENCY 51
+/*
+ * Once upon a time we supposed that writes through the GGTT would be
+ * immediately in physical memory (once flushed out of the CPU path). However,
+ * on a few different processors and chipsets, this is not necessarily the case
+ * as the writes appear to be buffered internally. Thus a read of the backing
+ * storage (physical memory) via a different path (with different physical tags
+ * to the indirect write via the GGTT) will see stale values from before
+ * the GGTT write. Inside the kernel, we can for the most part keep track of
+ * the different read/write domains in use (e.g. set-domain), but the assumption
+ * of coherency is baked into the ABI, hence reporting its true state in this
+ * parameter.
+ *
+ * Reports true when writes via mmap_gtt are immediately visible following an
+ * lfence to flush the WCB.
+ *
+ * Reports false when writes via mmap_gtt are indeterminately delayed in an in
+ * internal buffer and are _not_ immediately visible to third parties accessing
+ * directly via mmap_cpu/mmap_wc. Use of mmap_gtt as part of an IPC
+ * communications channel when reporting false is strongly disadvised.
+ */
+#define I915_PARAM_MMAP_GTT_COHERENT 52
+
typedef struct drm_i915_getparam {
__s32 param;
/*
diff --git a/include/drm-uapi/msm_drm.h b/include/drm-uapi/msm_drm.h
index bbbaffad772d..c06d0a5bdd80 100644
--- a/include/drm-uapi/msm_drm.h
+++ b/include/drm-uapi/msm_drm.h
@@ -201,10 +201,12 @@ struct drm_msm_gem_submit_bo {
#define MSM_SUBMIT_NO_IMPLICIT 0x80000000 /* disable implicit sync */
#define MSM_SUBMIT_FENCE_FD_IN 0x40000000 /* enable input fence_fd */
#define MSM_SUBMIT_FENCE_FD_OUT 0x20000000 /* enable output fence_fd */
+#define MSM_SUBMIT_SUDO 0x10000000 /* run submitted cmds from RB */
#define MSM_SUBMIT_FLAGS ( \
MSM_SUBMIT_NO_IMPLICIT | \
MSM_SUBMIT_FENCE_FD_IN | \
MSM_SUBMIT_FENCE_FD_OUT | \
+ MSM_SUBMIT_SUDO | \
0)
/* Each cmdstream submit consists of a table of buffers involved, and
diff --git a/include/drm-uapi/tegra_drm.h b/include/drm-uapi/tegra_drm.h
index 12f9bf848db1..6c07919c04e9 100644
--- a/include/drm-uapi/tegra_drm.h
+++ b/include/drm-uapi/tegra_drm.h
@@ -32,143 +32,615 @@ extern "C" {
#define DRM_TEGRA_GEM_CREATE_TILED (1 << 0)
#define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1)
+/**
+ * struct drm_tegra_gem_create - parameters for the GEM object creation IOCTL
+ */
struct drm_tegra_gem_create {
+ /**
+ * @size:
+ *
+ * The size, in bytes, of the buffer object to be created.
+ */
__u64 size;
+
+ /**
+ * @flags:
+ *
+ * A bitmask of flags that influence the creation of GEM objects:
+ *
+ * DRM_TEGRA_GEM_CREATE_TILED
+ * Use the 16x16 tiling format for this buffer.
+ *
+ * DRM_TEGRA_GEM_CREATE_BOTTOM_UP
+ * The buffer has a bottom-up layout.
+ */
__u32 flags;
+
+ /**
+ * @handle:
+ *
+ * The handle of the created GEM object. Set by the kernel upon
+ * successful completion of the IOCTL.
+ */
__u32 handle;
};
+/**
+ * struct drm_tegra_gem_mmap - parameters for the GEM mmap IOCTL
+ */
struct drm_tegra_gem_mmap {
+ /**
+ * @handle:
+ *
+ * Handle of the GEM object to obtain an mmap offset for.
+ */
__u32 handle;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
+
+ /**
+ * @offset:
+ *
+ * The mmap offset for the given GEM object. Set by the kernel upon
+ * successful completion of the IOCTL.
+ */
__u64 offset;
};
+/**
+ * struct drm_tegra_syncpt_read - parameters for the read syncpoint IOCTL
+ */
struct drm_tegra_syncpt_read {
+ /**
+ * @id:
+ *
+ * ID of the syncpoint to read the current value from.
+ */
__u32 id;
+
+ /**
+ * @value:
+ *
+ * The current syncpoint value. Set by the kernel upon successful
+ * completion of the IOCTL.
+ */
__u32 value;
};
+/**
+ * struct drm_tegra_syncpt_incr - parameters for the increment syncpoint IOCTL
+ */
struct drm_tegra_syncpt_incr {
+ /**
+ * @id:
+ *
+ * ID of the syncpoint to increment.
+ */
__u32 id;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
+/**
+ * struct drm_tegra_syncpt_wait - parameters for the wait syncpoint IOCTL
+ */
struct drm_tegra_syncpt_wait {
+ /**
+ * @id:
+ *
+ * ID of the syncpoint to wait on.
+ */
__u32 id;
+
+ /**
+ * @thresh:
+ *
+ * Threshold value for which to wait.
+ */
__u32 thresh;
+
+ /**
+ * @timeout:
+ *
+ * Timeout, in milliseconds, to wait.
+ */
__u32 timeout;
+
+ /**
+ * @value:
+ *
+ * The new syncpoint value after the wait. Set by the kernel upon
+ * successful completion of the IOCTL.
+ */
__u32 value;
};
#define DRM_TEGRA_NO_TIMEOUT (0xffffffff)
+/**
+ * struct drm_tegra_open_channel - parameters for the open channel IOCTL
+ */
struct drm_tegra_open_channel {
+ /**
+ * @client:
+ *
+ * The client ID for this channel.
+ */
__u32 client;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
+
+ /**
+ * @context:
+ *
+ * The application context of this channel. Set by the kernel upon
+ * successful completion of the IOCTL. This context needs to be passed
+ * to the DRM_TEGRA_CHANNEL_CLOSE or the DRM_TEGRA_SUBMIT IOCTLs.
+ */
__u64 context;
};
+/**
+ * struct drm_tegra_close_channel - parameters for the close channel IOCTL
+ */
struct drm_tegra_close_channel {
+ /**
+ * @context:
+ *
+ * The application context of this channel. This is obtained from the
+ * DRM_TEGRA_OPEN_CHANNEL IOCTL.
+ */
__u64 context;
};
+/**
+ * struct drm_tegra_get_syncpt - parameters for the get syncpoint IOCTL
+ */
struct drm_tegra_get_syncpt {
+ /**
+ * @context:
+ *
+ * The application context identifying the channel for which to obtain
+ * the syncpoint ID.
+ */
__u64 context;
+
+ /**
+ * @index:
+ *
+ * Index of the client syncpoint for which to obtain the ID.
+ */
__u32 index;
+
+ /**
+ * @id:
+ *
+ * The ID of the given syncpoint. Set by the kernel upon successful
+ * completion of the IOCTL.
+ */
__u32 id;
};
+/**
+ * struct drm_tegra_get_syncpt_base - parameters for the get wait base IOCTL
+ */
struct drm_tegra_get_syncpt_base {
+ /**
+ * @context:
+ *
+ * The application context identifying for which channel to obtain the
+ * wait base.
+ */
__u64 context;
+
+ /**
+ * @syncpt:
+ *
+ * ID of the syncpoint for which to obtain the wait base.
+ */
__u32 syncpt;
+
+ /**
+ * @id:
+ *
+ * The ID of the wait base corresponding to the client syncpoint. Set
+ * by the kernel upon successful completion of the IOCTL.
+ */
__u32 id;
};
+/**
+ * struct drm_tegra_syncpt - syncpoint increment operation
+ */
struct drm_tegra_syncpt {
+ /**
+ * @id:
+ *
+ * ID of the syncpoint to operate on.
+ */
__u32 id;
+
+ /**
+ * @incrs:
+ *
+ * Number of increments to perform for the syncpoint.
+ */
__u32 incrs;
};
+/**
+ * struct drm_tegra_cmdbuf - structure describing a command buffer
+ */
struct drm_tegra_cmdbuf {
+ /**
+ * @handle:
+ *
+ * Handle to a GEM object containing the command buffer.
+ */
__u32 handle;
+
+ /**
+ * @offset:
+ *
+ * Offset, in bytes, into the GEM object identified by @handle at
+ * which the command buffer starts.
+ */
__u32 offset;
+
+ /**
+ * @words:
+ *
+ * Number of 32-bit words in this command buffer.
+ */
__u32 words;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
+/**
+ * struct drm_tegra_reloc - GEM object relocation structure
+ */
struct drm_tegra_reloc {
struct {
+ /**
+ * @cmdbuf.handle:
+ *
+ * Handle to the GEM object containing the command buffer for
+ * which to perform this GEM object relocation.
+ */
__u32 handle;
+
+ /**
+ * @cmdbuf.offset:
+ *
+ * Offset, in bytes, into the command buffer at which to
+ * insert the relocated address.
+ */
__u32 offset;
} cmdbuf;
struct {
+ /**
+ * @target.handle:
+ *
+ * Handle to the GEM object to be relocated.
+ */
__u32 handle;
+
+ /**
+ * @target.offset:
+ *
+ * Offset, in bytes, into the target GEM object at which the
+ * relocated data starts.
+ */
__u32 offset;
} target;
+
+ /**
+ * @shift:
+ *
+ * The number of bits by which to shift relocated addresses.
+ */
__u32 shift;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
+/**
+ * struct drm_tegra_waitchk - wait check structure
+ */
struct drm_tegra_waitchk {
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object containing a command stream on which to
+ * perform the wait check.
+ */
__u32 handle;
+
+ /**
+ * @offset:
+ *
+ * Offset, in bytes, of the location in the command stream to perform
+ * the wait check on.
+ */
__u32 offset;
+
+ /**
+ * @syncpt:
+ *
+ * ID of the syncpoint to wait check.
+ */
__u32 syncpt;
+
+ /**
+ * @thresh:
+ *
+ * Threshold value for which to check.
+ */
__u32 thresh;
};
+/**
+ * struct drm_tegra_submit - job submission structure
+ */
struct drm_tegra_submit {
+ /**
+ * @context:
+ *
+ * The application context identifying the channel to use for the
+ * execution of this job.
+ */
__u64 context;
+
+ /**
+ * @num_syncpts:
+ *
+ * The number of syncpoints operated on by this job. This defines the
+ * length of the array pointed to by @syncpts.
+ */
__u32 num_syncpts;
+
+ /**
+ * @num_cmdbufs:
+ *
+ * The number of command buffers to execute as part of this job. This
+ * defines the length of the array pointed to by @cmdbufs.
+ */
__u32 num_cmdbufs;
+
+ /**
+ * @num_relocs:
+ *
+ * The number of relocations to perform before executing this job.
+ * This defines the length of the array pointed to by @relocs.
+ */
__u32 num_relocs;
+
+ /**
+ * @num_waitchks:
+ *
+ * The number of wait checks to perform as part of this job. This
+ * defines the length of the array pointed to by @waitchks.
+ */
__u32 num_waitchks;
+
+ /**
+ * @waitchk_mask:
+ *
+ * Bitmask of valid wait checks.
+ */
__u32 waitchk_mask;
+
+ /**
+ * @timeout:
+ *
+ * Timeout, in milliseconds, before this job is cancelled.
+ */
__u32 timeout;
+
+ /**
+ * @syncpts:
+ *
+ * A pointer to an array of &struct drm_tegra_syncpt structures that
+ * specify the syncpoint operations performed as part of this job.
+ * The number of elements in the array must be equal to the value
+ * given by @num_syncpts.
+ */
__u64 syncpts;
+
+ /**
+ * @cmdbufs:
+ *
+ * A pointer to an array of &struct drm_tegra_cmdbuf structures that
+ * define the command buffers to execute as part of this job. The
+ * number of elements in the array must be equal to the value given
+ * by @num_syncpts.
+ */
__u64 cmdbufs;
+
+ /**
+ * @relocs:
+ *
+ * A pointer to an array of &struct drm_tegra_reloc structures that
+ * specify the relocations that need to be performed before executing
+ * this job. The number of elements in the array must be equal to the
+ * value given by @num_relocs.
+ */
__u64 relocs;
+
+ /**
+ * @waitchks:
+ *
+ * A pointer to an array of &struct drm_tegra_waitchk structures that
+ * specify the wait checks to be performed while executing this job.
+ * The number of elements in the array must be equal to the value
+ * given by @num_waitchks.
+ */
__u64 waitchks;
- __u32 fence; /* Return value */
- __u32 reserved[5]; /* future expansion */
+ /**
+ * @fence:
+ *
+ * The threshold of the syncpoint associated with this job after it
+ * has been completed. Set by the kernel upon successful completion of
+ * the IOCTL. This can be used with the DRM_TEGRA_SYNCPT_WAIT IOCTL to
+ * wait for this job to be finished.
+ */
+ __u32 fence;
+
+ /**
+ * @reserved:
+ *
+ * This field is reserved for future use. Must be 0.
+ */
+ __u32 reserved[5];
};
#define DRM_TEGRA_GEM_TILING_MODE_PITCH 0
#define DRM_TEGRA_GEM_TILING_MODE_TILED 1
#define DRM_TEGRA_GEM_TILING_MODE_BLOCK 2
+/**
+ * struct drm_tegra_gem_set_tiling - parameters for the set tiling IOCTL
+ */
struct drm_tegra_gem_set_tiling {
- /* input */
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object for which to set the tiling parameters.
+ */
__u32 handle;
+
+ /**
+ * @mode:
+ *
+ * The tiling mode to set. Must be one of:
+ *
+ * DRM_TEGRA_GEM_TILING_MODE_PITCH
+ * pitch linear format
+ *
+ * DRM_TEGRA_GEM_TILING_MODE_TILED
+ * 16x16 tiling format
+ *
+ * DRM_TEGRA_GEM_TILING_MODE_BLOCK
+ * 16Bx2 tiling format
+ */
__u32 mode;
+
+ /**
+ * @value:
+ *
+ * The value to set for the tiling mode parameter.
+ */
__u32 value;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
+/**
+ * struct drm_tegra_gem_get_tiling - parameters for the get tiling IOCTL
+ */
struct drm_tegra_gem_get_tiling {
- /* input */
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object for which to query the tiling parameters.
+ */
__u32 handle;
- /* output */
+
+ /**
+ * @mode:
+ *
+ * The tiling mode currently associated with the GEM object. Set by
+ * the kernel upon successful completion of the IOCTL.
+ */
__u32 mode;
+
+ /**
+ * @value:
+ *
+ * The tiling mode parameter currently associated with the GEM object.
+ * Set by the kernel upon successful completion of the IOCTL.
+ */
__u32 value;
+
+ /**
+ * @pad:
+ *
+ * Structure padding that may be used in the future. Must be 0.
+ */
__u32 pad;
};
#define DRM_TEGRA_GEM_BOTTOM_UP (1 << 0)
#define DRM_TEGRA_GEM_FLAGS (DRM_TEGRA_GEM_BOTTOM_UP)
+/**
+ * struct drm_tegra_gem_set_flags - parameters for the set flags IOCTL
+ */
struct drm_tegra_gem_set_flags {
- /* input */
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object for which to set the flags.
+ */
__u32 handle;
- /* output */
+
+ /**
+ * @flags:
+ *
+ * The flags to set for the GEM object.
+ */
__u32 flags;
};
+/**
+ * struct drm_tegra_gem_get_flags - parameters for the get flags IOCTL
+ */
struct drm_tegra_gem_get_flags {
- /* input */
+ /**
+ * @handle:
+ *
+ * Handle to the GEM object for which to query the flags.
+ */
__u32 handle;
- /* output */
+
+ /**
+ * @flags:
+ *
+ * The flags currently associated with the GEM object. Set by the
+ * kernel upon successful completion of the IOCTL.
+ */
__u32 flags;
};
@@ -193,7 +665,7 @@ struct drm_tegra_gem_get_flags {
#define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr)
#define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait)
#define DRM_IOCTL_TEGRA_OPEN_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_OPEN_CHANNEL, struct drm_tegra_open_channel)
-#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_open_channel)
+#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_close_channel)
#define DRM_IOCTL_TEGRA_GET_SYNCPT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT, struct drm_tegra_get_syncpt)
#define DRM_IOCTL_TEGRA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SUBMIT, struct drm_tegra_submit)
#define DRM_IOCTL_TEGRA_GET_SYNCPT_BASE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT_BASE, struct drm_tegra_get_syncpt_base)
diff --git a/include/drm-uapi/v3d_drm.h b/include/drm-uapi/v3d_drm.h
new file mode 100644
index 000000000000..7b6627783608
--- /dev/null
+++ b/include/drm-uapi/v3d_drm.h
@@ -0,0 +1,194 @@
+/*
+ * Copyright © 2014-2018 Broadcom
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef _V3D_DRM_H_
+#define _V3D_DRM_H_
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+#define DRM_V3D_SUBMIT_CL 0x00
+#define DRM_V3D_WAIT_BO 0x01
+#define DRM_V3D_CREATE_BO 0x02
+#define DRM_V3D_MMAP_BO 0x03
+#define DRM_V3D_GET_PARAM 0x04
+#define DRM_V3D_GET_BO_OFFSET 0x05
+
+#define DRM_IOCTL_V3D_SUBMIT_CL DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl)
+#define DRM_IOCTL_V3D_WAIT_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo)
+#define DRM_IOCTL_V3D_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_CREATE_BO, struct drm_v3d_create_bo)
+#define DRM_IOCTL_V3D_MMAP_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_MMAP_BO, struct drm_v3d_mmap_bo)
+#define DRM_IOCTL_V3D_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)
+#define DRM_IOCTL_V3D_GET_BO_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset)
+
+/**
+ * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D
+ * engine.
+ *
+ * This asks the kernel to have the GPU execute an optional binner
+ * command list, and a render command list.
+ */
+struct drm_v3d_submit_cl {
+ /* Pointer to the binner command list.
+ *
+ * This is the first set of commands executed, which runs the
+ * coordinate shader to determine where primitives land on the screen,
+ * then writes out the state updates and draw calls necessary per tile
+ * to the tile allocation BO.
+ */
+ __u32 bcl_start;
+
+ /** End address of the BCL (first byte after the BCL) */
+ __u32 bcl_end;
+
+ /* Offset of the render command list.
+ *
+ * This is the second set of commands executed, which will either
+ * execute the tiles that have been set up by the BCL, or a fixed set
+ * of tiles (in the case of RCL-only blits).
+ */
+ __u32 rcl_start;
+
+ /** End address of the RCL (first byte after the RCL) */
+ __u32 rcl_end;
+
+ /** An optional sync object to wait on before starting the BCL. */
+ __u32 in_sync_bcl;
+ /** An optional sync object to wait on before starting the RCL. */
+ __u32 in_sync_rcl;
+ /** An optional sync object to place the completion fence in. */
+ __u32 out_sync;
+
+ /* Offset of the tile alloc memory
+ *
+ * This is optional on V3D 3.3 (where the CL can set the value) but
+ * required on V3D 4.1.
+ */
+ __u32 qma;
+
+ /** Size of the tile alloc memory. */
+ __u32 qms;
+
+ /** Offset of the tile state data array. */
+ __u32 qts;
+
+ /* Pointer to a u32 array of the BOs that are referenced by the job.
+ */
+ __u64 bo_handles;
+
+ /* Number of BO handles passed in (size is that times 4). */
+ __u32 bo_handle_count;
+
+ /* Pad, must be zero-filled. */
+ __u32 pad;
+};
+
+/**
+ * struct drm_v3d_wait_bo - ioctl argument for waiting for
+ * completion of the last DRM_V3D_SUBMIT_CL on a BO.
+ *
+ * This is useful for cases where multiple processes might be
+ * rendering to a BO and you want to wait for all rendering to be
+ * completed.
+ */
+struct drm_v3d_wait_bo {
+ __u32 handle;
+ __u32 pad;
+ __u64 timeout_ns;
+};
+
+/**
+ * struct drm_v3d_create_bo - ioctl argument for creating V3D BOs.
+ *
+ * There are currently no values for the flags argument, but it may be
+ * used in a future extension.
+ */
+struct drm_v3d_create_bo {
+ __u32 size;
+ __u32 flags;
+ /** Returned GEM handle for the BO. */
+ __u32 handle;
+ /**
+ * Returned offset for the BO in the V3D address space. This offset
+ * is private to the DRM fd and is valid for the lifetime of the GEM
+ * handle.
+ *
+ * This offset value will always be nonzero, since various HW
+ * units treat 0 specially.
+ */
+ __u32 offset;
+};
+
+/**
+ * struct drm_v3d_mmap_bo - ioctl argument for mapping V3D BOs.
+ *
+ * This doesn't actually perform an mmap. Instead, it returns the
+ * offset you need to use in an mmap on the DRM device node. This
+ * means that tools like valgrind end up knowing about the mapped
+ * memory.
+ *
+ * There are currently no values for the flags argument, but it may be
+ * used in a future extension.
+ */
+struct drm_v3d_mmap_bo {
+ /** Handle for the object being mapped. */
+ __u32 handle;
+ __u32 flags;
+ /** offset into the drm node to use for subsequent mmap call. */
+ __u64 offset;
+};
+
+enum drm_v3d_param {
+ DRM_V3D_PARAM_V3D_UIFCFG,
+ DRM_V3D_PARAM_V3D_HUB_IDENT1,
+ DRM_V3D_PARAM_V3D_HUB_IDENT2,
+ DRM_V3D_PARAM_V3D_HUB_IDENT3,
+ DRM_V3D_PARAM_V3D_CORE0_IDENT0,
+ DRM_V3D_PARAM_V3D_CORE0_IDENT1,
+ DRM_V3D_PARAM_V3D_CORE0_IDENT2,
+};
+
+struct drm_v3d_get_param {
+ __u32 param;
+ __u32 pad;
+ __u64 value;
+};
+
+/**
+ * Returns the offset for the BO in the V3D address space for this DRM fd.
+ * This is the same value returned by drm_v3d_create_bo, if that was called
+ * from this DRM fd.
+ */
+struct drm_v3d_get_bo_offset {
+ __u32 handle;
+ __u32 offset;
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* _V3D_DRM_H_ */
diff --git a/include/drm-uapi/vc4_drm.h b/include/drm-uapi/vc4_drm.h
index 4117117b4204..31f50de39acb 100644
--- a/include/drm-uapi/vc4_drm.h
+++ b/include/drm-uapi/vc4_drm.h
@@ -183,10 +183,17 @@ struct drm_vc4_submit_cl {
/* ID of the perfmon to attach to this job. 0 means no perfmon. */
__u32 perfmonid;
- /* Unused field to align this struct on 64 bits. Must be set to 0.
- * If one ever needs to add an u32 field to this struct, this field
- * can be used.
+ /* Syncobj handle to wait on. If set, processing of this render job
+ * will not start until the syncobj is signaled. 0 means ignore.
*/
+ __u32 in_sync;
+
+ /* Syncobj handle to export fence to. If set, the fence in the syncobj
+ * will be replaced with a fence that signals upon completion of this
+ * render job. 0 means ignore.
+ */
+ __u32 out_sync;
+
__u32 pad2;
};
diff --git a/include/drm-uapi/virtgpu_drm.h b/include/drm-uapi/virtgpu_drm.h
index 91a31ffed828..9a781f0611df 100644
--- a/include/drm-uapi/virtgpu_drm.h
+++ b/include/drm-uapi/virtgpu_drm.h
@@ -63,6 +63,7 @@ struct drm_virtgpu_execbuffer {
};
#define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */
+#define VIRTGPU_PARAM_CAPSET_QUERY_FIX 2 /* do we have the capset fix */
struct drm_virtgpu_getparam {
__u64 param;
diff --git a/include/drm-uapi/vmwgfx_drm.h b/include/drm-uapi/vmwgfx_drm.h
index 0bc784f5e0db..399f58317cff 100644
--- a/include/drm-uapi/vmwgfx_drm.h
+++ b/include/drm-uapi/vmwgfx_drm.h
@@ -40,6 +40,7 @@ extern "C" {
#define DRM_VMW_GET_PARAM 0
#define DRM_VMW_ALLOC_DMABUF 1
+#define DRM_VMW_ALLOC_BO 1
#define DRM_VMW_UNREF_DMABUF 2
#define DRM_VMW_HANDLE_CLOSE 2
#define DRM_VMW_CURSOR_BYPASS 3
@@ -68,6 +69,8 @@ extern "C" {
#define DRM_VMW_GB_SURFACE_REF 24
#define DRM_VMW_SYNCCPU 25
#define DRM_VMW_CREATE_EXTENDED_CONTEXT 26
+#define DRM_VMW_GB_SURFACE_CREATE_EXT 27
+#define DRM_VMW_GB_SURFACE_REF_EXT 28
/*************************************************************************/
/**
@@ -79,6 +82,9 @@ extern "C" {
*
* DRM_VMW_PARAM_OVERLAY_IOCTL:
* Does the driver support the overlay ioctl.
+ *
+ * DRM_VMW_PARAM_SM4_1
+ * SM4_1 support is enabled.
*/
#define DRM_VMW_PARAM_NUM_STREAMS 0
@@ -94,6 +100,8 @@ extern "C" {
#define DRM_VMW_PARAM_MAX_MOB_SIZE 10
#define DRM_VMW_PARAM_SCREEN_TARGET 11
#define DRM_VMW_PARAM_DX 12
+#define DRM_VMW_PARAM_HW_CAPS2 13
+#define DRM_VMW_PARAM_SM4_1 14
/**
* enum drm_vmw_handle_type - handle type for ref ioctls
@@ -356,9 +364,9 @@ struct drm_vmw_fence_rep {
/*************************************************************************/
/**
- * DRM_VMW_ALLOC_DMABUF
+ * DRM_VMW_ALLOC_BO
*
- * Allocate a DMA buffer that is visible also to the host.
+ * Allocate a buffer object that is visible also to the host.
* NOTE: The buffer is
* identified by a handle and an offset, which are private to the guest, but
* useable in the command stream. The guest kernel may translate these
@@ -366,27 +374,28 @@ struct drm_vmw_fence_rep {
* be zero at all times, or it may disappear from the interface before it is
* fixed.
*
- * The DMA buffer may stay user-space mapped in the guest at all times,
+ * The buffer object may stay user-space mapped in the guest at all times,
* and is thus suitable for sub-allocation.
*
- * DMA buffers are mapped using the mmap() syscall on the drm device.
+ * Buffer objects are mapped using the mmap() syscall on the drm device.
*/
/**
- * struct drm_vmw_alloc_dmabuf_req
+ * struct drm_vmw_alloc_bo_req
*
* @size: Required minimum size of the buffer.
*
- * Input data to the DRM_VMW_ALLOC_DMABUF Ioctl.
+ * Input data to the DRM_VMW_ALLOC_BO Ioctl.
*/
-struct drm_vmw_alloc_dmabuf_req {
+struct drm_vmw_alloc_bo_req {
__u32 size;
__u32 pad64;
};
+#define drm_vmw_alloc_dmabuf_req drm_vmw_alloc_bo_req
/**
- * struct drm_vmw_dmabuf_rep
+ * struct drm_vmw_bo_rep
*
* @map_handle: Offset to use in the mmap() call used to map the buffer.
* @handle: Handle unique to this buffer. Used for unreferencing.
@@ -395,50 +404,32 @@ struct drm_vmw_alloc_dmabuf_req {
* @cur_gmr_offset: Offset to use in the command stream when this buffer is
* referenced. See note above.
*
- * Output data from the DRM_VMW_ALLOC_DMABUF Ioctl.
+ * Output data from the DRM_VMW_ALLOC_BO Ioctl.
*/
-struct drm_vmw_dmabuf_rep {
+struct drm_vmw_bo_rep {
__u64 map_handle;
__u32 handle;
__u32 cur_gmr_id;
__u32 cur_gmr_offset;
__u32 pad64;
};
+#define drm_vmw_dmabuf_rep drm_vmw_bo_rep
/**
- * union drm_vmw_dmabuf_arg
+ * union drm_vmw_alloc_bo_arg
*
* @req: Input data as described above.
* @rep: Output data as described above.
*
- * Argument to the DRM_VMW_ALLOC_DMABUF Ioctl.
+ * Argument to the DRM_VMW_ALLOC_BO Ioctl.
*/
-union drm_vmw_alloc_dmabuf_arg {
- struct drm_vmw_alloc_dmabuf_req req;
- struct drm_vmw_dmabuf_rep rep;
-};
-
-/*************************************************************************/
-/**
- * DRM_VMW_UNREF_DMABUF - Free a DMA buffer.
- *
- */
-
-/**
- * struct drm_vmw_unref_dmabuf_arg
- *
- * @handle: Handle indicating what buffer to free. Obtained from the
- * DRM_VMW_ALLOC_DMABUF Ioctl.
- *
- * Argument to the DRM_VMW_UNREF_DMABUF Ioctl.
- */
-
-struct drm_vmw_unref_dmabuf_arg {
- __u32 handle;
- __u32 pad64;
+union drm_vmw_alloc_bo_arg {
+ struct drm_vmw_alloc_bo_req req;
+ struct drm_vmw_bo_rep rep;
};
+#define drm_vmw_alloc_dmabuf_arg drm_vmw_alloc_bo_arg
/*************************************************************************/
/**
@@ -1103,9 +1094,8 @@ union drm_vmw_extended_context_arg {
* DRM_VMW_HANDLE_CLOSE - Close a user-space handle and release its
* underlying resource.
*
- * Note that this ioctl is overlaid on the DRM_VMW_UNREF_DMABUF Ioctl.
- * The ioctl arguments therefore need to be identical in layout.
- *
+ * Note that this ioctl is overlaid on the deprecated DRM_VMW_UNREF_DMABUF
+ * Ioctl.
*/
/**
@@ -1119,7 +1109,107 @@ struct drm_vmw_handle_close_arg {
__u32 handle;
__u32 pad64;
};
+#define drm_vmw_unref_dmabuf_arg drm_vmw_handle_close_arg
+
+/*************************************************************************/
+/**
+ * DRM_VMW_GB_SURFACE_CREATE_EXT - Create a host guest-backed surface.
+ *
+ * Allocates a surface handle and queues a create surface command
+ * for the host on the first use of the surface. The surface ID can
+ * be used as the surface ID in commands referencing the surface.
+ *
+ * This new command extends DRM_VMW_GB_SURFACE_CREATE by adding version
+ * parameter and 64 bit svga flag.
+ */
+
+/**
+ * enum drm_vmw_surface_version
+ *
+ * @drm_vmw_surface_gb_v1: Corresponds to current gb surface format with
+ * svga3d surface flags split into 2, upper half and lower half.
+ */
+enum drm_vmw_surface_version {
+ drm_vmw_gb_surface_v1
+};
+
+/**
+ * struct drm_vmw_gb_surface_create_ext_req
+ *
+ * @base: Surface create parameters.
+ * @version: Version of surface create ioctl.
+ * @svga3d_flags_upper_32_bits: Upper 32 bits of svga3d flags.
+ * @multisample_pattern: Multisampling pattern when msaa is supported.
+ * @quality_level: Precision settings for each sample.
+ * @must_be_zero: Reserved for future usage.
+ *
+ * Input argument to the DRM_VMW_GB_SURFACE_CREATE_EXT Ioctl.
+ * Part of output argument for the DRM_VMW_GB_SURFACE_REF_EXT Ioctl.
+ */
+struct drm_vmw_gb_surface_create_ext_req {
+ struct drm_vmw_gb_surface_create_req base;
+ enum drm_vmw_surface_version version;
+ uint32_t svga3d_flags_upper_32_bits;
+ SVGA3dMSPattern multisample_pattern;
+ SVGA3dMSQualityLevel quality_level;
+ uint64_t must_be_zero;
+};
+
+/**
+ * union drm_vmw_gb_surface_create_ext_arg
+ *
+ * @req: Input argument as described above.
+ * @rep: Output argument as described above.
+ *
+ * Argument to the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl.
+ */
+union drm_vmw_gb_surface_create_ext_arg {
+ struct drm_vmw_gb_surface_create_rep rep;
+ struct drm_vmw_gb_surface_create_ext_req req;
+};
+
+/*************************************************************************/
+/**
+ * DRM_VMW_GB_SURFACE_REF_EXT - Reference a host surface.
+ *
+ * Puts a reference on a host surface with a given handle, as previously
+ * returned by the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl.
+ * A reference will make sure the surface isn't destroyed while we hold
+ * it and will allow the calling client to use the surface handle in
+ * the command stream.
+ *
+ * On successful return, the Ioctl returns the surface information given
+ * to and returned from the DRM_VMW_GB_SURFACE_CREATE_EXT ioctl.
+ */
+/**
+ * struct drm_vmw_gb_surface_ref_ext_rep
+ *
+ * @creq: The data used as input when the surface was created, as described
+ * above at "struct drm_vmw_gb_surface_create_ext_req"
+ * @crep: Additional data output when the surface was created, as described
+ * above at "struct drm_vmw_gb_surface_create_rep"
+ *
+ * Output Argument to the DRM_VMW_GB_SURFACE_REF_EXT ioctl.
+ */
+struct drm_vmw_gb_surface_ref_ext_rep {
+ struct drm_vmw_gb_surface_create_ext_req creq;
+ struct drm_vmw_gb_surface_create_rep crep;
+};
+
+/**
+ * union drm_vmw_gb_surface_reference_ext_arg
+ *
+ * @req: Input data as described above at "struct drm_vmw_surface_arg"
+ * @rep: Output data as described above at
+ * "struct drm_vmw_gb_surface_ref_ext_rep"
+ *
+ * Argument to the DRM_VMW_GB_SURFACE_REF Ioctl.
+ */
+union drm_vmw_gb_surface_reference_ext_arg {
+ struct drm_vmw_gb_surface_ref_ext_rep rep;
+ struct drm_vmw_surface_arg req;
+};
#if defined(__cplusplus)
}
--
2.17.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC i-g-t 2/6] intel-gpu-overlay: Add engine queue stats
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
@ 2018-10-03 12:07 ` Tvrtko Ursulin
-1 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Use new PMU engine queue stats (queued, runnable and running) and display
them per engine.
v2:
* Compact per engine stats. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
overlay/gpu-top.c | 42 ++++++++++++++++++++++++++++++++++++++++++
overlay/gpu-top.h | 11 +++++++++++
overlay/overlay.c | 7 +++++++
3 files changed, 60 insertions(+)
diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c
index 61b8f62fd78c..22e9badb22c1 100644
--- a/overlay/gpu-top.c
+++ b/overlay/gpu-top.c
@@ -72,6 +72,18 @@ static int perf_init(struct gpu_top *gt)
gt->fd) >= 0)
gt->have_sema = 1;
+ if (perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class, d->inst),
+ gt->fd) >= 0)
+ gt->have_queued = 1;
+
+ if (perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class, d->inst),
+ gt->fd) >= 0)
+ gt->have_runnable = 1;
+
+ if (perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class, d->inst),
+ gt->fd) >= 0)
+ gt->have_running = 1;
+
gt->ring[0].name = d->name;
gt->num_rings = 1;
@@ -93,6 +105,24 @@ static int perf_init(struct gpu_top *gt)
gt->fd) < 0)
return -1;
+ if (gt->have_queued &&
+ perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class,
+ d->inst),
+ gt->fd) < 0)
+ return -1;
+
+ if (gt->have_runnable &&
+ perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class,
+ d->inst),
+ gt->fd) < 0)
+ return -1;
+
+ if (gt->have_running &&
+ perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class,
+ d->inst),
+ gt->fd) < 0)
+ return -1;
+
gt->ring[gt->num_rings++].name = d->name;
}
@@ -298,6 +328,12 @@ int gpu_top_update(struct gpu_top *gt)
s->wait[n] = sample[m++];
if (gt->have_sema)
s->sema[n] = sample[m++];
+ if (gt->have_queued)
+ s->queued[n] = sample[m++];
+ if (gt->have_runnable)
+ s->runnable[n] = sample[m++];
+ if (gt->have_running)
+ s->running[n] = sample[m++];
}
if (gt->count == 1)
@@ -310,6 +346,12 @@ int gpu_top_update(struct gpu_top *gt)
gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time;
if (gt->have_sema)
gt->ring[n].u.u.sema = (100 * (s->sema[n] - d->sema[n]) + d_time/2) / d_time;
+ if (gt->have_queued)
+ gt->ring[n].queued = (double)((s->queued[n] - d->queued[n])) / I915_SAMPLE_QUEUED_DIVISOR * 1e9 / d_time;
+ if (gt->have_runnable)
+ gt->ring[n].runnable = (double)((s->runnable[n] - d->runnable[n])) / I915_SAMPLE_RUNNABLE_DIVISOR * 1e9 / d_time;
+ if (gt->have_running)
+ gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time;
/* in case of rounding + sampling errors, fudge */
if (gt->ring[n].u.u.busy > 100)
diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h
index d3cdd779760f..cb4310c82a94 100644
--- a/overlay/gpu-top.h
+++ b/overlay/gpu-top.h
@@ -36,6 +36,9 @@ struct gpu_top {
int num_rings;
int have_wait;
int have_sema;
+ int have_queued;
+ int have_runnable;
+ int have_running;
struct gpu_top_ring {
const char *name;
@@ -47,6 +50,10 @@ struct gpu_top {
} u;
uint32_t payload;
} u;
+
+ double queued;
+ double runnable;
+ double running;
} ring[MAX_RINGS];
struct gpu_top_stat {
@@ -54,7 +61,11 @@ struct gpu_top {
uint64_t busy[MAX_RINGS];
uint64_t wait[MAX_RINGS];
uint64_t sema[MAX_RINGS];
+ uint64_t queued[MAX_RINGS];
+ uint64_t runnable[MAX_RINGS];
+ uint64_t running[MAX_RINGS];
} stat[2];
+
int count;
};
diff --git a/overlay/overlay.c b/overlay/overlay.c
index eae5ddfa8823..7087d5e5b24e 100644
--- a/overlay/overlay.c
+++ b/overlay/overlay.c
@@ -256,6 +256,13 @@ static void show_gpu_top(struct overlay_context *ctx, struct overlay_gpu_top *gt
len = sprintf(txt, "%s: %3d%% busy",
gt->gpu_top.ring[n].name,
gt->gpu_top.ring[n].u.u.busy);
+ if (gt->gpu_top.have_queued &&
+ gt->gpu_top.have_runnable &&
+ gt->gpu_top.have_running)
+ len += sprintf(txt + len, " (%.2f / %.2f / %.2f)",
+ gt->gpu_top.ring[n].queued,
+ gt->gpu_top.ring[n].runnable,
+ gt->gpu_top.ring[n].running);
if (gt->gpu_top.ring[n].u.u.wait)
len += sprintf(txt + len, ", %d%% wait",
gt->gpu_top.ring[n].u.u.wait);
--
2.17.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [igt-dev] [RFC i-g-t 2/6] intel-gpu-overlay: Add engine queue stats
@ 2018-10-03 12:07 ` Tvrtko Ursulin
0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Use new PMU engine queue stats (queued, runnable and running) and display
them per engine.
v2:
* Compact per engine stats. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
overlay/gpu-top.c | 42 ++++++++++++++++++++++++++++++++++++++++++
overlay/gpu-top.h | 11 +++++++++++
overlay/overlay.c | 7 +++++++
3 files changed, 60 insertions(+)
diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c
index 61b8f62fd78c..22e9badb22c1 100644
--- a/overlay/gpu-top.c
+++ b/overlay/gpu-top.c
@@ -72,6 +72,18 @@ static int perf_init(struct gpu_top *gt)
gt->fd) >= 0)
gt->have_sema = 1;
+ if (perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class, d->inst),
+ gt->fd) >= 0)
+ gt->have_queued = 1;
+
+ if (perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class, d->inst),
+ gt->fd) >= 0)
+ gt->have_runnable = 1;
+
+ if (perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class, d->inst),
+ gt->fd) >= 0)
+ gt->have_running = 1;
+
gt->ring[0].name = d->name;
gt->num_rings = 1;
@@ -93,6 +105,24 @@ static int perf_init(struct gpu_top *gt)
gt->fd) < 0)
return -1;
+ if (gt->have_queued &&
+ perf_i915_open_group(I915_PMU_ENGINE_QUEUED(d->class,
+ d->inst),
+ gt->fd) < 0)
+ return -1;
+
+ if (gt->have_runnable &&
+ perf_i915_open_group(I915_PMU_ENGINE_RUNNABLE(d->class,
+ d->inst),
+ gt->fd) < 0)
+ return -1;
+
+ if (gt->have_running &&
+ perf_i915_open_group(I915_PMU_ENGINE_RUNNING(d->class,
+ d->inst),
+ gt->fd) < 0)
+ return -1;
+
gt->ring[gt->num_rings++].name = d->name;
}
@@ -298,6 +328,12 @@ int gpu_top_update(struct gpu_top *gt)
s->wait[n] = sample[m++];
if (gt->have_sema)
s->sema[n] = sample[m++];
+ if (gt->have_queued)
+ s->queued[n] = sample[m++];
+ if (gt->have_runnable)
+ s->runnable[n] = sample[m++];
+ if (gt->have_running)
+ s->running[n] = sample[m++];
}
if (gt->count == 1)
@@ -310,6 +346,12 @@ int gpu_top_update(struct gpu_top *gt)
gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time;
if (gt->have_sema)
gt->ring[n].u.u.sema = (100 * (s->sema[n] - d->sema[n]) + d_time/2) / d_time;
+ if (gt->have_queued)
+ gt->ring[n].queued = (double)((s->queued[n] - d->queued[n])) / I915_SAMPLE_QUEUED_DIVISOR * 1e9 / d_time;
+ if (gt->have_runnable)
+ gt->ring[n].runnable = (double)((s->runnable[n] - d->runnable[n])) / I915_SAMPLE_RUNNABLE_DIVISOR * 1e9 / d_time;
+ if (gt->have_running)
+ gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time;
/* in case of rounding + sampling errors, fudge */
if (gt->ring[n].u.u.busy > 100)
diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h
index d3cdd779760f..cb4310c82a94 100644
--- a/overlay/gpu-top.h
+++ b/overlay/gpu-top.h
@@ -36,6 +36,9 @@ struct gpu_top {
int num_rings;
int have_wait;
int have_sema;
+ int have_queued;
+ int have_runnable;
+ int have_running;
struct gpu_top_ring {
const char *name;
@@ -47,6 +50,10 @@ struct gpu_top {
} u;
uint32_t payload;
} u;
+
+ double queued;
+ double runnable;
+ double running;
} ring[MAX_RINGS];
struct gpu_top_stat {
@@ -54,7 +61,11 @@ struct gpu_top {
uint64_t busy[MAX_RINGS];
uint64_t wait[MAX_RINGS];
uint64_t sema[MAX_RINGS];
+ uint64_t queued[MAX_RINGS];
+ uint64_t runnable[MAX_RINGS];
+ uint64_t running[MAX_RINGS];
} stat[2];
+
int count;
};
diff --git a/overlay/overlay.c b/overlay/overlay.c
index eae5ddfa8823..7087d5e5b24e 100644
--- a/overlay/overlay.c
+++ b/overlay/overlay.c
@@ -256,6 +256,13 @@ static void show_gpu_top(struct overlay_context *ctx, struct overlay_gpu_top *gt
len = sprintf(txt, "%s: %3d%% busy",
gt->gpu_top.ring[n].name,
gt->gpu_top.ring[n].u.u.busy);
+ if (gt->gpu_top.have_queued &&
+ gt->gpu_top.have_runnable &&
+ gt->gpu_top.have_running)
+ len += sprintf(txt + len, " (%.2f / %.2f / %.2f)",
+ gt->gpu_top.ring[n].queued,
+ gt->gpu_top.ring[n].runnable,
+ gt->gpu_top.ring[n].running);
if (gt->gpu_top.ring[n].u.u.wait)
len += sprintf(txt + len, ", %d%% wait",
gt->gpu_top.ring[n].u.u.wait);
--
2.17.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC i-g-t 3/6] intel-gpu-overlay: Show 1s, 30s and 15m GPU load
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
@ 2018-10-03 12:07 ` Tvrtko Ursulin
-1 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Show total GPU loads in the window banner.
Engine load is defined as total of runnable and running requests on an
engine.
Total, non-normalized, load is display. In other words if N engines are
busy with exactly one request, the load will be shown as N.
v2:
* Different flavour of load avg. (Chris Wilson)
* Simplify code. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
overlay/gpu-top.c | 39 ++++++++++++++++++++++++++++++++++++++-
overlay/gpu-top.h | 11 ++++++++++-
overlay/overlay.c | 28 ++++++++++++++++++++++------
3 files changed, 70 insertions(+), 8 deletions(-)
diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c
index 22e9badb22c1..501429b86379 100644
--- a/overlay/gpu-top.c
+++ b/overlay/gpu-top.c
@@ -28,6 +28,7 @@
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
+#include <math.h>
#include <errno.h>
#include <assert.h>
@@ -126,6 +127,10 @@ static int perf_init(struct gpu_top *gt)
gt->ring[gt->num_rings++].name = d->name;
}
+ gt->have_load_avg = gt->have_queued &&
+ gt->have_runnable &&
+ gt->have_running;
+
return 0;
}
@@ -290,17 +295,32 @@ static void mmio_init(struct gpu_top *gt)
}
}
-void gpu_top_init(struct gpu_top *gt)
+void gpu_top_init(struct gpu_top *gt, unsigned int period_us)
{
+ const double period = (double)period_us / 1e6;
+ const double load_period[NUM_LOADS] = { 1.0, 30.0, 900.0 };
+ const char *load_names[NUM_LOADS] = { "1s", "30s", "15m" };
+ unsigned int i;
+
memset(gt, 0, sizeof(*gt));
gt->fd = -1;
+ for (i = 0; i < NUM_LOADS; i++) {
+ gt->load_name[i] = load_names[i];
+ gt->exp[i] = exp(-period / load_period[i]);
+ }
+
if (perf_init(gt) == 0)
return;
mmio_init(gt);
}
+static double update_load(double load, double exp, double val)
+{
+ return val + exp * (load - val);
+}
+
int gpu_top_update(struct gpu_top *gt)
{
uint32_t data[1024];
@@ -313,6 +333,8 @@ int gpu_top_update(struct gpu_top *gt)
struct gpu_top_stat *s = >->stat[gt->count++&1];
struct gpu_top_stat *d = >->stat[gt->count&1];
uint64_t *sample, d_time;
+ double gpu_qd = 0.0;
+ unsigned int i;
int n, m;
len = read(gt->fd, data, sizeof(data));
@@ -341,6 +363,8 @@ int gpu_top_update(struct gpu_top *gt)
d_time = s->time - d->time;
for (n = 0; n < gt->num_rings; n++) {
+ double qd = 0.0;
+
gt->ring[n].u.u.busy = (100 * (s->busy[n] - d->busy[n]) + d_time/2) / d_time;
if (gt->have_wait)
gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time;
@@ -353,6 +377,14 @@ int gpu_top_update(struct gpu_top *gt)
if (gt->have_running)
gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time;
+ qd = gt->ring[n].runnable + gt->ring[n].running;
+ gpu_qd += qd;
+
+ for (i = 0; i < NUM_LOADS; i++)
+ gt->ring[n].load[i] =
+ update_load(gt->ring[n].load[i],
+ gt->exp[i], qd);
+
/* in case of rounding + sampling errors, fudge */
if (gt->ring[n].u.u.busy > 100)
gt->ring[n].u.u.busy = 100;
@@ -362,6 +394,11 @@ int gpu_top_update(struct gpu_top *gt)
gt->ring[n].u.u.sema = 100;
}
+ for (i = 0; i < NUM_LOADS; i++) {
+ gt->load[i] = update_load(gt->load[i], gt->exp[i],
+ gpu_qd);
+ gt->norm_load[i] = gt->load[i] / gt->num_rings;
+ }
update = 1;
} else {
while ((len = read(gt->fd, data, sizeof(data))) > 0) {
diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h
index cb4310c82a94..115ce8c482c1 100644
--- a/overlay/gpu-top.h
+++ b/overlay/gpu-top.h
@@ -26,6 +26,7 @@
#define GPU_TOP_H
#define MAX_RINGS 16
+#define NUM_LOADS 3
#include <stdint.h>
@@ -39,6 +40,12 @@ struct gpu_top {
int have_queued;
int have_runnable;
int have_running;
+ int have_load_avg;
+
+ double exp[NUM_LOADS];
+ double load[NUM_LOADS];
+ double norm_load[NUM_LOADS];
+ const char *load_name[NUM_LOADS];
struct gpu_top_ring {
const char *name;
@@ -54,6 +61,8 @@ struct gpu_top {
double queued;
double runnable;
double running;
+
+ double load[NUM_LOADS];
} ring[MAX_RINGS];
struct gpu_top_stat {
@@ -69,7 +78,7 @@ struct gpu_top {
int count;
};
-void gpu_top_init(struct gpu_top *gt);
+void gpu_top_init(struct gpu_top *gt, unsigned int period_us);
int gpu_top_update(struct gpu_top *gt);
#endif /* GPU_TOP_H */
diff --git a/overlay/overlay.c b/overlay/overlay.c
index 7087d5e5b24e..1a1d01db082b 100644
--- a/overlay/overlay.c
+++ b/overlay/overlay.c
@@ -141,7 +141,8 @@ struct overlay_context {
};
static void init_gpu_top(struct overlay_context *ctx,
- struct overlay_gpu_top *gt)
+ struct overlay_gpu_top *gt,
+ unsigned int period_us)
{
const double rgba[][4] = {
{ 1, 0.25, 0.25, 1 },
@@ -153,7 +154,7 @@ static void init_gpu_top(struct overlay_context *ctx,
int n;
cpu_top_init(>->cpu_top);
- gpu_top_init(>->gpu_top);
+ gpu_top_init(>->gpu_top, period_us);
chart_init(>->cpu, "CPU", 120);
chart_set_position(>->cpu, PAD, PAD);
@@ -928,13 +929,13 @@ int main(int argc, char **argv)
debugfs_init();
- init_gpu_top(&ctx, &ctx.gpu_top);
+ sample_period = get_sample_period(&config);
+
+ init_gpu_top(&ctx, &ctx.gpu_top, sample_period);
init_gpu_perf(&ctx, &ctx.gpu_perf);
init_gpu_freq(&ctx, &ctx.gpu_freq);
init_gem_objects(&ctx, &ctx.gem_objects);
- sample_period = get_sample_period(&config);
-
i = 0;
while (1) {
ctx.time = time(NULL);
@@ -950,9 +951,24 @@ int main(int argc, char **argv)
show_gem_objects(&ctx, &ctx.gem_objects);
{
- char buf[80];
+ struct gpu_top *gt = &ctx.gpu_top.gpu_top;
cairo_text_extents_t extents;
+ char buf[256];
+
gethostname(buf, sizeof(buf));
+
+ if (gt->have_load_avg) {
+ int len = strlen(buf);
+
+ snprintf(buf + len, sizeof(buf) - len,
+ "%s; %u engines; load %s %.2f, %s %.2f, %s %.2f",
+ buf,
+ gt->num_rings,
+ gt->load_name[0], gt->load[0],
+ gt->load_name[1], gt->load[1],
+ gt->load_name[2], gt->load[2]);
+ }
+
cairo_set_source_rgb(ctx.cr, .5, .5, .5);
cairo_set_font_size(ctx.cr, PAD-2);
cairo_text_extents(ctx.cr, buf, &extents);
--
2.17.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [igt-dev] [RFC i-g-t 3/6] intel-gpu-overlay: Show 1s, 30s and 15m GPU load
@ 2018-10-03 12:07 ` Tvrtko Ursulin
0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Show total GPU loads in the window banner.
Engine load is defined as total of runnable and running requests on an
engine.
Total, non-normalized, load is display. In other words if N engines are
busy with exactly one request, the load will be shown as N.
v2:
* Different flavour of load avg. (Chris Wilson)
* Simplify code. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
overlay/gpu-top.c | 39 ++++++++++++++++++++++++++++++++++++++-
overlay/gpu-top.h | 11 ++++++++++-
overlay/overlay.c | 28 ++++++++++++++++++++++------
3 files changed, 70 insertions(+), 8 deletions(-)
diff --git a/overlay/gpu-top.c b/overlay/gpu-top.c
index 22e9badb22c1..501429b86379 100644
--- a/overlay/gpu-top.c
+++ b/overlay/gpu-top.c
@@ -28,6 +28,7 @@
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
+#include <math.h>
#include <errno.h>
#include <assert.h>
@@ -126,6 +127,10 @@ static int perf_init(struct gpu_top *gt)
gt->ring[gt->num_rings++].name = d->name;
}
+ gt->have_load_avg = gt->have_queued &&
+ gt->have_runnable &&
+ gt->have_running;
+
return 0;
}
@@ -290,17 +295,32 @@ static void mmio_init(struct gpu_top *gt)
}
}
-void gpu_top_init(struct gpu_top *gt)
+void gpu_top_init(struct gpu_top *gt, unsigned int period_us)
{
+ const double period = (double)period_us / 1e6;
+ const double load_period[NUM_LOADS] = { 1.0, 30.0, 900.0 };
+ const char *load_names[NUM_LOADS] = { "1s", "30s", "15m" };
+ unsigned int i;
+
memset(gt, 0, sizeof(*gt));
gt->fd = -1;
+ for (i = 0; i < NUM_LOADS; i++) {
+ gt->load_name[i] = load_names[i];
+ gt->exp[i] = exp(-period / load_period[i]);
+ }
+
if (perf_init(gt) == 0)
return;
mmio_init(gt);
}
+static double update_load(double load, double exp, double val)
+{
+ return val + exp * (load - val);
+}
+
int gpu_top_update(struct gpu_top *gt)
{
uint32_t data[1024];
@@ -313,6 +333,8 @@ int gpu_top_update(struct gpu_top *gt)
struct gpu_top_stat *s = >->stat[gt->count++&1];
struct gpu_top_stat *d = >->stat[gt->count&1];
uint64_t *sample, d_time;
+ double gpu_qd = 0.0;
+ unsigned int i;
int n, m;
len = read(gt->fd, data, sizeof(data));
@@ -341,6 +363,8 @@ int gpu_top_update(struct gpu_top *gt)
d_time = s->time - d->time;
for (n = 0; n < gt->num_rings; n++) {
+ double qd = 0.0;
+
gt->ring[n].u.u.busy = (100 * (s->busy[n] - d->busy[n]) + d_time/2) / d_time;
if (gt->have_wait)
gt->ring[n].u.u.wait = (100 * (s->wait[n] - d->wait[n]) + d_time/2) / d_time;
@@ -353,6 +377,14 @@ int gpu_top_update(struct gpu_top *gt)
if (gt->have_running)
gt->ring[n].running = (double)((s->running[n] - d->running[n])) / I915_SAMPLE_RUNNING_DIVISOR * 1e9 / d_time;
+ qd = gt->ring[n].runnable + gt->ring[n].running;
+ gpu_qd += qd;
+
+ for (i = 0; i < NUM_LOADS; i++)
+ gt->ring[n].load[i] =
+ update_load(gt->ring[n].load[i],
+ gt->exp[i], qd);
+
/* in case of rounding + sampling errors, fudge */
if (gt->ring[n].u.u.busy > 100)
gt->ring[n].u.u.busy = 100;
@@ -362,6 +394,11 @@ int gpu_top_update(struct gpu_top *gt)
gt->ring[n].u.u.sema = 100;
}
+ for (i = 0; i < NUM_LOADS; i++) {
+ gt->load[i] = update_load(gt->load[i], gt->exp[i],
+ gpu_qd);
+ gt->norm_load[i] = gt->load[i] / gt->num_rings;
+ }
update = 1;
} else {
while ((len = read(gt->fd, data, sizeof(data))) > 0) {
diff --git a/overlay/gpu-top.h b/overlay/gpu-top.h
index cb4310c82a94..115ce8c482c1 100644
--- a/overlay/gpu-top.h
+++ b/overlay/gpu-top.h
@@ -26,6 +26,7 @@
#define GPU_TOP_H
#define MAX_RINGS 16
+#define NUM_LOADS 3
#include <stdint.h>
@@ -39,6 +40,12 @@ struct gpu_top {
int have_queued;
int have_runnable;
int have_running;
+ int have_load_avg;
+
+ double exp[NUM_LOADS];
+ double load[NUM_LOADS];
+ double norm_load[NUM_LOADS];
+ const char *load_name[NUM_LOADS];
struct gpu_top_ring {
const char *name;
@@ -54,6 +61,8 @@ struct gpu_top {
double queued;
double runnable;
double running;
+
+ double load[NUM_LOADS];
} ring[MAX_RINGS];
struct gpu_top_stat {
@@ -69,7 +78,7 @@ struct gpu_top {
int count;
};
-void gpu_top_init(struct gpu_top *gt);
+void gpu_top_init(struct gpu_top *gt, unsigned int period_us);
int gpu_top_update(struct gpu_top *gt);
#endif /* GPU_TOP_H */
diff --git a/overlay/overlay.c b/overlay/overlay.c
index 7087d5e5b24e..1a1d01db082b 100644
--- a/overlay/overlay.c
+++ b/overlay/overlay.c
@@ -141,7 +141,8 @@ struct overlay_context {
};
static void init_gpu_top(struct overlay_context *ctx,
- struct overlay_gpu_top *gt)
+ struct overlay_gpu_top *gt,
+ unsigned int period_us)
{
const double rgba[][4] = {
{ 1, 0.25, 0.25, 1 },
@@ -153,7 +154,7 @@ static void init_gpu_top(struct overlay_context *ctx,
int n;
cpu_top_init(>->cpu_top);
- gpu_top_init(>->gpu_top);
+ gpu_top_init(>->gpu_top, period_us);
chart_init(>->cpu, "CPU", 120);
chart_set_position(>->cpu, PAD, PAD);
@@ -928,13 +929,13 @@ int main(int argc, char **argv)
debugfs_init();
- init_gpu_top(&ctx, &ctx.gpu_top);
+ sample_period = get_sample_period(&config);
+
+ init_gpu_top(&ctx, &ctx.gpu_top, sample_period);
init_gpu_perf(&ctx, &ctx.gpu_perf);
init_gpu_freq(&ctx, &ctx.gpu_freq);
init_gem_objects(&ctx, &ctx.gem_objects);
- sample_period = get_sample_period(&config);
-
i = 0;
while (1) {
ctx.time = time(NULL);
@@ -950,9 +951,24 @@ int main(int argc, char **argv)
show_gem_objects(&ctx, &ctx.gem_objects);
{
- char buf[80];
+ struct gpu_top *gt = &ctx.gpu_top.gpu_top;
cairo_text_extents_t extents;
+ char buf[256];
+
gethostname(buf, sizeof(buf));
+
+ if (gt->have_load_avg) {
+ int len = strlen(buf);
+
+ snprintf(buf + len, sizeof(buf) - len,
+ "%s; %u engines; load %s %.2f, %s %.2f, %s %.2f",
+ buf,
+ gt->num_rings,
+ gt->load_name[0], gt->load[0],
+ gt->load_name[1], gt->load[1],
+ gt->load_name[2], gt->load[2]);
+ }
+
cairo_set_source_rgb(ctx.cr, .5, .5, .5);
cairo_set_font_size(ctx.cr, PAD-2);
cairo_text_extents(ctx.cr, buf, &extents);
--
2.17.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC i-g-t 4/6] tests/perf_pmu: Add tests for engine queued/runnable/running stats
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
@ 2018-10-03 12:07 ` Tvrtko Ursulin
-1 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Simple tests to check reported queue depths are correct.
v2:
* Improvements similar to ones from i915_query.c.
v3:
* Rebase for __igt_spin_batch_new.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tests/perf_pmu.c | 259 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 259 insertions(+)
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 6ab2595b114d..6f064b605a4c 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -169,6 +169,7 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
#define TEST_RUNTIME_PM (8)
#define FLAG_LONG (16)
#define FLAG_HANG (32)
+#define TEST_CONTEXTS (64)
static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
{
@@ -959,6 +960,224 @@ multi_client(int gem_fd, const struct intel_execution_engine2 *e)
assert_within_epsilon(val[1], perf_slept[1], tolerance);
}
+static double calc_queued(uint64_t d_val, uint64_t d_ns)
+{
+ return (double)d_val * 1e9 / I915_SAMPLE_QUEUED_DIVISOR / d_ns;
+}
+
+static void
+queued(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ const uint32_t bbe = MI_BATCH_BUFFER_END;
+ const unsigned int max_rq = 10;
+ double queued[max_rq + 1];
+ unsigned int n, i;
+ uint64_t val[2];
+ uint64_t ts[2];
+ uint32_t bo;
+ int fd;
+
+ igt_require_sw_sync();
+ if (flags & TEST_CONTEXTS)
+ gem_require_contexts(gem_fd);
+
+ memset(queued, 0, sizeof(queued));
+
+ bo = gem_create(gem_fd, 4096);
+ gem_write(gem_fd, bo, 4092, &bbe, sizeof(bbe));
+
+ fd = open_pmu(I915_PMU_ENGINE_QUEUED(e->class, e->instance));
+
+ for (n = 0; n <= max_rq; n++) {
+ IGT_CORK_FENCE(cork);
+ int fence = -1;
+
+ gem_quiescent_gpu(gem_fd);
+
+ if (n)
+ fence = igt_cork_plug(&cork, -1);
+
+ for (i = 0; i < n; i++) {
+ struct drm_i915_gem_exec_object2 obj = { };
+ struct drm_i915_gem_execbuffer2 eb = { };
+
+ obj.handle = bo;
+
+ eb.buffer_count = 1;
+ eb.buffers_ptr = to_user_pointer(&obj);
+
+ eb.flags = engine | I915_EXEC_FENCE_IN;
+ if (flags & TEST_CONTEXTS)
+ eb.rsvd1 = gem_context_create(gem_fd);
+ eb.rsvd2 = fence;
+
+ gem_execbuf(gem_fd, &eb);
+
+ if (flags & TEST_CONTEXTS)
+ gem_context_destroy(gem_fd, eb.rsvd1);
+ }
+
+ val[0] = __pmu_read_single(fd, &ts[0]);
+ usleep(batch_duration_ns / 1000);
+ val[1] = __pmu_read_single(fd, &ts[1]);
+
+ queued[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+ igt_info("n=%u queued=%.2f\n", n, queued[n]);
+
+ if (fence >= 0)
+ igt_cork_unplug(&cork);
+
+ for (i = 0; i < n; i++)
+ gem_sync(gem_fd, bo);
+ }
+
+ close(fd);
+
+ gem_close(gem_fd, bo);
+
+ for (i = 0; i <= max_rq; i++)
+ assert_within_epsilon(queued[i], i, tolerance);
+}
+
+static unsigned long __query_wait(igt_spin_t *spin, unsigned int n)
+{
+ struct timespec ts = { };
+ unsigned long t;
+
+ igt_nsec_elapsed(&ts);
+
+ if (spin->running) {
+ igt_spin_busywait_until_running(spin);
+ } else {
+ igt_debug("__spin_wait - usleep mode\n");
+ usleep(500e3); /* Better than nothing! */
+ }
+
+ t = igt_nsec_elapsed(&ts);
+
+ return spin->running ? t : 500e6 / n;
+}
+
+static void
+runnable(int gem_fd, const struct intel_execution_engine2 *e)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ bool contexts = gem_has_contexts(gem_fd);
+ const unsigned int max_rq = 10;
+ igt_spin_t *spin[max_rq + 1];
+ double runnable[max_rq + 1];
+ uint32_t ctx[max_rq];
+ unsigned int n, i;
+ uint64_t val[2];
+ uint64_t ts[2];
+ int fd;
+
+ memset(runnable, 0, sizeof(runnable));
+
+ if (contexts) {
+ for (i = 0; i < max_rq; i++)
+ ctx[i] = gem_context_create(gem_fd);
+ }
+
+ fd = open_pmu(I915_PMU_ENGINE_RUNNABLE(e->class, e->instance));
+
+ for (n = 0; n <= max_rq; n++) {
+ gem_quiescent_gpu(gem_fd);
+
+ for (i = 0; i < n; i++) {
+ uint32_t ctx_ = contexts ? ctx[i] : 0;
+
+ if (i == 0)
+ spin[i] = __spin_poll(gem_fd, ctx_, engine);
+ else
+ spin[i] = __igt_spin_batch_new(gem_fd,
+ .ctx = ctx_,
+ .engine = engine);
+ }
+
+ if (n)
+ usleep(__query_wait(spin[0], n) * n);
+
+ val[0] = __pmu_read_single(fd, &ts[0]);
+ usleep(batch_duration_ns / 1000);
+ val[1] = __pmu_read_single(fd, &ts[1]);
+
+ runnable[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+ igt_info("n=%u runnable=%.2f\n", n, runnable[n]);
+
+ for (i = 0; i < n; i++) {
+ end_spin(gem_fd, spin[i], FLAG_SYNC);
+ igt_spin_batch_free(gem_fd, spin[i]);
+ }
+ }
+
+ if (contexts) {
+ for (i = 0; i < max_rq; i++)
+ gem_context_destroy(gem_fd, ctx[i]);
+ }
+
+ close(fd);
+
+ assert_within_epsilon(runnable[0], 0, tolerance);
+ igt_assert(runnable[max_rq] > 0.0);
+
+ if (contexts)
+ assert_within_epsilon(runnable[max_rq] - runnable[max_rq - 1],
+ 1, tolerance);
+}
+
+static void
+running(int gem_fd, const struct intel_execution_engine2 *e)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ const unsigned int max_rq = 10;
+ igt_spin_t *spin[max_rq + 1];
+ double running[max_rq + 1];
+ unsigned int n, i;
+ uint64_t val[2];
+ uint64_t ts[2];
+ int fd;
+
+ memset(running, 0, sizeof(running));
+ memset(spin, 0, sizeof(spin));
+
+ fd = open_pmu(I915_PMU_ENGINE_RUNNING(e->class, e->instance));
+
+ for (n = 0; n <= max_rq; n++) {
+ gem_quiescent_gpu(gem_fd);
+
+ for (i = 0; i < n; i++) {
+ if (i == 0)
+ spin[i] = __spin_poll(gem_fd, 0, engine);
+ else
+ spin[i] = __igt_spin_batch_new(gem_fd,
+ .engine = engine);
+ }
+
+ if (n)
+ usleep(__query_wait(spin[0], n) * n);
+
+ val[0] = __pmu_read_single(fd, &ts[0]);
+ usleep(batch_duration_ns / 1000);
+ val[1] = __pmu_read_single(fd, &ts[1]);
+
+ running[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+ igt_info("n=%u running=%.2f\n", n, running[n]);
+
+ for (i = 0; i < n; i++) {
+ end_spin(gem_fd, spin[i], FLAG_SYNC);
+ igt_spin_batch_free(gem_fd, spin[i]);
+ }
+ }
+
+ close(fd);
+
+ assert_within_epsilon(running[0], 0, tolerance);
+ for (i = 1; i <= max_rq; i++)
+ igt_assert(running[i] > 0);
+}
+
/**
* Tests that i915 PMU corectly errors out in invalid initialization.
* i915 PMU is uncore PMU, thus:
@@ -1709,6 +1928,15 @@ igt_main
igt_subtest_f("init-sema-%s", e->name)
init(fd, e, I915_SAMPLE_SEMA);
+ igt_subtest_f("init-queued-%s", e->name)
+ init(fd, e, I915_SAMPLE_QUEUED);
+
+ igt_subtest_f("init-runnable-%s", e->name)
+ init(fd, e, I915_SAMPLE_RUNNABLE);
+
+ igt_subtest_f("init-running-%s", e->name)
+ init(fd, e, I915_SAMPLE_RUNNING);
+
igt_subtest_group {
igt_fixture {
gem_require_engine(fd, e->class, e->instance);
@@ -1814,6 +2042,27 @@ igt_main
igt_subtest_f("busy-hang-%s", e->name)
single(fd, e, TEST_BUSY | FLAG_HANG);
+
+ /**
+ * Test that queued metric works.
+ */
+ igt_subtest_f("queued-%s", e->name)
+ queued(fd, e, 0);
+
+ igt_subtest_f("queued-contexts-%s", e->name)
+ queued(fd, e, TEST_CONTEXTS);
+
+ /**
+ * Test that runnable metric works.
+ */
+ igt_subtest_f("runnable-%s", e->name)
+ runnable(fd, e);
+
+ /**
+ * Test that running metric works.
+ */
+ igt_subtest_f("running-%s", e->name)
+ running(fd, e);
}
/**
@@ -1906,6 +2155,16 @@ igt_main
e->name)
single(render_fd, e,
TEST_BUSY | TEST_TRAILING_IDLE);
+ igt_subtest_f("render-node-queued-%s", e->name)
+ queued(render_fd, e, 0);
+ igt_subtest_f("render-node-queued-contexts-%s",
+ e->name)
+ queued(render_fd, e, TEST_CONTEXTS);
+ igt_subtest_f("render-node-runnable-%s",
+ e->name)
+ runnable(render_fd, e);
+ igt_subtest_f("render-node-running-%s", e->name)
+ running(render_fd, e);
}
}
--
2.17.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [igt-dev] [RFC i-g-t 4/6] tests/perf_pmu: Add tests for engine queued/runnable/running stats
@ 2018-10-03 12:07 ` Tvrtko Ursulin
0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Simple tests to check reported queue depths are correct.
v2:
* Improvements similar to ones from i915_query.c.
v3:
* Rebase for __igt_spin_batch_new.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tests/perf_pmu.c | 259 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 259 insertions(+)
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index 6ab2595b114d..6f064b605a4c 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -169,6 +169,7 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e)
#define TEST_RUNTIME_PM (8)
#define FLAG_LONG (16)
#define FLAG_HANG (32)
+#define TEST_CONTEXTS (64)
static igt_spin_t * __spin_poll(int fd, uint32_t ctx, unsigned long flags)
{
@@ -959,6 +960,224 @@ multi_client(int gem_fd, const struct intel_execution_engine2 *e)
assert_within_epsilon(val[1], perf_slept[1], tolerance);
}
+static double calc_queued(uint64_t d_val, uint64_t d_ns)
+{
+ return (double)d_val * 1e9 / I915_SAMPLE_QUEUED_DIVISOR / d_ns;
+}
+
+static void
+queued(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ const uint32_t bbe = MI_BATCH_BUFFER_END;
+ const unsigned int max_rq = 10;
+ double queued[max_rq + 1];
+ unsigned int n, i;
+ uint64_t val[2];
+ uint64_t ts[2];
+ uint32_t bo;
+ int fd;
+
+ igt_require_sw_sync();
+ if (flags & TEST_CONTEXTS)
+ gem_require_contexts(gem_fd);
+
+ memset(queued, 0, sizeof(queued));
+
+ bo = gem_create(gem_fd, 4096);
+ gem_write(gem_fd, bo, 4092, &bbe, sizeof(bbe));
+
+ fd = open_pmu(I915_PMU_ENGINE_QUEUED(e->class, e->instance));
+
+ for (n = 0; n <= max_rq; n++) {
+ IGT_CORK_FENCE(cork);
+ int fence = -1;
+
+ gem_quiescent_gpu(gem_fd);
+
+ if (n)
+ fence = igt_cork_plug(&cork, -1);
+
+ for (i = 0; i < n; i++) {
+ struct drm_i915_gem_exec_object2 obj = { };
+ struct drm_i915_gem_execbuffer2 eb = { };
+
+ obj.handle = bo;
+
+ eb.buffer_count = 1;
+ eb.buffers_ptr = to_user_pointer(&obj);
+
+ eb.flags = engine | I915_EXEC_FENCE_IN;
+ if (flags & TEST_CONTEXTS)
+ eb.rsvd1 = gem_context_create(gem_fd);
+ eb.rsvd2 = fence;
+
+ gem_execbuf(gem_fd, &eb);
+
+ if (flags & TEST_CONTEXTS)
+ gem_context_destroy(gem_fd, eb.rsvd1);
+ }
+
+ val[0] = __pmu_read_single(fd, &ts[0]);
+ usleep(batch_duration_ns / 1000);
+ val[1] = __pmu_read_single(fd, &ts[1]);
+
+ queued[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+ igt_info("n=%u queued=%.2f\n", n, queued[n]);
+
+ if (fence >= 0)
+ igt_cork_unplug(&cork);
+
+ for (i = 0; i < n; i++)
+ gem_sync(gem_fd, bo);
+ }
+
+ close(fd);
+
+ gem_close(gem_fd, bo);
+
+ for (i = 0; i <= max_rq; i++)
+ assert_within_epsilon(queued[i], i, tolerance);
+}
+
+static unsigned long __query_wait(igt_spin_t *spin, unsigned int n)
+{
+ struct timespec ts = { };
+ unsigned long t;
+
+ igt_nsec_elapsed(&ts);
+
+ if (spin->running) {
+ igt_spin_busywait_until_running(spin);
+ } else {
+ igt_debug("__spin_wait - usleep mode\n");
+ usleep(500e3); /* Better than nothing! */
+ }
+
+ t = igt_nsec_elapsed(&ts);
+
+ return spin->running ? t : 500e6 / n;
+}
+
+static void
+runnable(int gem_fd, const struct intel_execution_engine2 *e)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ bool contexts = gem_has_contexts(gem_fd);
+ const unsigned int max_rq = 10;
+ igt_spin_t *spin[max_rq + 1];
+ double runnable[max_rq + 1];
+ uint32_t ctx[max_rq];
+ unsigned int n, i;
+ uint64_t val[2];
+ uint64_t ts[2];
+ int fd;
+
+ memset(runnable, 0, sizeof(runnable));
+
+ if (contexts) {
+ for (i = 0; i < max_rq; i++)
+ ctx[i] = gem_context_create(gem_fd);
+ }
+
+ fd = open_pmu(I915_PMU_ENGINE_RUNNABLE(e->class, e->instance));
+
+ for (n = 0; n <= max_rq; n++) {
+ gem_quiescent_gpu(gem_fd);
+
+ for (i = 0; i < n; i++) {
+ uint32_t ctx_ = contexts ? ctx[i] : 0;
+
+ if (i == 0)
+ spin[i] = __spin_poll(gem_fd, ctx_, engine);
+ else
+ spin[i] = __igt_spin_batch_new(gem_fd,
+ .ctx = ctx_,
+ .engine = engine);
+ }
+
+ if (n)
+ usleep(__query_wait(spin[0], n) * n);
+
+ val[0] = __pmu_read_single(fd, &ts[0]);
+ usleep(batch_duration_ns / 1000);
+ val[1] = __pmu_read_single(fd, &ts[1]);
+
+ runnable[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+ igt_info("n=%u runnable=%.2f\n", n, runnable[n]);
+
+ for (i = 0; i < n; i++) {
+ end_spin(gem_fd, spin[i], FLAG_SYNC);
+ igt_spin_batch_free(gem_fd, spin[i]);
+ }
+ }
+
+ if (contexts) {
+ for (i = 0; i < max_rq; i++)
+ gem_context_destroy(gem_fd, ctx[i]);
+ }
+
+ close(fd);
+
+ assert_within_epsilon(runnable[0], 0, tolerance);
+ igt_assert(runnable[max_rq] > 0.0);
+
+ if (contexts)
+ assert_within_epsilon(runnable[max_rq] - runnable[max_rq - 1],
+ 1, tolerance);
+}
+
+static void
+running(int gem_fd, const struct intel_execution_engine2 *e)
+{
+ const unsigned long engine = e2ring(gem_fd, e);
+ const unsigned int max_rq = 10;
+ igt_spin_t *spin[max_rq + 1];
+ double running[max_rq + 1];
+ unsigned int n, i;
+ uint64_t val[2];
+ uint64_t ts[2];
+ int fd;
+
+ memset(running, 0, sizeof(running));
+ memset(spin, 0, sizeof(spin));
+
+ fd = open_pmu(I915_PMU_ENGINE_RUNNING(e->class, e->instance));
+
+ for (n = 0; n <= max_rq; n++) {
+ gem_quiescent_gpu(gem_fd);
+
+ for (i = 0; i < n; i++) {
+ if (i == 0)
+ spin[i] = __spin_poll(gem_fd, 0, engine);
+ else
+ spin[i] = __igt_spin_batch_new(gem_fd,
+ .engine = engine);
+ }
+
+ if (n)
+ usleep(__query_wait(spin[0], n) * n);
+
+ val[0] = __pmu_read_single(fd, &ts[0]);
+ usleep(batch_duration_ns / 1000);
+ val[1] = __pmu_read_single(fd, &ts[1]);
+
+ running[n] = calc_queued(val[1] - val[0], ts[1] - ts[0]);
+ igt_info("n=%u running=%.2f\n", n, running[n]);
+
+ for (i = 0; i < n; i++) {
+ end_spin(gem_fd, spin[i], FLAG_SYNC);
+ igt_spin_batch_free(gem_fd, spin[i]);
+ }
+ }
+
+ close(fd);
+
+ assert_within_epsilon(running[0], 0, tolerance);
+ for (i = 1; i <= max_rq; i++)
+ igt_assert(running[i] > 0);
+}
+
/**
* Tests that i915 PMU corectly errors out in invalid initialization.
* i915 PMU is uncore PMU, thus:
@@ -1709,6 +1928,15 @@ igt_main
igt_subtest_f("init-sema-%s", e->name)
init(fd, e, I915_SAMPLE_SEMA);
+ igt_subtest_f("init-queued-%s", e->name)
+ init(fd, e, I915_SAMPLE_QUEUED);
+
+ igt_subtest_f("init-runnable-%s", e->name)
+ init(fd, e, I915_SAMPLE_RUNNABLE);
+
+ igt_subtest_f("init-running-%s", e->name)
+ init(fd, e, I915_SAMPLE_RUNNING);
+
igt_subtest_group {
igt_fixture {
gem_require_engine(fd, e->class, e->instance);
@@ -1814,6 +2042,27 @@ igt_main
igt_subtest_f("busy-hang-%s", e->name)
single(fd, e, TEST_BUSY | FLAG_HANG);
+
+ /**
+ * Test that queued metric works.
+ */
+ igt_subtest_f("queued-%s", e->name)
+ queued(fd, e, 0);
+
+ igt_subtest_f("queued-contexts-%s", e->name)
+ queued(fd, e, TEST_CONTEXTS);
+
+ /**
+ * Test that runnable metric works.
+ */
+ igt_subtest_f("runnable-%s", e->name)
+ runnable(fd, e);
+
+ /**
+ * Test that running metric works.
+ */
+ igt_subtest_f("running-%s", e->name)
+ running(fd, e);
}
/**
@@ -1906,6 +2155,16 @@ igt_main
e->name)
single(render_fd, e,
TEST_BUSY | TEST_TRAILING_IDLE);
+ igt_subtest_f("render-node-queued-%s", e->name)
+ queued(render_fd, e, 0);
+ igt_subtest_f("render-node-queued-contexts-%s",
+ e->name)
+ queued(render_fd, e, TEST_CONTEXTS);
+ igt_subtest_f("render-node-runnable-%s",
+ e->name)
+ runnable(render_fd, e);
+ igt_subtest_f("render-node-running-%s", e->name)
+ running(render_fd, e);
}
}
--
2.17.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC i-g-t 5/6] intel-gpu-top: Add queue depths and load average
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
@ 2018-10-03 12:07 ` Tvrtko Ursulin
-1 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
With the driver now exporting various request queue depths for each
engine, we can display this information here. Both as raw counters,
and by adding a load average like metrics composed from number of
runnable and running requests in a given time period. 1s, 30s and 5m
periods are used for load average.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tools/Makefile.am | 2 +-
tools/intel_gpu_top.c | 122 +++++++++++++++++++++++++++++++++++++-----
tools/meson.build | 2 +-
3 files changed, 110 insertions(+), 16 deletions(-)
diff --git a/tools/Makefile.am b/tools/Makefile.am
index e7de4d90241c..e03842afce8d 100644
--- a/tools/Makefile.am
+++ b/tools/Makefile.am
@@ -29,7 +29,7 @@ intel_aubdump_la_LDFLAGS = -module -avoid-version -no-undefined
intel_aubdump_la_SOURCES = aubdump.c
intel_aubdump_la_LIBADD = $(top_builddir)/lib/libintel_tools.la -ldl
-intel_gpu_top_LDADD = $(top_builddir)/lib/libigt_perf.la
+intel_gpu_top_LDADD = $(top_builddir)/lib/libigt_perf.la -lm
bin_SCRIPTS = intel_aubdump
CLEANFILES = $(bin_SCRIPTS)
diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index b923c3cfbe97..8990ef17b771 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -56,6 +56,10 @@ struct pmu_counter {
struct pmu_pair val;
};
+#define NUM_QUEUES (3)
+
+#define NUM_LOADS (3)
+
struct engine {
const char *name;
const char *display_name;
@@ -65,9 +69,13 @@ struct engine {
unsigned int num_counters;
+ double qd[3];
+ double load_avg[NUM_LOADS];
+
struct pmu_counter busy;
struct pmu_counter wait;
struct pmu_counter sema;
+ struct pmu_counter queue[NUM_QUEUES];
};
struct engines {
@@ -95,6 +103,11 @@ struct engines {
struct pmu_counter imc_reads;
struct pmu_counter imc_writes;
+ double qd_scale;
+
+ double load_exp[NUM_LOADS];
+ double load_avg[NUM_LOADS];
+
struct engine engine;
};
@@ -320,6 +333,13 @@ static double filename_to_double(const char *filename)
return v;
}
+#define I915_EVENT "/sys/devices/i915/events/"
+
+static double i915_qd_scale(void)
+{
+ return filename_to_double(I915_EVENT "rcs0-queued.scale");
+}
+
#define RAPL_ROOT "/sys/devices/power/"
#define RAPL_EVENT "/sys/devices/power/events/"
@@ -453,6 +473,8 @@ static int pmu_init(struct engines *engines)
engines->rc6.config = I915_PMU_RC6_RESIDENCY;
_open_pmu(engines->num_counters, &engines->rc6, engines->fd);
+ engines->qd_scale = i915_qd_scale();
+
for (i = 0; i < engines->num_engines; i++) {
struct engine *engine = engine_ptr(engines, i);
struct {
@@ -462,6 +484,9 @@ static int pmu_init(struct engines *engines)
{ .pmu = &engine->busy, .counter = "busy" },
{ .pmu = &engine->wait, .counter = "wait" },
{ .pmu = &engine->sema, .counter = "sema" },
+ { .pmu = &engine->queue[0], .counter = "queued" },
+ { .pmu = &engine->queue[1], .counter = "runnable" },
+ { .pmu = &engine->queue[2], .counter = "running" },
{ .pmu = NULL, .counter = NULL },
};
@@ -576,12 +601,11 @@ static void fill_str(char *buf, unsigned int bufsz, char c, unsigned int num)
*buf = 0;
}
-static void pmu_calc(struct pmu_counter *cnt,
- char *buf, unsigned int bufsz,
- unsigned int width, unsigned width_dec,
- double d, double t, double s)
+static void _pmu_calc(struct pmu_counter *cnt,
+ char *buf, unsigned int bufsz,
+ unsigned int width, unsigned width_dec,
+ double val)
{
- double val;
int len;
assert(bufsz >= (width + width_dec + 1));
@@ -591,8 +615,6 @@ static void pmu_calc(struct pmu_counter *cnt,
return;
}
- val = __pmu_calc(&cnt->val, d, t, s);
-
len = snprintf(buf, bufsz, "%*.*f", width + width_dec, width_dec, val);
if (len < 0 || len == bufsz) {
fill_str(buf, bufsz, 'X', width + width_dec);
@@ -600,6 +622,16 @@ static void pmu_calc(struct pmu_counter *cnt,
}
}
+static void pmu_calc(struct pmu_counter *cnt,
+ char *buf, unsigned int bufsz,
+ unsigned int width, unsigned width_dec,
+ double d, double t, double s)
+{
+ double val = __pmu_calc(&cnt->val, d, t, s);
+
+ _pmu_calc(cnt, buf, bufsz, width, width_dec, val);
+}
+
static uint64_t __pmu_read_single(int fd, uint64_t *ts)
{
uint64_t data[2] = { };
@@ -658,10 +690,14 @@ static void pmu_sample(struct engines *engines)
for (i = 0; i < engines->num_engines; i++) {
struct engine *engine = engine_ptr(engines, i);
+ unsigned int j;
update_sample(&engine->busy, val);
update_sample(&engine->sema, val);
update_sample(&engine->wait, val);
+
+ for (j = 0; j < NUM_QUEUES; j++)
+ update_sample(&engine->queue[j], val);
}
}
@@ -702,12 +738,19 @@ usage(const char *appname)
appname, DEFAULT_PERIOD_MS);
}
+static double update_load(double load, double exp, double val)
+{
+ return val + exp * (load - val);
+}
+
int main(int argc, char **argv)
{
unsigned int period_us = DEFAULT_PERIOD_MS * 1000;
+ const double load_period[NUM_LOADS] = { 1.0, 30.0, 900.0 };
int con_w = -1, con_h = -1;
struct engines *engines;
unsigned int i;
+ double period;
int ret, ch;
/* Parse options */
@@ -741,10 +784,15 @@ int main(int argc, char **argv)
return 1;
}
+ /* Load average setup. */
+ period = (double)period_us / 1e6;
+ for (i = 0; i < NUM_LOADS; i++)
+ engines->load_exp[i] = exp(-period / load_period[i]);
+
pmu_sample(engines);
for (;;) {
- double t;
+ double t, qd = 0;
#define BUFSZ 16
char freq[BUFSZ];
char fact[BUFSZ];
@@ -754,6 +802,7 @@ int main(int argc, char **argv)
char reads[BUFSZ];
char writes[BUFSZ];
struct winsize ws;
+ unsigned int j;
int lines = 0;
/* Update terminal size. */
@@ -778,8 +827,44 @@ int main(int argc, char **argv)
pmu_calc(&engines->imc_writes, writes, BUFSZ, 6, 0, 1.0, t,
engines->imc_writes_scale);
+ for (i = 0; i < engines->num_engines; i++) {
+ struct engine *engine = engine_ptr(engines, i);
+
+ if (!engine->num_counters)
+ continue;
+
+ for (j = 0; j < NUM_QUEUES; j++) {
+ if (!engine->queue[j].present)
+ continue;
+
+ engine->qd[j] =
+ __pmu_calc(&engine->queue[j].val, 1, t,
+ engines->qd_scale);
+ }
+
+ qd += engine->qd[1] + engine->qd[2];
+
+ for (j = 0; j < NUM_LOADS; j++) {
+ engine->load_avg[j] =
+ update_load(engine->load_avg[j],
+ engines->load_exp[j],
+ engine->qd[1] +
+ engine->qd[2]);
+ }
+ }
+
+ for (j = 0; j < NUM_LOADS; j++) {
+ engines->load_avg[j] =
+ update_load(engines->load_avg[j],
+ engines->load_exp[j],
+ qd);
+ }
+
if (lines++ < con_h)
- printf("intel-gpu-top - %s/%s MHz; %s%% RC6; %s %s; %s irqs/s\n",
+ printf("intel-gpu-top - load avg %5.2f, %5.2f, %5.2f; %s/%s MHz; %s%% RC6; %s %s; %s irqs/s\n",
+ engines->load_avg[0],
+ engines->load_avg[1],
+ engines->load_avg[2],
fact, freq, rc6, power, engines->rapl_unit, irq);
if (lines++ < con_h)
@@ -803,7 +888,7 @@ int main(int argc, char **argv)
if (engine->num_counters && lines < con_h) {
const char *a = " ENGINE BUSY ";
- const char *b = " MI_SEMA MI_WAIT";
+ const char *b = "Q r R MI_SEMA MI_WAIT";
printf("\033[7m%s%*s%s\033[0m\n",
a,
@@ -817,6 +902,7 @@ int main(int argc, char **argv)
for (i = 0; i < engines->num_engines && lines < con_h; i++) {
struct engine *engine = engine_ptr(engines, i);
unsigned int max_w = con_w - 1;
+ char qdbuf[NUM_LOADS][BUFSZ];
unsigned int len;
char sema[BUFSZ];
char wait[BUFSZ];
@@ -827,14 +913,22 @@ int main(int argc, char **argv)
if (!engine->num_counters)
continue;
+ for (j = 0; j < NUM_QUEUES; j++)
+ _pmu_calc(&engine->queue[j], qdbuf[j], BUFSZ,
+ 3, 0, engine->qd[j]);
+
pmu_calc(&engine->sema, sema, BUFSZ, 3, 0, 1e9, t, 100);
pmu_calc(&engine->wait, wait, BUFSZ, 3, 0, 1e9, t, 100);
- len = snprintf(buf, sizeof(buf), " %s%% %s%%",
+
+ len = snprintf(buf, sizeof(buf),
+ " %s %s %s %s%% %s%%",
+ qdbuf[0], qdbuf[1], qdbuf[2],
sema, wait);
- pmu_calc(&engine->busy, busy, BUFSZ, 6, 2, 1e9, t,
- 100);
- len += printf("%16s %s%% ", engine->display_name, busy);
+ pmu_calc(&engine->busy, busy, BUFSZ, 6, 2, 1e9, t, 100);
+
+ len += printf("%16s %s%% ",
+ engine->display_name, busy);
val = __pmu_calc(&engine->busy.val, 1e9, t, 100);
print_percentage_bar(val, max_w - len);
diff --git a/tools/meson.build b/tools/meson.build
index e4517d667299..c1dd71fada79 100644
--- a/tools/meson.build
+++ b/tools/meson.build
@@ -97,7 +97,7 @@ shared_library('intel_aubdump', 'aubdump.c',
executable('intel_gpu_top', 'intel_gpu_top.c',
install : true,
install_rpath : bindir_rpathdir,
- dependencies : tool_deps + [ lib_igt_perf ])
+ dependencies : tool_deps + [ lib_igt_perf, math ])
conf_data = configuration_data()
conf_data.set('prefix', prefix)
--
2.17.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [igt-dev] [RFC i-g-t 5/6] intel-gpu-top: Add queue depths and load average
@ 2018-10-03 12:07 ` Tvrtko Ursulin
0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
With the driver now exporting various request queue depths for each
engine, we can display this information here. Both as raw counters,
and by adding a load average like metrics composed from number of
runnable and running requests in a given time period. 1s, 30s and 5m
periods are used for load average.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tools/Makefile.am | 2 +-
tools/intel_gpu_top.c | 122 +++++++++++++++++++++++++++++++++++++-----
tools/meson.build | 2 +-
3 files changed, 110 insertions(+), 16 deletions(-)
diff --git a/tools/Makefile.am b/tools/Makefile.am
index e7de4d90241c..e03842afce8d 100644
--- a/tools/Makefile.am
+++ b/tools/Makefile.am
@@ -29,7 +29,7 @@ intel_aubdump_la_LDFLAGS = -module -avoid-version -no-undefined
intel_aubdump_la_SOURCES = aubdump.c
intel_aubdump_la_LIBADD = $(top_builddir)/lib/libintel_tools.la -ldl
-intel_gpu_top_LDADD = $(top_builddir)/lib/libigt_perf.la
+intel_gpu_top_LDADD = $(top_builddir)/lib/libigt_perf.la -lm
bin_SCRIPTS = intel_aubdump
CLEANFILES = $(bin_SCRIPTS)
diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index b923c3cfbe97..8990ef17b771 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -56,6 +56,10 @@ struct pmu_counter {
struct pmu_pair val;
};
+#define NUM_QUEUES (3)
+
+#define NUM_LOADS (3)
+
struct engine {
const char *name;
const char *display_name;
@@ -65,9 +69,13 @@ struct engine {
unsigned int num_counters;
+ double qd[3];
+ double load_avg[NUM_LOADS];
+
struct pmu_counter busy;
struct pmu_counter wait;
struct pmu_counter sema;
+ struct pmu_counter queue[NUM_QUEUES];
};
struct engines {
@@ -95,6 +103,11 @@ struct engines {
struct pmu_counter imc_reads;
struct pmu_counter imc_writes;
+ double qd_scale;
+
+ double load_exp[NUM_LOADS];
+ double load_avg[NUM_LOADS];
+
struct engine engine;
};
@@ -320,6 +333,13 @@ static double filename_to_double(const char *filename)
return v;
}
+#define I915_EVENT "/sys/devices/i915/events/"
+
+static double i915_qd_scale(void)
+{
+ return filename_to_double(I915_EVENT "rcs0-queued.scale");
+}
+
#define RAPL_ROOT "/sys/devices/power/"
#define RAPL_EVENT "/sys/devices/power/events/"
@@ -453,6 +473,8 @@ static int pmu_init(struct engines *engines)
engines->rc6.config = I915_PMU_RC6_RESIDENCY;
_open_pmu(engines->num_counters, &engines->rc6, engines->fd);
+ engines->qd_scale = i915_qd_scale();
+
for (i = 0; i < engines->num_engines; i++) {
struct engine *engine = engine_ptr(engines, i);
struct {
@@ -462,6 +484,9 @@ static int pmu_init(struct engines *engines)
{ .pmu = &engine->busy, .counter = "busy" },
{ .pmu = &engine->wait, .counter = "wait" },
{ .pmu = &engine->sema, .counter = "sema" },
+ { .pmu = &engine->queue[0], .counter = "queued" },
+ { .pmu = &engine->queue[1], .counter = "runnable" },
+ { .pmu = &engine->queue[2], .counter = "running" },
{ .pmu = NULL, .counter = NULL },
};
@@ -576,12 +601,11 @@ static void fill_str(char *buf, unsigned int bufsz, char c, unsigned int num)
*buf = 0;
}
-static void pmu_calc(struct pmu_counter *cnt,
- char *buf, unsigned int bufsz,
- unsigned int width, unsigned width_dec,
- double d, double t, double s)
+static void _pmu_calc(struct pmu_counter *cnt,
+ char *buf, unsigned int bufsz,
+ unsigned int width, unsigned width_dec,
+ double val)
{
- double val;
int len;
assert(bufsz >= (width + width_dec + 1));
@@ -591,8 +615,6 @@ static void pmu_calc(struct pmu_counter *cnt,
return;
}
- val = __pmu_calc(&cnt->val, d, t, s);
-
len = snprintf(buf, bufsz, "%*.*f", width + width_dec, width_dec, val);
if (len < 0 || len == bufsz) {
fill_str(buf, bufsz, 'X', width + width_dec);
@@ -600,6 +622,16 @@ static void pmu_calc(struct pmu_counter *cnt,
}
}
+static void pmu_calc(struct pmu_counter *cnt,
+ char *buf, unsigned int bufsz,
+ unsigned int width, unsigned width_dec,
+ double d, double t, double s)
+{
+ double val = __pmu_calc(&cnt->val, d, t, s);
+
+ _pmu_calc(cnt, buf, bufsz, width, width_dec, val);
+}
+
static uint64_t __pmu_read_single(int fd, uint64_t *ts)
{
uint64_t data[2] = { };
@@ -658,10 +690,14 @@ static void pmu_sample(struct engines *engines)
for (i = 0; i < engines->num_engines; i++) {
struct engine *engine = engine_ptr(engines, i);
+ unsigned int j;
update_sample(&engine->busy, val);
update_sample(&engine->sema, val);
update_sample(&engine->wait, val);
+
+ for (j = 0; j < NUM_QUEUES; j++)
+ update_sample(&engine->queue[j], val);
}
}
@@ -702,12 +738,19 @@ usage(const char *appname)
appname, DEFAULT_PERIOD_MS);
}
+static double update_load(double load, double exp, double val)
+{
+ return val + exp * (load - val);
+}
+
int main(int argc, char **argv)
{
unsigned int period_us = DEFAULT_PERIOD_MS * 1000;
+ const double load_period[NUM_LOADS] = { 1.0, 30.0, 900.0 };
int con_w = -1, con_h = -1;
struct engines *engines;
unsigned int i;
+ double period;
int ret, ch;
/* Parse options */
@@ -741,10 +784,15 @@ int main(int argc, char **argv)
return 1;
}
+ /* Load average setup. */
+ period = (double)period_us / 1e6;
+ for (i = 0; i < NUM_LOADS; i++)
+ engines->load_exp[i] = exp(-period / load_period[i]);
+
pmu_sample(engines);
for (;;) {
- double t;
+ double t, qd = 0;
#define BUFSZ 16
char freq[BUFSZ];
char fact[BUFSZ];
@@ -754,6 +802,7 @@ int main(int argc, char **argv)
char reads[BUFSZ];
char writes[BUFSZ];
struct winsize ws;
+ unsigned int j;
int lines = 0;
/* Update terminal size. */
@@ -778,8 +827,44 @@ int main(int argc, char **argv)
pmu_calc(&engines->imc_writes, writes, BUFSZ, 6, 0, 1.0, t,
engines->imc_writes_scale);
+ for (i = 0; i < engines->num_engines; i++) {
+ struct engine *engine = engine_ptr(engines, i);
+
+ if (!engine->num_counters)
+ continue;
+
+ for (j = 0; j < NUM_QUEUES; j++) {
+ if (!engine->queue[j].present)
+ continue;
+
+ engine->qd[j] =
+ __pmu_calc(&engine->queue[j].val, 1, t,
+ engines->qd_scale);
+ }
+
+ qd += engine->qd[1] + engine->qd[2];
+
+ for (j = 0; j < NUM_LOADS; j++) {
+ engine->load_avg[j] =
+ update_load(engine->load_avg[j],
+ engines->load_exp[j],
+ engine->qd[1] +
+ engine->qd[2]);
+ }
+ }
+
+ for (j = 0; j < NUM_LOADS; j++) {
+ engines->load_avg[j] =
+ update_load(engines->load_avg[j],
+ engines->load_exp[j],
+ qd);
+ }
+
if (lines++ < con_h)
- printf("intel-gpu-top - %s/%s MHz; %s%% RC6; %s %s; %s irqs/s\n",
+ printf("intel-gpu-top - load avg %5.2f, %5.2f, %5.2f; %s/%s MHz; %s%% RC6; %s %s; %s irqs/s\n",
+ engines->load_avg[0],
+ engines->load_avg[1],
+ engines->load_avg[2],
fact, freq, rc6, power, engines->rapl_unit, irq);
if (lines++ < con_h)
@@ -803,7 +888,7 @@ int main(int argc, char **argv)
if (engine->num_counters && lines < con_h) {
const char *a = " ENGINE BUSY ";
- const char *b = " MI_SEMA MI_WAIT";
+ const char *b = "Q r R MI_SEMA MI_WAIT";
printf("\033[7m%s%*s%s\033[0m\n",
a,
@@ -817,6 +902,7 @@ int main(int argc, char **argv)
for (i = 0; i < engines->num_engines && lines < con_h; i++) {
struct engine *engine = engine_ptr(engines, i);
unsigned int max_w = con_w - 1;
+ char qdbuf[NUM_LOADS][BUFSZ];
unsigned int len;
char sema[BUFSZ];
char wait[BUFSZ];
@@ -827,14 +913,22 @@ int main(int argc, char **argv)
if (!engine->num_counters)
continue;
+ for (j = 0; j < NUM_QUEUES; j++)
+ _pmu_calc(&engine->queue[j], qdbuf[j], BUFSZ,
+ 3, 0, engine->qd[j]);
+
pmu_calc(&engine->sema, sema, BUFSZ, 3, 0, 1e9, t, 100);
pmu_calc(&engine->wait, wait, BUFSZ, 3, 0, 1e9, t, 100);
- len = snprintf(buf, sizeof(buf), " %s%% %s%%",
+
+ len = snprintf(buf, sizeof(buf),
+ " %s %s %s %s%% %s%%",
+ qdbuf[0], qdbuf[1], qdbuf[2],
sema, wait);
- pmu_calc(&engine->busy, busy, BUFSZ, 6, 2, 1e9, t,
- 100);
- len += printf("%16s %s%% ", engine->display_name, busy);
+ pmu_calc(&engine->busy, busy, BUFSZ, 6, 2, 1e9, t, 100);
+
+ len += printf("%16s %s%% ",
+ engine->display_name, busy);
val = __pmu_calc(&engine->busy.val, 1e9, t, 100);
print_percentage_bar(val, max_w - len);
diff --git a/tools/meson.build b/tools/meson.build
index e4517d667299..c1dd71fada79 100644
--- a/tools/meson.build
+++ b/tools/meson.build
@@ -97,7 +97,7 @@ shared_library('intel_aubdump', 'aubdump.c',
executable('intel_gpu_top', 'intel_gpu_top.c',
install : true,
install_rpath : bindir_rpathdir,
- dependencies : tool_deps + [ lib_igt_perf ])
+ dependencies : tool_deps + [ lib_igt_perf, math ])
conf_data = configuration_data()
conf_data.set('prefix', prefix)
--
2.17.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC i-g-t 6/6] intel-gpu-top: Support for client stats
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
@ 2018-10-03 12:07 ` Tvrtko Ursulin
-1 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tools/intel_gpu_top.c | 360 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 351 insertions(+), 9 deletions(-)
diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index 8990ef17b771..02c5a7d77614 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -20,9 +20,6 @@
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*
- * Authors:
- * Eric Anholt <eric@anholt.net>
- * Eugeni Dodonov <eugeni.dodonov@intel.com>
*/
#include <stdio.h>
@@ -41,6 +38,8 @@
#include <errno.h>
#include <math.h>
#include <locale.h>
+#include <limits.h>
+#include <signal.h>
#include "igt_perf.h"
@@ -701,8 +700,300 @@ static void pmu_sample(struct engines *engines)
}
}
+enum client_status {
+ FREE = 0, /* mbz */
+ ALIVE,
+ PROBE
+};
+
+struct client {
+ enum client_status status;
+ unsigned int id;
+ unsigned int pid;
+ char name[128];
+ unsigned int samples;
+ unsigned long total;
+ struct engines *engines;
+ unsigned long *val;
+ uint64_t *last;
+};
+
+#define SYSFS_ENABLE "/sys/class/drm/card0/clients/enable_stats"
+
+bool __stats_enabled;
+
+static int __set_stats(bool val)
+{
+ int fd, ret;
+
+ fd = open(SYSFS_ENABLE, O_WRONLY);
+ if (fd < 0)
+ return -errno;
+
+ ret = write(fd, val ? "1" : "0", 2);
+ if (ret < 0)
+ return -errno;
+ else if (ret < 2)
+ return 1;
+
+ close(fd);
+
+ return 0;
+}
+
+static void __restore_stats(void)
+{
+ int ret;
+
+ if (__stats_enabled)
+ return;
+
+ ret = __set_stats(false);
+ if (ret)
+ fprintf(stderr, "Failed to disable per-client stats! (%d)\n",
+ ret);
+}
+
+static void __restore_stats_signal(int sig)
+{
+ exit(0);
+}
+
+static void enable_stats(void)
+{
+ int fd, ret;
+
+ fd = open(SYSFS_ENABLE, O_RDONLY);
+ if (fd < 0)
+ return;
+
+ close(fd);
+
+ __stats_enabled = filename_to_u64(SYSFS_ENABLE, 10);
+ if (__stats_enabled)
+ return;
+
+ ret = __set_stats(true);
+ if (!ret) {
+ if (atexit(__restore_stats))
+ fprintf(stderr, "Failed to register exit handler!");
+
+ if (signal(SIGINT, __restore_stats_signal))
+ fprintf(stderr, "Failed to register signal handler!");
+ } else {
+ fprintf(stderr, "Failed to enable per-client stats! (%d)\n",
+ ret);
+ }
+}
+
+#define SYSFS_CLIENTS "/sys/class/drm/card0/clients/"
+
+static struct client *clients;
+static unsigned int num_clients;
+
+#define for_each_client(c, tmp) \
+ for (tmp = num_clients, c = clients; tmp > 0; tmp--, c++)
+
+static uint64_t read_client_busy(unsigned int id, const char *engine)
+{
+ char buf[256];
+ ssize_t ret;
+
+ ret = snprintf(buf, sizeof(buf), SYSFS_CLIENTS "/%u/busy/%s",
+ id, engine);
+ assert(ret > 0 && ret < sizeof(buf));
+ if (ret <= 0 || ret == sizeof(buf))
+ return 0;
+
+ return filename_to_u64(buf, 10);
+}
+
+static struct client *find_client(enum client_status status, unsigned int id)
+{
+ struct client *c;
+ unsigned int tmp;
+
+ for_each_client(c, tmp) {
+ if ((status == FREE && c->status == FREE) ||
+ (status == c->status && c->id == id))
+ return c;
+ }
+
+ return NULL;
+}
+
+static void update_client(struct client *c, unsigned int pid, char *name)
+{
+ uint64_t val[c->engines->num_engines];
+ unsigned int i;
+
+ if (c->pid != pid)
+ c->pid = pid;
+
+ if (strncmp(c->name, name, sizeof(c->name)))
+ strncpy(c->name, name, sizeof(c->name));
+
+ for (i = 0; i < c->engines->num_engines; i++) {
+ struct engine *engine = engine_ptr(c->engines, i);
+
+ val[i] = read_client_busy(c->id, engine->name);
+ }
+
+ c->total = 0;
+
+ for (i = 0; i < c->engines->num_engines; i++) {
+ assert(val[i] >= c->last[i]);
+ c->val[i] = val[i] - c->last[i];
+ c->total += c->val[i];
+ c->last[i] = val[i];
+ }
+
+ c->samples++;
+ c->status = ALIVE;
+}
+
+static void
+add_client(unsigned int id, unsigned int pid, char *name,
+ struct engines *engines)
+{
+ struct client *c;
+
+ assert(!find_client(ALIVE, id));
+
+ c = find_client(FREE, 0);
+ if (!c) {
+ unsigned int idx = num_clients;
+
+ num_clients += (num_clients + 2) / 2;
+ clients = realloc(clients, num_clients * sizeof(*c));
+ assert(clients);
+ c = &clients[idx];
+ memset(c, 0, (num_clients - idx) * sizeof(*c));
+ }
+
+ c->id = id;
+ c->engines = engines;
+ c->val = calloc(engines->num_engines, sizeof(c->val));
+ c->last = calloc(engines->num_engines, sizeof(c->last));
+ assert(c->val && c->last);
+
+ update_client(c, pid, name);
+}
+
+static void free_client(struct client *c)
+{
+ free(c->val);
+ free(c->last);
+ memset(c, 0, sizeof(*c));
+}
+
+static char *read_client_sysfs(unsigned int id, const char *field)
+{
+ char buf[256];
+ ssize_t ret;
+
+ ret = snprintf(buf, sizeof(buf), SYSFS_CLIENTS "/%u/%s", id, field);
+ assert(ret > 0 && ret < sizeof(buf));
+ if (ret <= 0 || ret == sizeof(buf))
+ return NULL;
+
+ ret = filename_to_buf(buf, buf, sizeof(buf));
+ assert(ret == 0);
+ if (ret)
+ return NULL;
+
+ return strdup(buf);
+}
+
+static void scan(struct engines *engines)
+{
+ struct dirent *dent;
+ struct client *c;
+ char *pid, *name;
+ unsigned int tmp;
+ unsigned int id;
+ DIR *d;
+
+ for_each_client(c, tmp) {
+ if (c->status == ALIVE)
+ c->status = PROBE;
+ }
+
+ d = opendir(SYSFS_CLIENTS);
+ if (!d)
+ return;
+
+ while ((dent = readdir(d)) != NULL) {
+ if (dent->d_type != DT_DIR)
+ continue;
+ if (!isdigit(dent->d_name[0]))
+ continue;
+
+ id = atoi(dent->d_name);
+
+ name = read_client_sysfs(id, "name");
+ assert(name);
+ if (!name)
+ continue;
+
+ pid = read_client_sysfs(id, "pid");
+ assert(pid);
+ if (!pid) {
+ free(name);
+ continue;
+ }
+
+ c = find_client(PROBE, id);
+ if (c) {
+ update_client(c, atoi(pid), name);
+ continue;
+ }
+
+ add_client(id, atoi(pid), name, engines);
+
+ free(name);
+ free(pid);
+ }
+
+ closedir(d);
+
+ for_each_client(c, tmp) {
+ if (c->status == PROBE)
+ free_client(c);
+ }
+}
+
+static int cmp(const void *_a, const void *_b)
+{
+ const struct client *a = _a;
+ const struct client *b = _b;
+ long tot_a = a->total;
+ long tot_b = b->total;
+
+ tot_a *= a->status == ALIVE && a->samples > 1;
+ tot_b *= b->status == ALIVE && b->samples > 1;
+
+ tot_b -= tot_a;
+
+ if (!tot_b)
+ return (int)b->id - a->id;
+
+ while (tot_b > INT_MAX || tot_b < INT_MIN)
+ tot_b /= 2;
+
+ return tot_b;
+}
+
static const char *bars[] = { " ", "▏", "▎", "▍", "▌", "▋", "▊", "▉", "█" };
+static void n_spaces(const unsigned int n)
+{
+ unsigned int i;
+
+ for (i = 0; i < n; i++)
+ putchar(' ');
+}
+
static void
print_percentage_bar(double percent, int max_len)
{
@@ -716,8 +1007,10 @@ print_percentage_bar(double percent, int max_len)
if (i)
printf("%s", bars[i]);
- for (i = 0; i < (max_len - 2 - (bar_len + 7) / 8); i++)
- putchar(' ');
+ bar_len = max_len - 2 - (bar_len + 7) / 8;
+ if (bar_len > max_len)
+ bar_len = max_len;
+ n_spaces(bar_len);
putchar('|');
}
@@ -784,12 +1077,15 @@ int main(int argc, char **argv)
return 1;
}
+ enable_stats();
+
/* Load average setup. */
period = (double)period_us / 1e6;
for (i = 0; i < NUM_LOADS; i++)
engines->load_exp[i] = exp(-period / load_period[i]);
pmu_sample(engines);
+ scan(engines);
for (;;) {
double t, qd = 0;
@@ -802,7 +1098,8 @@ int main(int argc, char **argv)
char reads[BUFSZ];
char writes[BUFSZ];
struct winsize ws;
- unsigned int j;
+ unsigned int len, engine_w, j;
+ struct client *c;
int lines = 0;
/* Update terminal size. */
@@ -812,9 +1109,10 @@ int main(int argc, char **argv)
}
pmu_sample(engines);
- t = (double)(engines->ts.cur - engines->ts.prev) / 1e9;
+ scan(engines);
+ qsort(clients, num_clients, sizeof(*c), cmp);
- printf("\033[H\033[J");
+ t = (double)(engines->ts.cur - engines->ts.prev) / 1e9;
pmu_calc(&engines->freq_req, freq, BUFSZ, 4, 0, 1.0, t, 1);
pmu_calc(&engines->freq_act, fact, BUFSZ, 4, 0, 1.0, t, 1);
@@ -860,6 +1158,8 @@ int main(int argc, char **argv)
qd);
}
+ printf("\033[H\033[J");
+
if (lines++ < con_h)
printf("intel-gpu-top - load avg %5.2f, %5.2f, %5.2f; %s/%s MHz; %s%% RC6; %s %s; %s irqs/s\n",
engines->load_avg[0],
@@ -903,7 +1203,6 @@ int main(int argc, char **argv)
struct engine *engine = engine_ptr(engines, i);
unsigned int max_w = con_w - 1;
char qdbuf[NUM_LOADS][BUFSZ];
- unsigned int len;
char sema[BUFSZ];
char wait[BUFSZ];
char busy[BUFSZ];
@@ -941,6 +1240,49 @@ int main(int argc, char **argv)
if (lines++ < con_h)
printf("\n");
+ if (lines++ < con_h) {
+ printf("\033[7m");
+ len = printf("%5s%16s", "PID", "NAME");
+
+ engine_w = (con_w - len) / engines->num_engines;
+ for (i = 0; i < engines->num_engines; i++) {
+ struct engine *engine = engine_ptr(engines, i);
+ unsigned int name_len =
+ strlen(engine->display_name);
+ unsigned int pad = (engine_w - name_len) / 2;
+
+
+ n_spaces(pad);
+ printf("%s", engine->display_name);
+ n_spaces(engine_w - pad - name_len);
+ len += pad + name_len + (engine_w - pad -
+ name_len);
+ }
+ n_spaces(con_w - len);
+ printf("\033[0m\n");
+ }
+
+ for_each_client(c, i) {
+ if (lines++ > con_h)
+ break;
+
+ assert(c->status != PROBE);
+ if (c->status != ALIVE || c->samples < 2)
+ break;
+
+ printf("%5u%16s ", c->pid, c->name);
+
+ for (j = 0; j < engines->num_engines; j++) {
+ double pct;
+
+ pct = (double)c->val[j] / period_us / 1e3 * 100;
+
+ print_percentage_bar(pct, engine_w);
+ }
+
+ putchar('\n');
+ }
+
usleep(period_us);
}
--
2.17.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [igt-dev] [RFC i-g-t 6/6] intel-gpu-top: Support for client stats
@ 2018-10-03 12:07 ` Tvrtko Ursulin
0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-10-03 12:07 UTC (permalink / raw)
To: igt-dev; +Cc: Intel-gfx, Tvrtko Ursulin
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
tools/intel_gpu_top.c | 360 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 351 insertions(+), 9 deletions(-)
diff --git a/tools/intel_gpu_top.c b/tools/intel_gpu_top.c
index 8990ef17b771..02c5a7d77614 100644
--- a/tools/intel_gpu_top.c
+++ b/tools/intel_gpu_top.c
@@ -20,9 +20,6 @@
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE.
*
- * Authors:
- * Eric Anholt <eric@anholt.net>
- * Eugeni Dodonov <eugeni.dodonov@intel.com>
*/
#include <stdio.h>
@@ -41,6 +38,8 @@
#include <errno.h>
#include <math.h>
#include <locale.h>
+#include <limits.h>
+#include <signal.h>
#include "igt_perf.h"
@@ -701,8 +700,300 @@ static void pmu_sample(struct engines *engines)
}
}
+enum client_status {
+ FREE = 0, /* mbz */
+ ALIVE,
+ PROBE
+};
+
+struct client {
+ enum client_status status;
+ unsigned int id;
+ unsigned int pid;
+ char name[128];
+ unsigned int samples;
+ unsigned long total;
+ struct engines *engines;
+ unsigned long *val;
+ uint64_t *last;
+};
+
+#define SYSFS_ENABLE "/sys/class/drm/card0/clients/enable_stats"
+
+bool __stats_enabled;
+
+static int __set_stats(bool val)
+{
+ int fd, ret;
+
+ fd = open(SYSFS_ENABLE, O_WRONLY);
+ if (fd < 0)
+ return -errno;
+
+ ret = write(fd, val ? "1" : "0", 2);
+ if (ret < 0)
+ return -errno;
+ else if (ret < 2)
+ return 1;
+
+ close(fd);
+
+ return 0;
+}
+
+static void __restore_stats(void)
+{
+ int ret;
+
+ if (__stats_enabled)
+ return;
+
+ ret = __set_stats(false);
+ if (ret)
+ fprintf(stderr, "Failed to disable per-client stats! (%d)\n",
+ ret);
+}
+
+static void __restore_stats_signal(int sig)
+{
+ exit(0);
+}
+
+static void enable_stats(void)
+{
+ int fd, ret;
+
+ fd = open(SYSFS_ENABLE, O_RDONLY);
+ if (fd < 0)
+ return;
+
+ close(fd);
+
+ __stats_enabled = filename_to_u64(SYSFS_ENABLE, 10);
+ if (__stats_enabled)
+ return;
+
+ ret = __set_stats(true);
+ if (!ret) {
+ if (atexit(__restore_stats))
+ fprintf(stderr, "Failed to register exit handler!");
+
+ if (signal(SIGINT, __restore_stats_signal))
+ fprintf(stderr, "Failed to register signal handler!");
+ } else {
+ fprintf(stderr, "Failed to enable per-client stats! (%d)\n",
+ ret);
+ }
+}
+
+#define SYSFS_CLIENTS "/sys/class/drm/card0/clients/"
+
+static struct client *clients;
+static unsigned int num_clients;
+
+#define for_each_client(c, tmp) \
+ for (tmp = num_clients, c = clients; tmp > 0; tmp--, c++)
+
+static uint64_t read_client_busy(unsigned int id, const char *engine)
+{
+ char buf[256];
+ ssize_t ret;
+
+ ret = snprintf(buf, sizeof(buf), SYSFS_CLIENTS "/%u/busy/%s",
+ id, engine);
+ assert(ret > 0 && ret < sizeof(buf));
+ if (ret <= 0 || ret == sizeof(buf))
+ return 0;
+
+ return filename_to_u64(buf, 10);
+}
+
+static struct client *find_client(enum client_status status, unsigned int id)
+{
+ struct client *c;
+ unsigned int tmp;
+
+ for_each_client(c, tmp) {
+ if ((status == FREE && c->status == FREE) ||
+ (status == c->status && c->id == id))
+ return c;
+ }
+
+ return NULL;
+}
+
+static void update_client(struct client *c, unsigned int pid, char *name)
+{
+ uint64_t val[c->engines->num_engines];
+ unsigned int i;
+
+ if (c->pid != pid)
+ c->pid = pid;
+
+ if (strncmp(c->name, name, sizeof(c->name)))
+ strncpy(c->name, name, sizeof(c->name));
+
+ for (i = 0; i < c->engines->num_engines; i++) {
+ struct engine *engine = engine_ptr(c->engines, i);
+
+ val[i] = read_client_busy(c->id, engine->name);
+ }
+
+ c->total = 0;
+
+ for (i = 0; i < c->engines->num_engines; i++) {
+ assert(val[i] >= c->last[i]);
+ c->val[i] = val[i] - c->last[i];
+ c->total += c->val[i];
+ c->last[i] = val[i];
+ }
+
+ c->samples++;
+ c->status = ALIVE;
+}
+
+static void
+add_client(unsigned int id, unsigned int pid, char *name,
+ struct engines *engines)
+{
+ struct client *c;
+
+ assert(!find_client(ALIVE, id));
+
+ c = find_client(FREE, 0);
+ if (!c) {
+ unsigned int idx = num_clients;
+
+ num_clients += (num_clients + 2) / 2;
+ clients = realloc(clients, num_clients * sizeof(*c));
+ assert(clients);
+ c = &clients[idx];
+ memset(c, 0, (num_clients - idx) * sizeof(*c));
+ }
+
+ c->id = id;
+ c->engines = engines;
+ c->val = calloc(engines->num_engines, sizeof(c->val));
+ c->last = calloc(engines->num_engines, sizeof(c->last));
+ assert(c->val && c->last);
+
+ update_client(c, pid, name);
+}
+
+static void free_client(struct client *c)
+{
+ free(c->val);
+ free(c->last);
+ memset(c, 0, sizeof(*c));
+}
+
+static char *read_client_sysfs(unsigned int id, const char *field)
+{
+ char buf[256];
+ ssize_t ret;
+
+ ret = snprintf(buf, sizeof(buf), SYSFS_CLIENTS "/%u/%s", id, field);
+ assert(ret > 0 && ret < sizeof(buf));
+ if (ret <= 0 || ret == sizeof(buf))
+ return NULL;
+
+ ret = filename_to_buf(buf, buf, sizeof(buf));
+ assert(ret == 0);
+ if (ret)
+ return NULL;
+
+ return strdup(buf);
+}
+
+static void scan(struct engines *engines)
+{
+ struct dirent *dent;
+ struct client *c;
+ char *pid, *name;
+ unsigned int tmp;
+ unsigned int id;
+ DIR *d;
+
+ for_each_client(c, tmp) {
+ if (c->status == ALIVE)
+ c->status = PROBE;
+ }
+
+ d = opendir(SYSFS_CLIENTS);
+ if (!d)
+ return;
+
+ while ((dent = readdir(d)) != NULL) {
+ if (dent->d_type != DT_DIR)
+ continue;
+ if (!isdigit(dent->d_name[0]))
+ continue;
+
+ id = atoi(dent->d_name);
+
+ name = read_client_sysfs(id, "name");
+ assert(name);
+ if (!name)
+ continue;
+
+ pid = read_client_sysfs(id, "pid");
+ assert(pid);
+ if (!pid) {
+ free(name);
+ continue;
+ }
+
+ c = find_client(PROBE, id);
+ if (c) {
+ update_client(c, atoi(pid), name);
+ continue;
+ }
+
+ add_client(id, atoi(pid), name, engines);
+
+ free(name);
+ free(pid);
+ }
+
+ closedir(d);
+
+ for_each_client(c, tmp) {
+ if (c->status == PROBE)
+ free_client(c);
+ }
+}
+
+static int cmp(const void *_a, const void *_b)
+{
+ const struct client *a = _a;
+ const struct client *b = _b;
+ long tot_a = a->total;
+ long tot_b = b->total;
+
+ tot_a *= a->status == ALIVE && a->samples > 1;
+ tot_b *= b->status == ALIVE && b->samples > 1;
+
+ tot_b -= tot_a;
+
+ if (!tot_b)
+ return (int)b->id - a->id;
+
+ while (tot_b > INT_MAX || tot_b < INT_MIN)
+ tot_b /= 2;
+
+ return tot_b;
+}
+
static const char *bars[] = { " ", "▏", "▎", "▍", "▌", "▋", "▊", "▉", "█" };
+static void n_spaces(const unsigned int n)
+{
+ unsigned int i;
+
+ for (i = 0; i < n; i++)
+ putchar(' ');
+}
+
static void
print_percentage_bar(double percent, int max_len)
{
@@ -716,8 +1007,10 @@ print_percentage_bar(double percent, int max_len)
if (i)
printf("%s", bars[i]);
- for (i = 0; i < (max_len - 2 - (bar_len + 7) / 8); i++)
- putchar(' ');
+ bar_len = max_len - 2 - (bar_len + 7) / 8;
+ if (bar_len > max_len)
+ bar_len = max_len;
+ n_spaces(bar_len);
putchar('|');
}
@@ -784,12 +1077,15 @@ int main(int argc, char **argv)
return 1;
}
+ enable_stats();
+
/* Load average setup. */
period = (double)period_us / 1e6;
for (i = 0; i < NUM_LOADS; i++)
engines->load_exp[i] = exp(-period / load_period[i]);
pmu_sample(engines);
+ scan(engines);
for (;;) {
double t, qd = 0;
@@ -802,7 +1098,8 @@ int main(int argc, char **argv)
char reads[BUFSZ];
char writes[BUFSZ];
struct winsize ws;
- unsigned int j;
+ unsigned int len, engine_w, j;
+ struct client *c;
int lines = 0;
/* Update terminal size. */
@@ -812,9 +1109,10 @@ int main(int argc, char **argv)
}
pmu_sample(engines);
- t = (double)(engines->ts.cur - engines->ts.prev) / 1e9;
+ scan(engines);
+ qsort(clients, num_clients, sizeof(*c), cmp);
- printf("\033[H\033[J");
+ t = (double)(engines->ts.cur - engines->ts.prev) / 1e9;
pmu_calc(&engines->freq_req, freq, BUFSZ, 4, 0, 1.0, t, 1);
pmu_calc(&engines->freq_act, fact, BUFSZ, 4, 0, 1.0, t, 1);
@@ -860,6 +1158,8 @@ int main(int argc, char **argv)
qd);
}
+ printf("\033[H\033[J");
+
if (lines++ < con_h)
printf("intel-gpu-top - load avg %5.2f, %5.2f, %5.2f; %s/%s MHz; %s%% RC6; %s %s; %s irqs/s\n",
engines->load_avg[0],
@@ -903,7 +1203,6 @@ int main(int argc, char **argv)
struct engine *engine = engine_ptr(engines, i);
unsigned int max_w = con_w - 1;
char qdbuf[NUM_LOADS][BUFSZ];
- unsigned int len;
char sema[BUFSZ];
char wait[BUFSZ];
char busy[BUFSZ];
@@ -941,6 +1240,49 @@ int main(int argc, char **argv)
if (lines++ < con_h)
printf("\n");
+ if (lines++ < con_h) {
+ printf("\033[7m");
+ len = printf("%5s%16s", "PID", "NAME");
+
+ engine_w = (con_w - len) / engines->num_engines;
+ for (i = 0; i < engines->num_engines; i++) {
+ struct engine *engine = engine_ptr(engines, i);
+ unsigned int name_len =
+ strlen(engine->display_name);
+ unsigned int pad = (engine_w - name_len) / 2;
+
+
+ n_spaces(pad);
+ printf("%s", engine->display_name);
+ n_spaces(engine_w - pad - name_len);
+ len += pad + name_len + (engine_w - pad -
+ name_len);
+ }
+ n_spaces(con_w - len);
+ printf("\033[0m\n");
+ }
+
+ for_each_client(c, i) {
+ if (lines++ > con_h)
+ break;
+
+ assert(c->status != PROBE);
+ if (c->status != ALIVE || c->samples < 2)
+ break;
+
+ printf("%5u%16s ", c->pid, c->name);
+
+ for (j = 0; j < engines->num_engines; j++) {
+ double pct;
+
+ pct = (double)c->val[j] / period_us / 1e3 * 100;
+
+ print_percentage_bar(pct, engine_w);
+ }
+
+ putchar('\n');
+ }
+
usleep(period_us);
}
--
2.17.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [igt-dev] ✓ Fi.CI.BAT: success for 21st century intel_gpu_top & Co.
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
` (6 preceding siblings ...)
(?)
@ 2018-10-03 16:57 ` Patchwork
-1 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2018-10-03 16:57 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: igt-dev
== Series Details ==
Series: 21st century intel_gpu_top & Co.
URL : https://patchwork.freedesktop.org/series/50499/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4918 -> IGTPW_1901 =
== Summary - SUCCESS ==
No regressions found.
External URL: https://patchwork.freedesktop.org/api/1.0/series/50499/revisions/1/mbox/
== Known issues ==
Here are the changes found in IGTPW_1901 that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@drv_module_reload@basic-reload-inject:
fi-glk-j4005: PASS -> DMESG-WARN (fdo#106725, fdo#106248) +1
igt@gem_wait@basic-busy-all:
fi-skl-gvtdvm: PASS -> DMESG-WARN (fdo#105541)
igt@kms_cursor_legacy@basic-flip-before-cursor-atomic:
fi-glk-j4005: PASS -> DMESG-WARN (fdo#105719)
igt@kms_flip@basic-flip-vs-modeset:
fi-glk-j4005: PASS -> DMESG-WARN (fdo#106000)
igt@kms_flip@basic-flip-vs-wf_vblank:
fi-glk-j4005: PASS -> FAIL (fdo#100368)
igt@kms_flip@basic-plain-flip:
fi-glk-j4005: PASS -> DMESG-WARN (fdo#106097)
igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence:
fi-byt-clapper: PASS -> FAIL (fdo#103191, fdo#107362)
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#105541 https://bugs.freedesktop.org/show_bug.cgi?id=105541
fdo#105719 https://bugs.freedesktop.org/show_bug.cgi?id=105719
fdo#106000 https://bugs.freedesktop.org/show_bug.cgi?id=106000
fdo#106097 https://bugs.freedesktop.org/show_bug.cgi?id=106097
fdo#106248 https://bugs.freedesktop.org/show_bug.cgi?id=106248
fdo#106725 https://bugs.freedesktop.org/show_bug.cgi?id=106725
fdo#107362 https://bugs.freedesktop.org/show_bug.cgi?id=107362
== Participating hosts (48 -> 43) ==
Additional (1): fi-cfl-guc
Missing (6): fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-u2 fi-snb-2520m fi-bdw-samus
== Build changes ==
* IGT: IGT_4662 -> IGTPW_1901
CI_DRM_4918: f595aba3a6e2f6972bb158eb8434b58d22d0e5f0 @ git://anongit.freedesktop.org/gfx-ci/linux
IGTPW_1901: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1901/
IGT_4662: ebf6a1dd1795e2f014ff3c47fe2eb4d5255845bd @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
== Testlist changes ==
+igt@perf_pmu@init-queued-bcs0
+igt@perf_pmu@init-queued-rcs0
+igt@perf_pmu@init-queued-vcs0
+igt@perf_pmu@init-queued-vcs1
+igt@perf_pmu@init-queued-vecs0
+igt@perf_pmu@init-runnable-bcs0
+igt@perf_pmu@init-runnable-rcs0
+igt@perf_pmu@init-runnable-vcs0
+igt@perf_pmu@init-runnable-vcs1
+igt@perf_pmu@init-runnable-vecs0
+igt@perf_pmu@init-running-bcs0
+igt@perf_pmu@init-running-rcs0
+igt@perf_pmu@init-running-vcs0
+igt@perf_pmu@init-running-vcs1
+igt@perf_pmu@init-running-vecs0
+igt@perf_pmu@queued-bcs0
+igt@perf_pmu@queued-contexts-bcs0
+igt@perf_pmu@queued-contexts-rcs0
+igt@perf_pmu@queued-contexts-vcs0
+igt@perf_pmu@queued-contexts-vcs1
+igt@perf_pmu@queued-contexts-vecs0
+igt@perf_pmu@queued-rcs0
+igt@perf_pmu@queued-vcs0
+igt@perf_pmu@queued-vcs1
+igt@perf_pmu@queued-vecs0
+igt@perf_pmu@render-node-queued-bcs0
+igt@perf_pmu@render-node-queued-contexts-bcs0
+igt@perf_pmu@render-node-queued-contexts-rcs0
+igt@perf_pmu@render-node-queued-contexts-vcs0
+igt@perf_pmu@render-node-queued-contexts-vcs1
+igt@perf_pmu@render-node-queued-contexts-vecs0
+igt@perf_pmu@render-node-queued-rcs0
+igt@perf_pmu@render-node-queued-vcs0
+igt@perf_pmu@render-node-queued-vcs1
+igt@perf_pmu@render-node-queued-vecs0
+igt@perf_pmu@render-node-runnable-bcs0
+igt@perf_pmu@render-node-runnable-rcs0
+igt@perf_pmu@render-node-runnable-vcs0
+igt@perf_pmu@render-node-runnable-vcs1
+igt@perf_pmu@render-node-runnable-vecs0
+igt@perf_pmu@render-node-running-bcs0
+igt@perf_pmu@render-node-running-rcs0
+igt@perf_pmu@render-node-running-vcs0
+igt@perf_pmu@render-node-running-vcs1
+igt@perf_pmu@render-node-running-vecs0
+igt@perf_pmu@runnable-bcs0
+igt@perf_pmu@runnable-rcs0
+igt@perf_pmu@runnable-vcs0
+igt@perf_pmu@runnable-vcs1
+igt@perf_pmu@runnable-vecs0
+igt@perf_pmu@running-bcs0
+igt@perf_pmu@running-rcs0
+igt@perf_pmu@running-vcs0
+igt@perf_pmu@running-vcs1
+igt@perf_pmu@running-vecs0
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1901/issues.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 16+ messages in thread
* [igt-dev] ✗ Fi.CI.IGT: failure for 21st century intel_gpu_top & Co.
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
` (7 preceding siblings ...)
(?)
@ 2018-10-04 7:40 ` Patchwork
-1 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2018-10-04 7:40 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: igt-dev
== Series Details ==
Series: 21st century intel_gpu_top & Co.
URL : https://patchwork.freedesktop.org/series/50499/
State : failure
== Summary ==
= CI Bug Log - changes from IGT_4662_full -> IGTPW_1901_full =
== Summary - FAILURE ==
Serious unknown changes coming with IGTPW_1901_full absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in IGTPW_1901_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://patchwork.freedesktop.org/api/1.0/series/50499/revisions/1/mbox/
== Possible new issues ==
Here are the unknown changes that may have been introduced in IGTPW_1901_full:
=== IGT changes ===
==== Possible regressions ====
igt@perf_pmu@queued-contexts-bcs0:
shard-hsw: NOTRUN -> FAIL +43
igt@perf_pmu@render-node-runnable-vcs0:
shard-apl: NOTRUN -> FAIL +43
igt@perf_pmu@runnable-bcs0:
shard-snb: NOTRUN -> FAIL +24
igt@perf_pmu@running-vcs1:
shard-kbl: NOTRUN -> FAIL +54
== Known issues ==
Here are the changes found in IGTPW_1901_full that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@drv_suspend@shrink:
shard-snb: PASS -> INCOMPLETE (fdo#106886, fdo#105411)
shard-kbl: PASS -> INCOMPLETE (fdo#106886, fdo#103665)
igt@kms_available_modes_crc@available_mode_test_crc:
shard-apl: PASS -> FAIL (fdo#106641)
igt@kms_busy@extended-modeset-hang-newfb-with-reset-render-a:
shard-snb: NOTRUN -> DMESG-WARN (fdo#107956)
igt@kms_color@pipe-a-ctm-max:
shard-apl: PASS -> FAIL (fdo#108147)
igt@kms_cursor_crc@cursor-128x128-onscreen:
shard-kbl: PASS -> FAIL (fdo#103232)
shard-apl: PASS -> FAIL (fdo#103232)
igt@kms_cursor_crc@cursor-64x64-suspend:
shard-apl: PASS -> FAIL (fdo#103191, fdo#103232)
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-wc:
shard-apl: PASS -> FAIL (fdo#103167) +1
igt@kms_frontbuffer_tracking@fbc-1p-rte:
shard-apl: PASS -> DMESG-WARN (fdo#103558, fdo#108131)
igt@kms_plane_multiple@atomic-pipe-c-tiling-x:
shard-kbl: PASS -> FAIL (fdo#103166) +1
igt@kms_plane_multiple@atomic-pipe-c-tiling-y:
shard-apl: PASS -> FAIL (fdo#103166) +1
igt@kms_setmode@basic:
shard-apl: PASS -> FAIL (fdo#99912)
shard-kbl: PASS -> FAIL (fdo#99912)
igt@kms_vblank@pipe-c-ts-continuation-idle-hang:
shard-apl: PASS -> DMESG-WARN (fdo#105602, fdo#103558) +7
==== Possible fixes ====
igt@gem_pwrite@big-gtt-fbr:
shard-snb: INCOMPLETE (fdo#105411) -> PASS
igt@kms_available_modes_crc@available_mode_test_crc:
shard-snb: FAIL (fdo#106641) -> PASS
igt@kms_color@pipe-a-degamma:
shard-apl: FAIL (fdo#108145) -> PASS
igt@kms_cursor_crc@cursor-128x128-suspend:
shard-apl: FAIL (fdo#103191, fdo#103232) -> PASS
shard-kbl: FAIL (fdo#103191, fdo#103232) -> PASS
igt@kms_cursor_crc@cursor-256x85-random:
shard-apl: FAIL (fdo#103232) -> PASS +4
igt@kms_cursor_crc@cursor-64x21-onscreen:
shard-kbl: FAIL (fdo#103232) -> PASS +1
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-pwrite:
shard-apl: FAIL (fdo#103167) -> PASS +1
shard-kbl: FAIL (fdo#103167) -> PASS
igt@kms_plane_multiple@atomic-pipe-a-tiling-y:
shard-apl: FAIL (fdo#103166) -> PASS +1
igt@kms_rotation_crc@sprite-rotation-180:
shard-snb: FAIL (fdo#103925) -> PASS
igt@pm_rpm@system-suspend-modeset:
shard-kbl: INCOMPLETE (fdo#107807, fdo#103665) -> PASS
fdo#103166 https://bugs.freedesktop.org/show_bug.cgi?id=103166
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#103232 https://bugs.freedesktop.org/show_bug.cgi?id=103232
fdo#103558 https://bugs.freedesktop.org/show_bug.cgi?id=103558
fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
fdo#103925 https://bugs.freedesktop.org/show_bug.cgi?id=103925
fdo#105411 https://bugs.freedesktop.org/show_bug.cgi?id=105411
fdo#105602 https://bugs.freedesktop.org/show_bug.cgi?id=105602
fdo#106641 https://bugs.freedesktop.org/show_bug.cgi?id=106641
fdo#106886 https://bugs.freedesktop.org/show_bug.cgi?id=106886
fdo#107807 https://bugs.freedesktop.org/show_bug.cgi?id=107807
fdo#107956 https://bugs.freedesktop.org/show_bug.cgi?id=107956
fdo#108131 https://bugs.freedesktop.org/show_bug.cgi?id=108131
fdo#108145 https://bugs.freedesktop.org/show_bug.cgi?id=108145
fdo#108147 https://bugs.freedesktop.org/show_bug.cgi?id=108147
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
== Participating hosts (6 -> 4) ==
Missing (2): shard-skl shard-glk
== Build changes ==
* IGT: IGT_4662 -> IGTPW_1901
* Linux: CI_DRM_4915 -> CI_DRM_4918
CI_DRM_4915: 26e7a7d954a9c28b97af8ca7813f430fd9117232 @ git://anongit.freedesktop.org/gfx-ci/linux
CI_DRM_4918: f595aba3a6e2f6972bb158eb8434b58d22d0e5f0 @ git://anongit.freedesktop.org/gfx-ci/linux
IGTPW_1901: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1901/
IGT_4662: ebf6a1dd1795e2f014ff3c47fe2eb4d5255845bd @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1901/shards.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2018-10-04 7:40 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-03 12:07 [RFC i-g-t 0/6] 21st century intel_gpu_top & Co Tvrtko Ursulin
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
2018-10-03 12:07 ` [RFC i-g-t 1/6] include: DRM uAPI headers update Tvrtko Ursulin
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
2018-10-03 12:07 ` [RFC i-g-t 2/6] intel-gpu-overlay: Add engine queue stats Tvrtko Ursulin
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
2018-10-03 12:07 ` [RFC i-g-t 3/6] intel-gpu-overlay: Show 1s, 30s and 15m GPU load Tvrtko Ursulin
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
2018-10-03 12:07 ` [RFC i-g-t 4/6] tests/perf_pmu: Add tests for engine queued/runnable/running stats Tvrtko Ursulin
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
2018-10-03 12:07 ` [RFC i-g-t 5/6] intel-gpu-top: Add queue depths and load average Tvrtko Ursulin
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
2018-10-03 12:07 ` [RFC i-g-t 6/6] intel-gpu-top: Support for client stats Tvrtko Ursulin
2018-10-03 12:07 ` [igt-dev] " Tvrtko Ursulin
2018-10-03 16:57 ` [igt-dev] ✓ Fi.CI.BAT: success for 21st century intel_gpu_top & Co Patchwork
2018-10-04 7:40 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.